Advanced Series in Neuroscience - Vol. 3
This page is intentionally left blank
Advanced Series in Neuroscience - Vol...
10 downloads
723 Views
144MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Advanced Series in Neuroscience - Vol. 3
This page is intentionally left blank
Advanced Series in Neuroscience - Vol. 3
edited by
H Gutfreund The Hebrew University of Jerusalem Mount Scorpus, Jerusalem 91905, Israel
G Toulouse Laboratoire de Physique Statistique, Ecole Normaie Superieure 24 rue Lhomond, 75231 Paris Cedex 05, France
Ufe World Scientific Vb
Singapore • New Jersey • London • Hong Kong
Published by World Scientific Publishing Co. Pte. Lid. P O B o \ 128, Fairer Road, Singapore 9128 USA office: Suite I B , 1060 Main Street, River Edge. NJ 07661 UK office: 73 Lynton Mead. Totieridge, London N20 8DH
The editors and publisher are grateful to the authors and the following publishers for their assistance and their permission to reproduce the articles found in this volume: Academic Press Inc. The American Association for the Advancement of Science (Science) American Institute of Physics (Rev. Mod. Phys.) American Physical Society (Phys. Rev. A ) American Psychological Association {Psychological Review) American Scientist (Am. Sci,) Annua] Review Inc. (Anna. Rev. Neurosci) Blackwell Publishers Blackwell Scientific Publications. Inc. Cambridge University Press Cold Spring Harbor Laboratory Press Dover Publications Inc. Elsevier Science Publishers. BV Elsevier Science Publishers Ltd (UK) IOP Publishing Ltd Karger, Basel Kluwer Academic Publishers Lawrence Erlbaum Associates, Inc. Macmillan Magazines Ltd (Nature) Massachusetts Institute of Technology M I T Press National Academy of Sciences (Proc. Nat l Acad. Sci. USA) Pergamon Press Rockefeller University Press Rout ledge Springer Verlag Yale University Press
B I O L O G Y A N D C O M P U T A T I O N : A PHYSICIST'S C H O I C E Copyright © 1994 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or pans thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher. For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 27 Congress Street, Salem, M A 01970. USA.
ISBN ISBN
981-02-1405-7 981-02-1406-5 (pbk]
Printed in Singapore
C O N T E N T S
GENERAL INTRODUCTION
CHAPTER a)
1
xi
SETTING T H E STAGE
Forewords; Introductory Warnings
1. T h e Mechanical M i n d H. B . Barlow, Ann. Rev. Neurosci.
1 3 5
13, 15-24 (1990).
2. E s t h e t i c Analysis of Works of A r t H. L . F . Helmholtz, in On the Sensations of Tone, transl. by A. J . Ellis, pp. 366-371 (Dover 1954).
15
3. More is Different P. W . Anderson, Science 177, 393-396 (1972).
21
4. W h a t is Computational Neuroscience? P. S. Churchland, C . Koch and T . J . Sejnowski, in Computational Neuroscience, ed. E . Schwartz, pp. 46-55 ( M I T Press 1990).
25
b)
P h y s i c s , Biology, C o m p u t a t i o n
35
5. Life, Thermodynamics, and Cybernetics L . Brillouin, Am. Sci. 37, 554-568 (1949).
37
6. Physics, Biological Computation, and Complementarity J . J . Hopfield, in The Lesson of Quantum Theory, Proceedings of the Niels Bohr Centenary Symposium, eds. D. de Boer, E . Dahl and O. Ulfbeck, pp. 295-314 (North-Holland 1986).
52
7. Evolution and Tinkering F . Jacob, Science 196, 1161-1166 (1977).
72
8. Physics and Antiphysics V . Braitenberg, in On the Texture of Brains, Chap. 2, pp. 5-8 (Springer 1977).
78
9. Manifesto of B r a i n Science V . Braitenberg, in Information Processing in the Cortex, eds. A. Aertsen and V . Braitenberg, pp. 473-477 (Springer 1992).
82
c)
C o m p u t e r a n d Brain; Logic and Statistics
87
10. T h e Logical Structure of the Nervous System; Nature of the System of Notations Employed: Not Digital B u t Statistical; T h e Language of the Brain Not the Language of Mathematics J . von Neumann, in The Computer and the Brain, pp. 74-82 (Yale U. Press 1958).
89
11. Probability, Philosophy and Science: A Briefing for Bayesians A. J . M. Garrett, in Maximum Entropy and Bayesian Methods,
98
ed. J . Skirling, pp. 107-116 (Kluwer 1989). V
12. T w o Revolutions — Cognitive and Probabilistic; Is the Mind a Bayesian? G . Gigerenzer and D. J . Murray, in Cognition as Intuitive X I - X I I , pp. 147-162 (Lawrence Erlbaum 1987). d)
108 Statistics,
Some P e r c e p t u a l Facts a n d Issues
124
13. Perception M. Delbrfick, in Mind from Matter?, Chap. 8, pp. 109-119 (Blackwell 19S6).
125
14. Understanding Images in the Brain C. Blakemore, in Images and Understanding, eds. H. Barlow, C . Blakemore, M. Weston-Smith, pp. 257-283, 386-388 (Cambridge U. Press 1990).
136
CHAPTER 2
164
a)
BIOLOGICALCONCEPTS AND METHODS; COMPUTATIONAL GOALS AND MEANS
Mental Representations
166
15. Concerning Imagery D. O. Hebb, Psychological Review 75, 466-477 (1968).
168
16. Mental Rotation of Three-Dimensional Objects R. N. Shepard and J . Metzler, Science 171, 701-703 (1971).
180
17. Mental Rotation of the Neuronal Population Vector A. P. Georgoponlos, J . T . Lurito, M. Petrides, A. B. Schwartz and J. T . Massey, Science 243, 234-236 (1989).
183
b)
Information Theory and Perception
18. Whatever Happened to Information T h e o r y ? R. L . Gregory, in Odd Perceptions, pp. 187-194 (Methuen 1986).
186 188
19. Some Informational Aspects of Visual Perception F . Attneave, Psychological Review 61, 183-193 (1954). 20. T h e Magical Number Seven, Plus or Minus Two: Some Limits on O u r Capacity for Processing Information G . A. Miller, Psychological Review 63, 81-97 (1956).
196 207
21. Single Units and Sensation: A Neuron Doctrine for Perceptual Psychology? H. B. Barlow, Perception 1, 371-394 (1972).
224
22. Conditions for Versatile Learning, Helmholtz's Unconscious Inference, and the Task of Perception H. B. Barlow, Vision Research 30, 1561-1571 (1990).
248
c)
259
Neuroanatomy
23. Reading the Structure of Brains V . Braitenberg, Network 1, 1-11 (1990).
260
24. Axonal Trees and Cortical Architecture G . J . Mitchison, Trends Neurosci. 15, 122-126 (1992).
271
vi
d) 25.
A s p e c t s of B i o computation
276
Antibodies and Learning; Selection Versus Instruction N. K . Jerne, in The Newosciences: A Study Program,
278
eds. G . Quartern, T . Menelchuk and F . O. Schmitt, pp. 200-205 (Rockefeller U. Press 1967). 26.
E a r l y Vision and Focal Attention
284
B. Julesz, Reviews of Modern Physics 63, 735-772 (1991). 27.
Neuropsychology and the Nature of Consciousness L . Weiskrantz, in Mindwaves, eds. C . Blakemore and S. Greenfield, pp. 306-320 (Blackwell 1987).
CHAPTER 3
a) 28.
MODES OF COMPUTATION; PROCESSING AND LEARNING
Neural Networks
322
337
339
Collective Processing and Neural States J. J . Hopfield, in Modeling and Analysis in Biomedicine, ed. C . Nicolini, 369-389 (World Sci. Pub. 1984).
341
29. T h e Space of Interactions in Neural Network Models E . Gardner, J . Phys. A 2 1 , 257-270 (1988).
361
b)
375
Parallel Algorithms
30.
Optimization by Simulated Annealing S. Kirkpatrick, C . Gelatt and M. P. Vecchi, Science 220, 671-680 (1983).
376
31.
A n Analogue Approach to the Travelling Salesman Problem using an E l a s t i c Net Method R. Durbin and D. Willshaw, Mature 326, 689-691 (1987).
386
c)
Generalization; Learning a Rule
389
32.
Large Automatic Learning, Rule Extraction, and Generalization J. Denker, D. Schwartz, B . Wittner, S. Solla, R. Howard, L . Jackel and J . J . Hopfield, Complex Systems 1, 877-922 (1987).
391
33.
Confronting Neural Network and Human Behavior in a Quasi regular Environment M. A. Virasoro, article based on a paper presented at the Conference: "From Statistical Physics to Statistical Inference, and Back", Cargese, 1992.
437
34.
Statistical Mechanics of Learning from Examples H. S. Seung, H . Sompolinsky and N. Tishby, Phys. Rev. A 4 5 ,
447
6056-6091 (1992). 35. O n the Classification of Learning Machines G . Parisi, Network 3, 259-265 (1992).
483
d)
490
36.
E a r l y Sensory Processing Perceptual Neural Organization: Some Approaches Based on Network Models and Information Theory R. Linsker, Anno. Rev. Neurosci. 13, 257-281 (1990). vii
492
37. A n Information-Theoretic View of Analog Representation in Striate Cortex J . G . Daugman, in Computational Neuroscience, ed. E . Schwartz, pp. 403-423 ( M I T Press 1990).
517
38. W h a t Does the Retina Know About Natural Scenes? J. J . Atick and A. N. Redlich, Neural Compulation 4, 196-210 (1992).
538
e)
553
Neural Codes
39. Finding Minimum Entropy Codes H. B. Barlow, T . P. Kanshal and G. J . Mitchison, Neural Computation 1, 412-423 (1989).
554
40. Reading a Neural Code W. Bialek, F . Rieke, R. R. de Ruyter van Steveninck and D. Warland, Science 252, 1854-1857 (1991).
566
41. Spike Arrival Times: A Highly Efficient Coding Scheme for Neural Networks S. J . Thorpe, in Parallel Processing in Neural Systems and Computers, eds. R. Eckmiller, G . Hartmann and G. Hauske, pp. 91-94 (Elsevier 1990).
570
CHAPTER 4
574
a)
BRAIN AREAS, CIRCUITS AND DYNAMICS
Sensory a n d Motor Pathways
576
42. Deciphering the Brain's Codes M. Konishi, Neural Computation 3, 1-18 (1991).
578
43. Segregation of Form, Color, Movement, and Depth: Anatomy, Physiology, and Perception M. S. Livingstone and D. H. Hubel, Science 240, 740-749 (1988).
596
44. Separate Visual Pathways for Perception and Action M. A. Goodale and A. D. Milner, Trends Neurosci. 15, 20-25 (1992).
606
45. Behavioral Neurophysiology: Insights into Seeing and Grasping S. P. Wise and R. Desimone, Science 242, 736-741 (1988).
612
46. A Back-Propagation Programmed Network that Simulates Response Properties of a Subset of Posterior Parietal Neurons D. Zipser and R. A. Andersen, Nature 331, 679-684 (1988).
618
b)
Bridges Between Psychophysics a n d Physiology
624
47. Neuronal Correlates of a Perceptual Decision W. T . Newsome, K. H. Britten and J . A. Movshon, Nature 341, 52-54 (1989).
625
48. Subjective Contours — Bridging the G a p Between Psychophysics and Physiology E . Peterhans and R. von der Heydt, Trends Neurosci. 14, 112-119 (1991).
627
viii
c)
S t r u c t u r e s a n d Functions of Various B r a i n A r e a s
635
49.
Reciprocal L i n k s of the Corpus Striatum with the Cerebral Cortex and Limbic System: A Common Substrate for
636
Movement and Thought? W. J . H . Nauta, in Neurology and Psychiatry: A Meeting of Minds, ed. J . Mueller, pp. 43-63 (Karger, Basel 1989). 50. T h e Cerebellar Network. Attempt at a Formalization of Its Structure
657
V . Braitenberg, Network 4, 11-17 (1993). 51. Hippocampal Synaptic Enhancement and Information Storage W i t h i n a Distributed Memory System B . L . McNanghton and R. G . M. Morris, Trends Neurosci. 10, 408-415 (1987). d)
Representations of Space in the B r a i n
665
673
52. T h e Updating of the Representation of V i s u a l Space in Parietal Cortex by Intended Eye Movements J . R. Duhamel, C . L . Colby and M. E . Goldberg, Science 255, 90-92 (1992).
675
53. O n Listing's L a w K . Hepp, Commun.
678 Math. Phys. 132, 285-292 (1990).
54. T w o - R a t h e r T h a n Three-Dimensional Representation of Saccades in Monkey Superior Colliculus A . J . van Opstal, K. Hepp, B. J . M. Hess, D. Straumann and V . Henn, Science 252, 1313-1315 (1991).
686
e)
689
Oscillations and Synchrony
55. T e m p o r a l Coding in the V i s u a l Cortex: New Vistas on Integration in the Nervous System A . K . Engel, P Konig, A. K . Kreiter, T . B. Schillen and W. Singer, Trends Neurosci. 15, 218-226 (1992).
690
CHAPTER 5
699
DEBATES AND SPECULATIONS
a)
Theory-Experiment Interplay
701
56.
Opening the Grey Box R. J . Douglas and K . A. C . Martin, Trends Neurosci. 286-293 (1991).
702 14,
57. I n Defence of Single Electrode Recordings D. J . Amit, Network 3, 385-391 (1992).
710
b)
717
58.
R o l e s of R e t r o a c t i v a t i o n Function of the T h a l a m i c Reticular Complex: T h e Searchlight Hypothesis F . Crick, Proc. Natl. Acad. Sci. USA 81, 4586-4590 (1984).
ix
718
59. Functions of Neural Networks in the Hippocampus and Neocortex in Memory E . T . Rolls, in Neural Models of Plasticity, eds. J . H. Byrne and W. O. Berry, pp. 240-265 (Academic Press, 1989).
723
60. T h e Brain Binds Entities and Events by Multiregional Activation from Convergence Zones A. R. Damasio, Neural Computation 1, 123-132 (1989).
749
61. O n the Computational Architecture of the Neocortex: I — T h e Role of the T h a l a m o c o r t i c a l Loop D. Mumford, Biological Cybernetics 65, 135-145 (1991).
759
62. O n the Computational Architecture of the Neocortex: I I — T h e Role of the Cortico-Cortical Loops D . Mumford, Biological Cybernetics 66, 241-251 (1992).
770
c)
781
Computational Strategies
63. Shifter Circuits: A Computational Strategy for Dynamic Aspects of Visual Processing C . H. Anderson and D. C . van Essen, Proc. Natl. Acad. Sci. 84, 6297-6301 (1987).
783 USA
64. Olfactory Computation and Object Perception J. J . Hopfield, Proc. Natl. Acad. Sci. USA 88, 6462-6466 (1991).
788
65. A Theory of How the Brain Might Work T . Poggio, in The Brain, Cold Spring Harbor Symposium on Quantitative Biology, Vol. LV, pp. 899-910 (Cold Spring Harbor Laboratory Press 1990).
793
d)
805
Language and Consciousness
66. Intelligence, Guesswork, Language H. B. Barlow, Nature 304, 185-195 (1987).
806
67. T h e Biological Role of Consciousness H. B. Barlow, in Mindwaves, eds. C . Blakemore and S. Greenfield, pp. 361-374 (Blackwell 1987).
817
68. T h e Problem of Consciousness and Introspection D. O. Hebb, in Brain Mechanisms and Consciousness, ed. J . F . Delafresnay, pp. 402-417 (Blackwell 1954).
831
Index
847
GENERAL INTRODUCTION
T h i s book may simply be used as a collection of articles related to neural computation. I n this compound opus the reprinted texts are the prominent figures, and the editors' introductions merely provide some background. However, the book will acquire some additional meaning if the reader perceives our initial motivations, and the process along which it evolved toward its present organisation.
Purpose The purpose of this reprint volume is two-fold. The first is to help physicists entering the field of neural networks and brain studies, to obtain a broader view of the context of a domain, new for them but not for science. T l i e second, reciprocally, is to help scientists of other disciplines to reach a better understanding of physicists' contributions within a context of perspectives they can relate to. When we entered this field in the mid-80s, we first felt acutely the difficulty of locating and accessing relevant texts of high quality and lasting value, among a myriad of books, conference proceedings and journals. Subsequently, a second difficulty was experienced as we attempted to communicate to unprepared audiences what the statistical physics of neural networks had achieved. Eventually we came to think that these two difficulties might have a common cause, and a common cure. T h e different scientific subcultures have developed not only different methods and terminology but also different scientific attitudes. When they meet and merge on the brain-study arena, a joint effort of cross-disciplinary understanding and dialogue is required. It
xi
is to this goal that our volume is meant to contribute.
Boundaries I n the meantime, several useful reprint volumes have appeared. Mention should be made of the two earlier books in this "Advanced Series in Neuroscience" (World Scientific Pub. C o . , Singapore): • Brain Theory edited by G . L . Shaw and G . Palm (1988); heretofore referred to as Vol. 1. •
The Neurobiology of Learning and Memory edited by G . L . Shaw, J . L . McGaugh and S. P. R . Rose (1990); Vol. 2.
Quite naturally, we decided on a book which would be complementary to these two collections as well as avoid any duplication. With another worthy couple of books, endowed with insightful introductions: Neurocomputing and Neurocomputing 2, edited by J . A . Anderson et at. ( M I T Press, 1988, 1990), the overlap has been limited to 5 texts (our Refs. (10), (21), (30), (46), (58)), which we felt necessary for the general harmony of our selection. Some other general resource books, at the boundary of our domain, are The Oxford Companion to the Mind edited by R . L . Gregory (Oxford U . Press, 1987) and the twin volumes of Parallel Distributed Processing edited by J . McClelland and D . Rumelhart ( M I T Press, 1986). Several textbooks on modern neural network theory have appeared recently, among which we will mention three: Modeling Brain Function by D. J . Amit (Cambridge U . Press, 1989), Introduction to the Theory of Neural Computation by J . Hertz, A . Krogh and R . G . Palmer (Addison-Wesley, 1991), and An Introduction to the Modeling of Neural Networks by P. Peretto (Cambridge U . Press, 1992).
Organisation Somehow, the preceding references define the boundaries of our book, namely, what it does not attempt to cover, and what it is not. But then what is it? It is a sequence of 68 reprints, organised into 5 chapters and 22 sections. There is a certain arbitrariness at all levels: selection, grouping into sections, and successive ordering. The titles chosen for the chapters and sections are, so to speak, a guide for the eye. They were invented a posteriori, in order to emphasise some thematic relationships between papers placed in succession. But, besides the "local relations", provided by the grouping into sections, there are also many "long-range relations". It is the purpose of our
xii
introductions to the various chapters and sections to indicate the richness of these relations. Further demonstration of the many connections between the different papers may be derived from the Index. Our goal will be achieved, if the collection of reprints appears to ihe attentive reader as a connected, yet flexible, web.
Not too tight, because
the field is a rapidly evolving one. Not too loose, because the appeal of this scientific enterprise rests largely on the convergence of diverse approaches.
Contents T h e text selection process has evolved over a span of a year. We consulted many colleagues from many disciplines. Invaluable advice was received from a number of experts. It is a pleasure to acknowledge our particular gratitude to: Roger Balian, Horace Barlow, Elie Bienenstock, Valentino Braitenberg, Jean Bullier, Pierre Buser, Miri Dick, Klaus Hepp, John Hopfield, Claude Meunier, Jean-Michel Roy, Simon Thorpe and David Willshaw, among all those who were consulted. Two articles, (35) and (57), have been specially conceived for this volume; we are grateful to their authors. Although these colleagues bear no responsibility, of course, for the final selection — where much arbitrary pruning had to be done in order to keep the project within a reasonable size — their comments and suggestions along the way have considerably enriched the breadth and quality of this book. In the process, the editors were rewarded (for all their painful awareness of limited competence and risks of shortcomings) with a sense of achieving an intellectually significant experience. Chapter 1 sets the stage by providing historical landmarks and perspectives, and insights from various disciplines (biology, physics, mathematics, computer science, psychology, philosophy). T h e two perennial questions: W h a t is intelligence? Mind from matter?, are touched on by our authors, with a standard of quality and honesty that should decisively immunise the newcomer against much of the recurrently mediocre literature to which he may otherwise fall naive prey. Chapter 2 introduces biological ideas and approaches that are not part of the standard education of a physicist. In brief, it may be described as a presentation of biocomputation to physicists and other outsiders. Chapter 3 discusses neural networks that solve a variety of information processing and learning tasks. Several models and algorithms, taking inspiration from information theory and statistical physics, are presented in mutually enlightening succession.
xiii
Chapter 4 is a return to brain realities. It attempts to give a state-ofthe-art account of the structures and functions of various perceptual and motor areas. Finally, Chapter 5 is deliberately turned toward the future. So much remains unknown in this field, that there is space for debates on appropriate measurements, for speculations on the role of the pervasive retroactivation pathways, for suggestions about the existence of further unexplored computational strategies, and for a final round on language and consciousness.
Perspectives All over the world, new educational programmes and new research centres are created in the field covered by this book. They appear under a variety of names: "Computation and Neural Systems", "Brain Theory", "Cognitive Neurosciences", "Parallel Computation", etc. They involve collaborations between neurosciences and psychology, mathematics and physics (logic, probability theory, geometry, dynamical systems, statistical physics), computer science and engineering (information theory, signal theory, artificial intelligence, robotics), even evolution theory and philosophy. A n d this enumeration is not exhaustive. We do not think that a new discipline with a unified language will soon emerge from these interdisciplinary collaborations, but we do believe in multilingualism. In this sense, our book is a second generation endeavour, because it belongs to a set of initiatives that aim to overcome a previous "early" stage. The efforts that we invested in the edition of this reprint volume will not have been lost if they contribute to enhancing an educated mutual respect between physicists and biologists, experimentalists and theoreticians, and if they help to deepen a sense of awe for the mysterious unfolding of relations between the diverse sciences.
Index This book should be used as a companion volume to the conventional textbooks in the field; it is not meant to replace them. Their virtue lies in coherence and homogeneity, to which our book wishes to add a complementary dimension of historical and interdisciplinary depth. A book of this nature entails a special style of index. T h e index of this book is essentially a help for discovering further relations between the various reprinted articles. The number of entries is limited, and only the articles in
XIV
which they are discussed are indicated, not the page numbers. T h e reader is invited to look through the whole list in order to discover what terms have been selected. For instance, the term "saccade" does not appear, but the corresponding studies will be found at "eye movements". In contrast, we thought it helpful to distinguish between the different uses of "entropy" concepts. It would have been counterproductive to enter all technical terms from each discipline, specially when similar things are designated by different names in neighbouring disciplines.
X V
Chapter 1
SETTING T H E STAGE
As is fitting for a beginning chapter, attempts are made here to provide historical perspectives and insights from various vantage points. T h e reader will thus find here maximal diversity in terms of the dates of publications, the background of the authors, and the span of problems evoked. Here also, careful consideration has been given, in our selection process, to the following qualities of exposition: clarity, accuracy of thought, forcefulness of expression, and elegance of style. The personality of some of the authors is worth mentioning, because their lives may serve as existent proof of some of the purposes of this volume. Four of the authors have established epoch-making bridges between disciplines: H . Helmholtz, 19th century biologist and physicist {see picture and note in paper (67)); J . von Neumann, mathematician and physicist, pioneer of computer science; L . Brillouin, physicist and champion of early information theory; and M . Delbruck, physicist and one of the founders of modern biology. Also worth mentioning for their openness of mind, in addition to the brilliance of their writings, are three biologists: H . Barlow, neurophysiologist (recordman for the number of texts selected in this volume; see the short biography in paper (67)); V . Braitenberg, neuroanatomist (second on this account, ex-aequo with J . J . Hopfield); F . Jacob, molecular biologist; and two physicists: P. W . Anderson, noted in particular for his many contributions to condensed and disordered matter physics; and J . J . Hopfield, contemporary exemplar of a successful move from theoretical physics to theoretical biology. The division of this chapter into four sections is particularly arbitrary, for most of the texts are involved with the description of our domain, i.e. neurocomputation and its boundaries, and all lay the ground for the four
1
subsequent chapters. ordering.
Nevertheless, here are a few ideas that guided our
Section l a starts with a consideration of higher brain functions, to which return shall be made occasionally in the course of the book, particularly in the final section 5d. Definitions of level of study, and warnings against the dangers of level confusion, provide a common theme. Section lb offers perspectives on the rich historical relations between physics and biology, and sheds light on some interesting differences between physical laws and biological explanations. Section l c puts emphasis on mathematical (logical, statistical) and epistemological notions. Finally, section I d provides an adequate transition to the subsequent chapters of the book introducing some of the fundamental issues in perception.
2
la Forewords; Introductory Warnings
If the reader absorbs the contents of the first three texts in this section, he (henceforth, "he" is a short word for "he/she") will be equipped to skip 90%of the sterile recurrent controversies on the mind-body, the mind-brain and the mind-matter problems, and to avoid most of the false prophets and doctrinaires. T h u s , he will be prepared to face the real debates, for which there is no universal doctrinal panacea. One can hardly but be impressed by the high standard of serenity and modesty already reached by Helmholtz, the 19th century pioneer (a serenity that is perhaps only equaled in this century by the writings of Donald Hebb). T h e fourth text, written by one philosopher, Patricia Churchland, in collaboration with two biologists involved in theoretical and experimental neurosciences, introduces many concepts that are developed in all subsequent chapters. Apart from this survey of computational neurosciences and a thorough discussion of the computer-brain issue, the authors also review and criticise the influential David Marr's three-level theory. The distinction they present between "natural kinds" and "non-natural kinds" is indeed a point that needs to be stressed, especially because it is not so familiar to physicists. For many problems of perception (such as colour vision, pattern recognition, pain, etc.) and learning (generalisation, etc.), it is essential to distinguish between the properties of the physical stimuli and the biases of the perceiving apparatus. T h i s distinction between objective and subjective qualities is a theme that runs through the whole book, and that has many facets. A related question is: "How much is the brain externally (vs. internally) driven"? Many views on conscious-unconscious processes, reduction-construction, analysis-synthesis, are evoked in this section. Perhaps these discussions will help the reader to overcome barriers, and to discover that similar scientific tensions exist within physics and within biology. Historical and general perspectives are further elaborated in the three sections that follow, thus completing the setting of the stage for the whole
3
volume. Some noticeable returns to Helmholtz's insightful remarks on unconscious processes, and to the problem of consciousness, occur in papers (13), (22), (27) and (44) and in Chapter 5, Section 5d. It was tempting to extract some felicitous sentences from the four texts of this section, and bring them together in order to define an "enlightened" epistemology about the matter-life-mind trilogy, but we have resisted the temptation.
4
Reprinted with permission from Anna. Rev. Neurosci., Vol. 13, pp. 1S-24,1990 © 1990 Annual Reviews Inc.
THE MECHANICAL M I N D Horace Barlow Physiological Laboratory, Cambridge, C B 2 3 E G , England
Most neuroscientists accept the machine as a useful metaphor or model of the mind. It points our research in a direction that has been outstandingly successful for more than a century, namely the reductionist analysis of brain function in terms of simpler physical, chemical, and biological processes, and because we can understand all, or almost all, about machines, the metaphor encourages us to think we can discover all, or almost all, about the mind. I thought when I started writing this piece that the metaphor was sound and useful, though its uncritical acceptance made me uneasy, especially because I knew that many of my colleagues in subjects like mathematics and linguistics received it with something close to incredulity, while many others disliked it intensely. I therefore thought it would be worthwhile to examine the idea in more detail, to try to find if there was any justification for unease, incredulity, or dislike; if you read on you will find that I have been forced to the conclusion that the metaphor is misleading and potentially harmful, but that this results from prevalent ignorance and prejudice about machines rather than any gross defect of the metaphor.
IN DEFENCE OF THE ANALOGY I think the main purpose behind calling the mind a machine is to drive out demons. W e are saying, in effect, " L o o k , there is no more to the working of the brain than the physics and chemistry of its componets, just as there is no more to a machine than the physics and chemistry of its components." This is admirable as an invocation not to waste time on mental spirits and to study the physics and chemistry instead, so what are the objections? First there are three minor ones, namely that we don't actually know all about machines, that brains are made of totally different materials, and that they have come into existence in a strikingly different manner. By considering these objections we see where to be cautious about the meta-
5
16 phor. but they are not fatal to it. However, there are more serious problems that do justify mistrust and dislike; first, minds do some things that no current machines do, and if one is mainly interested in these particular tasks the analogy is not much help and could be misleading; second, it is the minds of other people one interacts with, so to treat minds as machines valued mainly for their usefulness has unpleasant ethical implications.
We Do Not Understand All About Machines The person who understands most about a machine is its designer, and no designer of a complex machine would claim that everything about it was perfectly understood. Engineers are really very different from scientists, for they use a body of knowledge to create something new. and provided the goal is reached they are not too concerned if their creation has unforeseen or unknown properties. Scientists make use of the same body of knowledge, but they must pay particular attention to anything unforeseen or unknown, since their goal is to extend the body of knowledge rather than to exploit it. In engineering, intuitive leaps and creative solutions are rightly admired when they make a machine work, but as scientists, we should admire creativity and intuition that enable us to understand what was previously unknown; we do not want to import the attitude "If it works, that's good enough" along with the concept that mind is a product of engineering. The current enthusiasm for neural networks may show that this niggle is partly justified. Here are techniques that enable machines to perform tasks that hitherto lay in the province of the mind, but whereas artificial intelligence and the previous styles of computer simulation led to a more detailed analytical knowledge of the requirements for performing the task, the network style of simulation does not; instead it uses some blind procedure, such as back-propagation, and the success of a simulation is taken as justification for the procedure, rather than for the programmer's analysis oTthe task requirements. T o be fair, the major proponents of the network approach do not regard this as an advantage and claim the method gives insight by, for instance, showing what intermediate elements or "hidden units" enable a task to be done; but one suspects the avoidance of detailed analysis is nonetheless a factor in the popularity of the method.
The Materials Are Very Different The objection that the materials are totally different need not be taken very seriously since, whatever the materials are, they still obey the rules of physics and chemistry. It just means more surprises for the neuroscienttst, since he will come across unexpected properties and methods: birds do not use propellers or jet engines, but they do obey the same laws of aerodynamics as machines that fly.
6
17 Perhaps one should be a little more cautious when it comes to computers and the brain, for the architecture as well as the materials are so different, but surely some at least of the problems encountered when performing a task on a serial computer will carry over to the performance of the same task by the brain. A s long as that is so, we can learn from the analogy.
Minds and Machines Differ in Origin Does it matter that brains are the product of millions of years of genetic evolution combined with a few months' ontogenesis and several years of teaching and experience, whereas machines are designed by those brains and made by human hand? I can't see why it should, and it could be claimed that the explicit knowledge required for design and construction gives deep insight into the nature of the mind. But once again there are cautions. Evolution is an unprincipled and conservative designer. A s a result, one rarely finds neat theoretical solutions embodied in brains, and in particular they often use parallel mechanisms for achieving a single goal. This fact tends to make life difficult for the experimentalist, because it frustrates attempts to do simple and effective controls. F o r instance, one might naively expect that if binocular vision enables people to judge distance, then blocking one eye would seriously impair this capacity, but motion parallax, knowledge of the normal sizes of objects, and other cues leave good distance judgment in one-eyed people. The use of alternative sensory cues for balance can cause confusion (see below), and the presence of many parallel methods has caused untold difficulty in the analysis of homing and other navigational feats in animals. It seems to be hard for us to avoid the automatic assumption that just one method is used to perform some difficult task, and this may partly result from the metaphor of the mechanical mind, for multiple alternative methods are rare in machines.
WHAT THE MIND DOES M a n y of the things the mind and brain do are also done by machines, and when this is the case understanding the machine obviously helps. T o understand how a man balances on one leg, one should understand the principles of servo-feedback, though to reinforce the point made above, one should also know that a man's ability to stand on one leg with his eyes shut does not prove that vision is irrelevant to the task, and the ability to remain upright after the destruction of his vestibular organs does not prove they are unimportant either. But the brain also does things that no machine does, and here we get into trouble.
7
18
Things No Computer Does T o start with, consider the fact that we constantly model the inanimate world around us, and also monitor it for changes. We know thoroughly the route from home to office, and notice when a section of road is being repaired, or when the office door is newly painted. M u c h of this modelmaking is quite automatic and unconscious and we only know it has been done when something changes; we do not consciously monitor the spectral composition of sunlight or the pitch of the front door bell, but it's pretty certain we would notice if they changed. Robots are of course beginning to record and model their environments in order to find their way around in it, but compared with us they are extraordinarily backward, and so far they have little to teach us. But it would be premature to say that, for this reason, the metaphor is wrong or misleading, because the need for modeling has only recently arisen. It is likely that the principles will soon be better understood, and when this happens they may give us new insight into the brain's methods of building useful models of the environment.
Modeling People M o r e serious problems arise when one considers the human brain's propensity to model people. This starts at a very early age, and parents quickly realize that they are not the only ones trying to run the household; a baby quickly becomes pretty expert at parent-control. It seems to me that this really is a most un-machinelike process, and likely to remain so. One reason is that machines are designed to do something useful for their owners, whereas babies are not. N o doubt one could incorporate a few tricks in a computer that would enable it to get the better of its user—in fact many of them appear to do this effortlessly, without deliberate design; but in such cases it is clear upon reflection that they are not really getting the better of us, they are simply failing to give the desired service. T o improve the man-machine interface, an engineer might program a computer to learn about the user's behavior, just as a baby learns about its parents' behavior; but there the analogy ends, for the engineer's purpose would always be to diminish the conflict of wills, while that is not the baby's purpose at all. A machine is by definition intended to be useful, and therefore it must be designed so that its user's will is unopposed as far as possible. Here, then, we have reached a point where the machine is not a good metaphor for the mind, simply because no machine has been designed to do what the mind does. I am not saying there is a theoretical reason that they should not be so designed, only that engineers are not likely to
8
19
explore the means of doing something that it would be useless or counterproductive to do, and still less likely actually to do it. We, as neuroscientists, can explore the problems of one brain out-guessing another, but we shall not at present get. much help in doing so by regarding the mind as a machine, simply because that is not at present the sort of thing machines do. Notice that this argument is only valid for the present; comparative and competitive interactions between computers can be modeled, and this may be very instructive for understanding human interactions. Hence the implications of the metaphor are not static, and it may mean something very different to future generations that have more understanding of the complexity of purely mechanistic interactions. All the same, for the present the metaphor is, at best, unhelpful on such problems.
TAKING THE METAPHOR TOO SERIOUSLY Perhaps this discussion has brought us to the reason for the unease and dislike aroused by the mind-machine metaphor. If we took it seriously, would we not regard other people's minds as machines, and hence be interested in them primarily for their utility to us? Machines do not and should not oppose our wills, and nor should other people, according to the metaphor. This would all be less worrying if the newspapers were not full of the deeds and misdeeds of individuals who behave as though the metaphor was their gospel, and such an attitude is not entirely unfamiliar even in academic circles. Of course the metaphor is not intended to be a moral exhortation saying: "Treat other people's minds as machines that could be useful to you." But when we use it aren't we in effect saying: "It's alright to treat minds as machines, because actually they are"? Thus the metaphor may not be entirely innocuous, since it might influence how people think about minds and consequently how they treat other people. To follow up this thought we need to see how people actually use the idea of the mind.
Distinguishing Minds from Brains So far, encouraged by the mind-machine metaphor, I have used the words "mind" and "brain" more or less interchangeably, but now we see that a distinction might be useful. The brain is what the metaphor properly applies to, while the mind is something different: it is the concept we use to describe the source of other people's, and our own, behavior. Since the concept of a particular person's mind is fashioned out of the observed behavior of that individual, it is a model one's own brain makes of the
9
20 other individual's brain. Thus minds are the brain's models of itself and other brains, and the important thing is that the vast majority of people attribute behavior to such mind-models. A s neuroscientists we believe it is the brain that controls behavior, but this is the belief and terminology of a small minority of experts; others attribute a person's behavior to his mind and care nothing for the beautiful nerve cells that we dedicate our lives to. T h a t may be a pity, but in justification of the majority's attitude, pause to think how many of the facts you learn from this volume will alter the way you treat your colleagues, bring up your chiidcrn. talk to the janitor, or vote at the next election. Y o u r beliefs about other people's minds, on the other hand, clearly do influence your way of life. ;ind it is because the metaphor may modify your attitude to minds that it is potentially obnoxious. One can be thoroughly mechanistic in believing that the brain controls behavior strictly within the laws of physics and chemistry without this distinction between mind and brain becoming a mockery. In most cases one has knowledge of only a minute fraction of the physical and chemical factors that are actually (we mechanistically believe) controlling the brain's output, but this does not stop us from making quite good and reliable judgments about the actions our own and other minds will initiate. There is nothing unusual in a model having such predictive power in spite of its use of incomplete data; in fact, good models stringently select the data they represent, both in the case of one's mental models of the physical environment with the people in it, and in the case of accurate scientific theories such as thermodynamics. The predictive power of a model depends on its correct identification of the dominant controlling factors and their influence, not upon its completeness. A n incomplete model is often more generally useful than a more accurate one, as for example with Newtonian laws of motion.
Minds Take Over the Control of Behavior Figure 1 is an attempt to persuade neuroscientists that minds are actually more important than brains, and to show how this comes about. In stage 1, the mechanistic brain is in sole command, but brain A observes other brains, B for example, also controlling their own behavior. Occasionally they interact, or their behaviors conflict, so brain A builds a model of the way brain B controls what B says and does, based on what A has seen and heard. This model of brain B inside the brain of A is B's mind, shown in step 2; brain A uses his idea of B's mind to predict what B will say and do, and this should be beneficial to A when living in the same environment as B . B of course has done the same and now has its internal model of A ' s brain, A's mind.
10
21 Brain A
Observations received
fj BehavioT~) -
Brain B
(Behavior
J
Interactions Stage 1. Brain A alone controls its own behavior, but observes B also controlling its own behavior. Bruin B docs likewise.
Brain A
( Behavior)
Observations received
Interactions
Brain B
( Behavior)
Stage 2. Brain A has made B's mind, an internal model of B's brain based on what A has observed B doing. This helps A predict B's behavior. Brain B has done the same.
Brain A
( Behavior)
( Behavior) Interactions
Stage 3. Brain A creates a model of itself, A's mind; the behavioral interactions between A and B can now be modeled within A's brain by the interactions of A's mind and B's mind. Minds now control social behavior.
Figure I Minds control behavior. We attribute o i h t r people's actions to their minds, which it therefore seems appropriate to regard as the models our own brains make o f the source or other people's and our o w n behavior. This diagram shows three stages i n the development within the brain of minds that take control of its social behavior, leaving the brain in charge o f die unconscious, automatic actions we regard as mindless.
11
22 At this stage brain A can predict B's solo behavior, but this will not enable him to predict the outcome of interactions between A and B, for these cannot be modeled, since there is nothing inside A ' s brain suitable for B's mind to interact with. Brain A needs to make a model of the way its own brain generates its own behavior; this is A's mind, and the interactions between A's mind and B's mind allow brain A to model the interactions between A and B. By this time the two minds in brain A have become dominant in controlling much of the behavior of A , especially the verbal and other exchanges with other individuals. O n the other hand, nonverbal and nonsocial behavior may still be generated predominantly by brain A, uninfluenced by the minds it has created within itself—these are, as we would say, mindless actions. O f course all the time similar processes will have been going on in brain B, so A ' s models will not have it all their own way when it comes to making accurate predictions. Furthermore, if these mind-models are realistic, they should include minds within themselves; B's mind in brain A should, for example, have a little A's mind inside itself, for by this stage brain B will have its own version of A's mind, as shown in the right half of the figure. This infinite regress can presumably be cut short after a few cycles without seriously affecting the predictive power of the models, but it makes the brain into something more like a hall of mirrors than any currently understood machine. Surely, however, this hall of mirrors accurately portrays the fact that two people's minds interact with each other in an extraordinarily complex and intimate manner. A t the very least, the interactions that this figure attempts to portray will have to be taken into account in any mechanistic accounts of higher human behavior, and I do not see how this can be done without giving the concept of the mind a very dominant role.
Why Minds are More Interesting Now that we have a distinction between mind and brain, we can see why most people are much more interested in one than the other. T h e mathematician, linguist, historian, or literary critic is not concerned with the nuts and bolts of brain mechanisms, but with mind—the abstract model that appears to be responsible for the works that he studies and perhaps produces himself. It is the same with cars, domestic appliances, word processors, and so on: except for the expert, people simply do not care how they work as long as they do the job expected. O f course when minds fail to work as expected we may call in the brain specialist, but until then it is minds we deal with. I think this distinction between the mechanistic brain and the mindmodels it makes has interesting implications for our understanding of pain.
12
23 pleasure, ethics, and consciousness itself. A s I have argued elsewhere, the part of the brain's functioning of which we are consciously aware seems to be confined to the inputs, outputs, and interactions of the m i n d s of Figure 1. Just as ordinary people are more interested in minds than brains, so should be philosophers, for it is the source of peoples' actions rather than their mechanism that is their primary concern. Minds are the conceptual sources of this behavior, and although minds are the product of a mechanistic brain, their actions and interactions are not going to be understood in terms of physics and chemistry alone, even if they are ultimately determined by physics and chemistry alone. Notice that within this entirely mechanistic framework one c a n ask interesting questions about minds: H o w is the stability of social interestion affected by how good a model of B's brain A has in the form of B's mind? Are inaccurate models unethical? O r might it be unethical to have too good a model? H o w about the same questions applied to A ' s m i n d , the model of his own brain? H o w far will taught precepts affect the f o r m of these models? D o we also have mind-models of deities? Surely these are the sort of questions appropriate for theoretical analysis by philosophers and others, and the mechanistic nature of the brain hardly affects them. It is worth adding that such ideas might also form the basis for experiments. T h u s , to neuroscientists, the machine-like aspects of the brain are immediately important, because they firmly direct our attention to the physics and chemistry of the brain. But the metaphor should not be taken to imply that there is no more to the brain than can be described in terms of physics and chemistry, and the concept of mind is certain to be important in any satisfactory account of the way the brain controls social behavior. F o r this reason, philosphers and ordinary people are properly more c o n cerned with minds than brains.
CONCLUSIONS T o conclude, for most people in most circumstances the physical a n d chemical causation of the brain's output is not as important as the observable behavior that actually occurs; minds are what we attribute this behavior to, and our common-se"nse understanding of them will almost always give more useful predictions than knowledge of the physical a n d chemical processes that underlie them. F o r neuroscientists trying to refine such mechanistic accounts, the metaphor is appropriate and encouraging, but the dislike it arouses in others seems fully justified by its implication that other people's minds are mere machines and can be treated as s u c h . What is needed to prevent the metaphor of the mechanical mind sanctioning unbridled egocentric behavior is to get rid of the prejudice that
13
24
machines are essentially simple and deterministic, and to gain an appreciation of the complexity and difficulties in predicting behavior produced by two or more minds interacting in the manner shown in Figure 1. There is no reason to doubt that the brain is entirely mechanical, but it is a wonderful mechanism that can generate and use the concept of mind. ACKNOWLEDGEMENTS
I would like to acknowledge considerable help toward clarifying these ideas from Ian Glynn, Graeme Mitchison, Kathy Mullen, and Miranda Weston-Smith.
14
Reprinted with permission from On the Sensations of Tone. pp.366-371. 1954 © 1954 Dover Publications Inc.
E S T H E T I C ANALYSIS O F WORKS O F A R T
The esthetic analysis of complete musical works of art, and the comprehension of the reasons of their beauty, encounter apparently invincible obstacles at almost every point. But in the field of elementary musical art we have now gained so much insight into its internal connection that we are able to bring the results of our investigations to bear on the views which have been formed and in modem times nearly universally accepted respecting the cause and character of artistic beauty in general. It is, in fact, not difficult to discover a close connection and agreement between them ; nay, there are probably fewer examples more suitable than the theory of musical scales and harmony, to illustrate the darkest and most difficult points of general esthetics. Hence I feel that I should not be justified in % passing over these considerations, more especially as they are closely connected with the theory of sensual perception, and hence with physiology in general. No doubt is now entertained that beauty is subject to laws and rules dependent on the nature of human intelligence. The difficulty consists in the fact that these laws and rules, on whose fulfilment beauty depends and by which it must be judged, are not consciously present to the mind, either of the artist who creates tbe work, or the observer who contemplates it. Art works with design, but the work of art ought to have the appearance of being undesigned, and must be judged on that ground. Art createB as imagination pictures, regularly without conscious law, designedly without conscious aim. A work, known and acknowledged as the product of mere intelligence, will never be accepted as a work of art, however perfect be its adaptation to its end. Whenever we see that conscious reflection has acted in tbe arrangement of the whole, we find it poor. "'
Man fiihlt die Absicht, und man wird verstimmt. (We feel the purpose, und it jars upon us.)
And yet we require every work of art to be reasonable, and we shew this by subjecting it to a critical examination, and by seeking to enhance our enjoyment and our interest in it by tracing out the suitability, connection, and equilibrium of all its separate parts. The more we succeed in making the harmony and beauty of all its peculiarities clear and distinct, the richer we find it, and we even regard as the principal characteristic of a great work of art that deeper thought, reiterated observation, and continued reflection shew ua more and more clearly the reasonableness of all its individual parts. Our endeavour to comprehend the beauty of such a work by critical examination, in which wa partly succeed, shews that we assume a certain adaptation to reason in works of art, which may possibly rise to a conscious understanding, although such understanding is neither necessary for the invention nor for the enjoyment of the beautiful. For what is esthetically beautiful is recognised by the immediate judgment of a cultivated taste, which declares
15
3°7 it pleasing or displeasing, without any comparison whatever with law or conception. But that we do not accept delight in the beautiful as something individual, but rather hold it to be in regular accordance with the nature of mind in general, appears by our expecting and requiring from every other healthy human intellect the same homage that we ourselves pay to what we call beautiful. At most we allow that national or individual peculiarities of taste incline to this or that artistic ideal, and are most easily moved by it, precisely in the same way that a certain amount of education and practice in the contemplation of fine works of art is undeniably necessary for penetration into their deeper meaning. The principal difficulty in pursuing this object, is to understand how regularity can be apprehended by intuition without being consciously felt to exist. And this unconsciousness of regularity is not a mere accident in the effect of the beautiful on our mind, which may indifferently exist or not; it is, on the contrary, most ^ clearly, prominently, and essentially important. For through apprehending everywhere traces of regularity, connection, and order, without being able to grasp the law and plan of the whole, there arises in our mind a feeling that the work of art winch we are contemplating is the product of a design which far exceeds anything we can conceive at the moment, and which hence partakes of the character of the illimitable. Eemembering the poet's words: Dn gleiohst dem Geiat, dec du begreifst, (Thou'rt like the spirit thou couceivest),
we feel that those intellectual powers which were at workin the artist, are far above our conscious mental action, and that were it even possible at all, infinite time, meditation, and labour would have been necessary to attain by conscious thought that degree of order, connection, and equilibrium of all parts and all internal relations, which the artist has accomplished under the sole guidance of tact and taste, and which we have in turn to appreciate and comprehend by our own tact and taste, long before we begin a critical analysis of the work. It is clear that all high appreciation of the artist and his work reposes essentially on this feeling. In the first we honour a genius, a spark of divine creative fire, which far transcends the limits of our intelligent and conscious forecast. And yet the artist is a man as we are, in whom work the same mental powers as in ourselves, only in their own peculiar direction, purer, brighter, steadier; and by the greater or less readiness and completeness with which we grasp the artist's language we measure our own share of those powers which produced the wonder. Herein is manifestly the cause of that moral elevation and feeling of ecstatic satisfaction which is called forth by thorough absorption in genuine and lofty works of art. We learn from them to feel that even in the obscure depths of a healthy and harmoniously developed human mind, which are at least for the present inaccessible to analysis by conscious thought, there slumbers a germ of order that ^ is capable of rich intellectual cultivation, and we learn to recognise and admire in the work of art, though draughted in unimportant material, the picture of a similar arrangement of the universe, governed by law and reason in all its parts. The contemplation of a real work of art awakens our confidence in the originally healthy nature of the human mind, when uncribbed, unharassed, unobscured, and unfalsified. But for all this it is an essential condition that the whole extent of the regularity and design of a work of art should not he apprehended consciously. It is precisely from that part of its regular subjection to reason, which escapes our conscious apprehension, that a work of art exalts and delights us, and that the chief effects of the artistically beautiful proceed, not from the part which we are able fully to analyse. If we now apply these considerations to the system of musical tones and harmony, we see of course that these are objects belonging to an entirely subordinate
16
368 Application to Music
and elementary domain, but nevertheless they, too, are slowly matured inventions of the artistic taste of musicians, and consequently they, too, must be governed by the general ruleB of artistic beauty. Precisely because we are here still treading the lower walks of art, arid are not dealing with the expression of deep psychological problems, we are able to discover a comparatively simple and transparent solution of that fundamental enigma of esthetics. The whole of the last part of this book has explained how musicians gradually discovered the relationships between tones and chords, and how the invention of harmonic music rendered these relationships closer, and clearer, and richer. We have been able to deduce the whole system of rules which constitute Thorough Bass, from an endeavour to introduce a clearly sensible connection into the series of tones which form a piece of music. A feeling for the melodic relationship of consecutive tones, was first developed, •J commencing with Octave and Fifth and advancing to the Third. We have taken pains to prove that this feeling of relationship was founded on the perception of identical partial tones in the corresponding compound tones. Now these partial tones are of course present in the sensations excited in our auditory apparatus, and yet they are not generally the subject of conscious perception as independent sensations. The conscious perception of everyday life ia limited to the apprehension of the tone compounded of these partials, as a whole, just aa we apprehend tbe taste of a very compound dish as a whole, without clearly feeling how much of it is due to the salt, or the pepper, or other spices and condiments. A critical examination of our auditory sensations as such was required before we could diacover the existence of upper partial tonea. Hence the real reason of the melodic relationship of two tones (with the exception of a few more or less clearly expreaaed conjectures, as. for example, by Rameau and d'Alembert) remained so long undiscovered, or at least was not in any respect clearly and definitely formulated. I believe that I have ^ been able to furnish the required explanation, and hence clearly to exhibit the whole connection of the phenomena. The esthetic problem is thus referred to the common property of all sensual perceptions, namely, the apprehension of compound aggregates of sensations as sensible symbols of simple external objects, without analysing them. In our usual observations on external nature our attention is so thoroughly engaged by external objects that we are entirely unpractised in taking for the subjects of conscious observation, any properties of our sensations themselves, which we do not already know as the sensible expression of some individual external object or event. After musicians had long been content with the melodic relationship of tones, they began in the middle ages to make uae of harmonic relationship as shewn in consonance. The effects of various combinations of tones also depend partly on the identity or difference of two of their different partial tones, but they likewise partly depend on their combinational tonea. Whereas, however, in melodic „ relationahip the equality of the upper partial tonea can only be perceived by remembering tbe preceding compound tone, in harmonic relationship it is determined by immediate sensation, by the presence or absence of beats. Hence in harmonic combinations of tone, tonal relationship ia felt with that greater liveliness due to a present sensation as compared with the recollection of a past sensation. The wealth of clearly perceptible relations grows with the number of tones combined. Beats are easy to recognise as such when they occur slowly; but those which characterise dissonances are, almost without exception, very rapid, and are partly covered by sustained tones which do not beat, so that a careful comparison of slower and quicker beats is necessary to gain the conviction that the essence of dissonance consists precisely in rapid beats. Slow beats do not create the feeling of disson&nce, which does not arise till the rapidity of the beats confuses the ear and makes it unable to distinguish them. In this case also the ear feels the difference between the undiaturbed combination of sound in the case of two consonant tones, and the disturbed rough combination resulting from a dissonance. But, as
17
39
s general rule, the hearer is then perfectly unconscious of the cause to which the Unconscious disturbance and roughness are due. Sense of The development of harmony gave rise to a much rioheT opening out of musical Resemblance art than was previously possible, because the far clearer characterisation of related combinations of tones by means of chords and ehordal sequences, allowed of the use of much more distant relationships than were previously available, by modulating into different keys. In this way the means of expression greatly increased as well as the rapidity of the melodic and narmonic transitions which could now be introduced without destroying the musical connection. As the independent significance of chords came to be appreciated in the fifteenth and sixteenth centuries, a feeling arose for the relationship of chords to one another and to the tonic chord, in accordance with the same law which had long ago unconsciously regulated the relationship of compound tones. The relationship of compound tones depended on the identity of two or more partial tones, that of • chords on the identity of two or more notes. For the musician, of course, the law of the relationship of chords and keys is much more intelligible than that of compound tones. He readily hears tbe identical tones, or sees them in the notes before him. But the unprejudiced and uninstructed bearer is as little conscious of the reason of the connection of a clear and agreeable series of fluent chords, as he is of the reason of a well-connected melody. He is startled by a false cadence and feels its unexpectedness, but is not at all necessarily conscious of the reason of its unexpectedness. Then, again, we have seen that the reason why a chord in music appears to be the chord of a determinate root, depends as before upon the analysis of a compound tone into its partial tones, that is, as before upon those elements of a sensation which cannot readily become subjects of conscious perception. This relation between chords is of great importance, both in the relation of tbe tonic chord to the tonic tone, and in the sequence of chords. j The recognition of these resemblances between compound tones and between chords, reminds us of other exactly analogous circumstances which we must have often experienced. We recognise the resemblance between the faces of two near relations, without being at all able to say in what the resemblance consists, especially when age and sex are different, and the coarser outlines of the features consequently present striking differences. And yet notwithstanding these differ, ences—notwithstanding that we are unable to fix upon a single point in the two countenances which is absolutely alike—the resemblance is often so extraordinarily striking and convincing, that we have not a moment's doubt about it. Precisely the same thing occurs in recognising the relationship between two compound tones. Again, we are often able to assert with perfect certainty, that a passage not previously heard is due to a particular author or composer whose other works we know. Occasionally, but by no means always, individual mannerisms in verbal or f musical phrases determine our judgment, but as a rule we are mostly unable to fix upon the exact points of resemblance between the new piece and the known works of the author or composer. The analogy of these different cases may be even carried farther. When a father and daughter are strikingly alike in some well-marked feature, as the nose or forehead, we observe it at once, and think no more about it. But if the resemblance is so enigmatically concealed that we cannot detect it, we are fascinated, and cannot help continuing to compare their countenances. And if a painter drew two such heads having, say, a somewhat different expression of character combined with a predominant and striking, though indefinable, resemblance, we should undoubtedly value it as one of the principal beauties of his painting. Our ad. miration would certainly not be due merely to his technical skill; we should rather look upon his painting as evidencing an unusually delicate feeling for the
18
37°
Unconscious Sense of Tonal Relationship
significance of the human countenance, and find in this the artistic justification of his work. Now the case Is similar for musical intervals. The resemblance of an Octave to its root is so great and striking that the dullest ear perceives it; the Octave seems to he almost a pure repetition of the root, as it, in fact, merely repeats a part of the compound tone of its root, without adding anything new. Hence the esthetical effect of an Octave is that of a perfectly simple, but little attractive interval. The most attractive of the intervals, melodically and harmonically, are clearly the Thirds and Sixths,—the intervals which lie at the very boundary of those that the ear can grasp. The major Third and the major Sixth cannot be properly appreciated unless the first five partial tones are audible. These are present in good musical qualities of tone. The minor Third and the minor Sixth are for the most part justifiable only as inversions of the former intervals. The more complicated •' intervals in the scale cease to have any direct or easily intelligible relationship. They have no longer tbe charm of the Thirds. Moreover, it is by no means a merely external indifferent regularity which the employment of diatonic scales, founded on the relationship of compound tones, has introduced into the tonal material of music, as, for instance, rhythm introduced some such external arrangement into the words of poetry. I have shewn, on the contrary, in Chapter X I V . , that this construction of the scale furnished a means of measuring the intervals of their tones, so that the equality of two intervals lying in different sections of the scale would be recognised by immediate sensation. Thus tbe melodic step of a Fifth is always characterised by having the second partial tone of the second note identical with tbe third of the first. This produces a definiteness and certainty in the measurement of intervale for our sensation, such as might be looked for in vain in the system of colours, otherwise so similar, or in the estimation of mere differences of intensity in our various sensual If perceptions. Upon this reposes also the characteristic resemblance between the relations of the musical scale and of space, a resemblance which appears to me of vital importance for the peculiar effects of music. It is an essential character of space that at every position within it like bodies can be placed, and like motions can occur. Everything that is possible to happen in one part of space is equally possible in every other part of space and is perceived by us in precisely the same way. This is the case also with the musical scale. Every melodic phrase, every chord, which can be executed at any pitch, can be also executed at any other pitch in such a way that we immediately perceive the characteristic marks of their similarity. On the other hand, also, different voices, executing the same or different melodic phrases, can move at the same time within the compass of the scale, like two bodies in space, and, provided they are consonant in the accented parts of bars, without creating any musical disturbances. Such a close analogy consequently exists in If all essential relations between tbe musical scale and space, that even alteration of pitch has a readily recognised and unmistakable resemblance to motion in space, and is often metaphorically termed the ascending or descending motion or progression of a part. Hence, again, it becomes possible for motion in music to imitate the peculiar characteristics of motive forces in space, that is, to form an image of the various impulses and forces which he at the root of motion. And on this, as I believe, essentially depends the power of music to picture emotion. It is not my intention to deny that music in its initial state and simplest forms may have been originally an artistic imitation of the instinctive modolations of the voice that correspond to various conditions of the feelings. But I cannot think that this is opposed to the above explanation ; for a great part of the natural means of vocal expreesion may ba reduced to such facts as the following: its rhythm and accentuation are an immediate expression of the rapidity or force of the corresponding psychical motives—all effort drives the voice up—a desire to make a pleasant impression on another mind leads to selecting a softer, pleasanter quality of
19
37« tone—and so forth. An endeavour to imitate the involuntary modulations of the Expression voice and make its recitation richer and more expressive, may therefore very posof Motion sibly have led our ancestors to the discovery of the first means of musical expresin Music sion, just as the imitation of weeping, shouting, or sobbing, and other musical delineations may play a part in even cultivated music, (as in operas), although such modifications of tbe voice are not confined to the action of free mental motives, but embrace really mechanical and even involuntary muscular contractions. But it is quite clear that every completely developed melody goes far beyond an imitation of nature, even if we include tbe cases of the most varied alteration of voice under the influence of passion. Nay, the very fact that music introduces progression by fixed degrees both in rhythm and in the scale, renders even an approximative^ correct representation of nature simply impossible, for most of the passionate affections of the voice are characterised by a gliding transition in pitch. The imitation of nature is thus rendered as imperfect as the imitation of IT a picture by embroidery on a canvas with separate little squares for each shade of colour. Music, too, departed still further from nature when it introduced tbe greater compass, the mobility, and the strange qualities of tone belonging to musical instruments, by which tbe field of attainable musical effects has become so much wider than it was or could be when tbe human voice alone was employed. Hence though it is probably correct to say that mankind, in historical development, first learned the means of musical expression from tbe human voice, it can hardly be denied that these same means of expressing melodic progression act, in artistically developed music, without tbe slightest reference to tbe application made of them in the modulations of the human voice, and have a more general significance than any that can be attributed to innate instinctive cries. That this is the case appears above all in the modern development of instrumental music, which possesses an effective power and artistic justification that need not be gainsaid, although we may not yet be able to explain it in all its details. IT
Here J close my work. It appears to me that I have carried it as far as the physiological properties of tbe sensation of hearing exercise a direct influence on the construction of a musical system, tbat is, as far as the work especially belongs to natural philosophy. For even if I could not avoid mixing up esthetic problems with physical, the former were comparatively simple, and the latter much more complicated. This relation would necessarily become inverted if I attempted to proceed further into the esthetics of music, and to enter on the theory of rhythm, forms of composition, and means of musical expression. In all these fields the properties of sensual perception would of course have an influence at times, but only in a very subordinate degree. The real difficulty would lie in the development of the psychical motives which here assert themselves. Certainly this is the point II where the more interesting part of musical esthetics begins, tbe aim being to explain the wonders of great works of art, and to learn the utterances and actions of the various affections of the mind. But, however alluring such an aim may be, I prefer leaving others to carry out such investigations, in which I should feel myself too much of an amateur, while I myself remain on the safe ground of natural philosophy, in which I am at home.
20
Reprinted wilh ; .- i ::•-•> i on from Science. Vol. 177, pp. 393-396. 4 Aug 1972 ' 1972 The Ameriuin Association for the Advancement of Science
less r e l e v a n c e t h e y s e e m t o have t o t h e very
real
ence,
problems o f the
much
less
to
rest o f s c i -
those
of
society,
T h e constructionist hypothesis breaks down
More Is Different
when
confronted
with
the
twin
d i f f i c u l t i e s o f scale a n d c o m p l e x i t y . T h e behavior
o f large
and
gates o f e l e m e n t a r y
complex
particles,
turns
in
terms
a simple extrapolation o f the
prop-
o u t . is n o t t o be u n d e r s t o o d
B r o k e n symmetry and the nature of
of
the hierarchical structure of science.
aggreit
erties
of
a
few
particles-
Instead,
fit
each level o f c o m p l e x i t y e n t i r e l y new properties P. W .
ing
Anderson
appear, and the
o f The n e w
understand-
behaviors
requires
s e a r c h w h i c h I Chink is as in
its n a t u r e
as
seems t o m e
any
that
other.
one
re-
fundamental That
may
is, i t
array
the
sciences r o u g h l y l i n e a r l y i n a h i e r a r c h y , T h e reductionist hypothesis m a y
slill
be a l o p i c t o r c o n t r o v e r s y a m o n g
phi-
l o s o p h e r ^ b u t a m o n g [he g r e a t m a j o r i t y o f a c t i v e s c i e n t i s t s I t h i n k i t is a c c e p t e d without question. T h e w o r k i n g s of o u r m i n d s a n d bodies,
and
o f all the
ani-
m a t e Or i n a n i m a t e m a t t e r o f w h i c h
we
have
as-
any
detailed
knowledge,
are
s u m e d t o be c o n t r o l l e d b y the s a m e set of
fundamental
under
certain
laws, extreme
which
except
conditions
we
feel w e k n o w p r e t t y w e l l . Tt seems i n e v i t a b l e t o g o o n
uncrit-
i c a l l y t o w h a t a p p e a r s at first s i g h t be ism:
an
obvious
corollary of
reduction-
that i f e v e r y t h i n g obeys the
fundamental
laws,
then
to
the
same
only
sci-
e n t i s t s w h o are s t u d y i n g a n y t h i n g r e a l l y
plan ai ion o f phenomena in terms o f k n o w n fundamental laws. A s always, distinctions o f this k i n d are n o t unambiguous, but I hey are clear in most cases. Solid state physics, plasma physics, and perhaps also b i o l o g y are extensive. H i g h energy physics a n d a g o o d p a r i o f nuclear physics arc intensive. There is always m u c h less intensive research going o n t h a n extensive. Once new f u n d a m e n t a l laws are discovered, a large and ever increasing activity begins In order to apply the discoveries to h i t h e r t o unexplained phenomena. Thus, there are t w o dimensions t o basic research. T h e f r o n t i e r o f science extends all along a Jong line f r o m the newest and most m o d e r n iniensive research, over the extensive research recently spawned b y the intensive research o f yesterday, t o the broad and w e l l developed w e b o f extensive research activities based o n intensive research o f past decades.
to
some
astrophysicists,
some
and
other
mathematicians,
known
this
by
But [hat
this
hierarchy
does
not
imply
s c i e n c e X is " j u s t a p p l i e d ¥ , " A t
e a c h stage e n t i r e l y n e w l a w s , generalizations
are
concepts,
necessary,
re-
as great a d e g r e e as i n t h e p r e v i o u s One, P s y c h o l o g y is n o t a p p l i e d b i o l o g y , n o r
of
is b i o l o g y a p p l i e d
and
few
article
Weisskopf
physiology psychology
be i n d i c a t e d b y t h e f a c t t h a t f h e a r d i t
is e x p r e s s e d i n a r a t h e r passage
psychology social sciences
q u o t e d recently b y a leader i n the held materials
science,
"fundamental
of
Y elementary particle physics many-body physics chemistry molecular biology
elemen-
the
purpose
X solid state o r m a n y - b o d y physics chemistry molecular biology cell b i o l o g y
logicians
participants
main
science Y .
q u i r i n g i n s p i r a t i o n a n d c r e a t i v i t y t o just
o t h e r s . T h i s p o i n t o f v i e w , w h i c h i t is oppose,
elementary
T h e effectiveness o f t h i s message m a y
amounts
tary particle physicists, some
idea; T h e
e n t i t i e s o f science X o b e y t h e l a w s o f
and
f u n d a m e n t a l are t h o s e w h o are w o r k i n g on those laws. I n practice, that
a c c o r d i n g to t h e
to
well-
{!):
matter
who
urged
at a m e e t i n g d e d i c a t e d problems
physics"
to
in
accept
there
w e r e few o r no such p r o b l e m s a n d
that
science,
w h i c h he s e e m e d t o e q u a t e w i t h
device
engineering. The
main
fallacy
in
this
kind
of
t h i n k i n g is t h a t t h e r e d u c t i o n i s t h y p o t h esis does
not
by any
"constructionist''
one;
means The
tal
l a w s does n o t
start
from
imply
imply
ability
fundamen-
the ability
those laws a n d
a to to
reconstruct
t h e u n i v e r s e . I n f a c t , t h e m o r e t h e elem e n t a r y p a r t i c l e p h y s i c i s t s t e l l us a b o u t the nature o f the fundamental laws, the
21
chemistry.
I n m y o w n field o f m a n y - b o d y p h y s ics,
w e are, p e r h a p s , c l o s e r t o o u r
fun-
damental, intensive underpinnings
than
in
non-
any
other
science
in
which
t r i v i a l c o m p l e x i t i e s o c c u r , a n d as a sult
we
general
reduce everything to simple The author it a member • ! Ihe technical ElafT of the Bel] Telephone Laboratories Murray Hill. New JerMf 07974, and »i*IUns protewor oE IhEoretical prijiic* at Cavandiili Laboratory. Cambridge. England. Thi* article l i i n expanded uenton ot a Rescnta' Lecture prcc In 1967 at IhB UnJ-ErtiEy el California, La Jail*.
to
condensed that
n o t h i n g was left but extensive L o o k i n g at the development o f science in ihe T w e n t i e t h C e n t u r y one can dist i n g u i s h i w o trends, w h i c h J w i l l call " i n t e n s i v e " a n d "extensive" research, l a c k ing a better t e r m i n o l o g y . I n s h o r t : i n tensive research goes f o r the fundamental laws, extensive research goes f o r the ex-
the
from
have theory
begun
to
o f just
formulate
how
this
quantitative to qualitative
rea
shift differ-
entiation takes place. T h i s f o r m u l a t i o n , called
ihe
theory
of
"broken
sym-
m e t r y . " m a y be o f h e l p i n m a k i n g m o r e generally
clear
constructionist ism.
Che b r e a k d o w n converse
of
of
the
reduction-
I w i l l give an elementary
and in-
c o m p l e t e e x p l a n a t i o n o f these ideas, a n d t h e n go o n t o s o m e m o r e g e n e r a l speculative
comments
about
analogies
at
other levels and about similar phenomena. Before beginning this I wish to sort out two possible sources of misunderstanding. First, when I speak of scale change causing fundamental change I do not mean the rather well-understood idea that phenomena at a new scale may obey actually different fundamental laws—as, for example, general relativity is required on the cosmological scale and quantum mechanics on the atomic. 1 think it will be accepted that all ordinary matter obeys simple electrodynamics, and quantum theory, and that really covers most of what I shall discuss. (As I said, we must all star! with reductionism, which 1 fully accept.) A second source of confusion may be the fact that the concept of broken symmetry has been borrowed by the elementary particle physicists, but their use of the Term is strictly an analogy, whether a deep or a specious one remaining to be understood. Let me then start my discussion with an example on the simplest possible level, a natural One for me because 1 worked wilh it when I was a graduate student: the ammonia molecule. At that time everyone knew about ammonia and used it to calibrate his theory or his apparatus, and I was no exception. The chemists will tell you that ammonia "is" a triangular pyramid
with the nitrogen negatively charged and the hydrogens positively charged, so that it has an electric dipole moment (p), negative toward the apex of the pyramid. Now this seemed very strange to me. because I was just being taught ihat nothing has an electric dipole moment. The professor was really proving that no nucleus has a dipole moment, because he was leaching nuclear physics, but as his arguments were based on the symmetry of space and lime they should have been correct in general, 1 soon learned that, in facl they were correct (or perhaps it would be more accurate to say not incorrect) because he had been careful to say that no stationary state of a system (that is. one which does not change in time) has an electric dipole moment. If ammonia starts out from Ihe above unsymmelrical state, it will not slay in it very long. By means of quantum mechanical tunneling, the nitrogen can T
leak through the triangle of hydrogens to the other side, turning the pyramid inside oul, and, in fact, it can do so very rapidjy. This is the so-called "inversion," which occurs at a frequency of about 3 x 10 ° per second. A truly stationary state can only be an equal superposition of the unsymmetrical pyramid and its inverse. That mixture does not have a dipole moment. (I warn Ihe reader again that 1 am greatly oversimplifying and refer him to the textbooks for details.) l
I will not go through the proof, but the result is that the state of the system, if it is to be stationary, must always have the same symmetry as the laws of motion which govern it. A reason may be pur very simply: In quantum mechanics there is always a way, unless symmetry forbids, to get from one state to another. Thus, if we start from any one unsymmetrical state, the system will make transitions to others, so only by adding up all the possible unsymmetrical states in a symmetrical way can we get a stationary state. The symmetry involved in the case of ammonia is parity, the equivalence of left- and right-handed ways of looking at things. (The elementary particle experimentalists' discovery of certain violations of parity is not relevant to this question; those effects are too weak to aflecl ordinary matter.) Having seen how the ammonia molecule satisfies our theorem that there is no dipole moment, we may look into other cases and. in particular, study progressively bigger systems to see whether the state and the symmetry are always related. There are other similar pyramidal molecules, made of heavier atoms. Hydrogen phosphide, PH . which is twice as heavy as ammonia, inverts, but at one-tenth the ammonia frequency. Phosphorus trifluoride, PF . in which the much heavier fluorine is substituted for hydrogen, is not observed to invert at a measurable rate, although theoretically one can be sure that a state prepared in one orientation would invert in a reasonable time. 3
a
We may then go on to more complicated molecules, such as sugar, with about 40 atoms. For these it no longer makes any sense to expect the molecule to invert itself. Every sugar molecule made by a living organism is spiral in the same sense, and they never invert, either by quantum mechanical tunneling or even under thermal agitation at normal temperatures. At this point we must forget ahout the possibility of inversion and ignore the parity symmetry:
22
the symmetry laws have been, not repealed, but broken. If, on the other hand, we synthesize our sugar molecules by a chemical reaction more or less in thermal equilibrium, we will find that there are not, on the average, more left- than righthanded ones or vice versa. In the absence of anything more complicated than a collection of free molecules, the symmetry laws are never broken, on the average. We needed living matter to produce an actual unsymmetry in the populations. In really large, but still inanimate, aggregates of atoms, quite a different kind of broken symmetry can occur, again leading to a net dipole moment or to a net optical rotating power, or both. Many crystals have a net dipole moment in each elementary unit cell (pyroelectricity), and in some this moment can be reversed by an electric field (ferroelectricity). This asymmetry is a spontaneous effect of the crystal's seeking its lowest energy state. Of course, the state with the opposite moment also exists and has, by symmetry, just the same energy, but the system is so large that no thermal or quantum mechanical force can cause a conversion of one to the other in a finite lime compared to. say, the age of the universe. There arc at least three inferences to be drawn from this. One is that symmetry is of great importance in physics. By symmetry we mean the existence of different viewpoints from which the system appears the same, I I is only slightly overstating the case to say that physics is the study of symmetry. The first demonstration of the power of this idea may have been by Newton, who may have asked himself the question: What if the matter here in my hand obeys the same laws as that up in the sky— that is, what if space and matter are homogeneous and isotropic? The second inference is that the internal structure of a piece of matter need not be symmetrical even if the total state of it is. I would challenge you to start from the fundamental laws of quantum mechanics and predict the ammonia inversion and its easily observable properties without going through the stage of using The unsymmetrical pyramidal structure, even though no 'state" ever has that structure. It is fascinating that it was not until a couple of decades ago (2) that nuclear physicists stopped thinking of the nucleus as a featureless, symmetrical little hall and realized that while it really never has a dipole moment, it can become footballH
shaped or plate-shaped. This has observable consequences in the reactions and excitation spectra that are studied in nuclear physios, even though it is much more difficult to demonstrate directly than the ammonia inversion. In my opinion, whether or not one calls this intensive research, it is as fundamental in nature as many things one might so label. But it needed no new knowledge of fundamental laws and would have been extremely difficult to derive synthetically from those laws; it was simply an inspiration, based, to be sure, on everyday intuition, which suddenly fitted everything together. The basic reason why this result would have been difficult to derive is an important one for our further thinking. If the nucleus is sufficiently small there is no real way to define its shape rigorously: Three or four or ten particles whirling about each other do not define a rotating "plate" or "football." It is only as the nucleus is considered to be a many-body system—in what is often called the JV -* «i limit—that such behavior is rigorously definable. We say to ourselves: A macroscopic body of that shape would have such-and-such a spectrum of rotational and vibrational excitations, completely different in nature from those which would characterize a featureless system. When we see such a spectrum, even not so separated, and somewhat imperfect, we recognize that the nucleus is, after all. not macroscopic; it is merely approaching macroscopic behavior. Starting with the fundamental laws and a computer, we would have to do two impossible things —solve a problem with infinitely many bodies, and then apply the result to a finite system—before we synthesized this behavior. A third insight is that the state of a really big system does not at all have to have the symmetry of the laws which govern it; in fact, it usually has less symmetry. The outstanding example of this is the crystal: Buili from a substrate of atoms and space according to laws which express the perfect homogeneity of space, the crystal suddenly and unpredictably displays an entirely new and very beautiful symmetry. The general rule, however, even in the case of the crystal, is that the large system is less symmetrical than the underlying structure would suggest: Symmetrical as it is, a crystal is less symmetrical than perfect homogeneity. Perhaps in the case of crystals this appears to be merely an exercise in confusion. The regularity of crystals
could be deduced •-. :miempirically in the mid-19th century without any complicated reasoning at all. But sometimes, as in the case of superconductivity, the new symmetry—now called broken symmetry because the original symmetry is no longer evident—may be of an entirely unexpected kind and extremely difficult to visualize. In (he case of superconductivity. 30 years elapsed between the time when physicists were in possession of every fundamental law necessary for explaining it and the time when it was actually done. The phenomenon of superconductivity is the most spectacular example of tbe broken symmetries which ordinary macroscopic bodies undergo, but it is of course not the only one. Antiferromagnets, ferroelectrics. liquid crystals, and matter in many other states obey a certain rather general scheme of rules and ideas, which some many-body theorists refer to under the general heading of broken symmetry. J shall not further discuss the history, but give a bibliography at the end of this article {3).
1 do not mean to give the impression that all is settled. For instance. I think there are still fascinating questions of principle about glasses and other amorphous phases, which may reveal even more complex types of behavior. Nevertheless, the role of this type of broken symmetry in the properties of inert but macroscopic material hodies is now understood, at least in principle. In this case we can see how the whole becomes not only more than but very different from the sum of its parts. The next order of business logically is to ask whether an even more complete destruction of the fundamental symmetries of space and time is possible and whether new phenomena then arise, intrinsically different from tbe "simple" phase transition representing a condensation into a less symmetric state.
We have already excluded the apparently unsymmetric cases of Liquids, gases, and glasses. (In any real sense they are more symmetric.) It seems to me that the next stage is to consider the The essential idea is that in tbe so- system which is regular but contains called N —> « limit of large systems (on information. That is. it is regular irr our own. macroscopic scale) it is not space in some sense so that it can be only convenient but essential to realize "read out," but it contains elements that matter will undergo mathematically which can be varied from one "cell" sharp, singular "phase transitions" to to the next. An obvious example is states in which the microscopic sym- DNA; in everyday life, a line of type metries, and even the microscopic equa- or a movie film have the same structions of motion, are in a sense violated. ture. This type of "information-hearing The symmetry leaves behind as its ex- crystallinily' seems to be essential to pression only certain characteristic be- life. Whether the development of life haviors, for instance, long-wavelength requires any further breaking of symvibrations, of which the familiar exam- metry is by no means clear. ple is sound waves; or the unusual macKeeping on with the attempt to charroscopic conduction phenomena of the superconductor: or, in a very deep acterize types of broken symmetry analogy, the very rigidity of crystal lat- which occur in living things, 1 find that tices, and thus of most solid matter- at least one further phenomenon seems There is, of course, no question of the to be identifiable and either universal or system's really violating, as opposed to remarkably common, namely, ordering breaking, the symmetry of space and (regularity or periodicity) in the time time, but because its parts find it ener- dimension. A number of theories of life getically more favorable to maintain cer- processes have appeared in which regtain fixed relationships with each other, ular pulsing in time plays an important the symmetry allows only the body as role: theories of development, of growth a whole to respond to external forces. and growth limitation, and of the memory. Temporal regularity is very commonly observed in living objects. U This leads to a "rigidity," which is plays at least two kinds of roles. First, also an apt description of superconduc- most methods of extracting energy from tivity and superfluidity in spite of their the environment in order to set up a apparent "fluid" behavior. [In the for- continuing, quasi-stable process involve mer case, London noted this aspect lime-per in die machines, such as oscilvery early {4).\ Actually, for a hypo- lators and generators, and the processes thetical gaseous but intelligent citizen of of life work in the same way. Second, Jupiter or of a hydrogen cloud some- temporal regularity is a means of hanwhere in the galactic cooler, the proper- dling information, similar to informaties of ordinary crystals might well be tion-bearing spatial regularity. Human a more baffling and intriguing puzzle spoken language is an example, and it than those of superfluid helium. h
23
is noteworthy that all! computing machines use temporal pulsing. A possible third role is suggested in some of the theories mentioned above: the use of phase relationships of temporal pulses to handle information and control the growth and development of cells and organisms &).
possible; analysis, on the other hand, may be not only possible but fruitful in all kinds of ways: Without an understanding of the broken symmetry in superconductivity, for instance, Josephson would probably not have discovered his effect. (Another name for the Josephson effect is "macroscopic quantum-interference phenomena": interference efIn some sense, structure'—functional structure in a teleologies! sense, as op- fects observed between macroscopic posed to mere crystalline shape—must wave functions of electrons in superalso be considered a stage, possibly in- conductors, or of helium atoms in sutermediate between crystallinity and in- perfiuid liquid helium. These phenomformation strings, in the hierarchy of ena have already enormously extended the accuracy of electromagnetic meabroken symmetries. surements, and can be expected to play To pile speculation on speculation. T a great role in future computers, among would say that the next stage could be other possibilities, so that in the long hierarchy or specialization of function, run they may lead to some of the major or both. At some point we have to stop technological achievements of this dectalking about decreasing symmetry and ade (5)J For another example, biology start calling it increasing complication- has certainly taken on a whole new asThus, with increasing complication at pect from tbe reduction of genetics to each stage, we go on up the hierarchy biochemistry and biophysics, which will of the sciences. We expect to encounter have untold consequences- So it is not fascinating and, I believe, very funda- true, as a recent article would have it mental questions at each stage in fitting (7), that we each should "cultivate our together less complicated pieces into the own valley, and not attempt to build more complicated system and under- roads over the mountain ranges , standing the basically new types of be- between the sciences," Rather, we havior which can result. should recognize that such roads, while There may well be no useful parallel often the quickest shortcut to another to be drawn between the way in which part of our own science, are not visible complexity appears in the simplest cases from the viewpoint of one science alone. The arrogance of the particle physiof many-body theory and chemistry and tbe way it appears in the truly complex cist and his intensive research may be cultural and biological ones, except per- behind us (the discoverer of the positron haps to say that, in general, the rela- said "the rest is chemistry"), hut we tionship between the system and its have yet to recover from that of some parts is intellectually a one-way street. molecular biologists, who seem deterSynthesis is expected to be all but im-
24
mined (o try to reduce everything about the human organism to "only" chemistry, from the common cold arid all mental disease to the religious instinct. Surely there are more levels of organ i z a t i o n between human e t h o l o g y and DNA than there are between DNA and quantum electrodynamics, and each level can require a whole new conceptual structure. In closing, I offer two examples from economics of what I hope to have said. Marx said that quantitative differences become qualitative ones, but a dialogue in Paris in the 1920's sums it up even more clearly: F r r z o E R A L D : The rich are different from us. HEM [NOWAY: Yes, they have more money.
HclcKlKcl 1. V . F. WeiMkopt, In BtOvkhavtit Nor. Lob. PubL SSBT160 tiMS), Al*n fee Nuovo CImrn'o SUppl. Ser I 4. 46i (1966); Phys. Today 20 (No. 5). Z3 JI947). 2. A . Babe and B. R. MoHelson. Kgl. DAI. Vldflh. Srltk. htaU Fyt. Medd. 27. 16 (19511. 3. Broken lymmcliy Bad pbUc irUliliQD^ L . D. Landau. f * » % Sot-ietunloa U, i f i . 543 (1937]. Broken fymmcLrj and collective motion. £cDcralr J. t i . : . I - . A . Salam. S. Weinberg, rhys- Rev- 117, 965 (I961J: P. W. Andenon, Concepts In Solids (Benjamin. New York. 1963). pp. 175-IB2; B. D . JtHeptuon. Inesii. Trinity Colleae. Cambridae UniVEHJiy (1962). Special case*: an Illerro m agnei Ism. P. W . Anderson. Phyt- Rev. B6, 694 ; Y. "-" i Ibid. i l l , SOB (19*01. 4. P. London, Superflulds {Wiley. New York, 1950]. vol. 1. 5. M. H . Cohen, J. Theor. Biol. 31, 101 . 7. A . B. PLppaid. RttPntHbie Physics wilh ly (Cambridge Univ. P r t u , Loudon. 1972).
Reprinted with permission from Computational Nturvsttenee, pp. 4 6 - « . ]99fJ © 1990 M I T Press
The expression "Computational neuroscience" reflects the possibility of generating theories of brain function in terms of the information-processing properties of structures that make up nervous systems. It implies that we ought to be able to exploit the conceptual and technical resources of computational research to help find explanations of how neural structures achieve their effects, what functions are executed by neural structures, and the nature of representation by states of the nervous system.
What Is Computational Neuroscience? Patricia S. Churchland, Christof Koch, and Terrence J. Sejnowski
The expression also connotes the potential for theoretical progress in cooperative projects undertaken by neurobiologists and computer scientists. This collaborative possibility is crucial, for it appears that neither a purely bottom-up strategy nor a purely top-down strategy for explaining how the brain works is likely to be successful. With only marginal caricature, one can take the purely bottom-up strategy as recommending that higher-level functions can be neither addressed nor understood until all the fine-grained properties of each neuron and each synapse are understood. But if, as it is evident, some properties are network effects or system effects, and in that sense are emergent properties, they will need to be addressed by techniques appropriate for higher levels and described by theoretical categories suitable to that level. Assuming there are system properties or network properties that are not accessible at the single-unit level, then knowing all the fine-grained detail would still not suffice to explain how the brain works. A purely top-down strategy is typified, again with minimal caricature, by its dismissal of the organization and structure of the nervous system as essentially irrelevant to determining the nature of cognition (Pylyshyn 1950; Fodor 1975). Advocates of this strategy prefer instead to find computational models that honor only (or at least primarily} psychological and computational constraints. One major reason for eyeing skeptically the purely top-down strategy is that computational space is consummately vast, and on their own. psychological and engineering constraints do not begin to narrow the search
25
biological constraints has provoked theorists to reconsider what neuroscientists have said all along: the neuronal organization matters. Connectionists have begun to develop alternative strategies to GOFAI that are yielding models that are surprisingly powerful as well as more biologically plausible. {For example. Lehky and Sejnowski {1930j describe a neural network model that computes the shape of an object from its gray level intensities array; see also papers in Rumelhart and McClelland 1986 and Zipser and Andersen 19S8J
space down to manageable proportions. Unless we go into the black box, we are unlikely to get very far in understanding the actual nature of fundamental cognitive capacities, such as learning, perceiving, orienting and moving in space-time, and planning. There is an additional reason it may be inefficient to try to determine how the brain accomplishes some task, such as binocular depth perception, by taking a purely engineering approach: nervous systems are products of evolution. The brain s solutions to such problems may be radically different from what a clever engineer might design, at least for the reason that evolutionary changes are made within the context of a design and architecture that already is in place. Evolution cannot start from scratch, even when the optimal design would require that course. As [acob ( 1 9 8 2 ) has remarked, evolution is a tinkerer. and it fashions its modifications out of available materials, limited by earlier decisions- Moveover, any given capacity, such as binocular depth perception, is part of a much larger package subserving sensorimotor control and survival in general. From an engineering point of view, a design for an isolated stereoscopic device may look like a wonderfully smart design, but in fact it may not integrate at all well with the wider system and may be incompatible with the general design of the nervous system For these and other reasons [Churchland 1966). neurobiological constraints have come to be recognized as essential to the project.
Is the Brain a Computer? Ef we seek theories to explain in computational terms the function of some part of the nervous system—a network of neurons, or an individual neuron, or perhaps a system of networks—then that structure is seen as doing computation and hence as being a computer. But is this a justifiable perspective? If by "computer" we mean "serial digital computer," then the answer is No, For the analogy between the brain and a serial digital computer is an exceedingly poor one in most salient respects, and the failures of similarity between digital computers and nervous systems are striking. On the other hand, if we embrace a broader conception of computation (see below] then the answer is a definite Yes,
Although the name "computational neurosdence' may have emerged recently, the central motivation for the enterprise is by no means new. Even so, a great deal has happened between the publication of Cybernetics by Wiener in 1948 and the Computational Neuroscience meeting in Monterey in the spring of 1987. First, there has been a spectacular growth in anatomical and physioLogical information regarding nervous systems, and in techniques for extracting information. So remarkable and munificent have been the discoveries, that it has begun to seem that the time is ripe for generating empirically adequate computational theories at a variety of levels. No less dramatically, technological achievements in designing fast, powerful, and relatively Inexpensive computing machines have made it possible to undertake simulation and modeling projects that were hitherto only pipe dreams. Finally, disappointment in conventional A l strategies (Good Old Fashioned A l , or GOFAI, as Haugeland [ 1 9 8 5 ] calls it) for modeling cognition independently of neuro-
26
Of the dissimilarities between serial digital computers and nervous systems, perhaps the most important is this: nervous systems are not general purpose machines that are programmed up to be specialized machines; rather, they evolved to perform certain kinds of tasks. Unlike manufactured computers, nervous systems have plasticity—they grow, develop, leam and change. Nervous systems have a profoundly different architecture from serial digital computers. In particular, they have parallel structure, and nervous systems appear to use a system of information storage that is very different from that employed in serial digital computers. In conventional silicon chips one node is connected, on average, to three to four other nodes, while by contrast one cortical neuron may receive input from hundreds or thousands of neurons and in turn project to thousands of other neurons in cortex. It is also important in this context to note that events in nervous systems happen in the millisecond [10~ second) range, whereas events in silicon systems happen in the nanosecond (10~* second) range. The individual comH
3
ponents making up the brain and those making up serial digital machines work on vastly different time scales. This makes it quite obvious that the parallel architecture is not a trivial but an absolutely critical aspect of computational design in nervous systems. Finally, much of the computing in nervous systems undoubtedly is not manipulation of symbols according to rules explicit in the system, in contrast to conventional A l programs run on a serial digital computer. (Rumelhart and McClelland 1986)
and associated physical variables. More exactly, a physical system computes a function fix) when there is 11) a mapping between the system's physical inputs and x (2) a mapping between the system's physical outputs and y, such that (J) /(*) = y. This kind of arrangement is easily understood in a digital computer, though it is by no means confined to a machine of that configuration. A machine is taken to compute, for example, (PLUS 2,3) to give 5 as the output, by dint of the fact that its physical regularities are set up in such as way as to honor the These dissimilarities do not imply that brains are not computers, but only that brains are not serial digital abstract regularities in the domain of numbers. But that computers. Identifying computers with serial digital com- sort of property (physical states of a device mappable onto information-bearing states) can be achieved in many puters is neither justified nor edifying, and a more insightdifferent kinds of architectural arrangements. Notice that ful strategy is to use a suitably broad notion of compualthough two pieces of wood sliding over each other may tation that takes the conventional digital computer as only a special instance, not as the defining archetype. (For initially not be regarded as .• computer, two pieces of wood with the proper representations is a slide rule, and a different view, see Pylyshyn 1984.) a slide rule is a mechanical computer whose physical It is useful, therefore, to class certain functions in the structure is set up so that certain mathematical relations brain as computational because nervous systems represent are preserved. Given this basic account of computation, it and they respond on the basis of the representations. is also useful to hypothesize what computations are perNervous systems represent the external world, the body formed by neurobiological components. For example, it they inhabit, and, in some instances, parts of the nervous appears that the neurons in the parietal cortex of primates system itself. Nervous system effects can be seen as compute head co-ordinates on the basis of retinal cocomputations because at some levels the explanations of ordinates (Andersen and Mountcastle 1983; Zipser and state transitions refer to abstract properties and the Andersen. 1987,1988), and that neurons in area VI of the relations between abstract properties Thai is, the exvisual cortex compute depth of a stimulus on the basis of planations are not simple mechanical explanations that binocular disparity (Poggio et al. 1985). cite causal interactions between physically specified items: for example, the explanation of how a molecule is transA point that is perhaps only implicit in the foregoing ported down an axon. Rather, the explanation describes characterization of computation should now be emthe states in terms of the information transformed, repphasized. A physical system is considered a computer resented, and stored. when its states can be taken as representing states of f
some other system: that is. so long as someone sees an interpretation of its states in terms of a given algorithm. Thus a central feature of this characterization is that whether something is a computer has an interest-relative component, in the sense that it depends on whether someone has an interest in the device s abstract properties and in interpreting its states as representing states of something else. Consequently, a computer is not a natural kind, in the way that, for example, an electron or an protein or a mammal is natural kind. For categories that do delimit natural kinds, experimentation is relevant to determining whether a given item really belongs to the category, and there are generalizations and laws (natural laws) about the items in the categories and theories embodying the laws. Non-natural kinds differ in all these
The distinction between physical causation and computation also arises in other areas of biology. For example, the sequence of base pairs in DNA codes a wide variety of structures and functions, including the sequences of amino acids in proteins, regulatory signals, and developmental programs. We are still a long way from the complete account of how the information contained in the DNA guides regulation and development, but the activation and deactivation of different genes at different stages of development in different cells can be considered a very complex computational system whose goal is to produce a living creature In a most general sense, we can consider a physical system as a computational system just in case there is an appropriate (revealing) mapping between some algorithm
27
about the real nature of the device and how it works. Thus finding out what is computed by a sieve is probably not very interesting, and will not teach us much we did not already know. In contrast, finding out what is computed by the cerebellum will teach us a lot aboul the nature of that tissue, and how it works.
respects, and typically have an interest-relative dimension. For example, 'bee' is a natural kind, but 'gem', 'dirt', and 'weed' are not. Stones are considered gems depending on whether some social group puts special value of them; plants are considered weeds, depending on whether a given gardener happens to like them in his garden. Some gardeners cultivate Baby's Breath as a desirable plant; other gardeners fight it as a weed. There is no experiment that will settle for us whether Baby's Breath really is a weed or not, because there is no (act of the matter, only soda] or idiosyncratic conventions. Nor are there interestindependent generalizations about all weeds—there is nothing common to plants called weeds save that some gardener has an interest in keeping them out of his garden. Similarly, there is no intrinsic property common to all computers, just the interest-relative property that someone sees a value in interpreting a system's states as representing states of some other system, and the properties of the system support such an interpretation. (For more on natural kinds, see P. M . Churchland 1985.1 Desk-top von Neumann machines have become prototypical computers because they are so common just as dandelions are prototypical weeds, but these prototypes should not be mistaken for the category itself.
Their simplicity notwithstanding, simple computational arrangements can lead to interesting insights. For example, assuming that we can landscape a terrain appropriately, a marble rolling down a hill into a valley can find (compute) the local minimum of a nonconvex two-dimensional cost function just as well as the steepest descent algorithm working on a digital computer, and probably faster. There is an analogy between this simple computer and Hopfield networks, parallel networks that store information at local minima of energy functions similar to these landscapes (Hopfield 1962). Global optimization problems can be solved by adding noise to the system, effectively "heating up" the marble so that it can jump out of a local minimum (Hinton and Sejnowski 1983). Applied to a problem in vision, for example, states of a Boltzmann machine correspond to a three-dimensional interpretation of a twodimensional image, and states with the lowest energy correspond to the best interpretation [Kienker et al. 1986; Sejnowski et al. 1986) Further, this analogy between a ball rolling down to find a local minimum of a landscape and computation of minima by analog networks has been the basis for a proposal that most problems in early vision are carried out by analog networks minimizing convex and nonconvex cost functions (Foggio and Koch 1985). The inpul is given by currents injected into the nodes of a resistive network and the final solution is recovered by measuring the voltage at each node (Koch et al. 1986). In other words. Kirchhoff's voltage and current laws are exploited directly for the computation, in contrast to standard VLSI circuits. IF we do not know the exact mapping between a problem and the physical domain, however, it may seem rather strange to classify a marble rolling down a hill or a current flowing in a wire as computing.
it may be suggested as a criticism of this very general characterization of computation that it is too general. For in this very wide sense, even a sieve or a threshing machine could be considered a computer, since they sort their inputs into types, and in principle one could specify the sorting algorithm. While this observation is correct, it is not so much a criticism but an apt appreciation of the breadth of the notion of computation. Conceivably, sieves and threshing machines could be construed as computers, if anyone has reason to care about the specific algorithm reflected in their input-output functions, though it is hard to see what those reasons might be. Still, an appropriately shaped rock can function as a sundial, and though this is a very simple computer, we do have reason to care about the temporal states that its shadow-casting states can be interpreted as representing.
A distinguishing feature of certain computational systems is their ease of programmability, and electronic computers are justly esteemed for this virtue. While it is impossible to adapt the marble-on-the-landscape computer to three or higher dimensional optimization problems, this would be a simple task for an optimization program running on an electronic computer. On the other hand.
Nonetheless, there is perhaps a correct intuition behind the criticism that is this. Finding a device sufficiently interesting to warrant the description 'computer' probably also entails that its input-output function is rather complex and unobvious. so that discovering what the function is reveals something important and perhaps unexpected
28
different initial conditions can be achieved very simply in the marble-on-the-landscape computer, which demonstrates that its flexibility lies in a different domain. Ease of programrnability addresses only one practical aspect of the prototypical computer, and although this aspect is of great importance for many purposes, it is irrelevant to the fundamental theoretical question of what it is to be a computer.
tectures, though not necessarily with comparable speed, elegance, or equivalent procedures. In principle, any algorithm can be run on a Turing machine, which is the dominant formal model of computation, not to be confused with an actual piece of electronic hardware. Apart from telling us that these algorithms can in principle be implemented on computers, the observation about Turing machines tell us little else, since the structure of the algorithm and the actual operations needed to carry out these computations in the nervous system remain to be discovered. What it does tell us is that the brain does not perform any magical operations when it computes depth or moves an arm.
Given this preamble on the nature of computation, the question now is whether it is appropriate to describe various structures in nervous systems as computing- The summary answer is that it certainly is. The reason is that we need a description of what various structures are doing that specifies the algorithm and the abstract properties thai it maps. Ions flowing along concentration and electrical gradients compute ]ust as surely as do electrons and holes in an MOS circuit. In fact, both obey rather similar equations. What we do not yet know in most cases of interest in neurobiology is the relationship between the problem domain, for instance computing binocular depth, and the appropriate biophysical variables, for instance average firing frequency, timing of individual action potentials, and the occurrence of bursts.
It is important to emphasize that different architectures may be input-output equivalent, but nevertheless be radically different in procedures and in the speed of arriving at results. The matter of time is especially worth raising in a biological context, since for organisms making a living in a world "red in tooth and claw," timing is of the essence. The organism's nervous system cannot afford the luxury of taking several days or even several minutes to decide whether and where to flee from a predator. If serial digital computers running GOFAI programs to perform tasks such as motor control and visual recognition had to compete in the actual space-time world with actual fauna—even humble fauna—they would not stand a chance. There is enormous evolutionary pressure to come up with fast solutions to problems, and this is a constraint of the first importance as we try to understand the computational principles that govern a brain Biologically speaking, a quick and dirty solution is better than a slow and perfect one.
On this point there is of course a major contrast between manufactured and biological computers. Since we construct digital computers ourselves, we build the appropriate relationship into their design. Consequently, we tend to take this mapping for granted in computers generally, both manufactured and evolved. But for structures in the nervous system, these relationships have to be empirically discovered. In the case of biological computers this may turn out to be very difficult since we typically do not know what is being computed by a structure, and intuitive folk ideas may be very misleading.
Levels
The relationship between electronic computers and the neurobiologies] computers that can be simulated on electronic computers may be misunderstood. Although nervous systems appear to be very different from serial digital computers, once we discover what their computational principles are. the computations can be simulated on serial digital computers. This is a purely formal point that can be put another way; so long as there is a systematic way fa nonmagicaJ way) that a network performs its tasks, then in principle that way can be captured by an algorithm that formally specifies the relations between input and output. Since it is an algorithm, it can run on computers with a variety of different archi-
Marr's Three Levels Discussions concerning computational theories and models designed to explain some aspect of nervous system function invariably make reference to the notion of levels, A simple framework outlining a conception of levels was articulated by Marr and Poggio {1976) and Marr {1982), and provided an important and influential conceptual starting point for thinking about levels in the context of computation by nervous structures. Marr's ideas drew upon the conception of levels in computer science, and accordingly he characterized three levels: [1) the com-
29
putahcna] level of abstract problem analysis, wherein the important to see that the purely formal point cannot speak to the issue of how best to discover the algorithm in task (e.g., determining structure from motion] is decomfact used by a given machine, nor how best to arrive at posed into its main constituents; (2) the level of the algorithm, which specified a formal procedure by which, for the neurobiologically adequate task analysis (P M . Churchland 19A2). Certainly it cannot tell us that the a given input the correct output would be given, and the discovery of the algorithms used by nervous structures task could thereby be performed; (3) the level of physical will be independent of a detailed understanding of the implementation of the computation. nervous system. Moreover, it does not tell us that any A centra] element in Marr's view was that a higher implementation is as good as any other. And it had better level was largely independent of the levels below it, and not, since as discussed above, different implementations hence computational problems of the highest level could display enormous differences in speed, size, efficiency, be analyzed independently of understanding the algorithm elegance, etc. The formal independece of algorithm from that executes the computation. For the same reason the architecture is something we can exploit to build other algorithmic problem of the second level was thought to machines once we know how the brain works, but it is be solvable independently of an understanding of the no guide to discovery if we do not yet know how the physical implementation. Thus his preferred strategy was brain works. "top-down" rather than "bottom-up " At least this was the official doctrine, though in practice, downward glances figured significantly in the attempts to find problem analyses and algorithm solutions. Ironically, given his advocacy of the top-down strategy. Man's work was itself highly influenced by neurobiological considerations, and implementational facts constrained his choice of problem and nurtured his computational and algorithmic insights. Publicly, advocacy of the top-down strategy did carry the implication, dismaying for some and comforting for others, that in solving questions of brain function the neurobiological facts could be more or less ignored, since they were, after all, just at the implementation level. Two very different issues tended to become conflated in the doctrine of independence One concerns whether, as a matter of discovery, one can figure out the algorithm and the problem analysis independently of facts about implementation. The other concerns whether, as a matter of formal theory, a given algorithm that is already known to perform a task in a given machine [e.g., the brain) can be implemented in some other machine that has a distinct architecture. Answers to these two questions may well diverge, and, as we argue, the answer to the first is probably No, while the answer to the second is Yes. So far as the latter is concerned, what computational theory tells us is merely that formal procedures can be run on different machines; in that sense and that sense alone, the algorithm is independent of the implementation. The formal point is straightforward: since an algorithm is formal, no specific physical parameters [e.g.. "vacuum tubes", "Ca**") are part of the algorithm. That said, it is
30
The issue concerning independence of levels in topdown strategies marks a substantial conceptual difference between Marr and the current generation of connectionists and computational neuroscientists. In contrast to the doctrine of independence of computation from implementation, current research suggests that considerations of implementation play a vital role in the kinds of algorithms that are devised and the kind of computational insights available to the scientist. Knowledge of brain architecture, so far from being irrelevant to the project, can be the essential basis and invaluable catalyst for devising likely and powerful algorithms—algorithms that have a reasonabe shot at explaining how in fact the neurons do the job. Consequently, if we consider the problem of discovering the computational principles of the nervous system, it is clear that analyzing the task and devising the algorithm are not independent of the neurobiological implementation. Without benefit of neurobiological constraints, we will undoubtedly find ourselves exploring some region of computational space utterly remote from the region inhabited by biological machines. Such exploration could be fun for its own sake, and may lead to technological innovations, but given the vastness of computational space, it stands a negligible chance of helping explain how nervous systems work. Levels are not independent, and there is much more coupling and intercalation than was previously appreciated. This interdependence should not. however, be considered a disadvantage, but an advantage, because it means we can avail ourselves of guidance from all levels.
Levels of Organization in the Brain
molecules, membranes, synapses, neurons, nuclei, circuits, networks, layers, maps, and systems (figure I). A t each structurally specified stratum we can raise the computational question: what does that organization of elements do? What does it contribute to the wider, computational organization of the brain? In addition, there are physiological levels; ion movement, channel configurations, EPSPs, IPSPs, action potentials, evoked potential waves, behavior, and perhaps other intervening levels that we have yet to leam about and that involve effects at higher anatomical levels such as circuits or systems.
Three levels of Analysis: How Adequate? In the previous section we provisionally acquiesced in a background presumption of Marr's framework. That presumption treated computation monoltthtcally. as a single kind of level of analysis. In the same vein, the framework treats implementation and task description/ decomposition each as -s single level of analysis. The presumption should not pass unchallenged, however, for on a closer look, there are difficulties, and these difficulties bear on how we conceive of levels. The difficulties center around the observation that when we measure Marr's three levels of analysis against levels of organization in the nervous system, the fit is poor and confusing (Churchland and Sejnowski 1988). To begin with, there is organized structure at different scales:
10 c m
Note also that the same level can be viewed computationally (in terms of functional role) or impiementationally (in terms of the substrate on which the function runs), depending on what questions you ask, and on whether you look up or down. For example, an action potential, from the point of view of communication between neurons, might be considered an implementation, since the neurons really only care about the presence of a binary event. However, from a lower level—the point of view of ionic distribution—the action potential is computational, since it is the result of integrating many sources of information, and this information is carried by the train of action potentials. In the next section we shall explore in more detail the question of levels in the nervous system, and now this bears upon the project of discovering computational models to explain nervous system function.
Systems
Maps
1 mm
100
The range of structural organization implies, therefore, that there are many levels of implementation, and that each has its companion task description. But if there is a ramifying of task descriptions to match the ramified structural organization, this diversity will probably be reflected in the ramification of algorithms that characterize how a task is accimplished. This in turn means that the notion of the algorithmic level is as oversimplified as the notion of the implementation level.
Networks
Lim
Synapses
Molecules
Philosophical Observations on Levels of Organization Figure 1 .
Which really are the levels relevant to explanation in the nervous system is an empirical, not an a priori, question. The answers to the question will slowly emerge as neuroscience and computational theories co-evolve, as theory and experiment converge on robust hypotheses. As things stand, there is a variety of ways of addressing the question of levels of organization.
Structural levels o f organiiation in The nervous system. The spatial v a l e at which anatomical organization can be Identified varies over many orders o f magnitude, from molecular dimerLiiOftS to that of the entire central nervous system. The schematic diagrams on the right illustrate (bottom} a typical chemical synapse, (middle) a proposed circuit for generating cmeriied receptive fields in visual cortex, and (top) part o f the m o t o r system thai controls the production ui speech sounds.
31
aid us in understanding this level (Sejnowski and Lehky
The range of spatial scales over which the nervous
1988; Zipser and Andersen 1985),
system has been explored covers more than eight orders of magnitude, from molecular dimensions measured in
Temporal Aspects of Levels
angstroms to fiber tracts that span centimeters. Organizational principles emerge at each spatiaJ Level that are
Nervous
directly relevant to the function of the nervous system
systems
are
dynamic
and
physiological
observations at each structural level can also be arranged
(figure 1). A few of these principles will be summarized
in hierarchy of time scales. These scales reach from tens to
here to serve as concrete examples.
hundreds of microseconds in the case of gating of single
Sensory information tends to be organized in spatial
ionic channels to days or weeks for biophysical and bio-
maps. The image of the world on the retina is mapped
chemical events underlying memory, such as long-term
onto subcortical structures such as the superior colliculus
potentiation. Action potentials and synaptic potentials of
and lateral geniculate
single neurons can be measured on a time scale of milli-
nucleus, and from there to a
proliferation of more than twenty maps in visual cortex.
seconds
The generality of this principle throughout the somato-
Recently, optical techniques have been used to record the
sensory, auditory, and visual systems was noted by Adrian
activity in populations of neurons in cerebral cortex with
(1953), who suggested that even the olfactory system
a time resolution of about 100 msec and a spatial resolution
may be organized in this way. Often these maps are
of about 100 /im. The measurement of blood flow with
arranged in sheets, as in the superior colliculus where
PET scanning has been used to observe the averge activity
maps of the visual, auditory, and somatic spaces are
throughout the brain, averaged over minutes with 5 mm
using intracellular single-electrode
recording.
stacked in adjacent layers such that stimuli from corre-
resolution. Even longer-term changes of behavior can be
sponding points in space lie above each other.
monitored and correlated with changes in the biochemical state of the nervous system. The range of time scales that
Topographic maps and laminae are special cases of a more general principle, which is the exploitation
can be studied varies over 12 orders of magnitude from
of
microseconds to days.
geometry in the design of information processing systems (Mead 1967; Yeshurun and Schwartz, this volume). Spatial proximity may be an efficient way for biological systems
The Status of Levels
to get together the information needed to solve rapidly difficult computational problems. For example, it is often
Each of the structural levels—molecules,
important to compute the differences between similar
synapses, neurons, nuclei, circuits, layers, columns, maps,
membranes,
features of a stimulus at nearby locations in space. By
and systems—is separable conceptually but not detachable
maintaining neighborhood relationships the total length
physically. What is picked out as a level is actually a
of the connections needed to bring together the signals is
boundary imposed on the structure we study using such
minimized.
techniques as are available to us, in order to try to understand the phenomena at hand. In the brain, they are all
Most of our information about the representation of
part of one, integrated, unified biological machine. That
sensory information is based on recording from single
is, the function of a neuron depends on the synapses that
neurons. This technique is revealing but it is also coruSning
bring it information, and, in turn, the neuron processes
insofar as it biases us toward thinking about the cellular
information by virtue of its interaction with other neurons
level rather than the subcellular or circuit levels. A s we
in local circuits, which themselves play a particular role by
leam more about the complexity of subcellular structures,
virtue of their place in the overall geometry of the brain.
such as voltage-dependent processing in dendrites and spines
(see
the
contributions
of Koch
This is a cautionary, if obvious, observation, generated by
and Poggio,
the appreciation that in a sense the postulation of levels is
Shepherd, Perkel. and Rail to this volme). our view of the
artificial from the point of view of the functioning brain,
computational capability of the neuron may change. W e
even though it may be scientifically indispensable if we
are especially in need of techniques that would allow us
are to make progress. A n d until we understand how the
to monitor populations of neurons (Llinas 1985). Simu-
brain works, it is always an open question whether we
lations of information processing in neural networks may
have put our boundaries in the most revealing place.
32
a model that aims to reproduce as much of the nervous system as possible from the bottom up. The problem with this approach is that a genuinely perfect model, faithful in every detail, is likely to be as incomprehensible as the nervous system itself. Although such a model may mimic the brain, it is debatable wheather any true understanding has been achieved. However, as noted above, the exclusively top-down approach has its own special problems.
Concluding Remarks A scientific held is defined primarily by its problem space and its successful large-scale theories. Until there are such theories in computational neuroscience. the field is defined mostly by the problems it seeks to solve, and the general methodology and specific techniques it hopes will yield successful theories. Accordingly, within the framework outlined here, we can say quite simply that the ultimate aim of computational neuroscience is to explain how the brain works. Since nervous systems have many levels of organization, theories appropriate to these levels will have to be generated, and the theories should integrate, level by level, into a coherent whole.
At this stage in our understanding of the brain, it may be fruitful to concentrate on models that suggest new and promising lines of experimentation, at all levels of organization. In this spirit, a model should be considered a provisional framework for organizing possible ways of thinking about the nervous system. The model may not be able to generate a full range of predictions owing to incompleteness; some assumptions may be unrealistic simplifications, and some details may even be demonstrably wrong. Nevertheless, as long as the model captures some useful kernel of truth that leads to new ways of thinking and to productive experiments, the model will have served its purpose, suggesting an improved model even as it is itself disproved, A corollary to this is that a very elaborate and sophisticated model that does not translate well into an experimental program is a sterile exercise when compared to something rougher that leads to productive research directions.
No single neural mode! can be expected to span all the levels cited, and an essential feature at one level of organization may be an insignificant detail at another. For example, the identity of a neurotransmitter is an essential feature in studying receptor binding, less important when studying circuits, and of marginal significance at the systems level. The multiplicity of levels of organization is a feature not only of neuroscience. but also of physics and chemistry, where explanations of phenomena on distinct levels of organization are more developed. Assuming a comprehensive "theory of the brain" does emerge, it will involve establishing a successive and overlapping chain of reasoning from the lowest levels to the highest, encompassing the various spatial, temporal, structural, and computational levels. Only then will we have achieved our ultimate aim.
The convergence of research in computer science, neuroscience, psychophysics, and cognitive psychology makes these truly exciting times. The hope that some of Ihe major mysteries of mind-brain function will finally be explained has become less pie-in-the-sky romanticism and more a palpable project that can be actively and experimentally pursued. Even so, the brain is awesomely complex; experiments are typically very difficult to do; there are ethical barriers to studying the normal human brain, and there may be nontrivial individual differences at many levels of organization. The answers may well elude us for some time to come.
It is rare for a model to jump over many levels, and the most successful models typically link only neighboring levels. For example, the Hodgkin-Huxley ionic model of the initiation and propagation of action potentials related macroscopic current measurements to the microscopic kinetics of ion channels, although such channels were only hypothetical at that time. This example also demonstrates that it may be necessary to make assumptions beyond the data available at one level in order to reach a better understanding at that level. Jt is only with the advent of single channel patch clamp recording techniques developed decades after the Hodgkin-Huxley model that their assumptions regarding the nature of microscopic ion channels could be verified. As knowledge accumulates at the cellular and molecular levels, it is tempting to incorporate all that is known into
33
Poggio, C . B. C Motter. S. Squalrito. and Y . Trotter f 1955) Responses
References
of neuron* in visual corte* ( V l and V 2 ) of Ihe alert macaque (o dynamic random-dol stereograms. Vision Res. 25; 397-406.
Adrian. E. D. (1953) The mechanism ol olfactory stimulation In Ihe Poggio. T . and C- Koch (1985} lli-posed problems in early vision,
mammal. Mo. Sci. {London) 9: 417-420
from computational theory lo analogue networks. Proc. R. Soc. Land. Andersen, R, A . and V. B. Mounicaslle (1953) The influence of the
B 226; 395-323,
angle oi gaze upon ihe excitability ol the light-sensitive neurons of Pylyshyn. Z W. (19B4) Computation and Cognition. Toward a
the posterior parietal cortex. ]. N/vosei. 3: 53?~548,
Foundation far Cogniliot Science. Cambridge. M I T Press. QiuichJand. P. M. 11982) 'Is thinker a natural kind?" Dialogue Hi Rumelhart, D E. and McClelland. J- L £1986) Parallel Distributed
223-238-
r
Processing: Explorations in the Micioshfclure of Cognition. Carnbridge, Qturchjand. P. M. (1985 > Conceptual progress and word/world
MIT Press.
relations; In search of ihe essence oi natural kinds. Can. }. tsfPhibs. 15: Sejnowski. T | . P. K. Kienker. and G - E Hinion (1986) Learning
1-17 Churdu'and, P. S. (19*61 tJEuraphil/naphif:
symmetry groups with hidden units: Beyond the perception. Priyaito Toward a Unified Science of
22D;
260-275.
Ihe tvitnd-brain. Cambridge. MTT PressZipser. D., and R. Andersen (1958) A backpropagalion network thai Churchland, P. 5., and T . 3, Sejnowski (1956) Neural representaions
simulates response properties of a subset of posterior parietal neurons.
and neural compulations. In; Biological Computation, ed- L Nadel.
Nature 33 lr 629—654.
Cambridge. M I T Press Fodor. J. A. (1975) The Language of Though!. New York; Crowell. (Paperback edition, 1979. Cambridge, Harvard University Press.} Haugeland, J. (1955) Artificial intelligence; The Very idea. Cambridge, M I T Press Hinton. G . E.. and T . ]. Se}nowskJ (1963) Optimal perceptual inference. In Proceedings of ihe IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Hopfield, J. J. (1962) Neural networks and physical systems with emergent collective computational abilities. Proc Noll Acad. Sci. 79;
Jacob. F. (1952) Thr Possible and the Actual Seattle, University of Washington Press Kienker, P. K_ T. J, Sejnowski. G . E . Hinton. and L E . Schumacher (19*6) Separating figure from ground with a parallel network Perception 25- 197-215, Koch. C . |. Mairoqum. and A Yullle (1956) Analog neuronal networks for early vision. Proc. Acad. Nail Sri. 63; 4263-4267. Lehky, S . and T . ]. Sejnowski (1988) Network model of shape-fromshading: Neural function arises from both receptive and projective fields Nature 333:
452-454,
Uinas R. R. (1965) Etectrolomc transmission m the mammalian central nervous system- in Cap Junctions, ed. Michael E. L Bennett and David C . Spray. Cold Spring Harbor Labaralory. Marr. D. (1962) Vision. San Francisco, freeman. Man-. D . and T. Poggio (1976) Cooperative computation of stereo disparity. Science 194; 263-257. Mead, C (1987) Analog VLSI and Nturo! Systems. Reading, Mass.. Addi son-Wesley.
34
lb
Physics, Biology, Computation
These four authors, two physicists and two biologists, contribute to shed light on the main themes of this volume from a diversity of angles. The definition of levels of study and the description of different "attitudes" towards science are elaborated in a clear and elegant fashion. The notion of computation, as a link between physics and biology, is further discussed. So this section can be viewed as an unfolding of many of the ideas introduced by Anderson (paper (3)), and by Churchland et al (paper (4)). It seemed to us essential to show how these ideas have appeared historically, as a result of deep and brilliant reflections from the best scientific minds (sometimes misled in ways still instructive for us), and as a progressive adjustment and refinement of past guesses. T h e first paper, by Leon Brillouin, presents an educated synthesis, dating from the middle of the 20th century. It includes views on the past revolutions in physics (thermodynamics, statistical physics, quantum mechanics) and it attempts to make predictions of future scientific progress on the life-matter issues, at the time of the edification of information theory, and before the triumphs of molecular biology. It is quite interesting and fruitful to reexamine these forecasts from our vantage point, half a century later, in order to assess our own chances at guessing right about the mind-body problems. T h e second paper, by John Hopfield, provides a "reasonable", "modern" account of these issues. It is a lucid attempt at pointing out what is a mystery and what is trivial, and at focusing on promising lines of research. It contains also a pedagogical introduction to neural networks, used for contentaddressable memories or for optimisation tasks, and thus leads naturally to Chapter 3. T h e third paper, by Frangois Jacob, is a brilliant survey of biological evolution, with a now famous distinction between the engineer and the tinkerer approaches to design. It offers views that are coherent and complementary to those presented earlier in the papers by Anderson (paper (3)), Churchland
35
et al {paper (4)), and Brillouin (paper (5)). Increasing complexity, greater selection among the possible, the increased role of history, led to a necessity for an analysis of complex objects at all levels. T h e two short texts by Valentino Braitenberg are separated by a lapse of fifteen years. They reach almost poetic value, through accuracy and condensation of expression. They are also sharp and even occasionally polemical in their definition of adequate strategies to understand brain functions. The plea, in favour of "interpretative neuroanatomy", will be illustrated in subsequent reprints of the author and in other articles of Chapters 2, 4 and 5.
36
itepruiieu in>iii nm. ML, Vol. 37. pp. 554-568. 1949 © 1949 American Sciential
LIFE, THERMODYNAMICS, A N D CYBERNETICS By L . B R I L L O U I N Cruft Laboratory, Harvard University"
H
OW is it possible to understand life, when the whole world is ruled by such a law as the second principle of thermodynamics, which points toward death and annihilation? This question has been asked by many scientists, and. in particular, by the Swiss physicist, C . E . Guye, in a very interesting book. The problem was discussed at the College de France in 1938. when physicists, chemists, and biologists met together and had difficulty in adjusting their different points of view. We could not reach complete agreement, and at the close of the discussions there were three well denned groups of opinion: (A) Our present knowledge of physics and chemistry is practically complete, and these physical and chemical laws will soon enable us to explain life, without the intervention of any special "life principle." (B) We know a great deal about physics and chemistry, but it is presumptuous to pretend that we know all about them. We hope that, among the things yet to be discovered, some new laws and principles will be found that will give us an interpretation of life. We admit that life obeys all the laws of physics and chemistry at present known to us, but we definitely feel that something more is needed before we can understand life. Whether it be called a "life principle" or otherwise is immaterial. ( C ) Life cannot be understood without reference to a "life principle." The behavior of living organisms is completely different from that of inert matter. Our principles of thermodynamics, and especially the second one, apply only to dead and inert objects; life is an exception to the second principle, and the new principle of life will have to explain conditions contrary to the second law of thermodynamics. Another discussion of the same problems, held at Harvard in 19+6, led to similar conclusions and revealed the same differences of opinion. In summarizing these three points of view, I have of course introduced some oversimplifications. Recalling the discussions, I am certain that opinions A and B were very clearly expressed. As for opinion C, possibly no one dared to state it as clearly as I have here, but it was surely in the minds of a few scientists, and some of the points introduced in the discussion lead logically to this opinion. For instance, consider a living organism; it has special properties which enable it to resist destruction, to heal its wounds, and to cure occasional sickness. This is very strange behavior, and nothing similar can be observed about inert matter. Is such behavior an exception to the second principle? It appears so, at least superficially, and we must be prepared to 1
* Noto Director oi Electronic Education, International New York, N. Y. 1
Levolution
Business Machines
physico-chimiqne {Paris, E. Chiron, 1922).
37
Corporation,
555 accept a "life principle" that would allow for some exceptions to the second principle. When life ceases and death occurs, the "life principle" stops working, and the second principle regains its full power, implying demolition of the living structure. T h e r e is no more healing, no more resistance to sickness; the destruction of the former organism goes on unchecked and is completed in a very short time. T h u s the conclusion, or question: W h a t about life and the second principle? I s there not, in living organisms, some power that prevents the action of the second principle? The Attitude
of the
Scientist
T h e three groups as defined in the preceding section may be seen to correspond to general attitudes of scientists towards research: (A) strictly conservative, biased against any change, and interested only in new development and application of well established methods or principles; (B) progressive, open-minded, ready to accept new ideas and discoveries; ( C ) revolutionary, or rather, metaphysical, with a tendency to wishful thinking, or indulging in theories lacking solid experimental basis. I n the discussion just reviewed, most non-specialists rallied into group B. T h i s is easy to understand. Physicists of the present century had to acquire a certain feeling for the unknown, and always to be very cautious against over-confidence. Prominent scientists of the previous generation, about 1900, would all be classed in group A. Common opinion about that time was that everything was known and that coming generations of scientists could only improve on the accuracy of experiments and measure one or two more decimals on the physical constants. T h e n some new laws were discovered: quanta, relativity, and radioactivity. T o cite more specific examples, the Swiss physicist Ritz was bold enough to write, at the end of the nineteenth century, that the laws of mechanics could not explain optical spectra. T h i r t y years passed before the first quantum mechanical explanation was achieved. T h e n , about 1922, after the first brilliant successes of quantum mechanics, things came to a standstill while experimental material was accumulating. Some scientists (Class A) still believed that it was just a question of solving certain very complicated mathematical problems, and that the explanation would be obtained from principles already known. O n the contrary, however, we had to discover wave mechanics, spinning electrons, and the whole structure of the present physical theories. Now, to speak frankly, we seem to have reached another dead-end. Present methods of quantum mechanics appear not to be able to explain the properties of fundamental particles, and attempts at such explanations look decidedly artificial. M a n y scientists again believe that a new idea is needed, a new type of mathematical correlation, before we can go one step further. All this serves to prove that every physicist must be prepared for many new discoveries in his own domain. Class A corresponds to cautiousness. Before abandoning the safe ground of well established ideas, says the
38
556 cautious scientist, it must be proved that these ideas do not check with experiments. Such was the case with the Michelson-Morley experiment. Nevertheless, the same group of people were extremely reluctant to adopt relativity. Attitude B seems to be more constructive, and corresponds to the trend of scientific research through past centuries; attitude C, despite its exaggeration, is far from being untenable. W e have watched many cases of new discoveries leading to limitations of certain previous "laws." After all, a scientific law is not a "decree" from some supernatural power; it simply represents a systematization of a large number of experimental results. A s a consequence, the scientific law has only a limited validity. I t extends over the whole domain of experimentation, and maybe slightly beyond. B u t we must be prepared for some strange modifications when our knowledge is expanded much farther than this. M a n y historical examples could be introduced to support this opinion. Classical mechanics, for instance, was one of the best-established theories, yet it had to be modified to account for the behavior of very fast particles (relativity), atomic structure, or cosmogony. F a r from being foolish, attitude C is essentially an exaggeration of B; and any scholar taking attitude B must be prepared to accept some aspects of group C, if he feels it necessary and if these views rest upon a sound foundation. T o return to the specific problem of life and thermodynamics, we find it discussed along a very personal and original line in a small book published by the famous physicist E . Schrodinger. His discussion is very interesting and there are many points worth quoting. Some of them will be examined later on. I n our previous classification, Schrodinger without hesitation joins group B: 2
We cannot expect [he states] that the ''laws of physics" derived from it [from the second principle and its statistical interpretation] suffice straightaway to explain the behavior of living matter, . . We must be prepared to find a new type of physical law prevailing in it. Or are we to term it a non-physical, not to say a super-physical law?' T h e reasons for such an attitude are very convincingly explained by Schrodinger, and no attempt will be made to summarize them here. Those who undertake to read the book will find plenty of material for reflection and discussion. L e t us simply state at this point that there is a problem about "life and the second principle." T h e answer is not obvious, and we shall now attempt to discuss that problem systematically. The
Second
Principle
of Thermodynamics, Shortcomings
Its Successes
and
Its
Nobody can doubt the validity of the second principle, no more than he can the validity of the fundamental laws of mechanics. However, the question is to specify its domain of applicability and the !
E . Schrodinger. What is Lift? (London, Cambridge University Press, and New York, The Macmillan Company, 1945). " Ibid., p. 80.
39
557 chapters of science or the type of problems for which it works safely. We shall put special emphasis on all cases where the second principle remains silent and gives no answer. I t is a typical feature of this principle that it has to be stated as an inequality. Some quantity, called "entropy," cannot decrease (under certain conditions to be specified later); but we can never state whether "entropy" simply stays constant, or increases, or how fast it will increase. Hence, the answer obtained from the second principle is very often evasive, and keeps a sibyllic character. W e do not know of any experiment telling against the second principle, but we can easily find many cases where it is useless and remains dumb. L e t us therefore try to specify these limitations and shortcomings, since it Is on this boundary that life plays. Both principles of tkervwdynamics apply only to an isolated system, which is contained in an enclosure through which no heat can be transferred, no work can be done, and no matter nor radiation can be exchanged. T h e first principle states that the total energy of the system remains constant. T h e second principle refers to another quantity called "entropy," S, that may only increase, or at least remain constant, but can never decrease. Another way to explain the situation is to say that the total amount of energy is conserved, but not its "quality." E n e r g y may be found in a high-grade quality, which can be transformed into mechanical or electrical work (think of the energy of compressed air in a tank, or of a charged electric battery); but there are also low-grade energies, like heat. T h e second principle is often referred to as a principle of energy degradation. T h e increase in entropy means a decrease in quality for the total energy stored in the isolated system. Consider a certain chemical system (a battery, for instance) and measure its entropy, then seal it and leave it for some time. W h e n you break the seal, you may again measure the entropy, and you will find it increased. I f your battery were charged to capacity before sealing, it will have lost some of its charge and will not be able to do the same amount of work after having been stored away for some time. T h e change may be small, or there may be no change; but certainly the battery cannot increase its charge during storage, unless some additional chemical reaction takes place inside and makes up for the energy and entropy balance. O n the other hand, life feeds upon high-grade energy or "negative entropy." A decrease in highgrade energy is tantamount to a loss of food for living organisms. Or we can also say that living organisms automatically destroy first1
5
4
The fundamental definition must always start with an isolated system, whose energy, total mass, and volume remain constant. Then, step by step, other problems may be discussed. A body at "constant temperature" is nothing but a body enclosed in a large thermostat, that is, in a big, closed, and isolated tank, whose energy content is so large that any heat developed in the body under experience cannot possibly change the average temperature of the tank. A similar experimental device, with a closed tank containing a large amount of an ideal gas, leads to the idea of a body maintained at constant pressure and constant temperature. These are secondary concepts derived from the original one. Schrodinger. p. 12, 5
40
5S8 quality energy, and thus contribute to the different mechanisms of the second principle. I f there are some living cells in the enclosure, they will be able for some time to feed upon the reserves available, but sooner or later this will come to an end and death then becomes inevitable. T h e second principle means death by confinement, and it will be necessary to discuss these terms. Life is constantly menaced by this sentence to death. T h e only way to avoid it is to prevent confinement. Confinement implies the existence of perfect walls, which are necessary in order to build an ideal enclosure. B u t there are some very important questions about the problem of the existence of perfect walls. D o we really know any way to build a wall that could not let any radiation in or out? T h i s is theoretically almost impossible; practically, however, it can be done and is easily accomplished in physical or chemical laboratories. There is, it is true, a limitation to the possible application of the second principle, when it comes to highly penetrating radiation, such as ultra-hard rays or cosmic rays; but this does not seem to have any direct connection with the problem of life and need not be discussed here. Time and the second principle. T h e second principle is a death sentence, but it contains no time lijnit, and this is one of the very strange points about it. T h e principle states that in a closed system, S will increase and high-grade energy must decrease; but it does not say how fast. W e have even had to Include the possibility that nothing at all might happen, and that S would simply remain constant. T h e second principle is an arrow pointing to a direction along a oneway road, with no upper or lower speed limit. A chemical reaction may flash in a split second or lag on for thousands of centuries. Although time is not a factor in the second principle, there is, however, a very definite connection between that principle and the definition of time. One of the most important features about time is its irreversibility. T i m e flows on, never comes back. W h e n the physicist is confronted with this fact he is greatly disturbed. A l l the laws of physics, in their elementary form, are reversible; that is, they contain the time but not its sign, and positive or negative times have the same function. A l l these elementary physical laws might just as well work backward. It is only when phenomena related to the second principle (friction, diffusion, energy transferred) are considered that the irreversibility of time comes in. T h e second principle, as we have noted above, postulates that time flows always in the same direction and cannot turn back. T u r n the time back, and your isolated system, where entropy ( S ) previously was increasing, would now show a decrease of entropy. T h i s is impossible: evolution follows a one-way street, where travel in the reverse direction is strictly forbidden. T h i s fundamental observation was made by the founders of thermodynamics, as, for instance, by Lord K e l v i n In the following paragraphs:
41
559 // Nature Could Run Backward* If, then, the motion of every particle of matter in the universe were precisely reversed at any instant, the course of nature would be simply reversed forever after. The bursting bubble of foam at the foot of a waterfall would reunite and descend into the water; the thermal motions would reconcentrate their enerey and throw the mass up the fall in drops re-forming into a close column of ascending water. Heat which had been generated by the friction of solids and dissipated by conduction, and radiation with absorption, would come again to the place of contact and throw the moving body back against the force to which it had previously yielded. Boulders would recover from the mud the materials required to rebuild them into rheir previous jagged forms, and would become reunited to the mountain peak from which they had formerly broken away. And if, also, the materialistic hypothesis of life were true, living creatures would grow backward, with conscious knowledge of the fuiure but with no memory of the past, and would become again, unborn. But the real phenomena of life infinitely transcend human science, and speculation regarding consequences of their imagined reversal is utterly unprofitable. Far otherwise, however, is it in respect to the reversal of the motions of matter uninfluenced by life, a very elementary consideration of which leads to the full explanation of the theory of dissipation of energy.
This brilliant statement indicates definitely that L o r d K e l v i n would also be classed in our group B, on the basis of his belief that there is in life something that transcends our present knowledge. Various illustrations have been given of the vivid description presented by L o r d Kelvin. Movie-goers have had many opportunities to watch a waterfall climbing up the hill or a diver jumping back on the springboard; but, as a rule, cameramen have been afraid of showing life going backward, and such reels would certainly not be authorized by censors! I n any event, it is a very strange coincidence that life and the second principle should represent the two most important examples of the impossibility of time's running backward. This reveals the intimate relation between both problems, a question that will be discussed i n a later section. Statistical interpretation of the second principle. T h e natural tendency of entropy to increase is interpreted now as corresponding to the evolution from improbable toward most probable structures. T h e brilliant theory developed by L . Boltzmann, F . W . Gibbs, and J . C . M a x w e l l explains entropy as a physical substitute for "probability" and throws a great deal of light upon all thermodynamical processes. This side of the question has been very clearly discussed and explained in Schrodinger's book and will not be repeated here. Let us, however, stress a point of special interest. W i t h the statistical theory, entropy acquires a precise mathematical definition as the logarithm of probability. I t can be computed theoretically, when a physical model is given, and the theoretical value compared with experiment. When this has been found to work correctly, the same physical model can be used to investigate problems outside the reach of classical thermodynamics and especially problems involving time. T h e questions raised ;
6
William Thompson (Lord Kelvin) in the Proceedings of the Royal Society oj Edinburgh 8: 325-331. 1874. Quoted in The Autobiography of Science, F . R . Moulton and J . J . Shifferes, Editors (New York, 1945), p. 468.
42
S60 in the preceding section can now be answered; the rate of diffusion for gas mixtures, the thermal conductivity of gases, the velocity of chemical reactions can be computed. I n this respect, great progress has been made, and in a number of cases it can be determined how fast entropy will actually increase. I t is expected that convenient models will eventually be found for all of the most important problems; but this is not yet the case, and we must distinguish between those physical or chemical experiments for which a detailed application of statistical thermodynamics has been worked out, and other problems for which a model has not yet been found and for which we therefore have to rely on classical thermodynamics without the help of statistics. I n the first group, a detailed model enables one to answer the most incautious questions; in the second group, questions involving time cannot be discussed. Distinction between two classes of experiments where entropy remains constant. T h e entropy of a closed system, as noted, must increase, or at least remain constant. W h e n entropy increases, the system is undergoing an irreversible transformation; when the system undergoes a reversible transformation, its total entropy remains constant. Such is the case for reversible cycles discussed in textbooks on thermodynamics, for reversible chemical reactions, etc. However, there is another case where no entropy change is observed, a case that is usually ignored, about which we do not find a word of discussion in textbooks, simply because scientists are at a loss to explain it properly. T h i s is the case of systems in unstable equilibrium. A few examples may serve to clarify the problem much better than any definition. I n a private kitchen, there is a leak in the gas range. A mixture of air and gas develops (unstable equilibrium), but nothing happens until a naughty little boy comes in, strikes a match, and blows up the roof. Instead of gas, you may substitute coal, oil, or any sort of fuel; all our fuel reserves are i n a state of unstable equilibrium. A stone hangs along the slope of a mountain and stays there for years, until rains and brooklets carry the soil away, and the rock finally rolls downhill. Substitute waterfalls, water reservoirs, and you have all our reserves of "white fuel." U r a n i u m remained stable and quiet for thousands of centuries; then came some scientists, who built a pile and a bomb and, like the naughty boy in the kitchen, blew up a whole city. Such things would not be permitted if the second principle were an active principle and not a passive one. Such events could not take place in a world where this principle was strictly enforced. All this makes one thing clear. A l l our so-called power reserves are due to systems in unstable equilibrium. T h e y are really reserves of negative entropy—structures wherein, by some sort of miracle, the normal and legitimate increase of entropy does not take place, until man, acting like a catalytic agent, comes and starts the reaction. Very little is known about these systems of unstable equilibrium. N o explanation is given. T h e scientist simply mumbles a few embar-
43
561 rassed words about "obstacles" hindering the reaction, or "potential energy walls" separating systems that should react but do not. T h e r e is a hint in these vague attempts at explanation, and, when properly developed, they should constitute a practical theory. Some very interesting attempts at an interpretation of catalysis, on the basis of quantum mechanics, have aroused great interest in scientific circles. But the core of the problem remains. How is it possible for such tremendous negative entropy reserves to stay untouched? W h a t is the mechanism of negative catalysis, which maintains and preserves these stores of energy? T h a t such problems have a stupendous importance for mankind, it is hardly necessary to emphasize. I n a world where oil simply waits for prospectors to come, we already watch a wild struggle for fuel. How would it be if oil burned away by itself, unattended, and did not wait passively for the drillers? Life and Its Relations
with ike Second
Principle
We have raised some definite questions about the significance of the second principle, and in the last section have noted certain aspects of particular importance. Let us now discuss these, point by point, in connection with the problem of life maintenance and the mechanism of life. Closed systems. M a n y textbooks, even the best of them, are none too cautious when they describe the increase of entropy. It is customary to find statements like this one: "The entropy of the universe is constantly increasing." This, in my opinion, is very much beyond the limits of human knowledge. Is the universe bounded or infinite? W h a t are the properties of the boundary? Do we know whether It is tight, or may It be leaking? Do entropy and energy leak out or in? Needless to say, none of these questions can be answered. W e know that the universe is expanding although we understand very little of how and why. Expansion means a moving boundary (If a n y ) , and a moving boundary is a leaking boundary; neither energy nor entropy can remain constant within. Hence, it is better not to speak about the "entropy of the universe." I n the last section we emphasized the limitations of physical laws, and the fact that they can be safely applied only within certain limits and for certain orders of magnitude. T h e whole universe is too big for thermodynamics and certainly exceeds considerably the reasonable order of magnitude for which its principles may apply. This is also proved by the fact that the Theory of Relativity and all the cosmological theories that followed always involve a broad revision and drastic modification of the laws of thermodynamics, before an attempt can be made to apply them to the universe as a whole. T h e only thing that we can reasonably discuss Is the entropy of a conceivable closed structure. Instead of the very mysterious universe, let us speak of our home, the earth. Here we stand on familiar ground. T h e earth is not a closed system. I t is constantly receiving energy and negative entropy from outside—radiant heat from the sun, gravitational energy from
44
562 sun and moon (provoking sea tides), cosmic radiation from unknown origin, and so on. T h e r e is also a certain amount of outward leak, since the earth itself radiates energy and entropy. How does the balance stand? Is it positive or negative? I t is very doubtful whether any scientist can answer this question, much less such a question relative to the universe as a whole. T h e earth is not a closed system, and life feeds upon energy and negative entropy leaking into the earth system. Sun heat and rain make crops (remember April showers and M a y flowers), crops provide food, and the cycle reads: first, creation of unstable equilibriums (fuels, food, waterfalls, etc.); then, use of these reserves by all living creatu res. Life acts as a catalytic agent to help destroy unstable equilibrium, but it is a very peculiar kind of catalytic agent, since it profits by the operation. W h e n black platinum provokes a chemical reaction, it does not seem to care, and does not profit by it. Living creatures care about food, and by using it they maintain their own unstable equilibrium. T h i s is a point that will be considered later. T h e conclusion of the present section is this: that the sentence to "death by confinement" is avoided by living in a world that is not a confined and closed system. The role of time. W e have already emphasized the silence of the second principle. T h e direction of any"reaction is given, but the velocity of the reaction remains unknown. I t may be zero (unstable equilibrium), it may remain small, or it may become very great. Catalytic agents usually increase the velocity of chemical reactions; however, some cases of "anticatalysis" or "negative catalysis" have been discovered, and these involve a slowing down of some important reactions (e.g., oxidation). Life and living organisms represent a most important type of catalysis. It is suggested that a systematic study of positive and negative catalysts might prove very useful, and would in fact be absolutely necessary before any real understanding of life could be attained. T h e statistical interpretation of entropy and quantum mechanics are undoubtedly the tools with which a theory of catalysis should be built. Some pioneer work has already been done and has proved extremely valuable, but most of it is restricted, for the moment, to the most elementary types of chemical reactions. T h e work on theoretical chemistry should be pushed ahead with great energy. Such an investigation will, sooner or later, lead us to a better understanding of the mechanisms of "unstable equilibrium." New negative catalysts m a y even make it possible to stabilize some systems that otherwise would undergo spontaneous disintegration, and to preserve new types of energies and negative entropies, just as we now know how to preserve food. W e have already emphasized the role of living organisms as catalytic agents, a feature that has long been recognized. E v e r y biochemist now
45
563 thinks of ferments and yeasts as peculiar living catalysts, which help release some obstacle and start a reaction, in a system in unstable equilibrium. Just as catalysts are working within the limits of the second principle, so living organisms are too. I t should be noted, however, that catalytic action in itself is something which is not under the jurisdiction of the second principle. Catalysis involves the velocity of chemical reactions, a feature upon which the second principle remains silent. Hence, in this first respect, life is found to operate along the border of the second principle. However, there is a second point about life that seems to be much more important. Disregard the very difficult problem of birth and reproduction. Consider an adult specimen, be it a plant or an animal or man. This adult individual is a most extraordinary example of a chemical system in unstable equilibrium. T h e system is unstable, undoubtedly, since it represents a very elaborate organization, a most improbable structure (hence a system with very low entropy, according to the statistical interpretation of entropy). T h i s instability is further shown when death occurs. T h e n , suddenly, the whole structure is left to itself, deprived of the mysterious power that held it together; within a very short time the organism falls to pieces, rots, and goes (we have the wording of the scriptures) back to the dust whence it came. Accordingly, a living organism is a chemical system in unstable equilibrium maintained by some strange "power of life," which manifests itself as a sort of negative catalyst. So long as life goes on, the organism maintains its unstable structure and escapes disintegration. I t slows down to a considerable extent (exactly, for a lifetime) the normal and usual procedure of decomposition. Hence, a new aspect of life. Biochemists usually look at living beings as possible catalysts. B u t this same living creature is himself an unstable system, held together by some sort of internal anticatalystl After all, a poison is nothing but an active catalyst, and a good drug represents an anticatalyst for the final inevitable reaction: death. N . Wiener, in his Cybernetics, takes a similar view when he compares enzymes or living animals to Maxwell demons, and writes: " I t may well be that enzymes are metastable Maxwell demons, decreasing entropy. . . . W e may well regard living organisms, such as M a n himself, in this light. Certainly the enzyme and the living organism are alike metastable: the stable s&te of an enzyme is to be deconditioned, and the stable state of a living organism is to be dead. A l l catalysts are ultimately poisoned: they change rates of reaction, but not true equilibrium. Nevertheless, catalysts and M a n alike have sufficiently definite states of metastability to deserve the recognition of these states as relatively permanent conditions." 7
Living
Organisms
and Dead
Structures
In a discussion at H a r v a r d (1946), P . W . Bridgman stated a funda' Norbert Wiener. Cybernetics, or Control and Communication in the Animal and the Machine (New York, John Wiley and Sons, 1948), p. 72.
46
564 mental difficulty regarding the possibility of applying the laws of thermodynamics to any system containing living organisms. H o w can we compute or even evaluate the entropy of a living being ? I n order to compute the entropy of a system, it is necessary to be able to create or to destroy it in a reversible way. W e can think of no reversible process by which a living organism can be created or killed: both birth and death are irreversible processes. There is absolutely no way to define the change of entropy that takes place in an organism at the moment of its death. W e might think of some procedure by which to measure the entropy of a dead organism, albeit it may be very much beyond our present experimental skill, but this does not tell us anything about the entropy the organism had just before it died. T h i s difficulty is fundamental; it does not make sense to speak of a quantity for which there is no operational scheme that could be used for its measurement. T h e entropy content of a living organism is a completely meaningless notion. I n the discussion of all experiments involving living organisms, biologists always avoid the difficulty by assuming that the entropy of the living objects remains practically constant during the operation. T h i s assumption is supported by experimental results, but it is a bold hypothesis and impossible to verify. T o a certain extent, a living cell can be compared to a flame: here is matter going in and out, and being burned. T h e entropy of a flame cannot be defined, since it is not a system in equilibrium. I n the case of a living cell, we may know the entropy of its food and measure the entropy of its wastes. I f the cell is appaxently maintained in good health and not showing any visible change, it may be assumed that its entropy remains practically constant. A l l experimental measures show that the entropy of the refuse is larger than that of the food. T h e transformation operated by the living system corresponds to an increase of entropy, and this is presented as a verification of the second principle of thermodynamics. B u t we may have some day to reckon with the underlying assumption of constant entropy for the living organism. T h e r e are many strange features In the behavior of living organisms, as compared with dead structures. T h e evolution of species, as well as the evolution of individuals, is an irreversible process. T h e fact that evolution has been progressing from the simplest to the most complex structures is very difficult to understand, and appears almost as a contradiction to the law of degradation represented by the second principle. T h e answer is, of course, that degradation applies only to the whole of an isolated system, and not to one isolated constituent of the system. Nevertheless, it is hard to reconcile these two opposite directions of evolution. M a n y other facts remain very mysterious: reproduction, maintenance of the living individual and of the species, free will, etc. 8
A most instructive comparison is presented by Schrodinger when he points to similarities and differences between a living organism, such as a cell, and one of the most elaborate structures of inanimate matter, 8
Schrodinger, pp. 3 and 78.
47
565 a crystal. Both examples represent highly organized structures concontaining a very large number of atoms. B u t the crystal contains only a few types of atoms, whereas the cell may contain a much greater variety of chemical constituents. T h e crystal is always more stable at very low temperatures, and especially at absolute zero. T h e cellular organization is stable only within a given range of temperatures. F r o m the point of view of thermodynamics this involves a very different type of organization. When distorted by some stress, the crystal may to a certain extent repair its own structure and move its atoms to new positions of equilibrium, but this property of self-repair is extremely limited. A similar property, but exalted to stupendous proportions, characterizes living organisms. T h e living organism heals Its own wounds, cures its sicknesses, and may rebuild large portions of its structure when they have been destroyed by some accident. This is the most striking and unexpected behavior. T h i n k of your own car, the day you had a flat tire, and imagine having simply to wait and smoke your cigar while the hole patched itself and the tire pumped itself to the proper pressure, and you could go on. T h i s sounds incredible. It is, however, the way nature works when you "chip off" while shaving in the morning. There is no inert matter possessing a similar property of repair. T h a t is why so many scientists (class B) think that our present laws of physics and chemistry do not suffice to explain such strange phenomena, and that something more is needed, some very important law of nature that has escaped our investigations up to now, but may soon be discovered. Schrodinger, after asking whether the new law required to explain the behavior of living matter (see page 556) might not be of a superphysical nature, adds: "No, I do not think that. F o r the new principle that is involved is a genuinely physical one. I t is, in m y opinion, nothing else than the principle of quantum theory over again." T h i s is a possibility, but It is far from certain, and Schrodinger's explanations are too clever to be completely convincing. 5
10
There are other remarkable properties characterizing the ways of living creatures. F o r instance, let us recall the paradox of Maxwell's demon, that submicroscopicat being, standing by a trapdoor and opening it only for fast molecules, thereby selecting molecules with highest energy and temperature. Such an action is unthinkable, on the submicroscopical scale, as contrary to the second principle. H o w does it 11
8
The property of self-repairing has been achieved in some special devices. A selfsealing tank with a regulated pressure control is an example. Such a property, however, is not realized in most physical structures and requires a special control device, which is a product of human ingenuity, not of nature. Schrodinger, p. 81. Wiener discusses very carefully the problem of the Maxwell demon (Cybernetics, pp. 71-72). One remark should be added. In order to choose the fast molecules, the demon should be able to see them;- but he is in an enclosure in equilibrium at constant temperature, where the radiation must be that of the black body, and it is impossible to see anything in the interior of a black body. T h e demon simply does not see the panicles, unless we equip him with a torchlight, and a torchlight is obviously a source of radiation not at equilibrium. It pours negative entropy into the system. {Continued on following page) 1 0
1 1
48
566 become feasible on a large scale: M a n opens the window when the weather is hot and closes it on cold d a y s ! Of course, the answer is that the earth's atmosphere is not in equilibrium and not at constant temperature. Here again we come back to the unstable conditions created by sunshine and other similar causes, and the fact that the earth is not a closed Isolated system. T h e very strange fact remains that conditions forbidden on a small scale are permitted on a large one, that large systems can maintain unstable equilibrium for large time-intervals, and that iife is playing upon all these exceptional conditions on the fringe of the second principle. Entropy and Intelligence One of the most interesting parts in Wiener's Cybernetics Is the discussion on " T i m e series, information, and communication," in which he specifies that a certain "amount of information is the negative of the quantity usually denned as entropy in similar situations. T h i s is a very remarkable point of view, and it opens the way for some important generalizations of the notion of entropy. Wiener introduces a precise mathematical definition of this new negative entropy for a certain number of problems of communication, and discusses the question of time prediction: when we possess a certain number of data about the behavior of a system in the past, how much can we predict of the behavior of that system in the future? I n addition to these brilliant considerations, Wiener definitely indicates the need for an extension of the notion of entropy. "Information represents negative entropy"; but if we adopt this point of view, how can we avoid its extension to all types of intelligence? W e certainly must be prepared to discuss the extension of entropy to scientific knowledge, technical know-how, and all forms of intelligent thinking. Some examples may illustrate this new problem. T a k e an issue of the N e w Y o r k Times, the book on Cybernetics, and an equal weight of scrap paper. D o they have the same entropy? According to the usual physical definition, the answer is "yes." But for an intelligent reader, the amount of information contained in these three bunches of paper is very different. I f "information means negative entropy," as suggested by Wiener, how are we going to measure this new contribution to entropy? Wiener suggests some practical and numerical definitions that may apply to the simplest possible problems of this kind. T h i s represents an entirely new field for investigation and a most revolutionary idea. Under these circumstances, the demon can certainly extract some fraction of this negative entropy by using his gate at convenient times. Once we equip the demon with a torchlight, we may aiso add some photoelectric cells and design an automatic system to do the work, as suggested by Wiener. The demon need not be a living organism, and intelligence is not necessary either. The preceding remarks seem to have been generally ignored, although Wiener says: the demon can only act on information received and this information represents a negative entropy. Wiener, Chap. H I , p. 76. 1 E
49
567 M a n y similar examples can be found. Compare a rocky hill, a pyramid, and a dam with its hydroelectric power station. T h e amount of "know-how" is completely different, and should also correspond to a difference in "generalized entropy," although the physical entropy of these three structures may be about the same. T a k e a modern largescale computing machinery, and compare its entropy with that of its constituents before the assembling. C a n it reasonably be assumed that they are equal: Instead of the "mechanical brain," think now of the living human brain. D o you imagine its (generalized) entropy to be the same as that for the sum of its chemical constituents: It seems that a careful investigation of these problems, along the directions initiated by Wiener, may lead to some important contributions to the study of life itself. Intelligence is a product of life, and a better understanding of the power of thinking may result in a new point of discussion concerning this highly significant problem. Let us try to answer some of the questions stated above, and compare the "value" of equal weights of paper: scrap paper, New Y o r k Times, Cybernetics. T o an illiterate person they have the same value. A n average English-reading individual will probably prefer the New Y o r k Times, and a mathematician will certainly value the book on Cybernetics much above anything else. "Value" means "generalized negative entropy," if our present point of view be accepted. T h e preceding discussion might discourage the reader and lead to the conclusion that such definitions are impossible to obtain. T h i s hasty conclusion, however, does not seem actually to be correct. A n example may explain the difficultv and show what is really needed. L e t us try to compare two beams of light, of different colors. T h e human eye, or an ultraviolet photo-cell, or an infrared receiving cell will give completely different answers. Nevertheless, the entropy of each beam of light can be exactly defined, correctly computed, and measured experimentally. T h e corresponding definitions took a long time to discover, and retained the attention of most distinguished physicists (e.g., Boltzman, P l a n c k ) . But this difficult problem was finally settled, and a careful distinction was drawn between the intrinsic properties of radiation and the behavior of the specific receiving set used for experimental measurements. E a c h receiver is defined by its "absorption spectrum," which characterizes the way it reacts to incident radiations. Similarly, it does not seem impossible to discover some criterion by which a definition of generalized entropy could be applied to "information," and to distinguish it from the special sensitivity of the observer. T h e problem is certainly harder than in the case of light. Light depends only upon one parameter (wave length), whereas a certain number of independent variables may be required for the definition of the "information value," but the distinction between an absolute intrinsic value of information and the absorption spectrum of the receiver Is indispensable. Scientific information represents certainly a sort of negative entropy for Wiener, who knows how to use it for prediction, and
50
568 may be of no value whatsoever to a non-scientist. T h e i r respective absorption spectra are completely different. Similar extensions of the notion of entropy are needed in the field of biology, with new definitions of entropy and of some sort of absorption spectrum. M a n y important investigations have been conducted by biologists during recent years, and they can be summarized as "new classifications of energies." F o r inert matter, it suffices to know energy and entropy. F o r living organisms, we have to introduce the "food value" of products. Calories contained in coal and calories in wheat and meat do not have the same function. Food value must itself be considered separately for different categories of living organisms. C e l lulose Is a food for some animals, but others cannot use it. When it comes to vitamins or hormones, new properties of chemical compounds are observed, which cannot be reduced to energy or entropy. All these data remain rather vague, but they all seem to point toward the need for a new leading idea (call it principle or law) in addition to current thermodynamics, before these new classifications can be understood and typical properties of living organisms can be logically connected together. Biology is still in the empirical stage and waits for a master idea, before it can enter the constructive stage with a few fundamental laws and a beginning of logical structure. I n addition to the old and classical concept of physical entropy, some bold new extensions and broad generalizations are needed before we can reliably apply similar notions to the fundamental problems of life and of intelligence. Such a discussion should lead to a reasonable answer to the definition of entropy of living organisms and solve the paradox of Bridgman (pages 563-564). A recent example from the physical sciences may explain the situation. D u r i n g the nineteenth century, physicists were desperately attempting to discover some mechanical models to explain the laws of electromagnetism and the properties of light. Maxwell reversed the discussion and offered an electromagnetic theory of light, which was soon followed by an electromagnetic interpretation of the mechanical properties of matter. We have been looking, up to now, for a physicochemical interpretation of life. I t may well happen that the discovery of new laws and of some new principles in biology could result In a broad redefinition of our present laws of physics and chemistry, and produce a complete change in point of view. I n any event, two problems seem to be of major importance for the moment: a better understanding of catalysis, since life certainly rests upon a certain number of mechanisms of negative catalysis; and a broad extension of the notion of entropy, as suggested by Wiener, until it can apply to living organisms and answer the fundamental question of P . W . Bridgman.
51
Reprinted with permission from The Lesson of Quantum Theory, pp. 295-314, 1986 © 1986 Elsevier Science Publishers, BV.
Physics, Biological Computation and Complementarity John J. Hopfield California lnsiilule of Technology Pasadena, California, USA
Contents 1. The domain of physics in biology 2. Logical, physical and biological computers 3. Neural compulation 4. Classical neurodynamics 5. Beyond neurodynamics: complementarity Summary Postscript References Discussion
295 296 300 301 308 311 312 312 313
1. The domain of physics in biology Biology as we know it lies wilhin a restricted domain of physics. The laws of elementary particle physics and cosmology and the history of the universe serve merely to determine the nature of a planetary environment. The dynamical equations of quantum mechanics and quantum electrodynamics (and their classical equivalents when appropriate) are the essential elemental laws of physics which lead to biology. Some physicists make claims that "we shall never understand life until we understand the origins of the elementary particles". But the real mysteries of biology he in the way in which these dynamical laws of physics, and the substrate of electrons, photons and nuclei on which they operate, produce the complex set of counter-intuitive phenomena labeled with the term biology. Biology is a problem in dynamics—an organism functions by irreversibly preying on the available free energy which it finds in its environment in order to maintain its dynamic state. Driven (or non-equilibrium) physical systems of simple components already show complex and almost unpredictable behaviors. Turbulence in fluid flow, deterministic chaos and fractal forms in snowflakes are a few of the complex phenomena which arise from the same simple physical laws and substrates that rule biology. Some of the basic unsolved problems of biology, such as the generation of 52
296 complex forms in large systems, can already be seen in these simple systems. O u r understanding of such problems in physics is far from complete. T h e physics of large systems in equilibrium is unified and simplified through our understanding of the derived or secondary laws of statistical mechanics and thermodynamics. There is no comparably general theory of strongly non-equilibrium systems. The major conceptual problems in biology have the additional complication that the biological matter is itself very complex due to a long and selective evolutionary history. Biology is not a quantum-mechanical problem. Because the masses of nuclei are large compared to the masses of electrons, the adiabatic separation of electron and nuclear motion in the Schrodinger equation is usually adequate. The electrons can then be effectively removed from the problem, leaving a problem of only nuclear motion with effective interactions of some complexity between nuclei. This remaining problem of nuclear motion is partly in the classical regime, and partly a quantum-mechanical problem. T h e hydrogen stretching vibrations are of large quantum energy compared with kT at room temperature, and are rigid in the molecular dynamics. The rotational quantum numbers are large at room temperature, and the rotational motion is thus essentially classical, as is much of the translational motion. T h e molecular dynamics of liquid water at room temperature can be described in classical terms as the motion of rigid molecules having a complex set of two- and three-body forces between them. Accurate descriptions of the viscosity, rotational relaxation and dynamic neutron diffraction of liquid water have been obtained from the computed dynamics of such a model of water. Non-equilibrium problems such as the turbulence of water are in essence problems of classical physics, as is biology. This is not to deny that there are intrinsic limitations in the knowledge with which we can know positions and momenta of nuclei. In this regard, the classical approximation ignores a quantum-mechanical limitation. But the essential mysteries, phenomena and complexity of biology are not a problem of Planck's constant. They are a problem of the large size of Avagadro's number combined with the non-equilibrium nature of the system. There do not seem to be any important larger quantum coherent aspects to living matter, contrary to the romantic hopes of some physicists of the 1930s.
2. Logical, physical and biological computers Biology can be seen as a hierarchy of computations and computational devices. The translation of D N A into protein structure is a kind of computation. T h e construction of a complex organism from the instructions in D N A is also the following of an algorithm. The recognition of a familiar object is a neural computation. These diverse computations share some common characteristics because they share an evolutionary background. T h e species which exist in biology today are those which have survived the competition with other organisms and the cataclysms of weather and geology. A t the other end of the size scale, the proteins which exist within a given species have survived a competition with different molecules which might have performed the same functions, and with the alternative of simply not having this
53
297 function performed at all. The fundamental survival and competition problem for the organism or for the enzyme molecule is to develop an algorithm for predicting the future, or more accurately, to develop a behavior such that physical processes (or actions) taken now will be likely to be appropriate to the future environment in which the organism or molecule finds itself, and promote the survival or reproduction of the organism. The organism which survives best will be that which most clearly "sees" the consequences of its possible present actions. The prediction of the future from present information is a kind of computation. T o understand complex aspects of biology beyond the descriptive level, it becomes necessary to think about the computational aspects—what is computation and how does biological computation differ from that which we think about in conventional computers. Before delving into neurobiological computation, we will illustrate some of the issues. Computation has three conceptual elements: an input, an output and a "device", which reads the input and produces the output. The particular output, or range of outputs, is the consequence both of the particular input and the nature of the computing "device". The classic conceptual device of computational theory is the Turing machine. This machine has several internal states. It can read an input-output tape, shift the tape, write output on the tape and change the internal state of the machine. Universal computation can be performed by such machines. Turing machines are intrinsically digital (or logical) machines, having a small number of reading and writing symbols and a finite number of internal states. When computation is a physical process, as in a real computer or in biology, each of these elements has a physical manifestation. T h e description of computers as logical devices has been thoroughly developed in the past 50 years. A computer is also a physical dynamical system, which follows the laws and limitations of physics. The object of a computer designer is to develop a real dynamic system which behaves as similarly as possible to some given logical design. Physical systems have noise and imperfections not envisioned in the simple logical view of computer function. A s a result, considerable effort in hardware design is invested in a problem of no logical concern, namely trying to get reliable computation in a noisy and fault-filled world. Biology compounds this problem by adding a unique view of the nature of errors and of logic itself. We will examine these points by an example from protein synthesis. A growing cell is constantly producing more proteins [see, for example, Watson (1976)]. I n this process, the information in a strand of messenger R N A is used as an instruction for making a protein. The input tape consists of a single strand of R N A . The output is a protein, a linear polymer of amino acids which then folds into a functional three-dimensional structure. The m R N A is a polymer made up of four kind of units, A , U , G and C . The protein is a polymer made up of twenty different kinds of amino acids, glycine, alanine, tyrosine, AUGGGUCCAAAGAGCCUGUGG Met G l y Pro L y s Ser L e u T i p
UGA stop
mRNA protein
Protein synthesis is carried out at a polymolecular assembly of proteins and R N A called a ribosome, which is the "Turing machine" for the process. Many different
54
298 molecules participate in the protein synthesis on the ribosome, including t R N A , G T P and a host of co-factors. The ribosome reads the m R N A input tape and inserts the appropriate amino acids into the protein in sequence. The logical operation performed might be described as the following instruction set: (1) (2) (3) (4)
Read the next three bases on the m R N A molecule "tape". Look up the corresponding amino acid in the genetic code "dictionary". A d d lhat amino acid to the protein. Shift the input tape by three bases.
T h e program contains a possible "stop" instruction in the code dictionary. T h e signal for starting is more complex. T h e logical operation of reading a particular nexl codon in the m R N A is actually carried out in a chemical reaction which knows nothing about logic. T h e reaction might schematically be written: peptide-n + correct amino acid - * peptide-n + 1, Correct refers to the amino acid in correspondence with the next m R N A triplet. There is, however, also a competing reaction peptide-*! + incorrect amino acid - * incorrect peptide-n 4- 1. Within the logical view of computation this competing reaction simply "doesn't happen". F r o m the point of view of chemistry, the competing reaction can happen, but has a different energy barrier for taking place. The reaction is rather similar to the one desired, and any enzymatic system which allows the correct reaction to take place must also allow the incorrect reaction, albeit with rather slower rates. The equilibrium constant for adding the correct amino acid and for adding the incorrect one are essentially identical. Equilibrium processes—physical processes which take place sufficiently slowly—result in useless proteins being formed because a particular incorrect amino acid is as likely to be added as a correct one. T h e logic which we would like the biological system to display—correct amino acid only—is not possible without errors, and is possible with low but finite errors only by deliberately running the system out of equilibrium. In non-equilibrium processes, the choice between products can be made on the basis of rates. The biological process must be designed to make use of kinetics to obtain accurate logical calculation. The qualitative description of biochemistry—that reactions take place because molecules fit together, and other reactions do not because the corresponding enzymatic binding does not happen—consists of logical statements, and by being only logical overlooks the essential non-equihbrium element to real physical computation. Ordinary computers also make use of dissipation to obtain their accuracy. T h e course of a computation might be described as a motion in computer state space, or in a physical phase space. The initial data define a starting point in that space, and the computer is supposed to follow an appropriate and determinate path to the solution, represented by another point in that space. This is illustrated by the solid line with arrows in fig. 1. The effect of noise and imperfections is to cause the flow to deviate from the desired direction, and an accumulation of noise or statistical imperfections will result in a wrong answer.
55
299
COMPUTER
STATE SPACE Correct Result
Initial Information
Fig. I, The dynamical Irajectory of a computing machine from initial information to the appropriate answer. In a noiseless perfect machine, Ihe desired paih would be directly followed. Noise and imperfections cause increasing divergence from this path, while restoration returns the trajectory toward the ideal one.
T h e way to avoid this problem is to introduce a physical process which squeezes the system back down onto the proper track in state space. I f we can keep this compression going on, it will result in finding the correct answer in spite of noise. Compressing a bunch of trajectories in phase space down into a smaller volume is easily done by a dissipative process. This squeezing down is essential to avoiding errors in a system which has the physical possibility of making them. T h e reason that protein synthesis must be run out of equilibrium is exactly this necessity for compacting the occupied state space in a dissipative fashion to avoid errors. ( I n electronic digital computers, the electrical power necessary to do a computation is in principle bounded by this aspect, though in fact the power dissipation is far larger for reasons which have nothing to do with fundamental physics.) I n digital computers, this idea is called restoration (von Neuman 1952, Mead and Conway 1980), and the fact that a nominal digital "one" (really a particular voltage level) will recover to the appropriate level after a transient perturbation is given to the circuit is an illustration of the presence of restoration. In thinking about the computations which are performed in neurobiology, we must expect that the system will show such restoration in an obvious fashion, or else the system would be unable to compute. Restoration—which gives a robustness against noise and computer imperfections-is a universal necessity to physical computers, while an irrelevancy in logical computers. Even with restoration, biological computers are inaccurate. T h e intrinsic accuracy of protein synthesis of the simplest sort based on elementary chemical recognition,
56
300 is about 1 error in 100 in difficult cases (Pauling 1957, Hopfield and Y a m a n e 1980). T h e protein synthesis system also has a proofreading system (Hopfield 1974), which consumes more free energy and adds another layer of restoration, resulting in error levels on the scale of 1/3000, a drastic level from ihe point of view of electronic machines, but apparently good enough for biology. T h e one additional oddity about biological computers is thai they do nol perform as well as possible. Streptomycin resistant mutants of bacteria proofread belter (Yates 1979) and are more accurate than the normal wild-lype bacteria, but are an artifact of an evolutionary history with streptomycin present in the growth environment. In the absence of streptomycin, the bacterium reverts to the wild type—which is less accurate. T h e same general kind of less-than-best accuracy has been also demonstrated in D N A synthesis in T 4 bacleriophage by Muzyczka et al. (1972). I n biology, being as accurate as possible in computations may be a loosing proposition! Biology is not interested in perfect logic. The possibility of making progress through random accidents—called creative thought when ihey occur in neurobiology, or evolution and special ion when they occur in molecular biology—seems to be an essential part of biological computation.
3. Neural compulation W e turn next to computation in neurobiology. The object is to understand how a set of neurons makes decisions, generates actions, generalizes and leams and profits from past experiences. The emphasis here must be on our understanding. T o merely know the input-outpul relation of a set of neurons by exhaustive study is not satisfying. Worse, a clump of neurons with 100 input neurons and 100 output neurons might easily require more than 1 0 bits of information to characterize the input-output relation, making exhaustive study impossible. I i would be equally unsatisfying to know the neural hardware in sufficient detail to be able to simulate the hardware on a monsterous digital computer and predict correctly the behavior of a neural system. This would correspond to being able to simulate ihe behavior of a classical gas of complex molecules, without the conceptual understandings brought to such gases by statistical mechanics and ihermodynamies. 4 0
Perhaps the largest computational burden placed on our brains is involved in visual perception. The end result of this computation is our decisions about what we have seen. (It is appropriate to emphasize decisions, for decisions are the essence of computation. Strictly linear systems do not truly compute, although they can be very useful elements in a computational system.) Given a flash exposure to a typical visual scene, we note Ihe presence of a few familiar objects, some rough characteristics of each, such as color, general size, etc. T h e immense supply of almost non-meaningful information—more than 1 0 bits of information were processed by the retinal cells which begin this calculation—is compressed into significant perceptual information, of which there are probably only a few thousand bits. I n a digital machine such a computation would be done by making a very large number of sequential decisions, but somehow the essence of biological decisions seems to be rather more holistic, collective, or Gestalt. We want to understand how such decisions are made. 9
57
301 Aspects of neural computations in higher animals which are particularly puzzling from a physics viewpoint include: (1) The system makes very effective use of its computational resources, whether measured by speed of calculation or volume or energy considerations. (2) In such systems, the computations done by the system are very resistant to damage to the neural system (fail soft). (3) Emergent properties such as self-awareness seem to be present. (4) I n higher animals, neurobiology manages to function without a determined circuit diagram. There are two reasons to think that progress might be made in understanding neural function. First, the system is large, the connectivity between neurons is large, the behavior is somewhat insensitive to the destruction or misfunction of components, and the calculation seems to have a somewhat holistic character. This suggests that collective effects might be involved, and that a search for collective effects in neural networks [Little (1974); Little and Shaw (1978); Hopfield (1982)] and neural computation might be fruitful. (Note that this is not the way that most chip designers work—they do not make use of collective effects, and would attempt to suppress them if they ever were to be noticed.) Second, the system cannot be as complex as it might appear. It is true that a general module of neurons with 100 inputs and 100 outputs could require 1 0 bits to specify its behavior, and would require such a number if all we think of is to list them. But a general module would require also 1 0 bits of information to describe how to build it. A simple module, which can be described in perhaps 10000 bits cannot produce a general input-output relation. It must produce a very special kind of input-output relation, which can be only apparently complex, not truly complex. (In this same fashion, the random-number generators which are used in computers appear to provide highly random numbers, but in fact generate very special sequences because the generators can be described by short programs.) 4 0
4 0
4. Classical neurodynamics We must understand what computational or circuit facilities a nervous system has at its disposal in order to see what is a mystery and what is trivial. The anatomy of a "typical" cortical neuron is sketched in fig. 2. The morphological and functional diversity of such cells is very large. We will briefly review the electrophysiology of such cells, which is covered in detail in textbooks (see, for example, K a n d e l and Schwarz 1981). A small electrode can be inserted into a cell body, and the potential difference between the inside and the outside of the cell can be studied as a function of time. A typical result of such a recording is a baseline potential of about — 90 millivolts, on which is superimposed a set of more or less stereotyped voltage "spikes" called action potentials, rising to a potential of about + 50 millivolts, and being of about 1 millisecond duration. A n action potential propagates from a cell body down an axon by an active regenerative process, and can thus propagate long distances without attenuation.
58
302
Dendrites
p ( A | l ) .
(3)
Since p(A|I) - p(AB|l) + p(A"ff|l), m a r g i n a l i z i n g over B (from the sum and product r u l e s ) , and since p(AB|I) can be decomposed using the product rule, f u r t h e r i n e q u a l i t i e s can be derived. Popper now supposes that the statement "B supports A to the degree z" corresponds to the Bayesian assignment P(A|BI) - z.
(4)
I t i s then easy to conjure up a c o n t r a d i c t i o n of the type " I am l i k e l y to drink tea. I an u n l i k e l y to drink coffee." (Both type (A).) "But, given a choice, I am more l i k e l y to drink coffee than tea." (Type ( 3 ) . ) Popper makes the point with dice. Popper concludes that the Bayesian view i s i n c o n s i s t e n t . But t h i s i s semantic confusion: the concept of support i s d i f f e r e n t i n (S) and ( 4 ) . Suppose
100
110
(3) i s taken as t h e d e f i n i t i o n o f w h a t i t means foe B t o support A; t h i s accords w e l l w i t h i n t u i t i o n . That done, t h e word means something else i n ( 4 ) , f o r no i n e q u a l i t y i s a t hand. L e t us i l l u m i n a t e t h i s by showing t h a t "supports" i n [4) may n o t coincide with intuition. Suppose t h a t A, B and I are such t h a t 0.99 - p ( A | B I ) < p ( A [ l ) .
(5)
There i s no problem i n a r r a n g i n g t h i s : l e t , f o r example A • " t h e r e w i l l be o t r a f f i c jam i n c e n t r a l London today", B = " t h e r e are no road works i n c e n t r a l London a t p r e s e n t " , and I = " i t i s a w o r k i n g day". Then B i s a n t a g o n i s t i c t o A; but a c c o r d i n g to Popper's q u a l i t a t i v e statement o f ( 5 ) , i t s u p p o r t s i t t o a degree o f 0 . 9 9 , i.e. v e r y s t r o n g l y . The l e s s o n i s t h a t assignment o f p r o b a b i l i t y i s d i s t i n c t from comparisons of p r o b a b i l i t i e s . Only p r o b a b i l i t i e s , not t h e i r differences (or d i f f e r e n c e s o f t h e i r l o g a r i t h m s ) s a t i s f y Cox's axioms. I n summary, I b e l i e v e t h e c o n f u s i o n p r e v a i l i n g over p r o b a b i l i t y i n p h i l o s o p h y i s due t o two f a c t o r s . F i r s t , the philosopher's d i s p o s i t i o n i s to ask "What i s p r o b a b i l i t y ? " , w h i l e t h e s c i e n t i s t seeks s o l u t i o n s to specific p h y s i c a l problems, asking i n s t e a d "How can p r o b a b i l i t y h e l p me?" The general i s always i l l u s t r a t e d by the s p e c i f i c . Second, i t c o u l d be t h a t p r o b a b i l i t y has had i t s day i n p h i l o s o p h y . Philosophy bore the t o r c h of W e s t e r n l e a r n i n g and e n q u i r y f o r c e n t u r i e s ; b u t , as more became known, s p e c i a l i s e d areas o f knowledge branched o f f from i t . Science itself i s the o u t s t a n d i n g example; u n t i l r e l a t i v e l y r e c e n t l y physics was known as N a t u r a l P h i l o s o p h y . But w i t h t h e u n d e r p i n n i n g o f Cox, r e c o g n i t i o n of the dominant r o l e o f the P r i n c i p l e o f Maximum E n t r o p y , and the beginnings (in quantum s t a t i s t i c a l mechanics) o f an o p e r a t o r - v a l u e d theory o f p r o b a b i l i t y , the day o f the amateur i n t h e b e s t sense, f o r p h i l o s o p h e r s are o f t e n eminent i n s e v e r a l branches o f t h e i r d i s c i p l i n e - may be a t an end.
3.
Modern p h i l o s o p h y
of
science
T h i s s e c t i o n c r i t i c a l l y s u r v e y s much 2 0 t h century p h i l o s o p h y ' s view o f s c i e n t i f i c methodology. For the b a s i s o f t h i s m a t e r i a l I am i n d e b t e d t o my former c o l l e a g u e a t the U n i v e r s i t y o f Sydney, David Stove, now r e t i r e d from its Department o f T r a d i t i o n a l and Modern P h i l o s o p h y . David holds t o t h e relevance of Carnap's d i s t i n c t i o n , b u t he i s a defender o f i n d u c t i o n , and a formidable c r i t i c of d e d u c t i v i s t philosophy o f science. Philosophy o f science i s b e s t d e s c r i b e d as t h a t which s c i e n t i f i c endeavours have i n common, b u t which n o n - s c i e n t i f i c s t u d i e s do n o t n e c e s s a r i l y share. We should not suppose, though, t h a t t h e r e i s a n y t h i n g magic about s c i e n c e : i t is s i m p l y a s u s t a i n e d a p p l i c a t i o n o f cannon sense. And since common sense i s consistent reasoning ( i n an a p p r o p r i a t e l y chosen space), we f i n d ourselves s t a r i n g s t r a i g h t a t Bayesian i n d u c t i v e p r o b a b i l i t y . Bayes n o t w i t h s t a n d i n g , modern p h i l o s o p h y o f science has grown i n t o a major p a t h o l o g y a s s o c i a t e d w i t h t h e names Popper, Lakatos, Kuhn and Feyerabend. D e s c r i b i n g i t i s my aim h e r e , and as a p r e l i m i n a r y arming I p r e s e n t a tentative f l o w c h a r t of how science i s done.
101
Ill
theoretical loop
theory
disfavoured
EVOLVE A THEORY
EXPERIMENT ATTESTING 1
theory favoured
experimental loop
REDUCE WO. OF UNSPECIFIED PARAMETERS
Figure 2:
A Broad Methodology For Science
I n i t s b a s i c form, the model i s given i n Figure 2. There i s no endpoint, and so no " f i n a l answer". Instead, one continually r e f i n e s theory and p r a c t i c e . The vexed debate over r e a l i s m i s circumvented by d e f i n i n g s c i e n t i f i c t r u t h a s the asymptote towards which t h i s process ( i n p r a c t i c e ) converges. Laws of Nature a r e unknown and never change (assuming, reasonably, that they e x i s t ) , but our approximations t o them improve as we l e a r n more. S i n c e the loops always contain a t l e a s t one inductive step, the whole process i s i n d u c t i v e . This i s no more than i t should be: one can be almost c e r t a i n t h a t the Sun w i l l continue to r i s e i n the e a s t , based on past observations (and the c e l e s t i a l mechanics constructed to e x p l a i n them), but c e r t a i n t y i s absent. The model can be fleshed out to varying degrees, and t h a t which I have found most i l l u s t r a t i v e i n p h y s i c a l science i s displayed i n F i g u r e 3. It is here t h a t philosophy can help s c i e n t i s t s , although they g e n e r a l l y pursue such a strategy i m p l i c i t l y . A new i n t u i t i v e leap can throw up a theory a t any time. The resulting flowchart i s welded to the old by applying the process to the union of the two t h e o r i e s . Unifying demonstrates d i r e c t l y t h a t science i s not dogmatically r e d u c t i o n i s t i c : reductionism i s simply a convenient way of implementing the s t r a t e g y of Figure 3. The i n t u i t i v e leap corresponds to a widening of the region of hypothesis space under consideration, a process f o r which there i s as yet no theory, even i n model problems. Meanwhile, the hotchpotch of guess, conjecture and imagination c a l l e d i n t u i t i o n i s p r e c i s e l y what d i s t i n g u i s h e s great s c i e n t i s t s from the r e s t . The i n d u c t i v e view of science goes a t l e a s t as f a r back as the 12th century scholar Roger Bacon, and thence forward to E l i z a b e t h I ' s c o u r t i e r F r a n c i s Bacon. (Cf course, William of Ockham's famous razor p r i n c i p l e " E s s e n t i a non sunt m u l t i p l i c a n d ^ praeter necessitatem" - e n t i t i e s should not be m u l t i p l i e d beyond n e c e s s i t y - i s e s s e n t i a l l y Bayesian.) The distinguished B r i t i s h e m p i r i c a l philosopher David Hume (1711-1776) argued against induction, and Stove t r a c e s today's movement to t h i s source [ 8 J .
102
112
INTUITIVE LEAP BASED ON PRIOR INFORMATION. OCKHAM-S RAZOR
FORMULATE HEM THEORY
REJECT THEORY"
INTERNALLY CONSISTENT? theory Strongly
disfavoured
FORMAL CHECK AGAINST P R I O R I N FORMATIOH inconclusive or strongly favoured
DEDUCE A TESTABLE CONSEQUENCE EXPERIMENTAL TESTING INFORMAL ANALYSIS OF RESULTS TIGHTEN PROTOCOL AND CONTINUE moderate *ti l _
strong
RECHECK. OTHER GROUNDS IN THEORY" S FAVOUR
theory -Strongly disfavoured
FORMAL STATISTICAL ANALYSIS OF RESULTS strongly inconclusive favouren
i
AFFIRM THEORY AS CURRENT MODEL
SEEK THEORY VTTH FEWER ARBITRARY PARAMETERS, SAME PREDICTIONS OF PAST OBSERVATIONS
AWAIT IMPROVEMENTS TO EXPERIMENTAL TECHNOLOGY
WITHOLE JUDGEMENT
SEEK FURTHER TESTABLE CONSEQUENCES
deductive process * w e l l - d e f i n e d inductive process *weakly d e f i n e d inductive process F i g u r e 3; A D e t a i l e d Methodology For P h y s i c a l Science More r e c e n t l y , Pierre Duhem (1861-1916) argued that theory and experiment never meet face-to-face, because i n r e a l science a p r o h i b i t i v e number of a u x i l i a r y assumptions are involved i n reaching the i n t e r f a c e [ 9 ] . Today t h i s i s c a l l e d the Quine-Duhem t h e s i s . On the inductive p i c t u r e , e x t r a assumptions a r e r e a d i l y incorporated by s e t t i n g up the p r i o r d i s t r i b u t i o n for t h e i r parameters, c a l c u l a t i n g the j o i n t p o s t e r i o r d i s t r i b u t i o n of these and the d e s i r e d q u a n t i t i e s from t h e data by using Bayes' theorem, and then marginalizing over the extra parameters to take them out.
103
Ill
K a r l Popper opened the modern era w i t h L o g i k der Forschung i n 1934. I t is f a r more about philosophy of science than p r o b a b i l i t y t h e o r y . C u r i o u s l y though, Popper i s regarded more by h i s p r o f e s s i o n , a t l e a s t o u t s i d e England, as a c e l e b r i t y than a p h i l o s o p h e r ' s p h i l o s o p h e r . Summarising him i s n o t easy: as w i t h most c u l t s , many meanings can be read i n t o i t . T h i s i s due t o such c o n t r a d i c t i o n s as h i s acceptance of p r o b a b i l i t y but r e j e c t i o n o f induction; David Stove, i n the f i r s t p a r t of Popper find A f t e r : Four Modern I r r a t i o n a l i s t s [ 8 ) exposes the devices by which Popper (among o t h e r s ! lays h i s smokescreen. Popper'" primary t e n e t s can nevertheless be d i s c e r n e d . One i s t h a t a l l observations are "theory-laden". I n Popper's own words, "sense-data. u n t h e o r e t i c a l items of o b s e r v a t i o n , simply do n o t e x i s t ' ' [ 1 0 ] . I t i s difficult f o r Bayesians t o express the d e p t h of t h e i r disagreement w i t h t h i s . Data are d a t a , be they a d i s t r a c t i o n ( n o i s e ) or t r a c k s i n bubble chambers photographed a t enormous i n g e n u i t y and c o s t . Whatever, they are i n c o r p o r a t e d i n t o theory u s i n g Bayes' theorem. Of course, t h e o r i e s suggest which data t o seek, b u t that i s n o t a t a l l the same t h i n g ; once found, data can be used t o update t h e p r o b a b i l i t y o f any h y p o t h e s i s whatsoever. Popper a l s o i n s i s t s t h a t science i s d e d u c t i v e r a t h e r than inductive. P a r t l y t h i s i s a t e r m i n o l o g i c a l d i s p a r i t y , r e f e r r i n g n o t t o t h e o v e r a l l process but t o a s i n g l e stage: d e d u c t i o n o f the consequences o f a h y p o t h e s i s p r i o r to testing. (Popper's scheme i s o f t e n d e s c r i b e d as h y p o t h e t i c o - d e d u c t i v e . ) But Popper does r e j e c t i n d u c t i o n ; we have seen already h i s r e j e c t i o n of the i n d u c t i v e v i e w of p r o b a b i l i t y i n favour o f o t h e r i n t e r p r e t a t i o n s . Indeed, Popper has a s s e r t e d t h a t no t h e o r y ever becomes more probable when evidence i n i t s favour i s discovered, and t h a t every s c i e n t i f i c theory n o t o n l y begins by being i n f i n i t e l y improbable, b u t always remains so [ 1 1 ] . The f i r s t of these statements d i r e c t l y denies Bayes' theorem. U n d e r l y i n g the second i s the i d e a , seldom r e c o g n i s e d , o f the space i n w h i c h p r o b a b i l i t i e s are d e f i n e d . This c o n t a i n s an i n f i n i t y of competing t h e o r i e s , and before l o o k i n g at t h e i r d i s t i n c t i v e f e a t u r e s we must a s s i g n each equal prior p r o b a b i l i t y 1/co, or zero. I t seems t h a t Popper i s c o r r e c t . But Bayesians recognise t h i s as the problem o f n o n - n o r m a l i s a b l e or improper p r i o r s , i n hypothesis space. The r e s o l u t i o n i s the same: though the p r i o r i s nonn o r m a l i s a b l e , Bayes' theorem g i v e s f o r the p o s t e r i o r r a t i o a w e l l - d e f i n e d limit o f 0/0 which may p e r f e c t l y w e l l be n o r m a l i s a b l e [ 5 ] , Bayesians can open The Logic o f S c i e n t i f i c Discovery, and the f i r s t volume o f i t s massive Postscript [12], expounding Popper's p o s t - f r e q u e n t i s t " p r o p e n s i t y " v i e w o f probability (much c l o s e r t o Bayesian, though never made p l a i n ) , almost at random and i l l u m i n a t e the problems exposed. The idea by which Popper i s best known, and one o f which most students are aware, i s the d o c t r i n e of f a l s i f i a b i l i t y . A hypothesis i s only s c i e n t i f i c i f i t i s capable o f being proved f a l s e by o b s e r v a t i o n . This i s an i m p o r t a n t idea, b u t i t i s b a l d l y d e d u c t i v i s t i n r e s t r i c t i n g the concept t o f a l s i t y b u t not truth; f o r i n d e d u c t i v e l o g i c a s i n g l e counter-example can f a l s i f y a t h e o r y b u t no number o f examples can prove i t . I n r e a l science though, t h e o r i e s are n o t proved f a l s e (or t r u e ) w i t h c e r t a i n t y . I n s t e a d , data are i n c o r p o r a t e d v i a Bayes' theorem i n t o the p o s t e r i o r p r o b a b i l i t y , which may approach zero or one. As i t gets s u f f i c i e n t l y c l o s e (a matter o f t a s t e ) , the t h e o r y i s r e j e c t e d o r adopted. So the c r i t e r i o n i s n o t f a l s i f i a b i l i t y , but t e s t a b i l i t y : t h a t one can conceive of data which a l t e r t h e p r o b a b i l i t y o f the h y p o t h e s i s . E q u i v a l e n t l y , the h y p o t h e s i s must not be e q u a l l y disposed t o every datum. Stove t r a c e s
104
114
Poppet's dictum of f a l s i f i a b i l i t y to h i s e a r l y d i s t a s t e , i n Vienna, with dogmatic claims that i d e a s such as Marx's and Freud's were " i r r e f u t a b l e " (11,13]. The word game began here, though Marx a t l e a s t understood the stakes long before. He wrote to Engels i n 1857, concerning a h i s t o r i c a l event: "One can always get out of [making an a s s of o n e s e l f ] with a l i t t l e d i a l e c t i c . 1 have, of course, so worded my p r o p o s i t i o n as to be r i g h t e i t h e r way" [ 1 4 ] . Popper does r e f e r to t e s t a b i l i t y although, deprived of i n d u c t i v e l o g i c , he fails to n a i l down the i d e a . (Any statement p e r t a i n i n g to a theory, but which i s not t e s t a b l e , i s part of t h a t theory's i n t e r p r e t a t i o n . ) He a l s o r e f e r s to degrees of f a l s i f i a b i l i t y , and attempts to r e l a t e them to p r o b a b i l i t y . This again i s word-play: the ease w i t h which a hypothesis i s tested i s a matter for t e c h n o l o g i s t s , not t h e o r e t i c i a n s . I t was the successor t o Popper's London C h a i r , Imre Lakatos (1922-1974), who pointed out a f r e s h t h a t i n r e a l s c i e n c e , t h e o r i e s are never disproved (or proved) with c e r t a i n t y . Lakatos c l e a r l y reached t h i s conclusion through c o n t r a s t i n g the n a t u r a l s c i e n c e s with mathematics, i n whose h i s t o r y he had worked [ 1 5 ] , But, f o l l o w i n g Popper i n renouncing induction, Lakatos was left w i t h no framework to hang h i s observation on. His own attempt to b u i l d one, a doctrine of r e s e a r c h programmes [ 1 6 ] , confuses philosophy with h i s t o r y of science. Like many ideas given b i r t h i n e a s t e r n Europe, deductivism has became popular i n America. L e t us t h e r e f o r e examine the work of Thomas Kuhn (1922¬ ) , another avowed a n t i - i n d u c t i v i s t and the author of the hugely influential work The S t r u c t u r e of S c i e n t i f i c Revolutions [ 1 7 ] . Like Lakatos, Kuhn i s a f i r s t - r a t e h i s t o r i a n of s c i e n c e , who has w r i t t e n on the Copernican revolution i n astronomy, and the black-body controversy which gave b i r t h to the e a r l i e s t quantum hypothesis. The S t r u c t u r e of S c i e n t i f i c Revolutions presents a c y c l i c view of how science evolves, beginning w i t h a mass of unordered observations and competing theories, going i n t o a quiescent stage a f t e r the triumph of one theory over the r e s t , followed by gradual breakdown i n t o chaos again under the accumulation of anomalies from more stringent t e s t i n g . Kuhn has bequeathed us one of today's f a s h i o n a b l e words, paradigm, to describe the model prevailing during the quiescent stage. The h i s t o r y of s c i e n c e abounds with examples of t h i s process; trouble again a r i s e s when i t i s combined w i t h a n t i - i n d u e t i v i s m as a philosophy. Close to the end of The S t r u c t u r e of S c i e n t i f i c Revolutions, Kuhn c l e a r l y echoes Popper's a s s e r t i o n that every theory i s i n f i n i t e l y improbable, when he says that "we may have to r e l i n q u i s h the notion that changes of paradigm carry s c i e n t i s t s . . . . c l o s e r and c l o s e r to the t r u t h " [ 1 8 ] . I n other words, Kuhn b e l i e v e s that t h e o r i e s come and go as a r b i t r a r i l y as fashions i n clothing. For a n t i - i n d u e t i v i s t s , the c l o s e r f i t to observation of r e l a t i v i s t i c mechanics than Newton's counts for nothing. T h i s s i n g u l a r ideology i s d e f l a t e d by applying i t to p r o g r e s s i v e l y simpler problems: i t can hardly be no more true that the Moon i s made of rock than green cheese. There i s no doubting, though, that Popper, Lakatos and Kuhn a l l appear r e s p e c t f u l t o science. Consider f i n a l l y the i d e a s of Paul Feyerabend (1924¬ ) [19,20]. Feyerabend d e s c r i b e s h i s approach as "epistemological anarchism", and h i s slogan i s "Anything Goes". Again, t h i s derives c l e a r l y from the idea that a l l t h e o r i e s a r e e q u a l l y i n v a l i d , though i t i s a l s o an accurate d i s t i l l a t i o n of the s u b j e c t i v e Bayesian stance. And Popper's o l d e s t c r i t i c i s m holds of i t : i t i s not f a l s i f i a b l e l
105
115 Feyerabend i s on r e c o r d as s t a t i n g t h a t normal science i s a f a i r y t a l e , and t h a t equal time should be g i v e n to " a s t r o l o g y , acupuncture and w i t c h c r a f t " [ 2 1 ) ( t h o u g h 1 do not know what unctions he seeks when i l l ) . He i s fond o f c a t e g o r i s i n g science w i t h " r e l i g i o n , p r o s t i t u t i o n and so on" [ 1 9 J . Feyerabend believes that science i s j u s t one of many i n t e r n a l l y c o n s i s t e n t views o f the world, and t h a t the consequent choice between them should be made on s o c i a l grounds. But w h i l e many systems are i n t e r n a l l y c o n s i s t e n t , o n l y one plugs c o n s i s t e n t l y i n t o the w o r l d of observations, and t o reason s y s t e m a t i c a l l y about t h a t w o r l d we must use i t : science. E t h i c a l and s o c i a l considerations may d i c t a t e w h i c h areas t o s t u d y , but t h a t i s a d i f f e r e n t m a t t e r . Feyerabend's ideas have been brought f o r t h by "the sleep o f reason", and w h i l e they c o u l d p r o b a b l y o n l y f l o u r i s h i n a society d i s i l l u s i o n e d w i t h s c i e n c e ( t h r o u g h i t s p e r c e i v e d misuse), they represent the l o g i c a l c u l m i n a t i o n o f the r e j e c t i o n o f induction. For t h a t i s what Popper, Lakatos, Kuhn and Feyerabend have i n coomon; and d e s p i t e much mutual r e p u d i a t i o n , i t f a r outweighs t h e i r d i f f e r e n c e s . The ideas of these f o u r comprise a major stream i n contemporary p h i l o s o p h y o f science. I t i s an odd f a c t t h a t s c i e n t i s t s o f t e n quote these p h i l o s o p h e r s f a v o u r a b l y [22]. Science magazine r e c e n t l y lauded Feyerabend's views as "a b r e a t h o f f r e s h air" [ 2 1 ] . The e x p l a n a t i o n i s undoubtedly a benign i g n o r a n c e . Being prepared t o r e v i s e hypotheses i n the l i g h t of fresh i n f o r m a t i o n ( a n a t t i t u d e p o l i t i c i a n s m i g h t heed) makes s c i e n t i s t s easy p r e y . What, by c o n t r a s t , could alter Feyerabend's o p i n i o n ? 4.
Conclusion
The o b j e c t i v e Bayesian v i e w i s as capable o f r e s o l v i n g problems concerning i n d u c t i v e l o g i c i n p h i l o s o p h y as i t i s i n science. Difficulties are not c o n c e p t u a l , but merely t e c h n i c a l : the huge spaces used i n r e a l problems, and t h e d e t e r m i n a t i o n of p r i o r p r o b a b i l i t i e s i n a wide v a r i e t y o f c o n t e x t s [1]. In particular, the Bayesian view, applied t o s c i e n t i f i c methodology, produces a coherent, i n d u c t i v e p h i l o s o p h y o f science. Non-inductive philosophies o f s c i e n c e i n v a r i a b l y lead t o a b s u r d i t i e s . References (1]
[2] [3] [4] [5] [6] [7] [8]
and
Notes
E . T . Jaynes. 1983. Papers on P r o b a b i l i t y , S t a t i s t i c s and S t a t i s t i c a l P h y s i c s , ed: R. D. Rosencrantz. Synthese s e r i e s v o l 158. R e i d e l (Dordrecht). R.T. Cox. 1946. Am. J . Phys. 14, 1 . K . R . Popper. 1959. The Logic o f S c i e n t i f i c D i s c o v e r y . Hutchinson (London). T r a n s l a t i o n o f : Logik der Fotschung ( S p r i n g e r , Vienna, 1 9 3 4 ) . R . Carnap. 1950. L o g i c a l Foundations o f P r o b a b i l i t y . U n i v e r s i t y of Chicago Press. E.T. Jaynes. 1968. IEEE Transactions i n Systems Science and C y b e r n e t i c s , SSC-4, p227. R e p r i n t e d as Chapter 7 of [ 1 ] . A . C . M i c h a l o s . 1971. The Popper-Carnap Controversy. M a r t i n u s N i j h o f f (The Hague). K . R . Popper. 1954. B r i t . J . P h i l o s . S c i . 5, 143. R e p r i n t e d i n r e f e r e n c e [ 3 ] , r e v i s e d e d i t i o n 1980. Appendix i x . D.C. Stove. 1982. Popper And A f t e r : Four Modem I r r a t i o n a l i s t s . Pergamon
106
116 (Oxford). P. Diihem. 1954. The Aim and S t r u c t u r e o f P h y s i c a l T h e o r y . Second e d i t i o n , P r i n c e t o n U n i v e r s i t y Press. ( T r a n s l a t i o n o f second French e d i t i o n . 1914.) [ 1 0 ] K . R . Popper. 1968. I n : Problems i n the P h i l o s o p h y o f Science, eds: I . Lakatos & A. Musgrave, p l 6 3 . N o r t h - H o l l a n d (Amsterdam). [ 1 1 ] see: D.C. Stove. June 1985. Encounter, p65-74. [ 1 2 ] K . R . Popper. 1982-3. P o s t s c r i p t t o The L o g i c o f S c i e n t i f i c D i s c o v e r y , Volume I : Realism and the Aim of Science ( 1 9 B 3 ) , Volume I I : The Open U n i v e r s e : An Argument f o r I n d e t e r m i n i s m ( 1 9 8 2 ) , Volume I I I : Quantum Theory and the Schism i n Phyics ( 1 9 8 2 ) . Ed: W.W. H a r t l e y 1 1 1 . Hutchinson ( L o n d o n ) . Mote the t i t l e o f volume I I I ; the schism occasioned by quantum theory i s between past and p r e s e n t , n o t i n physics as Popper a s s e r t s . [9]
[ 1 3 ] D.C. S t o v e . 1982. "How Popper's Philosophy Began'. P h i l o s o p h y . 57, 3 8 1 . The source i s an a u t o b i o g r a p h i c a l d e t a i l by Popper i n the summary paper 'Science: Conjectures and R e f u t a t i o n s ' , i n ; Conjectures and R e f u t a t i o n s , p u b l i s h e r : Routledge & Kegan Paul (London) 1963. [ 1 4 ] K. Marx. 1983. C o l l e c t e d Works, 40, 152. Lawrence & W i s h a r t (London). [ 1 5 ] I . L a k a t o s . 1963-4. V a r i o u s papers, c o l l e c t e d as; Proofs and R e f u t a t i o n s : The Logic o f Mathematical Discovery. Eds: J . W o r r a l l & E. Zaher. Cambridge U n i v e r s i t y Press, 1976. [ 1 6 ] I . L a k a t o s . 1978. P h i l o s o p h i c a l Papers, V o l I : The Methodology o f S c i e n t i f i c Research Progranmes. Eds: J . W o r r a l l & G. C u r r i e . Cambridge U n i v e r s i t y Press. [ 1 7 ] T . S . Kuhn. 1962. The S t r u c t u r e of S c i e n t i f i c R e v o l u t i o n s . U n i v e r s i t y of Chicago Press. (Second E d i t i o n , e n l a r g e d , 1970.) [ 1 8 ] Reference [ 1 7 ] , second e d i t i o n , p l 7 0 . [ 1 9 ] P.K. Feyerabend. 1975. A g a i n s t Method: O u t l i n e of an A n a r c h i s t i c Thee < of Knowledge. New L e f t Books (London). [20] P.K. Feyerabend. 1987. F a r e w e l l To Reason. Verso ( L o n d o n ) . [ 2 1 ] P.K. Feyerabend. 1979. Quoted i n : Science, 206, 534. [ 2 2 ] T. Theocharis k M. Psimopoulos. 19B7. N a t u r e . 329, 595, and 3 3 1 , 384 (1988).
107
Reprinted with permission from Cognition as Intuitive Statistics, pp. 147-162. 1987 © 1987 Lawrence Erlbaum Associates. Inc.
Two Revolutions — Cognitive and Probabilistic; Is the Mind a Bayesian? IS T H E M I N D A B A Y E S I A N ? After the inference revolution, a new question arose: A r e the long soughtafter laws of thought the laws of probability theory '' A n d , to answer the new question, a new kind of problem was posed. These problems were constructed so that they couid be answered by calculating probabilities, means, variances, or correlations. Restructuring was seldom necessary for the solution of such problems, and so this aspect of the theory of thought was held in abeyance. Consequently, the new vocabulary for understanding human reasoning was the vocabulary of the statistician; the new elements of thinking were numbers (probabilities), and the process of thinking itself was explained by statistical operations such as calculating likelihood ratios. T h e theoretical questions asked by experimenters, the problems posed to the subjects, and the explanations sought all reflected the fascination with probability theory and statistics. The new vocabulary made the study of inductive thinking into one of the fastest growing areas of the new psychology of thinking, revitalized by the cognitive revolution. O u r concentration here is on the link between inductive thinking and probability theory to the exclusion of other issues such as deductive thinking (e.g. J o h n s o n - L a i r d , 1983; Wason & JohnsonL a i r d , 1972) and computer models of problem solving (e.g. Dorner, 1983; Newell & Simon, 1972; Simon, 1979). In this section we deal with the question whether the mind is a Bayesian statistician.
Conservatism U r n s and balls have long been the stock-in-trade of the probabilists. F o r instance, when L a p l a c e (1774/1878— 1912> proved B a y e s ' theorem, he used the well-known urn filled with white and black balls as his illustration. In the 1960s, the urn-and-balls problems made their way into the laboratories of experimental psychologists, although often in a more contemporary terminology, as bookbag-and-poker chips problems (e.g. E d w a r d s , 1966; E d w a r d s . Lindman & Phillips. 1965). Consider for instance a typical problem posed to the subjects; Imagine yourself in the following experiment. Two urns are filled with a large number of poker chips. The first urn contains 70% red chips and 30% blue. The second contains 70% blue chips and 30% red. The experimenter flips a fair coin to select one of the two urns, so the prior probability for
108
148 each urn is .50. He then draws a succession of chips from the selected urn. Suppose that the sample contains eight red and four blue chips. What is your revised probability that the selected urn is the predominantly red one? (Peterson & Beach, 1967, p. 32).
If your answer is around .75, you agree with most of the subjects. C o m p a r e the new bookbag-and-poker chips problems with those of the Wurzburg school and Gestalt psychology. In Biihler's and Duncker's problems, calculation was seldom a useful tool: and if it w a s , it was not sufficient to find a solution. N o w the answer asked for is a number, and calculation rather than restructuring is sufficient to go from the problem to the answer. H o w is that calculation done? B a y e s ' theorem gives an answer. B a y e s ' theorem is an elementary consequence of the definition of conditional probability given a mutually exclusive and exhaustive set of hypotheses. In the foregoing problem, there are two hypotheses: / / , that the selected urn is the predominantly red one; and H that it is the predominantly blue one. T h e answer asked for is the posterior probability p(H,\D) that H, is true given the data D, that is. eight red and four blue chips. A c c o r d i n g to B a y e s ' theorem, the posterior probability is: 2
p{H,\D)
= p(HMD\H,)lp{D)
(5.1)
with p(D)
= p{HMD\H,)
+
pWMWHJ.
A s described in chapter 1, p ( / / , ) and p ( / / j ) are the prior probabilities of the hypotheses H and / / , , respectively, and p(D\H,) and p(D\H ) are called the likelihoods of the data D if H, and / / , , respectively, are true. T h u s , B a y e s ' theorem gives a rule for revising a prior probability (or, base rate, as it is often called) into a posterior probability after new data has been observed. T h e posterior probability will be greater than the prior, if the ratio p(D\H )lp{D) exceeds 1, which is the case if the data has a high probability if the hypothesis /V, is true. F o r the present problem, the likelihood of getting x = 8 red chips in a sample of n = 12 if / / , holds is given by t
2
t
p(D\H,)
= ("WO-P,)"-'
(5.2)
In words, p," is the probability of drawing a sequence of x red chips (p, is the percentage of red chips in the urn H,), and ( I - p , ) " ' is the probability of drawing a sequence of (n-x) blue chips. Therefore, the product of - pi)""* is the probability of a sequence of x red and ( n - j r ) blue chips. T h e number (;) denotes the number of different orderings in which x red and (n-x) blue chips can occur. T h u s , we can calculate the likelihoods: n
109
149 p(D\H ) = (tf ).7» x .3* = .231 p(D\H ) = (',' ).3 x .7 = .008. x
g
4
2
Inserting the likelihoods and the prior probabilities into B a y e s ' theorem, we calculate the posterior probability: p(//,IZ)) -
.5 x .231/(.5 x .231 + .5 x .008) = .967
T h u s , after having observed eight red chips in a sample of twelve, we should revise the prior probability from .50 to around .97. Compared with this, the average subject was more conservative, revising from .50 to only .75. T h i s "cautious" revision of the prior probability was called conservatism and was the main finding of that early research. Incidentally, the calculations for this simple bookbag-and-poker chip problem illustrate the complexity of the mental calculations that the reasoning mind is supposed to perform. B y the late 1960s, it was concluded that conservatism was a persistent phenomenon, although some variables, such as the sequential order of the data and the introduction of incentives, influenced the degree of conservatism (Peterson & B e a c h , 1967; Peterson & D u C h a r m e , 1967; Phillips & E d w a r d s , 1966). In E d w a r d s ' (1968) words, "It takes anywhere from two to five observations to do one observation's worth of work in inducing a subject to change his opinions" (p. 18). T h e question whether the mind reasons like a B a y e s i a n seemed to have found a consistent answer: T h e mind is a quasi-Bayesian, that is, a very conservative one. T h e mind mistrusts new data and gives greater weight lo the prior probabilities.
How to Explain
Conservatism?
Peterson and Beach (1967) proposed that the mind systematically miscalculates equation (5.2), but not (5.1). That is, the mind first miscalculates the likelihoods, but then correctly uses Bayes" theorem, although with the wrong likelihoods. E d w a r d s (1968) proposed Ihe opposite explanation— that the mind correctly calculates equation (5.2) but miscalculates (5.1). In other words, the likelihoods are correct, but the mind misaggregates them with the prior probabilities instead of combining them according to B a y e s ' theorem. T h e r e were other interesting attempts to explain the phenomenon, such as the idea that the mind considers as relevant data the ratio of red to blue chips rather than their difference. Bayes' theorem, however, implies that only the difference counts: eight red and four blue chips (a difference of four) in a sample of twelve give the same result as five red and one blue chips in a sample of six (see Manz, 1970). T h e important come from rather T h e psychological bility theory and.
point here is to see where the explanatory concepts than to speculate which explanation might be correct. explanations now come from the vocabulary of probain particular, from B a y e s ' theorem, like the research
n o
150 question and the problems posed. First, the facts that require explanation are the deviations from B a y e s i a n reasoning: H a d the subjects followed B a y e s ' theorem, no explanation would have been necessary. B a y e s ' theorem has become the frame of reference. S e c o n d , although human reasoning does deviate from B a y e s i a n reasoning, the explanation is still sought in the vocabulary of B a y e s ' theorem, such as •"miscalculating" likelihoods or •'misaggregation." L i k e textbook problems in probability theory, these explanations ignore the content and context of the specific problem: it is the mathematical structure that counts. It is ironic that the whole phenomenon of conservatism disappeared when in the early 1970s Daniel K a h n e m a n and A m o s T v e r s k y posed Bayesian problems with a content different from bookbags and poker chips. Subjects no longer seemed to reason conservatively about the new problems: indeed, they even seemed to neglect the prior probabilities.
B a s e Rate N e g l e c t : T h e K a h n e m a n a n d T v e r s k y Program There is an important difference between the kind of problems posed by E d w a r d s , on one hand, and K a h n e m a n and T v e r s k y , on the other. E d w a r d s ' problems were pure application of probability theory, whereas K a h n e m a n and T v e r s k y ' s problems, while still of the textbook sort, approximated real-life situations. T h e latter thus became simultaneously more interesting and, as we shall see, more ambiguous in their interpretation. K a h n e m a n and T v e r s k y (1973) started with the unquestioned belief that B a y e s ' theorem would give a single correct answer to the problems they investigated. T h e i r problems were typically presented in questionnaires and distributed in classrooms and other group settings; subjects were typically high school students and college undergraduates. One of the best known problems is the E n g i n e e r - L a w y e r Problem. Kahneman and T v e r s k y (1973) told the following cover story to a group of students: A panel of psychologists have interviewed and administered personality tests to 30 engineers and 70 lawyers, all successful in their respective fields. On the basis of this information, thumbnail descriptions of the 30 engineers and 70 lawyers have been written. You will find on your forms five descriptions, chosen at random from the 100 available descriptions. For each description, please indicate your probability that the person described is an engineer, on a scale from 0 to 100. The same task has been performed by a panel of experts, who were highly accurate in assigning probabilities to the various descriptions. You will be paid a bonus to the extent that your estimates come close to those of (he expert panel, (p. 241)
111
151 A second group of students received the same instruction with inverted base rates (prior probabilities), namely, 70 engineers and 30 l a w y e r s . A l l subjects
were given
the same
personality descriptions.
O n e o f these
thumbnail descriptions follows: Jack is a 45-year-old man. He is married and has four children. He is generally conservative, careful, and ambitious. He shows no interest in political and social issues and spends most of his free time on his many hobbies which include home carpentry, sailing, and mathematical puzzles. The probability that Jack is one of the 30 engineers in the sample of 100 is —
%. (p. 241)
Although
the
likelihoods—p(descriptionl E n g i n e e r )
and
p(descrip-
t i o n l L a w y e r ) — a r e not specified here, the experimental situation is set up in such a way that it is still possible for K a h n e m a n and T v e r s k y to use B a y e s ' theorem to compute the posterior probabilities: T h e y calculate the ratios of the odds in both groups, so that the likelihoods cancel
out.
1
B a y e s ' theorem predicts that the posterior probabilities should be different for the two groups. K a h n e m a n and T v e r s k y found, however, that the mean responses in the two groups were for the most part the same and 1
concluded that base rates were largely ignored.' T h e explanation w a s that
'The likelihoods are cancelled oui in ihe following way; The "prior odds" p(Engineer)/ p(Lawyer) are 30/70 in the first and 70/30 in the second group. The likelihood ratios p(descriptionlEnEineer)/p(descrip(iontLawyer) are assumed to be the same in bolh groups. If we use the symbol O, and O ; for the prior odds in the two groups, respeclively, and L for the likelihood ratio, then ihe ratio of the posterior odds Q, and ( I . in the two groups is given as follows:
This means that the posterior odds p( E ngi nee rldescription)/p( L a wye ride scrip! ion) in the first group should be only \ t% of those in the second, orequivalenlly. the odds in the second should be more lhan five limes as high as in Ihe first (see Kahneman & Tversky. 1973). •'In f a d . however. Kahneman and Tversky (1983) found a statistically signifkan! difference between the two groups (p < .01). Nevertheless, (hey subsequently ignored this and concluded lhal the manipulation of ihe information about base rates had only a "minimal effect" and therefore lhat base rales were "largely ignored." Of course, they were quite right to forgel aboul their null hypothesis lest, since null hypothesis testing in this situation, wilh two competing prediclions. is useless. Bolh Bayes' theorem and the "neglect of base rales" hypothesis specify prediclions, and i( is nonsensical to identify one wilh a null hypolhesis. In f a d . Kahneman and Tversky came lo their conclusion by eyeballing (Ihe results lie much closer lo ihe prediclions of ihe "base rate neglect" hypothesis than to those of Bayes' theorem): alternatively one might have adopted a symmetric hypothesis testing procedure such as Neyman & Pearsons'. This example, like that of cognitive algebra in chapter 3. shows that null hypolhesis testing not only may be useless, but may actually be misleading. In contrast to Anderson, Kahneman & Tversky realized this, bul it is still a question why they performed the null hypothesis testing ritual in the first place.
112
152 the subjects had arrived at the probability judgments by judging only the similarity between the description and the stereotype of an engineer, or, in order words, the degree to which the description is representative of the stereotype. T h i s strategy was called the representativeness heuristic. T h e five personality descriptions included the following nondiagnostic s k e t c h , which is particularly interesting because it was constructed to be totally uninformative with respect to the question whether the person is an engineer or a lawyer: " D i c k is a 30-year-old man. He is married with no children. A man of high ability and high motivation, he promises to be quite successful in his field. He is well liked by his colleagues." ( K a h n e man & T v e r s k y , I973, p. 242). T h e median probability judgments for Dick being an engineer were 50% in each group, and the authors concluded that the subjects had ignored base rates even when the description was totally uninformative. T h e representativeness heuristic again reveals the dream of universal laws of thought—if not Bayes' theorem, then at least such universal heuristics as representativeness. T h i s search for a few simple laws plays d o w n the importance of the task for structuring thought. F o r instance, G i n o s a r and Trope (1980) repeated the E n g i n e e r - L a w y e r study, but gave only one of the five personality descriptions (including the nondiagnostic sketch) to each subject. T h e y concluded that in this situation the nondiagnostic sketch was assessed by the base rate. It seems that one particular feature of K a h n e m a n and T v e r s k y s task, namely the presentation of diagnostic descriptions before the nondiagnostic sketch, directed the reasoning about the nondiagnostic sketch. A second feature of K a h n e man and T v e r s k y ' s task was that no subject was given different base rates. But FischhorT, S l o v i c , and Lichtenstein (I979) report that base rates have more impact if they are varied in the problem posed to a subject. T h e general structure of the task, and not only some general "law of thought," appears to determine performance.
Represen
tativeness
What is the explanation for the new phenomenon, the so-called neglect of base rates? K a h n e m a n and T v e r s k y {1973, T v e r s k y & K a h n e m a n , 1982a) offer two major explanations, the representativeness heuristic and causal versus incidental base rates. T h e first explanation is derived from the E n g i n e e r - L a w y e r Problem and similar problems. K a h n e m a n and T v e r s k y claim that their subjects arrived at their answer by a heuristic, or short-hand, strategy rather than by B a y e s ' theorem. This heuristic is called representativeness and is identified with the similarity between the description and the stereotype of an engineer. In their early writings, they
113
153 seemed to imply that the base rate fallacy necessarily results from the use of the representativeness heuristic. But soon it became clear that other problems, such as the C a b Problem (see below), could not be dealt with by a representativeness heuristic. T h u s , representativeness might be only one condition for the neglect of base rates. A s Kahneman and T v e r s k y (1972) put it, '"Representativeness, like perceptual similarity, is easier to assess than to characterize. In both cases, no general definition is available, yet there are many situations where people agree which of two stimuli is more similar to a standard" (p. 431). The term representativeness was used in the early writings with more than one meaning, as the authors themselves later realized (Tversky & K a h n e m a n , 1982b). We shall therefore confine our discussion to the use of the term representativeness in the context of Bayesian-type problems of probability revision such as the E n g i n e e r - L a w y e r Problem. K a h n e m a n and T v e r s k y (1973) distinguish between formal rules like B a y e s ' , which specify how we should determine probabilities, and nonformal heuristics like "representativeness," which describe how we actually determine probabilities. B a y e s ' theorem is correct and is about probability; heuristics are mostly misleading and not influenced by probability. But as we shall show, this distinction is illusory. In fact, the representativeness heuristic boils down to computing probabilities using only likelihoods, without prior probabilities. It is therefore just as formal and potentially quantitative as Bayes' theorem itself, though not normative. We submit that this formal nature of a purportedly informal, qualitative heuristic shows the extent to which probability theory has permeated contemporary theory construction in this area of psychology. In the following section we show that the concept of a representativeness heuristic can be reduced to the concept of likelihood in B a y e s ' theory.
The Skeleton
of a
Heuristic
H o w can we show that representativeness is a redescription of base rate neglect rather than an explanation? It seems to us that in most, if not all, cases, representativeness or similarity is synonymous with likelihood. If this is true, the explanation will turn out to be merely a redescription of the phenomenon in Bayesian terms. What basis do we have for this claim? L e t us consult T v e r s k y and Kahneman (1982b), who dissect their original use of the term representativeness into two different senses; judgments by representativeness, the heuristic used for inference and prediction; and judgments of representativeness, the finding that people judge and expect samples to be highly representative of their parent
114
154 populations. It is the first meaning we are concerned with. T h e y differentiate four basic cases of judgment by representativeness. F o r each, they c h a r a c t e r i z e representativeness as a directional relation between a model or population and an instance o r sample. T h i s means that a sample is more or less representative of a particular population, but not vice versa. This general characterization is in accordance with that of a likelihood, which is a directional relation, p(D\H) between a sample D and a population or hypothesis H. Although K a h n e m a n and T v e r s k y believe that their subj e c t s use representativeness as an alternative to probabilistic thinking, they themselves characterize at least the first three kinds of judgments by representativeness in terms of distributions (Tversky & K a h n e m a n , 1982b, p.87). L e t us look at the four basic kinds in which the concept of representativeness is invoked. In the first c a s e , representativeness is defined as a relation between a class H and the value D of a variable defined in this class. T h e given example speaks of (more or less) representative values of the income of college professors. T h e authors slate that representativeness is determined here mainly by what the subject knows about the frequency distribution, and that a value D will be most representative if it is close to the m e a n . In this first c a s e , representativeness is identical with the likelihood of an observed value D given a subjective distribution H, that is of a likelihood p(D\H), where H is a one-dimensional distribution and D is a single observation. In the second and third cases, representativeness is defined as a relation between a class and an element or subset, respectively. Since K a h n e m a n and T v e r s k y see the second case as a special case of (he third, we shall deal only with the latter. To use one of the authors' examples, students of astronomy are less representative of the enlire student body than are students of psychology. K a h n e m a n and T v e r s k y contrast this to Ihe first case by pointing out that we now deal with more than one attribute in the population and with more than one element in the sample. T h u s , the first can be regarded as Ihe unidimensional version of the third, as the authors themselves state. Therefore, representativeness is for case 3 identical with the notion of a likelihood p(D\H). where D is a sample rather a single event and H is an unknown, subjective multidimensional distribution rather than a unidimensional one. In the last case, representativeness is defined as a relation between a causal system and a possible consequence. Here, it is no longer a class but a system that may produce a consequence D. F o r example, let H be pneumonia and D be high fever, which is frequently associated with pneumonia. T h i s case seems to be equivalent to the unidimensional case 1, except that the subjective distribution of effects which specifies the likelihoods is now interpreted as " c a u s e d " by a system H. T h i s specific
115
155 interpretation does not, however, change the formal identity of case 4 with case 1 and can therefore be subsumed under case 1 To summarize, representativeness seems to be in all cases reducible to likelihood. T h u s , to say that the subject uses a representativeness heuristic for probability revision seems to be equivalent to saying that the subject uses the likelihood in B a y e s ' theorem, but not the prior probability. O u r conclusion, therefore, is that Bayes' theorem provides the vocabulary for the explanation of the phenomenon, as it also did for the phenomenon itself. T h i s clarifies our point that the explanation offered is but a redescription of the phenomenon. The phenomenon that is called "neglect of base rates" could just as well be called "neglect of base rates and nonneglect of likelihoods." T h e explanation, judgment by representativeness, can be now rephrased as "use of likelihoods and neglect of base rates." The explanation is a redescription of the phenomenon. Although Bayes' theorem has been dismissed by Kahneman and T v e r s k y as a fundamental law of human thinking, the new candidate, representativeness, has inherited major attributes, since it stems from the same framework, as we have shown. T h e s e attributes are typical of formal and calculational approaches to human thinking: it is held that the representativeness heuristic operates independently of the content and the context of the problem; the heuristic has nothing to say about the process of information search; and it emphasizes rationality rather than passion.
Independence
of Human Thinking from Content.
T h e assumption is
that human thinking should be directed by the formal structure of the problem, not by the content. Along with their conviction that B a y e s ' theorem should be used in all problems that seem to have a Bayesian structure, independent of the actual content, Kahneman and T v e r s k y (1972) have the parallel conviction that people use a representativeness heuristic for all contents: "Although our experimental examples were confined to well-defined sampling processes (where objective probability is readily computable), we conjecture that the same heuristic plays an important role in the evaluation of uncertainty in essentially unique situations where no 'correct' answer is available" (p. 451). E x a m p l e s that follow are the judgment of the probability that a particular 12-year-old boy will become a scientist, that a company will go out of business, and that a politician will be elected for office. However, in the recent past it has become more and more evident that if the structure of the problem
'The problem with Tversky and Kahneman's (1982b) distinction into four kinds of judgments by representativeness is that their own formal structure (p. 871 does not make sense in Ihe light of all the examples they give. Here we have concentrated on those examples which fit into their own formal structure.
116
156 posed is held constant, but the content is changed, responses differ. That content is crucial for the laws of human reasoning has been found in problems concerning both inductive and deductive thinking (e.g., see E i n h o r n , 1980; E v a n s , 1984; Griggs, 1983; Pollard, 1982; Wason, 1983). F o r instance, in their research on deductive thinking and the understanding of negations, Wason and J o h n s o n - L a i r d (1972) also started with the assumption that the content they invented to illustrate a structure, say a syllogism or a double negation, would not affect the operation of the laws of thought; " F o r some considerable time we cherished the illusion that this was the way to proceed, and that only the structural characteristics of the problem mattered. Only gradually did we realize first that there was no existing formal calculus which correctly modelled our subjects' inferences" (pp. 244f.). K a h n e m a n and T v e r s k y (1982) also made a retreat, realizing that content is crucial and that, consequently, human reasoning cannot be described by content-independent formal rules. However, what has not been clarified is that judgments by representativeness—because they are in essence judgments by likelihood—are as content independent as formal B a y e s i a n reasoning. Independence of Human Thinking from Context. To claim that human thinking is directed by a general-purpose heuristic such as representativeness implies that the context of the problem is not very important for a theory of thinking. F o r instance, the wording of the problem and the particular example given are contextual variables. A s we pointed out earlier, the proponents of the Wiirzburg and the Gestalt school concluded from their experiments that these contextual variables give an impulse or a direction to the flow of thought. Recently, the role of these '"irrelevant" variables has again come to be appreciated (e.g. Berkeley & Humphreys, 1982; C r o c k e r , 1981; E i n h o r n , 1980). F o r instance, consider the additional information in the E n g i n e e r - L a w y e r problem that a panel of experts were "highly a c c u r a t e " in the same task and that "you will be paid a bonus to the extent that your estimates come close to those of the expert panel." It seems that K a h n e m a n and T v e r s k y added this paragraph to increase the motivation of their students to fill out their questionnaires more carefully. In fact, the impact may be far greater. T h e subjects may understand from the success of the experts that the personality descriptions are highly informative if only one knows how to read them. T h u s they might conclude that there is only one strategy to win the bonus, namely, to concentrate on the description and to forget about the base rates. Whether this explanation of the base rate neglect in terms of contextual variables rather than general heuristics is true or not can be easily c h e c k e d by repeating the experiment with different information. T h e general point, however, is that the representativeness heuristic, like B a y e s ' theorem, seduces one into believing that thinking is directed by a
117
157 few universal laws of thought. But it is becoming increasingly evident that thinking, like perception (see e.g. Birnbaum, 1982), is directed by the specific context in w h i c h a problem is presented. The Blind Spot: Information Search. Statistical hypothesis testing does not start until the variables and numbers needed for the formulas are available. It is not concerned with the preceding measurement process, with the questions, what is the relevant information we shall look for? and how shall w e measure the variables which we consider relevant? L i k e statistical theories, the representativeness heuristic does not deal with the process of information search. We have shown that it stands in the framework of statistical problems, where thinking means mental work on prepackaged information. When the subject enters the laboratory, most if not all of the process o f information search has already been done by the experimenter. In contrast, the older meaning of heuristic tool in Duncker's theory w a s confined to search for new information and new functional relationships. Heuristic tools were "looking around," "inspection," and "selection," and the outcome of heuristic processes were "restructurings" or "judgments of relevance," as recently reemphasized by E v a n s (1984). Passionless Irrationality. T h e belief that Bayes' theorem is the manifestation of rationality has its mirror image in the link between representativeness and irrationality. Since the representativeness heuristic was assumed to be used in almost every content and context, it seemed to provide a cognitive explanation for many a human error, even those usually explained by people's emotions and passions. F r o m base rate fallacy to self-serving attributions to ethnocentric prejudices—such phenomena were now understood as the result of a relatively passionless mind, handicapped by its ignorance of the laws of probability (e.g. Nisbett & R o s s , 1980). T h e parsimonious heuristic user is seen as "governed by a consistent misperception of the world rather than by opportunistic wishful thinking. G i v e n some editorial prodding, he may be willing to regard his statistical intuitions with proper suspicion and replace impression formation by computation whenever possible" ( T v e r s k y & K a h n e m a n , 1971,p. 110).
Causal
versus Incidental
Base
Rates
K a h n e m a n and T v e r s k y posed a second type of problem to their subjects, also of a B a y e s i a n structure, but one in which the likelihood could not be easily interpreted as similarity. Therefore, the supposed base rate neglect could not be explained by representativeness. One of the best known examples is the C a b Problem. Since the publication of the first results around 1972, different versions have appeared; the present one is from T v e r s k y and K a h n e m a n (1980).
118
158 A cab was involved in a hit-and-run accident at night. Two cab companies, the Green and the Blue, operate in the city. You are given the following data: (i) 85% of the cabs in the city are Green and 15% are Blue. (ii) A witness identified the cab as a Blue cab. The court tested his ability to identify cabs under the appropriate visibility conditions. When presented with a sample of cabs (half of which were Blue and half of which were Green) the witness made correct identifications in 80% of the cases and erred in 20% of the cases. Question: What is the probability that the cab involved in the accident was Blue rather than Green?" (p. 162) T h i s problem has the same formal Bayesian structure as the urn-andpokerchip problems, only the content has changed. T h e predominantly blue urn became the Blue cab company, and the chips drawn became the witness report. In contrast to the pokerchip problem, the likelihoods are already numerically given (.80 and .20). F r o m B a y e s ' theorem, T v e r s k y and K a h n e m a n calculated a probability p(B\"B") = .41 that the cab was Blue (B) given the witness report " B l u e " ( " # " ) . If the subjects were " c o n s e r v a t i v e " B a y e s i a n s , they should answer about .25-.30, in between the prior probability (base rate) for Blue (.15) and the Bayesian solution. T v e r s k y and K a h n e m a n (1980), however, report that several hundred subjects gave a modal and median response of .80. Since this median answer coincides with the credibility of the witness (the likelihood p{"B"\B), they conclude that their subjects ignored the relevant base rates. T h e same conclusion was drawn in several studies (see T v e r s k y & K a h n e m a n , 1982a), for undergraduates as well as for professionals. F o r example, C a s s c e l l s , Schoenberger and G r a y b o y s (1978. cited in T v e r s k y & K a h n e m a n , 1982a) posed the following problem lo 60 students and staff at H a r v a r d Medical School: " I f a test lo detect a disease whose prevalence is '/moo has a false positive rate of 5%, whal is the chance that a person found to have a positive result actually has the disease, assuming you know nothing about the person's symptoms or signs?" (p. 154). If one assumes that the test correctly diagnoses this disease in everyone who has it, one c a n calculate the posterior probability by Bayes' theorem, which is 2%. O f the 60 students and staff members, 11 gave this answer, whereas the average answer (mean) was 56% and the most c o m m o n answer (mode) was 95%!. T h i s range amply illustrates the large interindividual differences often found in such studies. T h e conclusion drawn was that "even highly educated respondents often fail to appreciate the significance of outcome base rate in relatively simple formal problems" ( T v e r s k y & K a h n e m a n , 1982a, p. 154). What is the explanation of (his phenomenon? First, it is necessary to
119
159 be clear about what the phenomenon itself is. T h e phenomenon observed is the mean, median, or modal probability judgment. T h e phenomenon is not the neglect of base rates. L i k e "conservativism," the latter is already an interpretation of the observed judgments in terms of B a y e s ' theorem. Nevertheless, T v e r s k y and K a h n e m a n (e.g. 1980, pp. 61ff) talk about "the phenomenon of base-rate neglect." T h i s phrasing is quite interesting, suggesting that B a y e s ' theorem has become such a ubiquitous background to theorizing that the observed fact is Immediately interpreted in terms of B a y e s ' theorem. T h i s interpretation now appears as the fact itself; any observed median probability can be immediately located on a scale with the base rate and the likelihood on the extremes, where the observation appears as either neglect of base rates or overweighting base rates (conservatism). T h u s , the fascination with probability theory has already penetrated the facts observed.
The
"Explanation"Analyzed.
What is the explanation for the phenomenon called the neglect of base rates? Since the reliability of the witness can be hardly called a similarity, K a h n e m a n and T v e r s k y cannot explain this supposed neglect of base rates by a representativeness heuristic. Instead they offer a second explanation: "We propose that the phenomenon of base-rate neglect largely depends on whether or not the evidence is given a causal interpretation" (Tversky & K a h n e m a n , 1980). What does this mean? F o r illustration, the authors give the following variation of the C a b Problem, where information (i) in the previous C a b Problem is replaced by the following information: (i'> "Although the two companies are roughly equal in size, 85% of cab accidents in the city involve Green cabs, and 15% involve Blue c a b s " (p. 63). T h e difference concerns the question, base rate of what? In the original instructions, base rates of Blue cabs in the city were specified, whereas in the new instruction base rates of Blue cabs that were involved in accidents were also specified. With this new base rate, the median answer of subjects was .60, as compared with .80 previously. Intraindividual variability again was large. Since this median answer lay in between the reliability of the witness (.80) and B a y e s ' solution (.41), the authors concluded that the base rate was no longer ignored. T h e explanation given is that there are two different types of base rates, causal and incidental. T h e "city" base rate in (i) is called incidental because the greater base rate of G r e e n cabs in the city "does not justify a causal inference that makes any particular G r e e n cab more likely to be involved in an accident than any particular Blue c a b . " T h e "accident" base rate in (i') is "causal because the difference in rates of accidents
120
160 between companies of equal size readily elicits the inference that the drivers of the G r e e n cabs are more reckless and/or less competent than the drivers of the Blue cabs." ( T v e r s k y & K a h n e m a n I982a, pp. I57-158) But is there also a causal link between a driver's competence and the criminality of fleeing the scene of the accident? A lack of a causal link would imply that the "accident" base rate is also "incidental." One subject's incidental rate might be another's causal rate. We shall return to this question shortly. What does it mean to explain the neglect of base rates by saying that the base rate was incidental? A n urn analogy may be helpful here. Imagine there are Blue and G r e e n urns, each filled with a mixture of witness reports " B " o r " G . " There are different possible experiments, two of which are called "urns in the c i t y " and "urns in accidents." In each experiment, first one urn is randomly drawn from all urns, and then a witness report is randomly drawn from that urn. The experiments differ only in that different populations of G r e e n and Blue urns are used. Imagine you are the subject in the accident experiment. Your task is to guess the color of the urn from which the witness report has been drawn. We tell you that a report B was drawn, and we tell you the proportion of reports B and G in the Blue and G r e e n urns, respectively. We do not give you the base rates of Blue and G r e e n urns in the accident experiment, but, by mistake, we inform you about the base rates in the city experiment. T h e latter do not refer to your experiment. This is exactly the meaning of incidental base rates, and you may be well advised to ignore such base rates. We now repeat the accident experiment with another subject, correct our mistake, and give the subject information about the appropriate accident base rate. T h i s is the situation to which K a h n e m a n and T v e r s k y refer as causal base rates. T h u s , it becomes clear that the distinction between causal and incidental base rales boils down to whether the relevant base rates for one experiment or another are given. This simple analysis of the nature of the new explanation has an interesting consequence: T h e phenomenon disappears, and the subjects may even be rehabilitated as good B a y e s i a n s . The Phenomenon Disappears. T v e r s k y and Kahneman (1980) still believe that a posterior probability of .41 is "the correct answer" to the original version of the C a b Problem, where the city base rates were given. H o w e v e r , as we have argued, they did not specify the relevant base rate "cabs involved in hit-and-run accidents," but only those of a different "experiment." In this situation, a Bayesian might either use the base rates of the city experiment as the unknown prior probabilities and hope that both happen to be equal, or, for lack of better knowledge, fall back on the principle of indifference and assume that as many Green drivers as Blue
121
161 drivers may commit hit-and-run crimes. In the latter case, a B a y e s i a n would calculate a posterior probability of .80, w h i c h , surprisingly, is identical to the median answer given by the subjects. T h e conclusion from this would be that the subjects in fact behaved like Bayesians who were forced to apply the principle of indifference. T h e phenomenon therewith disappears. T h e point is, however, that there is no way to decide about the correct answer. L e t us turn to the revised C a b Problem, where the base rates of cab accidents were given. K a h n e m a n and T v e r s k y now insert the accident base rate instead of, as before, the city base rate into the formula and again calculate .41 as the correct answer (since the numbers of two base rates have been simply exchanged). Note again that the posterior probability asked for is concerned with the hypothesis that a Blue cab was in a hit-and-run situation, not in just any accident or in the city. The relevant hit-and-run prior probability is again unspecified. Why do the authors estimate the missing prior probability using the base rales of accidents rather than of cabs in the city, or by applying Ihe principle of indifference? T h e r e is no discussion of this choice. A s we pointed out earlier this choice means inferring lack of honesty from lack of competence. This may be true, or it may not be. F o r instance, drivers who constantly have accidents may be so well adapted as not to panic and drive off, whereas the driver who never had an accident may be nervous and excited and prone to speed away. In these cases, it would be unreasonable to insert the accident base rates into the formula. But we do not know. The point is not the plausibility of such alternative causal accounts, but rather the choice of which base rate to use must be defended—otherwise we risk mixing up experiments, as in the urn analogy. Consequently, there is no unique way to calculate the posterior probability in this experiment, since the prior probability is unspecified. Kahneman and T v e r s k y ' s calculation of .41 is only one alternative. Therefore, the phenomenon disappears again in this second study, since one cannot conclude that subjects neglect base rates or that they neglect them to a lesser degree, if the base rate itself is not specified. Finally, if we assume that the subjects should apply the principle of indifference, then the median judgment of .60 even appears as an instance of conservatism. T h e general lesson to be learned from this is twofold. First, the explanation of the neglect of base rates by incidental versus causa! base rates can be reduced to an elementary rephrasing of Bayes' theorem: To revise a prior probability into a posterior probability and to assume that this has a single correct answer implies that the relevant base rate is specified. I f not, as in "incidental" base rates, it is nonsensical to talk about a neglect of the base rate. Second, the more Ihe urns in the example take on a specific content, the more ambiguity arises about which base
122
162 rate is the relevant one and therefore the more doubtful it is that there exists a single correct B a y e s i a n answer. We shall return to this fundamental issue in the next section. To summarize, this new psychology of thinking is saturated with probability theory; B a y e s ' theorem gives the framework for the question posed, the problems constructed and even the vocabulary for the phenomenon "observed." Prior probabilities and likelihoods are the new elements of human reasoning. F r o m this perspective, if the mind is not a B a y e s i a n , there are only two possible alternatives; the neglect of base rates, o r the neglect of the likelihoods. T h e former is the result presented by K a h n e m a n and T v e r s k y ; the latter has been concluded from the earlier bookbag-and-poker chips studies. In fact. K a h n e m a n and T v e r s k y ' s explanations, although they appear informal, can also be reduced to these two new elements of the mind. To explain Ihe neglect of base rates by saying that the subject uses a representativeness heuristic is in essence saying that the subject uses the likelihood in B a y e s ' formula, but not the base rate. T h e r e f o r e , the supposed explanation is simply a redescription of the supposed phenomenon. T h e second explanation, causal versus incidental base rates, also boils down lo B a y e s ' formula: To explain the neglect of base rales by saying that the base rales specified in the problem were incidental rather than causal is in essence saying thai the relevant prior probabilities were not specified in the first place. We have repeatedly pointed out how the descriptive and explanatory conclusions hinge on B a y e s ' formula as ihe normative background. We have also repeatedly shown how this entire perspective depends on the assumption that Bayes" theorem is normative. Not only the explanations, but the descriptions of the phenomena themselves are colored by this assumption, as we have seen. H o w e v e r , it is only by neglecting content, context, and information search that this normative assumption is made tenable. In s u m . B a y e s ' theorem has become the framework for reasoning about reasoning.
123
Id Some Perceptual Facts and Issues
T h e two texts of this section axe so well conceived and clearly written that they stand by themselves as good reading. Furthermore, they provide here an appropriate conclusion to the introductory chapter, and a smooth transition to the next. Both texts address fundamental issues concerning the nature of the human mind that have long puzzled philosophers, and both bring neatly argued and convincingly simple — even though still incomplete — answers to these major questions. In article ( 1 3 ) , a chapter from his posthumous book, Mind from Matter? (a title that might well have been chosen as a subtitle for this anthology), Max Delbriick presents the remarkable phenomena of colour, position and size constancy in human perception. He shows that our perceptual apparatus constructs a useful model of the external world, so that primary sense impressions do not have access to consciousness. He stresses an essential distinction between phylogenetic and ontogenetic learning, and ends with a briUiant conclusion about science and perception. Article (14) is a pictorial, highly informative, introduction to brain structures and functions. Blakemore argues about the functional significance of the spatial arrangements of activity in the brain. Why have ordered maps? The existence of isomorphic and non-isomorphic maps is well documented and used as an explanatory clue. Many experimental facts are described; thus this review may serve as a precious outline for a large number of papers in the subsequent chapters. T h e conclusion deals, in a sober and sound fashion, with recurrent issues concerning the relations between science and language. During his progression through this volume, the reader will often find useful a return to the clear introductions provided by these two articles.
124
Reprinted with permission from Mind From Matter, pp.109-119. 1986 © 1986 Blackwell Scientific Publications. Inc.
Perception
One
might naively imagine that v i s u a l perception a m o u n t s to the con-
scious m i n d looking at the image on the retina. T h i s is obviously not so; between retina a n d consciousness there are m a n y processing steps that progressively boii d o w n the information p r o v i d e d by the pattern of neuronal excitation i n the retina. We will n o w m o v e to yet more abstract levels of the perception process, w h i c h illustrate some general principles of adaptive brain evolution. T h e s e levels reflect the capability of the h u m a n cerebral cortex to filter a n d process the v i s u a l input to give "objective," observer-independent information about a n object being observed. T h i s capability was not w o r k e d out by neurophysiologists with their electrodes but "psyched out" by perceptual psychologists. It is manifest in n u m e r o u s perceptual constancy p h e n o m e n a , of w h i c h w e will discuss three examples. I n each of these p h e n o m e n a the perceptual apparatus of the cortex extracts objectified information from the v i s u a l input.
I . T h e first example is the constancy of the perceived color of an object, irrespective of the color of the i l l u m i n a t i n g light. We perceive that a n object h a s the same color, w h e t h e r we see it i n the bluish light of m o r n ing,
i n the r e d d i s h light of e v e n i n g , or i n the y e l l o w light of a fire or of
an i n c a n d e s c e n t l a m p . ( O n e of the rare situations i n w h i c h the perceptual apparatus for color vision is grossly fooled is that of a scene illuminated by a monochromatic light source, s u c h as the y e l l o w s o d i u m lamps u s e d for lighting streets a n d h i g h w a y s . M o n o c h r o m a t i c light is, of course, highly unnatural and was not present i n the environment while the h u m a n
125
110
perceptual apparatus evolved.) T h u s , u n d e r natural conditions, the real color of a n object is perceived regardless of the color of the illuminating light. T h i s is possible because what w e actually perceive as the color of an object is its property of absorption a n d reflection of the r e d , green a n d blue components of polychromatic light, in relation to other objects in the scene. T h i s abstraction of the color of an object from its relative absorbance a n d reflectance of the spectral components of the illuminating light is performed preconsciously, by the intuitive use of the concepts of "white" (meaning light i n w h i c h all spectral colors are equally represented) a n d "complementary color" ( m e a n i n g a member of the color pairs r e d and green, or blue and yellow, w h o s e mixture is perceived as white light). In 1925, E w a l d H e r i n g p r o p o s e d that to assess the color of various objects in a scene, the perceptual apparatus s u r v e y s the whole field of vision a n d defines one object as w h i t e , that is, as reflecting equally all colors of the visible spectrum. T h e light reflected from all other objects is then interpreted relative to the spectral composition of the light reflected by the white object. To make that interpretation, the perceptual apparatus can be thought of as a d d i n g p h a n t o m light of a color complementary to that of the illuminating light so that the object defined as white is actually perceived as white. For instance, if the illuminating light is predominantly red, the perceptual apparatus adds phantom green light to it, w h i c h makes a white object look white rather than red. T h e addition of phantom light of complementary color to the actual light source then provides not only for the perception of the white object as white but also for the reasonably accurate perception of the real colors of ail other objects.
2. T h e second example of a perceptual constancy p h e n o m e n o n is the invariance of the perceived position of an object d u r i n g voluntary h e a d or eye movements. A s a p e r s o n shifts the gaze or turns the h e a d , the image of the objects i n the v i s u a l s u r r o u n d moves on the retina. T h i s m o v e m e n t is not, however, perceived as a morion of the objects; they are correctly perceived as being stationary. T h i s perceptual compensation for the motion of the image is so completely automatic that it is not even consciously registered as a motion of the h e a d or the eyes. It is not registered because the change in position of the image of the objects on the retina is filtered out by the perceptual apparatus. A simple experiment reveals h o w this filtering process w o r k s . If you close one eye and jiggle the other with your fingers, stationary objects are perceived as jiggling. Since in this experiment your eye is m o v e d passively (rather than actively by contraction of your head or eye muscles), the result
126
121 s h o w s that the nurmaily perceived constancy of spatial position during a voluntary m o v e m e n t is the result of y o u r m o v e m e n t being taken into account by your perceptual apparatus. A n o t h e r experiment reveals that what is taken into account here is not the actual occurrence of the voluntary m o v e m e n t itself, but the
command
to the muscles to perform it.
In this experiment the muscle that w o u l d carry out the m o v e m e n t is temporarily paralyzed by injection of a d r u g . U n d e r these conditions, w h e r e the intention to move the eye but not its motion can occur, comm a n d of a n eye m o v e m e n t causes the stationary image of a n object to be incorrectly perceived as a m o v e m e n t of the object. H e r e the perceptual apparatus proceeds as if the eye had been m o v e d voluntarily and adjusts the fixed position of the image to compensate for the intended movement. To inform the perceptual a p p a r a t u s that a c o m m a n d for a h e a d or eye m o v e m e n t has been i s s u e d , a duplicate of the n e r v e impulse pattern directed from the brain to the motor n e u r o n s that c o m m a n d contraction of the appropriate muscles is s i m u l t a n e o u s l y sent to the relevant cerebral neurons. It is this "efference copy" of the c o m m a n d that allows for compensation of the m o v e m e n t of the retinal image. T h e idea of the role of such a n efference copy in p r o v i d i n g for the brain a quantitative expectation of the change i n sensory input resulting from the animal's o w n m o v e m e n t s was first proposed by E r i c h v o n Hoist in the early 1950s. T h e r e is n o w
neurophysiological e v i d e n c e
that eye m o v e m e n t
com-
m a n d s do influence neurons i n the v i s u a l cortex i n a m a n n e r consistent with v o n Hoist's efference copy proposal.
3. T h e third example of a perceptual constancy p h e n o m e n o n is the invariance of the perceived size of a n object regardless of the distance at w h i c h it is seen. W h e n an object is m o v e d t o w a r d the eyes, the size of its retinal image increases, yet the object is not incorrectly perceived as increasing i n size. T h e perceptual a p p a r a t u s accomplishes this c o m p e n sation by evaluating distance information a c c o r d i n g to the principles of perspective. A variety of clues are available to the perceptual apparatus regarding the distance of the object. For instance as the object approaches, the curvature of the eye lens increases (by action of the ciliary muscles) to keep the retinal image of the object i n focus, a n d if both eyes are to stay trained on the a p p r o a c h i n g object, their optical axes must increasingly converge (by action of the t w o sets of extraocular muscles). Either of these clues can be p r o v i d e d by a n "efference copy" of the c o m m a n d to the eye muscles. But if both eyes r e m a i n trained on a stationary distant object, the images cast on their retinas by the a p p r o a c h i n g object will
127
112
Command to the Eye Muscles to Move the Eveball Message to the Brain
Efference Copy
Eve Muscles
>
Message from the Retina about Movement
Block diagram illustrating von Hoist's hypothesis that the cerebral command for an eye movement is added, in the form of an efference copy, to the eye's report of perceived morion. According to this hypothesis, this operation determines whether the perceived motion is interpreted as a result of the percipient's own movement, or of the movement of the perceived object or visual surround. [After Hassenstein, 1971]
become increasingly disparate (parallax). T h i s third clue can be p r o v i d e d by n e r v e cells in the visual cortex that are especially dedicated to computing the binocular disparity of retinal images, to subserve the stereoscopic depth perception made possible b y the evolution of frontal eyes. O n the basis of these clues about the closeness of approach, the perceptual apparatus adjusts the perceived size of the retinal image to y i e l d the correct view that an object of constant size is coming closer. It is to be noted that all the processes mentioned here as being r e s p o n sible for perceptual constancy p h e n o m e n a involve preconscious operations. Hence it might be said that sensations as s u c h do not have access to consciousness. T h e processes by w h i c h percepts are abstracted from the sensory input cannot be introspected by the percipient. T h i s point is often overlooked w h e n physicists discuss the nature of reality, since
128
113
they tend to equate sensation i n the s e n s o r y organs w i t h what is presented to the consciousness. T h e conscious m i n d h a s no access to raw data; it obtains only a highly processed portion of the input. F r o m the evolutionary viewpoint, s u c h processing is e n o r m o u s l y adaptive, since it a l l o w s the m i n d to cope w i t h the real w o r l d . F o r instance, D o n a l d M c K a y suggests that voluntary m o v e m e n t s of the eyes are really preconscious questions about the w o r l d : if u p o n m o v i n g the eyes the images of objects move, the perceptual apparatus infers that the objects are i n reality stationary. A l l these processes s e r v e to abstract information free of the vagaries of the sensory organs a n d thus allow us to construct an objective w o r l d from our sensations. T h e abstraction of constancy of object color, position, a n d size from the retinal image represents only a very low level of the entire process of perception. T h e next higher level is that of constancy of abstraction of form. That is, w h e n an object is seen u n d e r c h a n g i n g conditions, its perceived form remains the s a m e e v e n though entirely different sets of sensory receptors are stimulated at different times. T h e variability of the object's aspects simply does not reach the consciousness. What is abstracted is the form of the object, irrespective of the particular part of the sensory system that provided the information about the object. T h i s capacity for abstraction of form from diverse bits of information, s u c h as the illumination a n d the angle from w h i c h an object is seen, is what certain psychologists call Gestait perception, n a m e l y the ability to see an object as a w h o l e . T h i s capacity is a precondition to forming the category of object, w h i c h , as we shall soon see, is one of the early concepts that the developing intellect of the infant produces. H o w is it that the m i n d perceives a n object as a w h o l e , a n d the w o r l d as only one real w o r l d , in view of the fact that the brain consists of two h e m i s p h e r e s ? O r to put this question i n another way, h o w do the two h e m i s p h e r e s give rise to a single m i n d ? F r o m o u r previous d i s c u s s i o n , we k n o w that the right half of the visual field projects to the left v i s u a l cortex, and the left half of the v i s u a l field to the right cortex. A similar crossed pathway also obtains for the projection of the acoustic field to the auditory cortex a n d for the projections from the bilateral motor areas of the cortex, w h i c h issue their c o m m a n d s to the musculature o n the opposite side of the body. T h e existence of t w o halves of the brain is not in itself s u r p r i s i n g , since the h u m a n b o d y is, on the w h o l e , bilaterally symmetric. However, the two halves of the body must communicate, because they behave in a coordinated manner. It might be thought that this coordination or integration of the t w o halves w o u l d take place i n a n organ that is u n i q u e rather than bilaterally
129
114
Gestalt perception is an active contribution of our perceptive apparatus to the interpretation of these figures. In each case, structures are seen that are not actually present. They are called subjective contours. When the contours are examined closely, they disappear. One subjective contour even appears to pass under another that intersects with it (b). Optical illusions demonstrate that subjective contours have the same functional effect as real contours, as indicated by the Ponzo illusion shown in (d). Although both vertical lines are the same length, the effect of the subjective triangle is to make the line on the left appear longer. [After Kanizsa, 1976]
symmetric, reflecting the unity of the m i n d . Rene Descartes took this line of reasoning early in the seventeenth century. I n the course of h i s anatomical explorations of the brain, Descartes discovered the pineal b o d y — a single organ l y i n g near the center of the b r a i n — a n d designated it as the seat of the m i n d . M o d e r n investigations have s h o w n the inadequacy of the theory of the unity of the m i n d in general and of its seat in the pineal body in particular. A s for the pineal body, it appears to function, not as the seat of the m i n d , but as a component of the biological clock that controls the daily r h y t h m s of physiology and behavior.
130
235 N o , i n our quest for the m i n d , we m u s t focus on the cerebral cortex, w h i c h is the organ of consciousness and of language. Do consciousness a n d language refer to the s a m e function? C e r t a i n l y not, for otherwise we w o u l d never be at a loss for w o r d s to express o u r thoughts. But can we be conscious of something that w o u l d be impossible for u s to verbalize? H e r e the w o r k on lateralization of various cortical capabilities has revolutionized our insights. N o r m a l l y the two h a l v e s of the cortex are so intimately integrated that investigation of their i n d i v i d u a l function is v e r y difficult. Some years ago, however, surgeons introduced an operation for patients w i t h severe epilepsy. I n this operation the corpus callosum, a massive strand of n e r v e fibers connecting the two h e m i s p h e r e s , is severed. Superficial observation of s u c h patients after their operation indicated, very surprisingly, that they s e e m e d to have suffered no loss at all of their n o r m a l perceptions, their motor activities, or their speech. But more refined studies by Roger Sperry of "split brain" patients revealed a fantastic situation: v i s u a l , auditory, or tactile inputs can be so d e s i g n e d as to reach only one of the h e m i s p h e r e s , a n d w h e n this is d o n e the other h e m i s p h e r e literally does not k n o w about it. T h i s c a n be tested by asking the patient to identify—say, by touch w i t h the right h a n d , controlled by the left h e m i s p h e r e — a n object seen in the left half of the v i s u a l field, a n d thus w i t h visual input leading to the right h e m i s p h e r e . T h e patient cannot do it, even though the verbal i n s t r u c t i o n is given to both h a l v e s of the brain. E v e n more strikingly, it t u r n s out that the right half is incapable of verbalizing what it "knows," e v e n though this k n o w l e d g e is clearly present, since the right half is able to use it for solving complex mental tasks. H o w is this astonishing result to be explained? A s we saw in chapter 6, the cortical areas dedicated to the production of speech are present on only one side (usually the left), rather t h a n symmetrically distributed over both hemispheres. W h e n the corpus c a l l o s u m is cut, the left h e m i sphere, w h i c h contains the speech p r o d u c t i o n centers, has no idea of what is being presented to the right. N e v e r t h e l e s s , w h i l e the right h e m i sphere cannot produce speech, it is capable of a great deal of mental processing on its o w n . T h e cerebral d i c h o t o m y goes so far that the patient may s h o w a n emotional response, say by a smile, w h e n seeing a picture with the right brain, but is asked why he or she s m i l e s , the verbalizing left brain can only admit ignorance. T h e s e great discoveries s h o w that we have two m i n d s u n d e r one roof, t w o m i n d s n o r m a l l y so well integrated that their separation is inapparent: they talk to each other via the corpus callosum, a n d then talk to the outside w o r l d with one voice, controlled by the left m i n d .
131
Lateralization of function in the cerebral hemispheres of the human brain. The left hemisphere (in the majority of individuals) is specialized for language comprehension, speech, and computational abilities. The right hemisphere is specialized for spatial constructions and nonverbal ideation, and also possesses simple verbal abilities. These lateralized functions can best be demonstrated in individuals in which the neural connections (via the corpus callosum) between the two cerebral hemispheres have been severed in the surgical treatment of severe epilepsy. [After Sperry, 1968]
Does the right h e m i s p h e r e , the n o n s p e a k i n g hemisphere, have consciousness? It certainly has a m i n d , in the sense that it can hear a n d u n d e r s t a n d speech and rationally answer questions, not by speech, but by solving problems. W h e t h e r one wants to call this consciousness is a matter of terminology, a n d terminology i n this area is not settled at present. I n any case, the unity of the m i n d i n n o r m a l persons is evidently the result of an interhemispheric c o n s e n s u s mediated by the corpus callosum. We will n o w strive to take a broader view of h u m a n perception and consider the relevance of these neuropsychological findings for epistemology, that is to say, for the philosopher's quest for u n d e r s t a n d i n g how we come to k n o w what we know.
132
117
Until quite recently, it was one of the basic misconceptions of philoso p h e r s that the m i n d deals with p r i m a r y sense impressions a n d that the individual learns to make the abstractions that form the basis of perception. For instance, according to the e m p i r i c i s m of the early eighteenth century, as formulated mainly by D a v i d H u m e a n d the F r e n c h encyclopedists, the m i n d at birth is a clean slate o n w h i c h there is gradually sketched a representation of the real w o r l d , built on cumulative experience. T h i s representation is orderly, or structured, because, thanks to the principle of inductive reasoning, we can recognize regular features of our experience a n d infer causal connections b e t w e e n events that habitually occur together. T h i s viewpoint rejects as a logical absurdity the possibility of innate or a priori knowledge of the w o r l d , that is, k n o w l edge possessed prior to h a v i n g experienced the w o r l d , w h i c h was a central feature of the seventeenth c e n t u r y p h i l o s o p h y of rationalism advocated by Descartes. I n the latter part of the eighteenth century, h o w ever, I m m a n u e l K a n t demonstrated that empiricist p h i l o s o p h y a n d its rejection of the possibility of a priori k n o w l e d g e is g r o u n d e d on a n i n a d equate u n d e r s t a n d i n g of the m i n d and its relation to reality. K a n t pointed out that sensory impressions can become experience, that is, gain meaning, onlv after they are interpreted in terms of the a priori categories— s u c h as time, space and object—that w e bring to, rather than derive from, experience. Tacit resort to propositions whose validity is similarly accepted a priori, s u c h as "Some A are B; therefore all A are B" (induction) or "The occurrence of a set of conditions A is both necessary a n d sufficient for the occurrence of B" (causation by A of B), allows the m i n d to construct reality from that experience. K a n t referred to these a priori categories a n d propositions of cognition as "transcendental," because they trans c e n d experience a n d were thought by h i m to be beyond the scope of scientific i n q u i r y But is it not strange that if, as K a n t alleges, we bring s u c h categories as time, space, a n d object, as well as the notion of causality, to sensation a priori, that they h a p p e n to fit the real w o r l d so well? C o n s i d e r i n g all the bizarre ideas we might have had prior to experience, it s e e m s nothing short of miraculous that our a priori ideas h a p p e n to be those that fill the bill. T h e way to resolve this d i l e m m a opened w h e n C h a r l e s D a r w i n put f o r w a r d the theory of natural selection i n mid-nineteenth century. But few philosophers or scientists s e e m e d to have noticed this until K o n rad L o r e n z d r e w attention to it in the 1940s. L o r e n z pointed out that the empiricist a r g u m e n t that knowledge about the w o r l d c a n enter the m i n d only through experience is valid if we consider only the ontogenetic devel-
133
118
opment of m a n , from fertilized h u m a n egg to adult person. But once w e also take into account the phylogenetic development of the h u m a n brain through evolutionary history, it becomes clear that persons c a n k n o w something of the w o r l d innately, prior to a n d independent of their o w n experience. After all, there is no biological reason w h y such k n o w l e d g e cannot be passed o n from generation to generation via the ensemble of genes that determines the structure a n d function of our brain. For that genetic ensemble came into b e i n g through the process of natural selection operating on our remote ancestors. A c c o r d i n g to L o r e n z , "experience h a s as little to do w i t h matching of the a priori with reality as does the matching of the fin structure of a fish w i t h the properties of water." In other w o r d s , the K a n t i a n notion of a priori knowledge is not i m p l a u sible at all. Rather, Kant's claims of the "a-priori-ness" of s u c h categories as space, time, and object, as w e l l as of causality, as transcendental c o m ponents of cognition almost hit the nail on the head. T h e s e ideas are indeed a priori for the individual, but they did not fall from heaven; they are matters of evolutionary adaptation, designed for s u r v i v a l i n the real world. It appears therefore that two k i n d s of learning are i n v o l v e d i n our dealing w i t h the w o r l d . O n e is phylogenetic learning, in the sense that during evolution we have evolved very sophisticated m a c h i n e r y for perceiving and making inferences about a real w o r l d , of w h i c h the preconscious neurophysiological abstraction processes acting on visual input, the perceptual constancy p h e n o m e n a associated with vision, a n d the interhemispheric c o n s e n s u s of our two m i n d s reached via the corpus callosum are but a few examples. T h e y s h o w that, collectively a n d across history, the h u m a n species h a s learned to deal with signals c o m i n g from the outside world by constructing a m o d e l of it. I n other w o r d s , w h e r e a s in the light of m o d e r n u n d e r s t a n d i n g of evolutionary processes, w e can say that the individual approaches perception a priori, this is by no m e a n s true w h e n we consider the history of m a n k i n d as a whole. W h a t is a priori for individuals is a posteriori for the species. T h e second k i n d of learning involved in dealing with the world is ontogenetic learning, namely the lifelong acquisition of cultural, linguistic, a n d scientific knowledge. T h u s we see the world t h r o u g h multiple pairs of glasses: some of them are inherited as part of o u r physiological apparatus, others acquired from direct experiences as we proceed through life. I n a sense, the discoveries of science help us to see what the w o r l d is like without some of these pairs of glasses. A s K o n r a d L o r e n z has put it, every step of knowledge means taking off a pair of glasses—but we could never dispense with all of them.
134
119
REFERENCES Campenhausen, C . von. 1981. Die Sinne ties Menschett. Stuttgart: Thieme. Geschwind, N. 1978. Specialization of the human brain. Scientific American 241(3): 180-199. Hassenstein, B. 1971. Information and Control in the Living Organism. London: Chapman and Hall. Hubel, D. H . , and T. N. Wiesel. 1970. The period of susceptibility to the physiological effects of unilateral eye closure in kittens, journal of Physiology 206: 419 - 436. Kanizsa, G . 1976. Subjective contours. Scientific American 243(4):48-52. Lorenz, K. 1941. Kant's Lehre vom apriorischen im Lichte gegen war tiger Biologie (Kant's doctrine of the a priori in the light of contemporary biology]. Blatter fitr Deutsche Philosophie 15: 94-125. In General Systems, ed. Bertalanffy and Rapoport, vol. 7, pp. 23-35. A n n Arbor: Society for General Systems Research, 1962. . 1959. Gestaltwahrnehmung als Quelle wissenschaftlicher Erkenntnis (Gestalt perception as a source of scientific knowledge), Zeitschrift fitr Experimented und Angeimndte Psychologie 6: 118-165, In Studies in Animal and Human Behaviour, vol. 2, pp. 281-322. Cambridge: Harvard University Press, 1971. Also in General Systems, ed. Bertalanffy and Rapoport, vol. 7, pp. 37-56. Ann Arbor: Society for General 5ystems Research, 1962. McKay, D. M, 1971, Voluntary eye movements are questions. Bibliotheca Ophthalmologics 82; 369-376. Purves, D., and J. W. Lichtman. 1985. Principles of Neural Development. Sunderland, Mass.: Sinauer Associates. Sperry, R. W. 1968. Mental unity following surgical disconnection of the cerebral hemispheres. The Harvey Lectures 62: 293-323. . 1982. Some effects of disconnecting the cerebral hemispheres. Science 217: 1223-1226. Wiesel, T. N., and D. H . Hubel. 1963. Single cell responses in striate cortex of kittens deprived in one eye. journal of Neurophysiology 26: 1003-1017.
135
Reprinted with permission from Images and Understanding, pp. 257-283, 1990 © 1990 Cambridge University Press
Understanding images in the brain COLIN
liLAKHMORK
The homunculus
fallacy
If y o u h a v e n e v e r read L u d w i g Wittgenstein's great treatise Philosophical Investigations', I h a v e just a single w o r d of advice. Don't! O r , to be less facetious, do it with a n open m i n d and a stiff d r i n k , ft is a n e x t r a o r d i n a r y a n d puzzling book - a m i x t u r e of brilliant insight a n d frustrating c o n u n d r u m s . Wittgenstein's e n d u r i n g c o n c e r n w a s for the n a t u r e o f l a n g u a g e a n d w h a t it tells us about the m i n d s of language-users. 1 lis a r g u m e n t s led h i m lo one c o n c l u s i o n that seems trivial, maybe just plain w r o n g , but w h i c h
has
s p a w n e d w o r r y i n g criticism of o u r use of e v e r y d a y w o r d s lo describe h o w I he brain w o r k s . Wittgenstein wrote: 'Only of a living h u m a n being a n d w h a t resembles (behaves like) a living h u m a n being c a n one say: il h a s .sensations: 1
it sees; is blind; hears; is c o n s c i o u s or u n c o n s c i o u s ' ( W i t t g e n s t e i n ) . 1
A n t h o n y Kenny- , in his spirited defence of Tlie Legacy of Wittgenstein, laments w h a t he sees as the frequent rejection of this d i c t u m by psychologists of perception, by neurophysiologists a n d by those w h o a r e interested in m a c h i n e intelligence. He coined the t e r m h o m u n c u l u s fallacy' to describe s u c h false ascription of predicates that a r e valid only for whole h u m a n beings w h e n t a l k i n g about bits o f a n i m a l s or people (especially their brains), o r about electronic circuits. W h a t , exactly, is the problem, y o u m i g h t ask? After all, noone w o u l d suggest that a one-legged m a n c a n n o t hear (In the sense that w e use that w o r d to describe the auditory perceptions of a n intact h u m a n being). A n d is it really a gross c o n c e p t u a l b l u n d e r lo talk of a robot w i t h a n artificial eye as s m u g ? H o w complete m u s t a h u m a n being be a n d h o w h u m a n - l i k e m u s t the b e h a v i o u r of a m a c h i n e be in order for the predicates of sensation a n d c o n s c i o u s n e s s to be valid? T o clarify Ihe issue we need a n example, a n d the one Uiul A n t h o n y K e n n y develops is the question of h o w the brain perceives the v i s u a l world (il y o u will, for the m o m e n t , forgive me for deliberately c o m m i t t i n g the h o m u n c u l u s fallacy!). K e n n y t h i n k s that present-day v i s u a l neurophysiologists a n d psychologists h a v e still not escaped from a c o n c e p t u a l tangle first recoanized bv 136
258
Rene Descartes. In his description of the formation of the retinal image, in La Dioptrique of 1 6 3 7 . and his subsequent suggestion that images of a sort are formed in the brain. Descartes both invented and acknowledged the major problem. 'The things we look at', he wrote, 'form quite perfect images in the back of our eyes.' Furthermore, such an image is 'not only produced in the back of the eye but also transmitted lo the brain'. Descartes did not. of course, know about nerve impulses or synaptic transmission and he followed ancient Aristotelian views about the importance of fluids or 'spirits' in the body. The clear cerebrospinal fluid, which fills the cavities or ventricles of the brain, had long been thought of as a pure and perfect distillation of blood and air. ft was called animal spirit by the medieval philosophers, who believed it to be the source of the anima or soul. Descartes imagined that the myriad fibres in the optic nerves transmitted a coherent copy of the retinal image, in the form of a pattern of vibrations, on to the surface or the ventricles of the brain. Now. the traditional medieval Cell Doctrine had suggested that the ventricular system of the brain is divided into three cells, serving different aspects of mental function, from sensation (in the first cell) through thought and reason (middle cell) to memory or movement (last cell). Descartes' most famous addition to this scheme was his attribution of special status to the pineal gland, a tiny organ at the back of the ventricles (the true function of which is still obscure). He thought that the pineal gland secreted the animal spirit and acted as an interface between the soul and the mere physical machinery of the brain. In Les Passions de I'dme*. Descartes imagined that the two pictures of any particular shape in the outside world (one from each eye), formed on the lining of the ventricles, set up corresponding patterns of motion in the fluid, which 'radiate towards the little gland which is surrounded by these spirits. Thus the two images in the brain form only one image on the gland, which, acting directly on the soul, causes it to see the shape.' This single picture on the pineal gland not only fused the images of the two eyes into a single view of the world but even conveniently re-inverted itl But then comes the problem of the homunculus fallacy. How can simply having an image in the brain constitute seeing that image? Descartes had earlier acknowledged this problem for the picture on the walls of the ventricles. 'The picture to some extent resembles the objects from which it originates. .. but we must not think that it is by means of this resemblance that the image makes us aware of the objects - as If we had another pair of eyes inside the brain to see it.' What was different about the image on the pineal gland was, presumably, its direct communion with the soul, which was, in effect, just such an internal pair of eyes - indeed an internal perceiver, thinker. 1
137
259
;
t ig. 17.1.
Rene Descartes w a s n u l Ihe lirsl lo observe ihu Inverted Image formed
on the retina of the eye. but Descartes' description i n La Dioptrique of 16 S7 w a s the most complete at that time. This diagram shows the technique that Descartes used to reveal this image. He cut a window in the outer coat of the back of a n ox eye and covered it with paper. He then saw the liny inverted image formed on the paper screen.
138
260
Pig. 1 7.2.
D u r i n g Ihe medieval period, under the influence of the C h u r c h , very
little experimental observation w a s carried out. Theories ofbrain function rested Inrgely on the writings of the Greeks a n d the dissections of Galen of Rergama ( 1 2 9 - 1 9 9 A D ) . One dominant theme, w h i c h persisted until the 18th century, w a s that mental processes take place in Ihe fluid ('animal spirit') that tills Ihe chambers or ventricles or the brain. There are 3 s u c h chambers, the first being divided into a pairoflateral ventricles. According to the Cell Doctrine' of Ihe Karly Christian Fathers, this first pair received messages from the sense organs: ihe middle one w a s responsible for judgement, thought a n d reason; and the last was the seat of memory a n d (sometimes) movement. This illustration of the Cell Doctrine comes from the 1 5 0 6 edition of Philosophia pauperum of Alberlus M a g n u s ( 1 2 0 6 - 8 0 ) . published as Philosophia naturallsby M . Furter, Basle. T h e three ventricles are s h o w n as simple circles, each divided In two.
chooser and knower. That, of course, is the essence ofCartesian Dualism. The body, including the brain, is a machine; a delicate, complex, beautifully interconnected machine - but merely a machine. In the Traite de Vhomme of 1664 Descartes asks us to imagine that all the functions that I attribute to this machine such a s . . . waking and sleeping: the reception of light, sounds, odours . . ., the impression of ideas, . . . the retention of those ideas in memory; . . . appetites and passions; and finally the movements of all the external members... occur naturally in this machine solely by the disposition of its organs, no less than the movements of a clock.
139
261
Fig, 17.3.
I n this famous diagram from the Traite de I'homme af 1 6 6 4 , Des-
cartes got the anatomy very wrong but made a number of conceptual advances. He suggested that nerve fibres from each eye project into the brain in a coherent bundle, thus reproducing a n array of activity (2,4,6) on the wall of the ventricles corresponding to the retinal image (1,3,5). He also proposed that combination of signals from the two eyes might occur to form a binocular image (a,b,c) on the pineal body (H), w h i c h Descartes saw as the site of communication between the spiritual soul and the machinery of the brain. I n fact w e now k n o w that a binocular 'map' Is indeed set up, not in the pineal, but In the primary visual cortex at the back of Ihe cerebral hemispheres.
But, to Descartes, the essentially human features of existence and action consciousness, perception, decision and w i l l - w e r e the domain of the soul, a mysterious, silent agent that watched the pictures in the brain and poked its invisible fingers into the clockwork from time to time to guide the mere machine. If we resist the intellectual surrender of Dualism, is it possible to steer a course around the rocks of the homunculus fallacy on our way to a material explanation of behaviour and even of perception and consciousness?
Order and disorder in brains and machines Computers can do very clever things without forming pictures in their chips. In the most general sense of the word computer, the brain is a computer - in the immortal words of Marvin Minsky, it is a 'meat machine'. Then could the brain not work perfectly well without having to depend on Cartesian inner images? In June, 19 86. a steering committee for a System Development Foundation Symposium on Computational Neuroscience composed a 'formulation of
140
262
dialectics' i n w h i c h t w o opposed positions w e r e put f o r w a r d i n response to the q u e s t i o n 'Is p h y s i c a l locality a n essential p a r t of n e u r a l c o m p u t a t i o n ? ' : Position ) . . . : The anatomical structure of the brain has no more to do with its function than the shape of the cabinet of a Vax fa powerful computer), or the location of its pc boards. Brain function is determined by the logical and dynamic connection properties of Its neurons. The actual physical structure, location, architecture, and geometry is irrelevant compared to its logical, connectiontst aspects. Position 2: Significant recent work in brain research has been related to the discovery and elucidation of detailed forms of somatotopic mapping, laminar specialization in cortex, and columnar architectures representing sensory sub-modalities. These forms of functional architecture may represent a major mode of brain function: the formatting of sensory data in a manner which simplifies its further processing. One of the major differences in computational style of brain versus Vax may well be the indifference of the Vax to its geometry and the exquisite attention paid by the brain to its geometry'. W h e t h e r or not the p h y s i c a l , s p a t i a l a r r a n g e m e n t of activity i n the b r a i n h a s functional
significance is the issue that I w a n t to address. Undoubtedly
spatial o r d e r is a d o m i n a n t feature of the b r a i n , w h i c h is far from the r a n d o m n e t w o r k t h a t s o m e t h e o r e t i c i a n s w o u l d h a v e liked it to be. T h e h a n d f u l of jelly t h a t fills the h u m a n h e a d is i n fact a tightly p a c k e d c o n g l o m e r a t e of separate c l u m p s of n e r v e cells w r a p p e d i n h u g e bundles of a x o n s (nerve fibres). T h e s t e m of the b r a i n , the c e r e b e l l u m , the c e r e b r a l h e m i s p h e r e s - they a r e a l l distinct s t r u c t u r e s w i t h t h e i r o w n c h a r a c t e r i s t i c a p p e a r a n c e s i n m i c r o s c o p i c sections, a l l laid out a n d i n t e r c o n n e c t e d i n a h i g h l y ordered fashion. E v e n the c e r e b r a l cortex, the w r i n k l e d m a n t l e of grey m a t t e r that s w a d d l e s the c e r e b r a l h e m i s p h e r e s , is, i n fact, a vast p a t c h w o r k of distinct f u n c t i o n a l a r e a s . I n 1 6 7 2 , i n Oxford. T h o m a s W i l l i s first suggested, o n the basis of a n a t o m i c a l e v i d e n c e as w e l l as the effects of brain d a m a g e , t h a t m e n t a l f u n c t i o n s take place i n the c e r e b r a l cortex r a t h e r t h a n i n the fluid of the v e n t r i c l e s . B u t the first s u g g e s t i o n t h a t specific functions a r e localized i n different regions of the c e r e b r a l cortex c a m e from the bogus science of p h r e n o l o g y , w h i c h c l a i m e d t h a t v a r i o u s m e n t a l functions are performed in 'organs', distributed o v e r the b r a i n , w h o s e development is reflected i n the size of b u m p s o n the s k u l l . F r a n z Joseph G a l l , the A u s t r i a n p h y s i c i a n w h o founded p h r e n o l o g y at the start of the 1 9 t h c e n t u r y , w a s a n expert a n d respected anatomist, w h o
m u s t at least take credit for d r a w i n g attention to
the
distinctive folds a n d creases i n the cortex. E v e n the attempt to correlate i n d i v i d u a l differences i n m e n t a l faculties w i t h v a r i a t i o n s in the s t r u c t u r e of the b r a i n w a s r e m a r k a b l y o r i g i n a l a n d far-sighted. P h r e n o l o g y w a s scientific
141
263
Fig. 17.4.
This illustration, from the 1 6 6 4 edition of Cerebri Analomie of
Thomas Willis 0- Martyn and |. Allesiry, London), is perhaps the most famous 1 7 l h century diagram of Ihe h u m a n bruin. The original drawing w a s made by Christopher W r e n , w h o worked for Willis in Oxford. The brain is s h o w n from below, with the brainstem, surrounded by Ihe cerebellum (B). Many of Ihe cranial nerves, including Ihe optic nerve {li) are seen emerging from Ihe brainstem. T h e blood supply of Ihe base of Ihe brain is very clearly illustrated, as are the great cerebral hemispheres ( A ) , with their m a n y folds and grooves. Willis was the first to attribute mental functions lo the cerebral hemispheres.
n o n s e n s e not b e c a u s e of a defect in its u n d e r l y i n g logic b u t b e c a u s e of its d e p e n d e n c e on a n e c d o t a l o b s e r v a t i o n s a n d i n a d e q u a t e s a m p l e s . I r o n i c a l l y , a l t h o u g h the m e t h o d s a n d c o n c l u s i o n s of p h r e n o l o g y w e r e s o o n discredited a n d
ridiculed
by the scientific c o m m u n i t y , the c o n c e p t of f u n c -
t i o n a l l o c a l i z a t i o n s u b s e q u e n t l y b e c a m e the b e d r o c k of b r a i n r e s e a r c h , I n the
142
264
Fig. 17.5.
T h i s is one or the m a n y diagrams illustrating the bogus theory of
phrenology that were popular for more than a hundred years after the distinguished anatomist Franz foseph Gall (1 7 5 8 - 1 8 2 8 ) first suggested that mental faculties were localized in distinct brain organs and that character could be diagnosed by feeling the bumps of the skull.
1860s, neurologists, especially John Hughlings Jackson in London and Pierre-Paul Broca in Paris, started to chart the surface of the cerebral hemispheres, using as their evidence the effects on behaviour, perception, language, movement and personality of localized damage to the brain. Broca described an area in the frontal lobes, usually on the left side, that now bears his name and which is essential for normal speech. Damage here (through stroke or injury) robs people of their ability to say more than a word or two, but not of their understanding of the spoken or written word. In the context of images in the brain. Hughlings Jackson's findings were even more significant. He described the way in which epileptic seizures sometimes start with involuntary twitching in one particular group of
143
265
m u s c l e s , often those of the t h u m b , the c o r n e r of the m o u t h o r o n e of the toes, before spreading to other parts of the body. B y r e l a t i n g the starting point of these ' J a c k s o n i a n fits' to s m a l l regions of degeneration a n d i r r i t a t i o n , discovered post mortem i n the b r a i n . J a c k s o n c o n c l u d e d that a strip of cortex, r u n n i n g d o w n the middle of e a c h cerebral h e m i s p h e r e j u s t b e h i n d Broca's a r e a , is responsible for a c t i v a t i o n of the m u s c l e s on the opposite side of the body. W h a t is more, t h e motor strip on e a c h side is a r r a n g e d
topographically,
the m u s c l e s of the toes a n d leg being controlled by the top of the strip, w i t h the torso, a r m . h a n d a n d face represented s u c c e s s i v e l y a l o n g the strip. T h i s motor area of the cortex h a s n o w been explored i n a n i m a l s as well a s h u m a n s , by a variety of e x p e r i m e n t a l techniques, a n d they a l l c o n f i r m H u g h l i n g s J a c k s o n ' s opinion t h a t there is a k i n d of image of t h e body m u s c l e s i n t h e b r a i n . Since H u g h l i n g s J a c k s o n ' s time, the concept of f u n c t i o n a l sub-division a n d topographic representation h a s become a sine qua non of brain r e s e a r c h . T h e task of c h a r t i n g the b r a i n is far from complete b u t t h e successes of the past m a k e o n e confident that e a c h part of the b r a i n ( a n d especially the cerebral cortex) is likely to be organized in a spatially ordered fashion. Just a s i n t h e decoding of a cipher, the t r a n s l a t i o n of L i n e a r B or the r e a d i n g of hieroglyphics, all t h a t w e need to recognize the order i n t h e brain is a set of rules - rules that relate the activity of n e r v e s to events in the outside w o r l d o r in the a n i m a l ' s body. T h e most fully explored spatial representations a r e those i n t h e regions of cerebral cortex reserved for t h e three m a j o r senses - v i s i o n , h e a r i n g a n d bodily sensation ( t o u c h , p a i n , etc.). I n C h a p t e r 1, H o r a c e B a r l o w h a s already described these sensory a r e a s (see F i g . 1.2) a n d it is i m p o r t a n t to a d d that e a c h p r i m a r y sensory a r e a is n o w k n o w n to be s u r r o u n d e d by a n u m b e r of additional areas, e a c h probably c o n t a i n i n g its o w n distinctive form of s p a t i a l a r r a n g e m e n t of information-processing (but m o r e of this later). Faced w i t h s u c h o v e r w h e l m i n g evidence for topographic patterns of a c tivity i n the b r a i n it is h a r d l y s u r p r i s i n g that n e u r o p h y s i o l o g i s t s a n d n e u r o a n a t o m i s t s h a v e c o m e to speak of the b r a i n h a v i n g maps, w h i c h a r e t h o u g h t to play a n essential part in the representation a n d interpretation of the w o r l d by the b r a i n , j u s t a s the m a p s of a n atlas do for the r e a d e r of t h e m . T h e biologist J.Z. Y o u n g writes of the brain h a v i n g a l a n g u a g e of a pictographic kind: ' W h a t goes o n i n the brain m u s t provide a faithful representation of events outside, a n d the a r r a n g e m e n t of the cells in it provides a detailed model 6
of the w o r l d . It c o m m u n i c a t e s m e a n i n g s by t o p o g r a p h i c a l a n a l o g i e s ' . B u t is there a d a n g e r i n the m e t a p h o r i c a l use of s u c h t e r m s as 'language', ' g r a m m a r ' a n d m a p ' to describe the properties of the b r a i n ? Peter H a c k e r , the Oxford Wittgenstein s c h o l a r , sees it a s a n extreme e x a m p l e of t h e h o m u n c u l u s
144
266
Fig. 17.6. Hughlings Jackson's idea that there is a motor map In the human cerebral hemispheres received strong support from the observations of the neurologists W. Penfield and T. Rasmussen in the 1940s. They described the effects of local electrical stimulation of the surface of the cortex in conscious human patients undergoing neurosurgical operations under local anaesthetic. Here is Penfield and Rasmussen's diagram of the motor 'homunculus'. Imagine that the brain has been cut in half, in a vertical plane lying approximately through the two ears, and we are looking from behind at a cross-section through the motor cortex on the right side orthe brain. The labels show the various par is of the body that are caused to move by electrical stimulation at different points along the motor strip, from the tongue and face on the side of the hemisphere around to the toes on the part of the strip that lies buried in the cleft that runs along the middle of the hemispheres. The cartoon drawings Indicate the relative areas of motor cortex devoted in different parts of the body. Notice the way in which the jaws and hands are vastly over-represented.
145
267 fallacy'. He finds it a •startling, indeed amazing idea' that scientists should suggest that there are maps in the brain. He writes: there are no representing maps without conventions of representation. There are no conventions of representation without the use, by intelligent, symbol-employing creatures, of the representation. And to use a representation correctly one must know the conventions of representation, understand them, be able to explain them, recognize mistakes and correct or acknowledge them when they are printed out. Whether a certain array of lines is or is not a map is not an intrinsic feature of the lines ... but a conventional one (that is, the actual employment, by a person, of a convention of mapping)... the modern neurophysiologist... comes perilously close to saying that when a person sees an object there is a map, a representation of the object, not on the pineal gland, but on the visual cortex. But now he must explain who or what sees or reads the map. Frankly. I think that this concern is misplaced. I cannot believe that any neurophysiologist believes that there is a ghostly cartographer browsing through the cerebral atlas. Nor do I think that the employment of common language words (such as map, representation, code, information and even language) is a conceptual blunder of the kind that Peter Hacker imagines. Such metaphorical imagery is a mixture of empirical description, poetic licence and inadequate vocabulary. Yet an important question remains. If the existence of spatially ordered neural images is useful to the brain, what benefit does that order bestow? Indeed, is it possible that the presence of order {convenient though it may be to the neuroscientist) is no more than an epiphenomenon - perhaps an inevitable but useless consequence of the fact that nerve fibres, growing from the sense organs to their targets in the brain, tend to preserve the same spatial pattern as that of the receptors in the sense organ - rather like the individual glass fibres in a coherent fibre optic bundle?
Isomorphic maps A tendency of nerve fibres to maintain spatial order as they grow would, of itself, generate isomorphic distributions in the brain - patterns topographically directly related to the array of receptor cells in the sense organ. Many cerebral maps are of this kind. Perhaps the map of the visual field on the primary visual cortex (described by Horace Barlow in Chapter 1) is simply a consequence of the fact that there is an orderly input of nerve fibres maintaining the neighbourly relations of the cells in the retina (though, as Barlow points out, the partial crossing-over of fibres from the two eyes as they enter the brain is a striking, and functionally appropriate, departure from strict isomorphism). In the same terms, the somatic sensory area has on it a 'picture' of the
146
268 opposite half of the body because sensory nerve fibres entering the spinal cord form an array corresponding to their origin on the body surface. Even the 'tonotopic' map in the primary auditory cortex (in which nerve cells are distributed in an orderly sequence according to the tone frequency to which they respond best) is isomorphic. The sound receptor cells are distributed along a membrane in the cochlea (ihe sense organ in the inner ear) and different frequencies of sound set up different patterns of vibration of this membrane, the point of maximum vibration (and thus maximum nerve activity) shifting along the membrane as the tone changes in pitch from high to low. So. as long as nerve fibres in the auditory pathway maintain the same general pattern as that of the array ofreceptor cells in the cochlea, a tonotopic map is inevitable. It is well known that these cortical maps have spatial distortions, which appear as if they might be functionally important. For instance, the representation of the most acute, central fovea of the retina occupies a hugely disproportionate fraction of the visual cortex; and the most sensitive areas of the body surface - the fingers, toes and lips in humans, the snout in a pig have expanded representations in the somatic sensory cortex. At first blush, it is tempting to think of these enlarged sub-maps as being like the large-scale inserts of major cities in a road atlas - a functional device to provide greater detail about things of most importance. In fact, though, even these apparently significant peculiarities of cortical maps might also be a simple consequence of patterned nerve growth. The fovea of the retina, the fingers, toes and so on all have a much denser innervation of sensory nerve fibres than the rest of the retina and skin. Hence, over-representation of those parts is an inevitable consequence of uniform topographic distribution of nerve fibres into the brain. One of the most remarkable isomorphic maps is found in the somatic sensory area of the cortex of a mouse, or indeed any rodent that has large whiskers on its face, which it uses to explore its environment. The mouse has five rows of such whiskers and the sensory nerve that innervates them is much larger than the optic nerve from the eye. It is hard to imagine the perceptual world of a mouse but the sensations from its whiskers must surely play a major role in it. No surprise, then, to discover that a huge proportion of the somatic sensory area on the side of the cerebral hemispheres of a mouse (Fig. 17.7) is devoted to signals from the whiskers. But the isomorphism goes much further than the existence of a map of the body. Tom Woolsey and Hendrik Van der Loos looked at microscopic sections of this whisker area, cut parallel to the flat surface of the cortex. The sections through the fourth cortical layer (where most of the incoming sensory nerve fibres terminate) 8
147
269 have the remarkable appearance shown in Fig. 1 7 . 7 - a pattern of rings, each ring consisting of densely packed cortical nerve cell bodies surrounding a bunch of incoming fibres. They called these structures 'barrels' because their three-dimensional appearance is like rows of barrels. There are the same number of barrels as there are whiskers, and the barrels are distributed in a pattern just like that of the whiskers on the face. The obvious conclusion that each barrel of cells is a cortical 'organ' (a barrel organ?) corresponding to a single whisker has been amply confirmed by electrophysiologists, who have recorded activity from cortical cells in this area and have shown that the cells of each barrel respond when one particular whisker is touched. Figure 17.8 is a further, elegant demonstration of this barrel/whisker isomorphism, performed by Margaret Kossut and Peter Hand at the University of Pennsylvania'. It is a computer-processed image of a cross-section (not a surface-parallel section) through the somatic sensory cortex of the left hemisphere of a rat (which also has barrels). The anaesthetized animal was given an injection of a radioactively-labelled analogue of glucose (2deoxyglucose), which is taken up by any highly active nerve cells in the brain. After the injection, just one whisker on the opposite side of the face was vibrated gently, and the computer image (in which the colour scale indicates the level of radioactive labelling) shows one strongly labelled barrel in the middle of the cortex. In a trivial sense, these brain maps do now have readers - the anatomists and physiologists who have discovered them. But the question remains: does the brain itself make use of its maps in a functional sense? Or are they all accidents of nerve growth? In the case of the whisker barrels there is good evidence that the map is not an inherent property of that bit of cortex but is imposed on it by the ingrowing sensory fibres. Van der Loos and Woolsey have shown that the cortical cells do not become assembled into barrels until about five days after birth in the mouse; if, before this age. one row of whiskers is plucked out, the corresponding row of barrels fails to form and the neighbouring barrels are larger than normal. This strongly suggests that the topographic order in the brain is determined by the pattern of incoming fibres. if all maps in the brain were of this isomorphic form, it would be difficult to argue for their functional value. But if we were to find that the brain contained new forms of mapping, topologically related to some important uspecl of the sensory stimulus, but not simply rellectltig an anatomical pattern of ingrowing nerve fibres, that would surely indicate that maps have meaning.
148
270
Fig. 17.7. The somatic sensory area in the cortex of a mouse occupies a central strip down each side of the cerebral hemispheres. Each strip receives sensory information from the skin and other tissues or the opposite side of the body. The pattern of projection forms a crude, distorted map or Ihe half-body, with the muzzle representation vastly magnified. The upper diagram shows the right side or the mouse brain, with the general arrangement of this body map. There is a . huge area devoted to the whiskers. Below is a photomicrograph (kindly supplied by Hendrik Van der Loos), taken at low magnification, of a single thin section through the cortex in the whisker area, cut parallel to the surface of the cortex. This particular section has glanced through the fourth layer, where the incoming sensory fibres arrive. It is stained lo show the individual cortical nerve cells as small blue-green dots and they clearly form a distinct pattern of rings or 'barrels', with the same spatial arrangement as the array of whiskers on the muzzle.
149
271
Fig. 17.8.
Here we see a computer-processed image of a microscopic section cut
in a vertical plane through the whisker area of the left hemisphere of a rat. T h e 1 m m scale bar indicates the size of this piece of cortex. T h e rat had received a small injection of a radioactive substance similar to glucose, w h i c h is taken up by active nerve cells. D u r i n g Ihe uptake period, one whisker on the right side of Ihe face was gently stroked to stimulate its sensory nerve endings. T h e colour scale indicates the levels of radioactivity in the brain. T h e red spot in the middle layers of the cortex is presumably a n individual 'barrel' of nerve cells activated by the stimulated whisker. (Computer image kindly supplied by Margaret Kossut a n d 9
Peter H a n d . )
Non-isomorphic maps T h e r e a r e representations i n the b r a i n that a r e topographic in relation to the outside w o r l d , or to some aspect of the sensory s t i m u l u s , b u t w h i c h a r e not isomorphic, i n the sense that they do not h a v e the s a m e spatial a r r a n g e m e n t as the distribution of receptors. H o r a c e B a r l o w , i n C h a p t e r 1 of this v o l u m e 10
and e l s e w h e r e , h a s already discussed the principles b e h i n d ( a n d a d v a n tages ol) s u c h forms of m a p p i n g . A few examples will m a k e clear w h a t I m e a n . T h e first e x a m p l e - one already s h o w n i n F i g . 1.6 - is the w a y i n w h i c h the orientations of the edges of shapes i n the v i s u a l w o r l d a r e represented by a local system of s u b - m a p p i n g w i t h i n the v i s u a l cortex. I n d i v i d u a l n e r v e cells i n
150
272 the primary visual cortex are selectively sensitive to the angles of lines and edges in the visual field: some cells respond best (produce the highest rate of impulses) when shown a vertical line, others a horizontal line and so on, for every possible angle of line. Moreover, these nerve cells are clustered together in 'columns' or 'slabs', so that if a microclcctrode is put into the cortex perpendicular to the surface and recordings are taken from several cells in succession, then all of them, from the surface down to the bottom of the grey matter, respond best to roughly the same orientation. But if an electrode moves obliquely through the cortex, the preferred orientation shifts, usually in very small steps, from one cluster of column of cells to the next. It seems, possible, then, that the primary visual cortex is 'decomposing' images into the orientations of their component contours and classifying shapes by the angles of their edges. For such a representation to be complete, in every part of the visual field, there are certain formal requirements for the form of mapping. Obviously, all orientations must be represented for each position on the visual field, otherwise the animal or human being would be blind to certain edges at certain points in space. This is achieved by superb efficiency of representation. First, there is sufficient sloppiness or scatter in the simple isomorphic map of the visual field to ensure that each point in space is represented over a patch of visual cortex about 2 mm in diameter. In turn, the orientation columns are laid down in such a way that a full sequence of all orientations is represented across about 0 . 7 5 - 1 . 0 0 mm. This means that there is a 'safety factor' of about 2 in the mapping' . Any particular edge in the visual field causes activity in at least two different columns of orientationselective cells and the chance is very low that any particular orientation will be missed completely. 1
Now. orientation is a one-dimensional, cyclical variable - from vertical to oblique to horizontal to the other oblique, back to vertical and so on. On the other hand the visual field is two-dimensional. To represent both properties smoothly across a single two-dimensional surface (the cortex) poses a topological problem. Figure 17.9 shows the way in which it seems to be solved in the primary visual cortex of the cat. We are looking down on the surface of a small patch of cortex explored by the neurophysiologist Klaus A l b u s ; each short line indicates the preferred orientation of the column of nerve cells lying beneath that point. (Some of these orientations were obtained empirically, by recording with a microelectrode at that point; the rest were filled in by inspired guesswork.) The pattern has a number of interesting features; first, the preferred orientation usually, but not always, shifts in small steps from one column to all its neighbours; second, each complete cycle of orientations occupies 1 mm or less across the cortex; third, there is an element of lJ
151
273
/ /
/
y — —
—
'
—
• - ~ —
*•
y
/ /
1 \
\ % \ \ \ * \ \ % \ \ \ \ \ \ \ "S. m
—
/ y —' 4 / y / / / / / / / 1/ \ 4 / /
>
1 / / •1 1 •1 \ \
—
--. —
N V
* \ X
* \
^
—
^
y
.
x-
/ / /
/ •
—
"
/
S
/ / ' // * / / M
\ \ 1 s / / / 1 1 \ \ \ 0 / 1 * l \ \ ^ —« / 1 1 \ \
—
l-'ig. 1 7.9.
4 .
^-«e
—
y
—
y / —- y
- 1
*' ^ \ S / /1 V
\
\
\
\
\
\ s H\
\ \
—
;
This diagram, based on experiments by K l a u s H i n t s ' , represents a
small part (about 1mm * 0.5 mm) of the surface or the primary visual cortex of a cat. Each short line shows the angle at w h i c h a n edge must appear in the appropriate part of Ihe visual field to stimulate the liny column of nerve cells lying below l h a l point on the surface. T h e lines w i l h dots In the middle are actual experimental results, showing the preferred orientation of the tirst nerve cell recorded with a microeleclrode introduced into the cortex at that point. T h e other orientations were added on the basis of the experimental observation that, on average, Ihe preferred orientation of adjacent columns of neurons changes by about 1 0 ° fora movement of 51) ,iin across the cortical surface, 'litis piece of visual cortex receives information from a single patch of visual held. T h u s each edge of any object appearing in this region would set up activity in a number of columns of cells 'tuned' lo the appropriate angle.
152
274 randomness, manifest in occasional reversals in sequences of orientation and in sudden large jumps. Compare this pattern with Figure 1 7.10, a graphic exercise by the artist Bridget Riley. She has apparently set herself the same task as that of the map of orientation in the visual cortex, namely representing steady progressive shifts of angle of lines on a two-dimensional surface. And she arrives at much the same solution. My other example of a non-isomorphic map comes from the auditory system. We can use our ears not only to distinguish tones (and hence more complex sounds, including speech) but also to judge the position in space of a sound source - surely an important skill to an animal attacking or under attack. Now, the pitch of a sound, as I have already described, is reflected in the position of maximum vibration in the organ of hearing, and hence the place at which nerve cells are most active in the isomorphic tone map of the primary auditory cortex. But distinguishing the position of a sound source is a much more subtle affair. It depends on a comparison of differences in the intensity and time of arrival of the sound wave at the two ears (for the horizontal position) or of the relative intensities of different components of
•, i
s
v
I
'
•
.
\
I
'
•.
I
I
-v
.fc
* -
N
I
\
v
\
s
-
«.
-
•.
~
'
'
I
'
I
V .
N
f -
-
-
.
t
i
> >
\
%
*
-
\
V
•»
-
\
\
I
H.
V
\ -
.1
./
—
»
•«
%
V
A
-
> *
i l
/ /
1
-
t
-
.*
•*
.
-
'
t.
\t
>
-
-
/
/
-
.>•
>i
*
-
> >
r.
/
•/
i
\
I
l
I
|
/
I
I
I
* -, ..y I
/
>
>
\
I
I
/
/
\
\
I
f
/
\
V, /
•/
I
I,
t
\
I
•/
-
\
\
i
#
, ,
1
•» »
-
S
>•• . -
\
:
> s
-
* '
I
»
•>
»•• ,v- .V. :i
. / • ./
I
_ ,
\
~
.- : v
, /
s
-
-• -
t /
N
v I
-
\ V
\
-
.
r I
•. • " '
>
\
t I
>
.
\
'
\ ~
/ ' V
-
-
if..
«
Fig. 17.10.
~
t-
-
-
-
*
*.
> I.
I t
\
~
>
-
, f J f
i V
,
I /
\ .
'
\
)
i
i
i I
|
|
'
I
/
/
-
i
y'
*
*
"Study for Static' (1966). Pencil on paper by Bridget Riley.
153
275 complex sounds reflected from the crevices of the outer ear (for vertical as well as horizontal position). No simple isomorphism between sound receptors and nerve cells in the brain could create a map of auditory space, yet such maps do exist in the brains of birds and mammals. On the back of the midbrain of mammals (the top of the brainstem, underneath the cerebral hemispheres) there are two pairs of little bumps, the superior and inferior colliculi, both of which have input from the ears. The inferior colliculus has a beautiful isomorphic map of tone frequency (like that in the primary auditory cortex); each individual nerve cell responds preferentially to a particular pitch but is almost insensitive to the position of sound in space. The situation is just the opposite in the deeper layers of the superior colliculus. Here the nerve cells respond best to complex 'natural' sounds rather than pure tones, and most of them need the sound source to be placed at a particular position in space to give their best response. Moreover, the best position shifts progressively from cell to cell, forming a 2-dimensional map of auditory space across this brain structure . 11
It turns out that the superior colliculus is primarily a visual centre, involved in the control of movements or the eyes, head and body. The incoming visual fibres are distributed in an isomorphic map of the visual field, which is neatly in register with the non-isomorphic map of auditory space. And both of these sensory maps are aligned with a motor map, such that activation of one group of nerve cells in this structure (whether through the occurrence of a sound or a visual stimulus at a particular point in space) causes the eyes to move across to look in the direction of that stimulus. A perfect little neuronal machine, whose actions depend on those two, superimposed sensory maps, sharing the same motor output. The construction of a non-isomorphic map of auditory space requires much more than the simple ordered growth of bundles of nerve fibres. In fact, Andrew King and his colleagues at Oxford * have evidence that the map of auditory space in the superior colliculus of the ferret is not purely innately determined, but is partly dependent on interplay between auditory and visual signals early in life, leading to the establishment of an auditory map aligned with the visual one. This kind of construction of a spatial representation that bears a relationship to the world but not to any simple feature of the sense organ seems to me strong evidence that maps, as such, are useful to the brain. 1
Why have maps? The operations performed by a nerve cell are determined not only by its inherent electrical and chemical properties but also by the connections that it receives, just like any component in an electronic circuit. What value could
154
276
Fig. 17.1 1. In the deep layers or the superior colliculus (a visual centre in the brainstem) of mammals there are nerve cells that respond to sounds at particular positions in space, and they are distributed to form a map of auditory space aligned with the visual map formed by input from the eyes. Here we see a ferret viewed from above. The inset diagram represents the surface of the midbrain, consisting of the inferior colliculi (IC). which have purely auditory inpul, and the superior colliculi (SC). The numbered dots show the positions at which nerve cells were recorded in the deep layers. Each cell responded best to sound presented at the position on the opposite side of space, shown in the diagram on the left. Each curve (enclosing a striped area) represents the relative strength of response or the nerve cell for sounds presented at different positions around the animal's head. (Based on results of A.). King and M.E. Hutchings".)
there be i n h a v i n g these u n i t a r y n e r v e cells a r r a n g e d i n topographic order? P e r h a p s it h a s s o m e t h i n g to do w i t h the fact t h a t m o s t of the c o n n e c t i o n s i n the b r a i n a r e short, local fibres, s u c h that cells a r e either excited o r inhibited by t h e i r close n e i g h b o u r s . T h e n a t u r e of a n y c o m p u t a t i o n a c c o m p l i s h e d by s u c h local c o n n e c t i o n s w i l l obviously depend o n the r e l a t i o n s h i p b e t w e e n the properties of neighbouring
n e r v e cells,
155
277 C o n s i d e r the k i n d of processing that is k n o w n to go o n i n the p r i m a r y v i s u a l cortex. I n d i v i d u a l n e r v e cells, in a single c o l u m n , respond to a line of a p a r t i c u l a r orientation. T h i s very property of orientation selectivity s e e m s to depend, at least i n part, o n s h o r t - r a n g e inhibitory c o n n e c t i o n s from cells i n n e i g h b o u r i n g c o l u m n s ( w h i c h respond best to o t h e r o r i e n t a t i o n s ) . A d m i n i s tration of a d r u g that blocks inhibition i n the cortex m a k e s cortical cells partly or completely lose their selectivity for the angle of a l i n e " . S o m e c o r t i c a l cells respond best to a n edge of a p a r t i c u l a r angle that moves across a region of the v i s u a l field a n d they m a y a c h i e v e this property by r e c e i v i n g excitatory c o n n e c t i o n s from a n u m b e r of n e a r b y cells i n the s a m e c o l u m n , w h i c h all prefer lines of a p a r t i c u l a r angle, but e a c h at a slightly different position i n the field. Of course, i n t e r c o n n e c t e d elements in the b r a i n , j u s t as i n a computer, need not necessarily lie close to e a c h other. O n e could i m a g i n e a b r a i n i n w h i c h a l l the cells a r e redistributed r a n d o m l y , but m a i n t a i n i n g a l l the s a m e c o n n e c tions. W o u l d it not w o r k just as well? I n at least one respect it w o u l d not, Nerve fibres c o n d u c t impulses at very slow speeds (between a b o u t 0 . 5 a n d 1 0 0 m / s ) , c o m p a r e d w i t h the speed of t r a n s m i s s i o n i n electronic c i r c u i t r y . So, the inevitable l e n g t h e n i n g of n e r v e fibres introduced by s c r a m b l i n g a b r a i n w o u l d severely slow d o w n the operations of the b r a i n a n d w o u l d introduce differences a n d errors of t i m i n g b e t w e e n the v a r i o u s i n p u t s to e a c h cell, w h i c h m i g h t seriously interfere w i t h the process it performs. A n o t h e r a d v a n t a g e of topography is that it simplifies the problem of getting some other i n p u t into register w i t h the sensory a r r a y . I m a g i n e a v i s u a l a r e a i n the cortex t h a t receives not o n l y a n i n p u t of fibres c a r r y i n g the v i s u a l messages from the eyes forming a m a p of the v i s u a l field but also a m u c h coarser a r r a y of 'activating' fibres that adjust the excitability of n e r v e cells so as to t u r n v i s u a l attention o n o r off selectively i n different parts of the v i s u a l field. ( A c t u a l l y this is not w i l d speculation: almost c e r t a i n l y there a r e s u c h a c t i v a t i n g inputs.) A s long as there is a m a p . a n y other spatially distributed i n p u t to the a r e a will h a v e its influence topologically patterned i n relation to the coordinates of the m a p . If there w e r e a n o n - i s o m o r p h i c m a p i n w h i c h all red objects ( w h a t e v e r their position i n the v i s u a l field) w e r e represented i n one sub-division of the m a p (or all birds, all m i c e , all h u m a n faces, etc.), t h e n a set of 'activating' fibres distributed topographically across s u c h a m a p c o u l d confine attention to one sub-division of experience. W h a t e v e r the m e c h a n ism o r s t o r i n g memories, that loo m i g h t depend o n a system of "this-is-worlhr e m e m b e r i n g fibres' distributed across the m a p s of sensory experience. F i n a l l y there is the o v e r r i d i n g problem of e c o n o m y i n the specification of instructions for building the b r a i n . T h e c h r o m o s o m e s of a h u m a n being,
156
278 which have to be sufficient to specify the structure of the entire body, may contain as few as 50.000 functional genes. Some of the genetic instructions for building the brain may, however, be very simple, such as 'make all nerve cells inhibit their immediate neighbours on either side', or 'link together nearby cells that tend to fire off at the same time'. The theoretical modellers of brain function have shown that such simple rules for local Interaction replicate certain properties of real nervecells (such as selectivity for elongated lines at particular orientations) but only if the input to the network is topographically arranged. Indeed it is a very attractive notion that the basic circuitry of each local 'module' of cortex might be identical, or at least very similar, in every cortical region. The very different functions of different cortical areas may depend mainly on the nature of the inputs received and their topographical distributions. If this is true, the intrinsic circuitry of the whole cerebral cortex may be specified by a surprisingly small number of genes. Meaning from maps? 10
Horace B a r l o w has already argued that the value of maps may be to bring together new associations of activity that can reveal interesting properties of the original image. The idea that the existence of a map in a brain structure allows local circuitry to discover useful relationships between the activity of neighbouring nerve cells may resolve a recent puzzling finding. The somatic sensory area of the cerebral cortex of a fruit bat, the flying fox, has a map of the body surface, like that in other mammals, but with one surprising difference. Figure 17.12 shows, on the left, the typical arrangement, in the somatic sensory cortex of the rat. Both hindlimbs and forellmbs are represented at the front of the strip of cortex, while the animal's back is represented behind. The map in the bat (shown on the right) has the representations of the head, hind limb and back disposed much as in other mammals, but the forelimb (wing) representation lies at the back of the strip. Mike Calford and his colleagues' , who discovered this odd pattern, pointed out that the disposition of the parts of the body in the map mimicks the posture of this animal, which habitually hangs with its wings folded over its back. They inferred that the very existence of this topographic arangement may have meaning for the animal in terms of its body posture. But how could this be without a Cartesian inner eye to look at it? A n alternative possibility is that the arrangement allows local interactions between signals from parts of the body surface that usually lie close to each other and are therefore likely to be touched simultaneously or sequentially. Imagine, for instance, a branch brushing against the bat as it hangs in a tree. It is likely to touch both the fur of the animal's back and its wing 6
157
279
B
A
Rat
Bat
Pig. 1 7.12. There are certain fundamental similarities in the nature of the map iif the budy of the somatic sensory cortex in virtually all mammals so far studied. The somatic cortex ties in the middle of the cerebral hemisphere, wilh the head represented al the bottom of the strip (see Pig. 17.7). The diagram on the left shows the typical arrangement of the body map on the left side of the hemispheres of a rat. The representations of the Torelimbs and the hindlimbs point forwards. In the fruit bat, however, Mike Calford and his colleagues" found that there is a small but significant variation on this general theme. As shown on the right, the representation of the forelimb (the wing) points backwards, close lo the representation of the animal's back. In other words, the map reflects the habitual posture of the bat as il roosts, hanging with its wings folded over its back.
successively.
P e r h a p s local interconnection between the closely
spaced
regions of cortex representing these t w o regions of s k i n a l l o w s some n e r v e cells to detect a n d respond selectively to s u c h a s t i m u l u s . Multiple sensory m a p s 1 h a v e already mentioned that e a c h p r i m a r y sensory a r e a is s u r r o u n d e d by additional areas, i n w h i c h the s a m e basic sensory i n f o r m a t i o n m a y be re-cast o n different spatial coordinates. For instance, in the auditory region of the cortex of the m o u s t a c h e d bat
158
280 (which uses echo-location to guide its flight and hunting), the primary area has a simple (isomorphic) map of tone frequency, with a much enlarged representation (an auditory 'fovea') for frequencies around the strongest harmonic component of the bat's own echolocating cries. This area is flanked by others in which important functional characteristics of the sound are mapped out. One consists of nerve cells that respond only when the locating cry is followed by a delayed echo, and they are arranged in a map in which the required delay (in other words the required distance of a target) varies across the cortex. Another area contains cells that respond preferentially to the first harmonic of llie emitted cry plus anollier pitch, slightly shifted from elllier (lie second or the third harmonic. These cells would respond best as the bat approaches a target which reflects an echo that has all its harmonic components Doppler-shifted to a slightly higher frequency (like the siren of a police car as it approaches you). In this area there are two maps, each with one axis corresponding to a small range of frequency around one of the higher harmonics. This arrangement results in the velocity of approach being represented systematically along each sub-area". The monkey's cerebral cortex is now known to be filled with a profusion of sensory areas"- " . F i g u r e 17.13 gives a 1985 version of the cerebral atlas of the rhesus monkey" (though new areas are being discovered all the time). The many separate visual areas contain complete or partial representations of the visual field but the assumption is that each of them is serving a special purpose in the interpretation of the visual image. Semir Zeki at University College London has evidence that one of them, the fourth visual area (V4), is especially interested in the colour of visual stimuli , and another one (V5 or 'MT'), described by Tony Movshon in Chapter 8, is concerned with the analysis of movement in the visual scene. 16
Interestingly, the number of such additional cortical areas appears to have increased up the evolutionary scale. A monkey may have as many as twenty extra visual areas, whereas a mouse probably has only two. Indeed, the ballooning expansion of the cerebral hemispheres that has occurred during mammalian evolution may have been achieved largely by the addition of more and more sensory areas, each with its own map of a particular sensory property or properties. The conventional wisdom of comparative psychology is that the intelligence of animals (i.e. the richness and sophistication of their behavioural repertoires) is correlated with the size of their cerebral hemispheres. Could it be that each additional sensory area gives a quantal increment of understanding of the world? It is surely significant that visual areas occupy more than half of the entire surface of the cerebral hemispheres of a monkey. For we now know that,
159
281
Fig. 1 7 . 1 3 .
One of Ihe mast exciting discoveries of the last few years is that there
are m a n y different sensory and motor areas in the cerebral cortex, In the monkey brain, illustrated here, a large number of visual areas fill up more than half of the entire surface of the cerebral hemispheres. T h e small diagram at the top left is a view or ihe right side of the cerebral hemispheres of a monkey, showing the major folds and creases, with colours representing the various areas visible on the surface. The rest of this diagram is a n imaginary unfolded view of the entire cortical surface, revealing many other areas that are buried out of sight i n the clefts of the cortex. Motor areas appear grey, somatic sensory areas blue a n d auditory areas green. All the other colours, shades of red and yellow, represent individual areas concerned with vision. Some of these are thought (o be specialized for the analysis of particular aspects of the visual s c e n e - m o t i o n i n the area 1
labelled M T , form and colour in V 4 " . (Kindly supplied by David V a n E s s e n " ) ,
a l t h o u g h v i s i o n feels as if it is a s i m p l e a n d effortless process, it is e x t r a o r d i n a r i l y c o m p l i c a t e d i n c o m p u t a t i o n a l t e r m s . W r i t i n g c o m p u t e r p r o g r a m s to do t h i n g s t h a t w e t h i n k of a s b e i n g intellectually d e m a n d i n g , s u c h a s p l a y i n g c h e s s or d o i n g a d v a n c e d c a l c u l u s , h a s proved r e l a t i v e l y s i m p l e c o m p a r e d w i t h the m o n s t r o u s p r o b l e m of m a k i n g a m a c h i n e t h a t c a n see.
160
282 The ultimate enigma of perception After all of this, how do we stand with Wittgenstein? There are topographically arranged sensory areas in the brain. Some of them have spatial coordinates that bear no simple relationship to the pattern of sensory receptors. In these terms, the brain does contain images of the outside world and it seems likely that these images arc of functional value in Ihe brain's taskof analysing sensory signals. With deference to Wittgenstein, is it really misleading to call such things maps? We are making slow but steady progress towards an account of how the structure and organization of the brain allow the owner of that brain to respond appropriately to the visual world. One day we might even be able to build a machine that simulates vision by having the same kind of structure as the brain. But will it see? Will it have conscious awareness of its visual world? According to Wittgenstein, it is appropriate to use the language of the mental only when talking of people or of other animals (or even, presumably, machines) whose behaviour is indistinguishable from that of human beings. As far as the privy world ofconscious perception is concerned, this argument is flawless. Indeed one could go further: how can one be truly confident about the nature of the perceptual experiences of any creature but oneself? The nature of perceptual experience is rightly a question for philosophy (see Chapter 22 by Nelson Goodman). There is every reason for philosophers to be interested in such classical conundrums as when 1 say that I see red and you say that you see red, are we having the same experience?'; and 'is "pain" a property of noxious stimuli or a creation of the conscious mind?' But the mechanisms by which a creature understands the world around it (i.e. responds appropriately to things and events) is surely a matter for empirical investigation. The brain is a machine that accomplishes behaviour. The challenge faced by brain research is to provide a plausible account, backed by experimental evidence, of the workings of this machine. It should be no surprise to find that science - especially a branch of science that is rapidly throwing up new discoveries - runs out of words. Everyday language is an invention to deal with everyday events. It works well as a means of communication between people who share a common view of the world and of their own conscious experiences and motives. It is, in its simplest form, at the level of individual words, a convention of categorical descriptions. Therefore language, even everyday language, has grown and changed as new views about the world have developed and as the need for new categories has arisen. The thing I am sitting/kneeling on, as 1 write this, is neither a chair nor a stool. It is one of those new-fangled 'contraptions for sitting/kneeling on', which is supposed to be good for your back. I need a new word for it but I carry on calling it a chair, until a better noun comes along.
161
283 Science is constantly generating new categories and new notions, beyond previous experience. Sometimes it invents neologisms for them (Supernova, protoplasm, quark, molecule, entropy, isotope, gene); sometimes it sticks to everyday words, beating them into new meanings (Black Hole, mass, nucleus, power, cell, current, inertia). Is it really a greater conceptual confusion for brain researchers to call the distribution of activity in the visual cortex a map' than for me to call the thing that I am sitting/kneeling on a 'chair' (or to call the black birds that swim on the Swan River in Western Australia 'swans')? One of the most profoundly enigmatic aspects of science is that it often has to use everyday language to formulate questions and concepts concerning a world beyond everyday experience. It is true that some areas of intellectual effort (such as mathematics, logic and music) have devised new systems of notation because everyday language has proved inadequate as a medium of communication of questions and ideas in these disciplines. But most areas of science stumble along, using ordinary language to 'bootstrap' themselves up to new concepts. Nowhere is the problem of language greater than in brain research, but the difficulty is not so much a deep conceptual confusion as an inadequacy of vocabulary and notation. Ultimately, it should be the objective of brain research to provide an account of the function of the nervous system that even includes the mechanism of consciousness. But that is. I fear, a long way off (not least because the everyday language of consciousness gives little clue as to how to formulate questions about its mechanism). In the meantime, it would be a major achievement to give an empirical description of the mechanisms in the brain that allow animals and people to respond as if they understand the visual world.
162
Notes to Chapter 17 1. Wittgenstein, L . (1953), Philosophical Investigations. Basil Blackwell, Oxford. 2. Kenny A . (1984), The Legacy of Wittgenstein. Basil Blackwell, Oxford. 3. Descartes, R. (1637), La Dioptrique (see Philosophical Writings transl. and ed. E . Anscombe & P.T. Geach (1954) London, Nelson). 4. Descartes. R. ( 1 6 4 9 ) , Les Passions de I'Ame (see The Philosophical Works of Descartes I, transl. & ed. E.S. Haldane & G.T.R. Ross ( 1 9 1 1 ) Cambridge University Press. Cambridge). 5. T h e minutes of the (one. 1 9 8 6 . meeting of the steering committee fur the SDF Symposium on Computational Neuroscience were prepared a n d circulated by Professor Eric S c h w a r t z of the New York University Medical Center in September 1986. 6. Y o u n g . J.Z. ( 1 9 7 8 ) , Programs of the Brain. Oxford University Press, Oxford. 7. Hacker. P. ( 1 9 8 7 ) . Languages, brains and minds, i n Mindwaves. ed. C . Blakemore & S. Greenfield. Basil Blackwell. Oxford. 8. Woolsey, T . A . & V a n der Loos. H . ( 1 9 7 0 ) . T h e structural organization of layer IV in the somatosensory region (SI) of mouse cerebral cortex. T h e description of a cortical Held composed of discrete cytoarc hi tec tonic units. Brum Res., 17, 2 0 5 - 4 2 . 9. Kossut. M. & Hand. P. ( 1 9 8 4 ) . Early development of changes in representation of vibrissae following neonatal denervation of surrounding vibrissae receptors; a 2-deoxy glucose study in the rat. Neuroscience Letters, 4 6 , 7 - 1 2 . 10. Barlow, H . B , ( 1 9 8 1 ) . Critical limiting factors in the design of the eye and visual cortex. T h e Ferrier lecture 1 9 8 0 . Proc. Roy. Soc. B 2 1 2 , 1-34. Barlow, H.B., ( 1 9 8 6 ) . W h y have multiple cortical areas? Vision Research. 26, 81-90. 11. Hubel, D . H . & Wiesel, T . N . ( 1 9 7 7 ) . Functional architecture of macaque monkey visual cortex. Proc. Roy Soc. B 1 9 8 , 1 - 5 9 . 12. Albus, K . ( 1 9 7 5 ) . A quantitative study of the projection area of the centra! and the paracentral visual field In area 17 of the cat. II. T h e spatial organization of the orientation domain. Experimental Brain Research, 2 4 . 1 8 1 - 2 0 2 . 1 3. Harris, L . R . . Blakemore, C . & Donaghy, M.J. ( 1 9 8 0 ) . Integration of visual a n d auditory space in the m a m m a l i a n superior colliculus. Nature. 2 8 8 , 5 6 - 9 . King, A . j . & Hutchings. M.F.. ( 1 9 8 7 ) . Spatial response properties of acoustically responsive neurons in the superior colliculus of the ferret: a map of auditory space. /. Neurophysiol., 57, 5 9 6 - 6 2 4 . 14. King, A . ] . . Hutchings. M.E.. Moore, D.R. & Blakemore. C . ( 1 9 8 7 ) . Developmental plasticity i n the representations of visual a n d auditory space. Nature, 3 3 2 , 7 3 - 6 . i 5. Rose, D. & Blakemore. C. ( 1 9 7 4 ) . Effects of bicuculline on functions of inhibition in visual cortex. Nature. 2 4 9 . 3 7 5 - 7 . Sillito, A . M . ( 1 9 8 4 ) . Functional considerations of the operation of GABAergic inhibitory processes in the visual cortex. In The Cerebral Cortex, vol, 2 A , ed. A , Peters & E . G . (ones. Plenum Publishing Corporation. New York. pp. 9 1 - 1 1 7. 16. Calford, M.U., Graydon. M.L., Huerta. M.F., Kaas, J.F. & Pettigrew, J.D. ( 1 9 8 5 ) . A variant nf the m a m m a l i a n somatotopic map In a bat. Nature. 31 i. 4 7 7 - 9 . 17. Suga. N . . Kuziral. K . & O'Neill, W.F.. ( 1 9 8 1 ) . How biosonar information is represented in the bat cerebral cortex. I n Neuronal Mechanisms of Hearing, ed. J. Syka & L . Aitkin. Plenum Publishing Corporation, New York. 18. Zeki, S. ( 1 9 8 3 ) . Colour coding in the cerebral cortex: the reaction of cells in monkey visual cortex to wavelengths a n d colours. Neuroscience. 9, 7 4 1 - 6 5 . 19. V a n Essen, D . C . (1985). Functional organization of primate visual cortex. In Cerebral Cortex, vol. 3, ed, A . Peters & E . G . Jones. Plenum Plublishing Corporation, New York.
163
Chapter 2
BIOLOGICAL CONCEPTS AND METHODS; C O M P U T A T I O N A L GOALS AND MEANS
This chapter introduces notions that a physicist, or any other outsider, is advised to learn about, before entering the field of biocomputation, if he wishes to avoid looking like a beotian or a tourist. In contrast with the previous chapter, where half of the author's names would be familiar to the average physicist, the authors in this chapter may well be unknown to him. But the newcomer should soon recognise in the late Canadian psychologist Hebb a major figure. Together with the two American psychologists Miller and Shepard, he contributed to the demise of behaviourism that had dominated psychology in the North American continent, and to the advent of the so-called "cognitive revolution" in the fifties and the sixties. The notion of mental representations, the suggestion that they might be embodied in the brain as cell assemblies, and the psychophysical experiments on mental images, constitute some of the main foundations of cognitive sciences. These foundations are evoked in section 2a, along with some recent investigations of their neurophysiological substrate. Section 2b is devoted to information theory as well as to an evaluation of its strong impact on psychology and brain theory. Quantitative estimates of sensory information flow and of the span of short-term memory, and emphasis put on the attention bottleneck and on the need for redundancy reduction, belong to this scientific tradition originating in the forties. Miller's "Magical number 7-plus-or-minus-2" a n d Barlow's "Cardinal cells"
164
are so often quoted and misquoted, that it is both a surprise and a pleasure to rediscover their seminal and masterful papers. T h e theoretical tension between Hebb's cell assembly and Barlow's sparse coding is still a major issue in neurocomputation. "Reading the structure of brains", as well as inferring the "how" and "why" of brain areas from neuroanatomy, is a challenging exercise evoked in section 2c. It requires qualities of "esprit de finesse", unfamiliar to physicists. But this synthetic approach is essential to brain theory. I n brief, the three articles of section 2d are mainly united by their differences. They illustrate quasi-orthogonal viewpoints. T h e immunologist Jerne presents the old debate between the concepts of selection and instruction — a tension pervasive in biological thought — with a sharpness that has been too often missing in the follow-up literature. The psychobiologist Julesz, creator of the random-dot stereograms, reviews early vision, pre-attentive and attentive. T h e neuropsychologist Weiskrantz is representative of the ancient branch of neuropsychiatry, that has been drawing precious information from lesions of the human brain.
165
2a Mental Representations
The mental representation of a percept or a concept in the brain is a central theme of cognitive science which will recur time and again throughout this volume. The three texts of this section report on three complementary fronts of research: psychology, psychophysics and neuroscience. The first paper (15) by Hebb focuses on imagery, i.e. the production or processing of mental images without sensory stimuli, which is among all thought processes the one that appears to be most easily accessible to scientific investigations. This paper has been chosen, because it was written shortly before the landmark experiments of Shepard and Metzler {article (16)); it illustrates beautifully how far psychology was able to go, using a mixture of scattered observations and educated intuitions. Some twenty years earlier (in 1949), Donald Hebb had written a book, The Organisation of Behaviour, that has since remained a classic, and that introduced two essential ideas for the field of neurocomputation: the notion that a percept or a concept is represented in the brain by an assembly of cells firing together, and the so-called Hebb learning rule for synaptic modification. These prophetic paragraphs have been so often reproduced elsewhere that we did not feel useful to do it again here. Article (15) offers a distinction between sensation and perception, and it emphasises the importance of the motor component in perception. Moreover, this article presents an additional value, in that it tries to relate Hebb's early intuitive ideas about cell assemblies to the then recent (in the sixties) neurophysiological findings of Hubel and Wiesel. T h e reader will discover many intriguing reports and speculations about phantom limbs, hallucinations, eidetic and hypnagogic imagery, and will also enjoy the free style of a great scientist. In contrast, the letter from Shepard and Metzler (article (16)) may appear as a dry technical report on reaction time measurements. But its method and findings played a key role in proving beyond all doubt that scientific investigation of mental phenomena was possible, and this paper is at the origin of the "mental rotation paradigm" in cognitive psychology.
166
T h e basic finding is that the time needed to recognise that two pictures are rotated aspects of the same object, is proportional to the angle of rotation. W h a t may come as a surprise is the fact that this recognition time does not depend on the orientation of the axis of rotation in space. Intuitively one could expect that mental detection of a rotation within the picture plane should be easier. T h e third paper (17), written about twenty years later, reports on electrode recordings from the motor cortex of monkeys which were trained to perform a specific task. Somehow, it bridges the gap between Hebb's intuitions about cell assemblies and Shepard-Metzler's psychophysical measurements on mental rotations. It shows that the cognitive operation of mental rotation may be represented by a varying activity pattern of a population of directionally tuned neurons. T h e weighted sum of their preferred directions defines the "neural population vector", which rotates through the shortest angle from the initial to the final position in the task.
167
Reprinted wilh permission from Psychological Review, Vol. 75, No. 6, pp. 466-477,1968 © 1968 American Psychological Association
CONCERNING IMAGERY ' D. O. H E B B McGill University, Montreal, Canada An attempt is made to analyze imagery in physiological terms. It is proposed (a) that eye movement has an organizing function, (_b) that Ist-order cell assemblies are the basis of vivid specific imagery, and (c) that higher-order assemblies are the basis of less specific imagery and non re presentational conceptual processes. Eidetic images, hallucinations, and hypnagogic imagery are compared with the memory image, and certain peculiarities of the memory image are discussed. T h i s paper concerns the content and mechanisms of imagery. T h e topic has received only sporadic attention, partly because of the positivistic temper of modern psychology and partly, one may suppose, because of the difficulties of dealing with thought processes in general. I propose to see what sort of analytical treatment can be made of the image and, equally, of its relation to sensation, perception, and thought. The occasion for such treatment is mainly my interest in thought—one can hardly turn round in this area without bumping into the image—but also the recent work on the place of imagery in paired-associate learning (Paivio, in press) and the convincing demonstrations of eidetic imagery made by Haber and his colleagues (Haber & Haber, 1964; Leask, Haber, & H a b e r ) . I have also in mind the hallucinatory activity reported in conditions of monotony, perceptual isolation, and loss of sleep (Bexton, H e r o n , & Scott, 1954; Melvill Jones, cited by Hebb, 1960, p. 741; Malmo & Surwillo, 1960; Morris, Williams, & L u b i n , 1960; Mosely, 1953). 1
1
Preparation of this paper was supported by the Defence Research Board ot Canada, Grant No. 9401-11. • J . Leask, R. N. Haber, and R . B. Haber. Eidetic imagery in children: I I . Longitudinal and experimental results, in preparation.
T H E P L A C E O F IMAGERY I N O B J E C T I V E PSYCHOLOGY
Let me first dispose of what seems to be a misconception, that reporting imagery, or describing it, is necessarily introspective. T h e point has been made elsewhere (Hebb, 1966) but I repeat it here for those not addicted to introductory textbooks. A n excellent example to begin with is the phantom limb, which is clearly a case of somesthetic imagery. After an arm or leg has been amputated there is, apparently in every case (Simmel, 1956), a hallucinatory awareness of the part that has been cut off. I n some 10-15% of the cases the patient also reports pain, the fingers or toes being curled up with cramp. Is this a report of introspection? T h e argument might be: T h e pain is in the right hand, but the patient has no right hand; so the pain is really in his m i n d ; so he is describing his mental processes, which is introspection : "looking inward." But the argument is faulty. W e are still dealing with a mechanism of response to the environment, though the mechanism (because a part is missing) is now functioning abnormally. Figure 1 represents a right hand connected schematically with brain and speech organs, before amputation. When the fingers are burnt or cramped the subject (S) says " O u c h " or "My hand hurts." T h i s is a normal mode of
168
467
response to the environment, involving (a) sensory input, (b) excitation of the central processes of perception and consciousness, and (c) motor output determined by the central activity. It is obvious that in such reactivity— when I burn my fingers and say "Ouch"—no question of looking inward arises. My verbal response is no more dependent on introspection than a dog's yelp when his tail is trod on. The same conclusion holds after an amputation. No excitation can originate in the missing hand, but the same excitation in principle can arise higher in the pathway by spontaneous firing on the neurons at level X in Figure 1. U S now reports pain in his imaginary or imagined hand we are not dealing with any different mechanism, in brain function, than when a normal 5" reports pain. Report of "sensation" from a phantom limb is not introspective report. The ordinary memory image can be understood in much the same way. The central processes here may be excited associatively (i.e., the cell assemblies are excited by other assemblies instead of spontaneouslyfiringafferent neurons), but in both cases we are dealing with a short circuiting of a sensory-perceptual-motor pathway. The .£ on holiday, seeing the ocean for the first time, remarks on tbe size of the breakers; reminded of the scene later he may say, "I can still see those waves." Though there is now no sensory input, the same central process, more or less, is exciting the same motor response—more or less. (What the differences may be we will consider later.) It is the same outward-looking mechanism that is operative, not introspection. At least, it is not introspection in the sense of a special in ward-looking mechanism of self-knowledge. Any-
F i a . 1. To illustrate the relation between normal sensation and the phantom limb.
one may define the term to suit himself, and may use it when reports of private events such as endogenous pain and imagery are in question. My point is that such report does not transcend the rules of objective psychology, in which mental processes are examined by inference and not by direct observation. The primary basis of inference is the relation of overt behavior to present and past stimulation, but there is also a basis of inference about oneself from the appearance of the external world: I may, for example, conclude that I am color-blind if surfaces that others call green and red look alike to me. I also make inferences about the functioning of my visual system when I oh serve positive and negative afterimages though my eyes are shut. It is important to say also, with regard to a report of imagery, that one is not describing the image but the apparent object. This becomes clear if one observes the apparent locus of what one is describing. One does not perceive one's perceptions, nor describe them; one describes the object that is perceived, from which one may draw
169
468 eye, head, or hand to see, hear, or feel better; ( c ) the resulting feedback; (d) further motor output, further feedback, and so on. A s we will see later, this is not a trivial point but must affect our understanding both of percept and of image.
inferences about the nature of the perceptual process. I n the case of imagery, one knows that the apparent object does not exist, and so it is natural to think that it must be the image that one perceives and describes, but this is unwarranted. T h e mechanism of imagery is an aberrant mechanism of exteroception, not a form of looking inward to observe the operations of the mind. So understood, the description of an imagined object has a legitimate place in the data of objective psychology. WORKING
Physiologically there is a discontinuity in the mode of operation of the afferent pathway to the sensory cortical area and the structures that lie beyond. T h e afferent transmission is highly reliable, whereas cortico-cortical transmission, the higher activity that includes perception as defined, occurs only in favorable circumstances. A n evoked potential in sensory cortex is obtainable in coma or under anesthesia, but any transmission past this point is not sufficient to break up the synchronous E E G activity. T h u s "anesthesia," meaning literally a lack of sensation, is a misnomer; we are dealing instead with a failure of transmission at a higher level.
DEFINITIONS
I n what follows it will be necessary to distinguish between sensation and perception, without supposing that there is a sharp separation between them. T h e distinction is based primarily on physiological considerations but the psychological evidence is in agreement. Sensation is defined here as the activity of receptors and the resulting activity of the afferent pathway up to and including the cortical sensory area; perception as the central (cortico-diencephalic) activity that is directly excited by sensation as defined. F o r the purposes of this analysis, then, sensation is a linear input to sensory cortex, perception the reentrant or reverberatory activity of cell-assemblies lying in association cortex and related structures. The term perception itself has two meanings in ordinary usage. Which of the two is intended is usually clear from the context, but when necessary I distinguish perceivintj, as the process of arriving at a "perception," from a percept, the end product, the brain process that is the cognition or awareness of the object perceived. Except with very familiar objects, perceiving is not a one-stage, single-shot affair. It usually involves (a) a sensory event; ( 6 ) a motor output, the adjustment of
A s Teuber ( I 9 6 0 ) has pointed out, perception cannot be identified with an activity of sensory cortex, so the physiological basis of a distinction between sensation and perception is clear. Sensory systems are organized with fibers in paratlel. providing for lateral summation and hence reliability of transmission at each synaptic junction. T h e divergent course of fibers from sensory cortex onward lacks this feature, and transmission here requires supporting; facilitation from the brainstem arousal system, which is absent under anesthesia. T h e selectivity of response to sensory stimulation even in the normal conscious state strongly indicates that supporting facilitation is also needed from the concurrent cortical activity; except when there is a sharp increase of arousal, due to pain stimulation or certain unfamiliar events, we "notice," perceive, or respond to only those events in the nor-
170
469
mal environment that are related to what we are thinking about at the moment. Finally, another relevance of the distinction between sensation and perception from a psychological point of view is the fact that different sensations or sensory patterns can give rise to the same perception, as in the perceptual constancies; and the fact that the same sensory pattern can give rise to quite different perceptions, as in the ambiguous figure (even with fixation of gaze). In this latter case, the only explanation that has been given is that different cell assemblies are excited by the input at different times.
With the printed word before him, spelling the word backward can be done nearly as quickly as forward, but this is not true of the image and the who tries such a task for the first time is apt to be surprised at what he finds. There is a sequential left-toright organization of the parts within the apparently unitary presentation, corresponding to the order of presentation in perception as one reads English from left to right and from top to bottom.
T H E P A T T E R N OF A C T I V I T Y
Both the ordinary memory image and the eidetic image arise from perception. As we will see, this does not mean that the memory image is identical with perception (though eidetic imagery may be), but it does have implications that have not been recognized. The percept of any but tbe simplest object cannot be regarded as a static pattern of activity isomorphic with the perceived object but must be a sequentially organized or temporal pattern. The same statement, it seems, applies to the memory image. This has been well established for the image of printed verbal material (Woodworth, 193S. p. 42, citing Binet and Fernald). The S with good visual imagery may be asked to form an image of a familiar word of medium length ("establish" or "material" would be suitable). When he has done so, he is asked to read off the letters backward ; or if 5 is one who reports that when he has memorized verse he can see the words on the page, lie may be asked to recall a particular stanza and then to read ihe last words of each line going from bottom to top.
Something of the same sort applies with imagery of nonliteral material, though now the order of "seeing" or reporting is less rigid. If the reader will form an image of some familiar object such as a car or a rowboat he will find that its different parts are not clear all at once but successively, as if his gaze in looking at an actual car shifted from fender to trunk to windshield to rear door to windshield, and so on. This freedom in seeing any part at will may make one feel that ail is simultaneously given: that tbe figure of speech of an image, a picture "before the mind's eye," in the old phrase, does not misrepresent the actual situation. But Binet (1903) drew attention to a surprising incompleteness in certain cases of imagery, which suggests a different conception. Let us consider the question more closely.
171
First, consider the actual mechanics of perceiving a complex visual object that is not completely strange but not so familiar that it can be fully perceived at a glance. Figure 2 represents a slightly off-beat squirrel or chipmunk. The eye movements made in perceiving it must vary, but assume that there are four points of fixation, A, B, C, and D. After fixating these points, perhaps repeatedly, the object is perceived with clarity: one percept is arrived at. But how are the separate visual impressions integrated ?
470 tions were reinstated at the same time. T h e effect must be the same as if, in perception, one saw four copies of Figure 2 superimposed to make Points A, B, C, and D coincide, in a mishmash of lines. Instead, Activities A, B, C, and D must be reinstated one at a time, the transition from one to the next mediated by a motor activity corresponding to the appropriate eye movement.
FIG. 2. To illustrate the role of eye movement in perception and imagery. (A, B, C, D, fixation points.) W e must take account of the fact that these four part-perceptions are all made in central vision, more or less on top of each other, though they are separated in time by eye movements. E a c h of the four is an excitation of a small group of cell-assemblies, which I will call for the moment Activity A, Activity B, and so on. These activities must take place in the same tissues, more or less intertwined. Activity A is separated from a following Activity B by an eye movement to the right and slightly downward; if Activity D occurs next, it is preceded by an eye movement downward and to the left; and so on. These movements are mechanically necessary in scanning the object, but they may have a further role. I n other words, the motor process may have an organizing function in the percept itself and in imagery. The image is a reinstatement of the perceptual activity, but consider the result if all four of the separate part-percep-
W h e n looking at the actual object each part-perception is accompanied by three motor excitations (assuming these four fixation points) produced by peripheral stimulation. One of them becomes liminal and the result is eye movement followed by another partperception. If the image is a reinstatement of the perceptual process it should include the eye movements (and in fact usually does); and if we can assume that the motor activity, implicit or overt, plays an active part we have an explanation of the way in which the part-images are integrated sequentially. I n short, a part-image does not excite another directly, but excites the motor system, which in turn excites the next part-image. T h a t there is an essential motor component in both perception and imagery was proposed earlier (Hebb, 1949, pp. 3 4 - 3 7 ) with some informal supporting observations that as far as I can discover are still valid. It is easy to form a clear image of a triangle or a circle when eye movement is made freely (not necessarily following the contours of the imagined figure), harder to do with fixation of gaze while imagining the eye movement, but impossible if one attempts to imagine the figure as being seen with fixation on one point. Though such informal evidence cannot carry great weight, it does agree with the idea that the motor accompaniments of imagery are not adventitious but essential.
172
471 A B S T R A C T I O N AND
HIGHER-ORDER
ASSEMBLIES
One of the classical problems of imagery is the generality of the image. Another is its relation to abstract thought. A hypothetical clarification of such questions emerges from a consideration of the relation of secondary and higher-order assemblies to primary ones. T h e r e is a classical view going back to Berkeley that an image must be of a specific object or situation and cannot have generalized reference, but Woodworth (1938, p. 4 3 ) cites an early paper by Koffka to the contrary, and it seems that the view is more a consequence of theory (regarding the image as reinstated sensation) than of observation. But how can an image be general, or abstract? Again, Binet (1903, p. 124) reports an opposition between thought and imagery. The image, a representative process, is by definition an element in thought: H o w can we understand such an opposition ? T h e present status of the theory of cell assemblies is paradoxical, since it has a way of leading to experiments that both support and disprove it. A n impressive confirmation from the phenomena of stabilized images ( P r i t c h ard, H e r o n , & Hebb, 1960) is matched by quite definite evidence, from the same set of experiments, that the theory is unsatisfactory as it stands (Hebb, 1963). Some of the difficulty for the treatment of perception becomes less with the proposal of Good (1965) that an assembly must consist of a number of subassemblies that enter momentarily now into one assembly, now into another. T h e assembly itself need no longer be thought of as all-or-nothing in its activity. Fading, for example, may be a function of the density of subassemblies active in a given region, and a strong stimulation may excite all the subassemblies that are available for a
173
given assembly while a weaker stimulation excites fewer of them. A subassembly, conceivably, might be as small as one of Lorente de No's closed loops consisting of only two or three neurons. It is, however, another aspect of the theory that concerns us now. T h i s is the varying degree of directness of relation between a sensory stimulus and an assembly activity. T h e old idea that an image must always be of a specific object was the result of thinking ( a ) that an image is the reinstatement of a sensory-central process, and ( 6 ) that the central part of the process corresponds exactly to the sensory stimulation. T h e epochal work of Hubel and Wiesel (epochal certainly for understanding perception) shows that this may be true for some components of the central process but is not true for others. A "simple cell" in the cortex responds to a specific retinal stimulation, its receptive field permitting little variation; but "complex cells" respond to stimulation in any part of their larger receptive fields, upwards of half a degree of visual angle in extent ( H u b e l & Wiesel, 1968). A subassembly made up of simple cells, or controlled by them, will thus be representative of a very specific sensory event, but one made up of complex cells will incorporate in itself some degree of generalization or abstraction. Assembly activities accordingly may be more or less specific as perceptual or imagina! events. The superordinate assembly (Hebb, 1949) takes the process of generalization or abstraction further. T h e primary or first-order assembly is one that is directly excited by sensory stimulation. T h e second-order assembly is made up of neurons and subassemblies that are excited, farther on in transmission, by a particular group of primary assemblies ; the third-
472 order made up of those excited by second-order assemblies. T h e theoretical idea is very similar to what Hubel and Wiesel have demonstrated experimentally for simple, complex, and hypercomplex cells; simple cells being those on which a number of retinal cells converge, complex cells those on which simple cells converge, and hypercomplex those on which complex cells converge. F r o m this it may be concluded that the first-order assembly is predominantly composed of or fired by simple cells.
nizations, and many more "ideas" {considered as temporary organizations of neuron groups) are possible than the total number of neurons in the brain; an ideational element may be a phase in a constant flow of changed groupings of neurons. The second point is that there are limits on the process of elaboration. A secondary assembly may be the limit of capacity of the rat brain as far as triangles are concerned, for all the complexity of that small brain. T h e tertiary assembly, it seems, calls for a bigger brain. T h e rat is doubtfully capable of perceiving a triangle as a whole, even when it has a fixed orientation, and is not capable of recognizing a triangle when it is rotated from the position in which he was trained to respond to it. T h e young chimpanzee, however, or the 2-year-old child, recognizes the rotated figure easily (Gellerman, 1933).
A n artificial example to make this specific: L e t us say an infant has already developed assemblies for lines of different slope in his visual field. H e is now exposed visually to a triangular object fastened to the side of his crib, so he sees it over and over again from a particular angle. Looking at it excites three primary assemblies, corresponding to the three sides. A s these are excited together, a secondary assembly gradually develops, whose activity is perception of the object as a whole—but in that orientation only. I f now he has a triangular block to play with, and sees it again and again from various angles he will develop several secondary assemblies, for the perception of the triangle in its different orientations. Finally, taking this to its logical conclusion, when these various secondary assemblies are active together or in close sequence, a tertiary assembly is developed, whose activity is perception of the triangle as a triangle, regardless of its orientation.
Another example: T h e baby repeatedly exposed to the sight of mother's hand in a number of positions would develop subassembly and assembly activities corresponding to perceptions of parts of the hand, and then the whole hand, as seen in these varied orientations. A s the hand is seen in motion, these assemblies would be made active in close sequence. T h e i r combined effects, at a higher level in transmission, would be the basis for forming a higher-order assembly whose activity would be the perception of a hand irrespective of posture.
H o w realistic is this proposal of complex processes developing further complex processes, in brain function? A heavy demand is made on the brain in the large number of neurons needed for what seems a simple perception. T w o comments are in order. As Lashley (1950) observed, the same neuron may enter into different orga-
Although with present knowledge such proposals must be made in very general terms, they are not unreasonably complex in the light of the anatomical and physiological evidence that is available, and they do offer an approach to the otherwise mysterious abstractions and generalizations of thought. Human thought consists of
174
473 abstraction piled on abstraction, of generalizations themselves based on generalizations, and if we are to accept the notion that thought is an activity of the brain we must explore the speculative possibilities of how this may occur. T h e actual perception of an object, following this line of thought, i n volves both primary and higher-order assemblies. T h e object is perceived both as a specific thing in a specific place with specific properties, and as generalized and abstracted from—but not all of this simultaneously. In imagery, only part of this activity may be reinstated. First-order assemblies, directly excited by sensation, must be an essential feature of perception, but need not be active when the excitation comes from other cortical processes. A memory image, that is, may consist only of second- and higher-order assemblies, without the first-order ones that would give it the completeness and vividness of perception. E I D E T I C IMAOERY W e are now in a position to consider a hypothesis of the nature of eidetic imagery. T h e eidetic image has been regarded with skepticism, I think, because as described it seems to have incompatible characteristics of both afterimage and memory image. Its occurrence only after stimulation, its transience, and its vividness in detail make it seem like an afterimage; but its apparent independence of eye movement, its failure to move as the eyes move and the possibility of "looking at" its different parts, to see them with equal clarity, means that it cannot be an afterimage. T o the skeptic it sounds like an image that has got stuck to the viewing surface, which is unlikely to say the least. Part of the difficulty of understanding disappears if we first assume, with Allport ( 1 9 2 8 ) ,
that the eidetic image is in the same class with the memory image, and if we then recognize that eye movement has a positive integrating role in memory images. N o w the scanning of the viewing surface becomes intelligible: A s the eidetiker changes fixation from one point to another the motor activity helps to reinstate the corresponding part-percept. It remains then to account for the detailed vividness of the eidetic image and my hypothesis proposes in short that the eidetic image includes the activity of first-order cell assemblies that are characteristic of perception but absent from the memory image. The idea finds support in the observation of Leask, Haber, and Haber (see Footnote 2 ) that the eidetic image may be strictly monocular when formed with one eye open, disappearing when that eye is closed and the other opened. I t was proposed above that the first-order assembly is composed of (or controlled by) first-order cells most of which have monocular function (Hubel & Wiesel, 1968). Since the eidetic image occurs only for a brief period following stimulation, one thinks first of the hypothesis as meaning that there is some after-discharge in the first-order assemblies. But this would imply the continued activity of all of them at the same time, whereas—as we have seen—the activity must be sequential and motor-linked. The hypothesis instead must be that the eidetiker has first-order visual assemblies which for some reason remain more excitable, for a brief period following stimulation, than those of other ,9s. It is possible, as Siipola and H a y den (1965) and Freides and Hayden (1966) have suggested, that the difference may be due to some slight brain damage. Other perceptual anomalies suggest, in turn, that one effect of brain damage may be to im-
175
474 pair the action of inhibitory neurons in the cortex: neurons whose function is to "turn off" one perceptual (or imaginal) process when it is replaced by another (Hebb, 1960, p. 7 4 3 ) . T h i s inhibitory function, however-—assuming it exists—is not entirety absent in the S"s of Leask et al., in view of the ease with which the eidetic imagery could be prevented or disrupted. T h e disruptive effect of new stimulation (as 5* looks away from the viewing surface) is intelligible if the subassembly components of the first-order assemblies are now excited in new patterns. Leask et al. also report that "thinking of something else" interferes with the formation of the image, and that the same effect results from verbalization in the attempt to memorize the picture's content. T h e theoretical implication is that higher-order assembly activity tends to interfere with some lower-order activity, if not by direct inhibition then possibly because the higher-order assembly utilizes some of the components of the lower-order one and thus breaks up its organization. The central fact in this area is Haber's brilliant experimental analysis of eidetic imagery. T h e present speculation does nothing to extend his results, though it may help to reduce skepticism (for example, in showing how S's eye movement may actively help in retrieving detail). Instead, his work serves here to provide solid experimental data whose import extends to a wider field in which trustworthy data are sparse indeed.
H A L L U C I N A T I O N , HYPNAGOGIC AND M E M O R Y I M A G E
IMAGE,
I wish therefore to conclude by taking account of hallucination and hypnagogic imagery, and relating them and the memory image to thought (or, properly, to other forms of thought).
176
T h e term hallucination is used here to include any spontaneous imagery that might be taken for a perception, even if S knows that he is not perceiving. A phantom leg, for example, is so convincing that the patient may not realize that the leg has been amputated ; at this point it meets the criterion of hallucination in the narrower sense, since the patient is deceived, but it continues to have the same convincing character after the patient is informed of his loss and can see the stump of the limb. T h e nature of the process has not changed; if it was hallucination before, it is hallucination now. Similarly, the imagery of some 5 s in perceptual isolation (Bexton, Heron, & Scott, 1954) was such that they would have thought they were looking at moving pictures if they had not known they were wearing the occluding goggles. T h i s must be hallucination also. It was proposed above that the memory image lacks vivid detail because it is aroused centrally instead of sensorily. Hallucinations have a central origin also, but their vividness is not inconsistent with the above conclusion. If the cause of hallucinations is spontaneous firing by cortical neurons, the spontaneous firing may occur in firstorder as well as in higher-order assemblies. I n its vividness, and in its implication of activity by first-order assemblies, the hallucination is like the eidetic image though it is at the opposite pole in its relation to sensory stimulation, since it seems to depend on a failure of sensation. I n norma! waking hours there is a constant modulating influence of sensory input upon cortical activities, helping both to excite cortical neurons and to determine the organization of their firing. When this influence is defective for any reason—pathological processes, or habituation resulting from monotony
475 or "sensory deprivation"—there is still cortical activity. Neurons fire spontaneously if not excited from without. T h e activity may be unorganized, and in the isolation experiments 5 s in fact were in a lethargic state much of the time, unable even to daydream effectively. B u t when by chance the spontaneous cortical firing falls into a "meaningful" pattern—when the active neurons include enough of those constituting a cell assembly to make the assembly active and so excite other assemblies in an organized pattern—S may find himself with bizarre thoughts or, if first-order assemblies are among those activated, with vivid detailed imagery. T h e hypnagogic image, like the eidetic image and unlike hallucination, is an aftereffect of stimulation but there may be a gap of hours, instead of seconds, between stimulation and the appearance of the imaginal activity. It is characteristic of the period before sleep, but on rare occasions may happen at other times. K . S. Lashley once said that after long hours at the microscope watching paramecia he found himself, as he left the laboratory, walking waist-deep through a flowing tide of paramecia (somewhat larger than life-size!) F o r myself, true hypnagogic imagery is of the same kind though it occurs only before sleep, and is quite different from the ordinary slight distortions of visual imagery at the onset of dreaming and sleep. I t depends on prolonged experience of an unaccustomed kind. A day in the woods or a day-long car trip after a sedentary winter sometimes has an extraordinarily vivid aftereffect. As I go to bed and shut my eyes—but not till then, though it may be hours since the conclusion of the special visual stimulation—a path through the bush or a winding highway begins to flow past me and continues to do so till
sleep intervenes. T h e scenes have a convincing realism, except in one respect. F i n e detail is missing. I see bushes with leaves on them, for example, but the individual leaf or bush becomes amorphous as soon as I try to see that one clearly, at the same time that its surroundings in peripheral vision remain fully evident. T h e phenomenon must be very much like the eidetic image, except in its time properties and its lack of fine detail.
177
T h e memory image does not share the peculiarities of these other forms of imagery, but it may still be more peculiar than is generally recognized. W e have already seen that it lacks detail, due apparently to an inefficiency of associative mechanisms of arousal. W e must now observe that the memory image is typically incomplete in gross respects as well. It frequently lacks even major parts of the object or scene that is imagined—-though if one looks for them they show up at once and so, unless the question is made explicit, one may have the impression that the whole was present all along. Binet's 14-year-old daughter had the advantage of not being psychologically trained and not realizing how improbable her reports would sound. Asked to consider the laundress, she reported seeing only the lady's head; if she saw anything else it was very imperfect and did not include the laundress's clothing or what she was doing. For a crystalline lens, she saw not the lens but the eye of her pet dog, with little of the head or the rest of the animal; and for a handle-bar, all the front part of her bicycle but missing the seat and the rear wheel (Binet, 1903, p. 126.) T o think of the memory image as the reinstatements of a single unified perceptual process makes such reports fantastic, but they are not all fantastic when the image is regarded
476 as a serial reconstruction that may terminate before the whole perceptual process has been reinstated. Incomplete imagery has a special relevance for ideas of the "self" ( a mixture of fantasy and realism discussed elsewhere: Hebb, 1960). It is comprehensible of course that one can, with deliberate intention, imagine what one would look like from another point in the room—that is, one can have imagery of oneself as seen from an external point—but a less complete imagery may occur unintended and without recognition. Memory of floating in water commonly includes some visual imagery of water lapping about a face (if one recalls floating face up) or of wet hair about the back of the head (if face down). A long time ago I could introspect with ease and did so freely. Becoming aware that there were theoretical difficulties about introspection, I began to look at the process critically. Eventually I discovered to my astonishment that it included some imagery of a pair of eyes with the upper part of the face (my eyes and face) somehow embedded in the back of a head (my head) looking forward into the sort of gray cavern R y l e (1949) has talked about. Unfortunately this seemed so ridiculous that I rapidly lost my ability to introspect and now can no longer report on the imagery in detail. But such fantasy in one form or another may be a source of the common conviction that one's mental processes are open to inspection. T h e imagery is fleeting and unobtrusive and not likely to be reported even to oneself, being so inconsistent with one's ideas of what imagery is and how it works, but it may nonetheless be a significant determinant of thought. T h e theoretical analysis earlier in this paper, in terms of lower- and higher-order assemblies, implies a con-
tinuum from the very vivid imagery of hallucination through the less vivid memory image to the completely abstract conceptual activity that has nothing representational about it. ( T h i s includes of course auditory—especially verbal—imagery as well as somesthetic imagery, and it must be wrong to make a dichotomy between visual imagery and thought, or to identify abstract ideas with verbal processes.) T h e ordinary course of thought involves an interaction of sensory input with the central processes—one looks at the problem situation directly, if it is available, makes sketches, talks to oneself— and the activity of the lower-order assemblies, in imagery, may have the same "semi-sensory" function of modulating the concurrent activity of higherorder assemblies. T h e relative efficacy of concrete nouns (names of imaginable objects as stimulus-words in paired-associate learning (cf. e.g., Paivio, 1969), together with the fact that pictures of such objects are still better, suggests something of the sort. Once it has had its effect on higher activity the image may cease; it would be reportable only when it is persistent, tending to interrupt the ongoing thought process, or reinstated later without reinstatement of the whole thought process of which it was part. In this way it is possible to understand how bizarre imagery, of the kind involved in my introspection, might occur without being recognized, or how visual imagery might form an essentia! part of the cognitive map (Tolman, 1948) of a man driving a car through familiar territory, even for the man who believes that visual imagery plays no part in his planning. T h e difference between those who have little imagery and those who have much may be not a difference of the mechanism of thinking, but a difference in the retrievability of the image.
178
477 REFERENCES G . W. The eidetic image and the after-image. American Journal oj Psychology, 1928, 40, 418-425.
ALLPORT,
BEXTON,
W.
H.,
HERON,
W . ,
4
SCOTT,
T.
H . Effects of decreased variation in the sensory environment. Canadian Journal of Psychology, 1954, 8, 70-76. B I N E T , A. L'ltude experimcntatc de ['intelligence. Paris: Schleicher, 1903. F R E T D E S , D., St H A Y D E N , S . D. Monocular testing: A methodological note on eidetic imagery. Perceptual and Motor Skills, 1966, 23, 88. GELLERMAN, L . W. Form discrimination in chimpanzees and two-year-old children: I. Form (triangularity) per se. Journal oi Genetic Psychology, 1933, 42, 3-27. G O O D , I. J . Speculations concerning the first ultraintelligent machine. Advances in Computers, 1965, 6, 31-88. HABER, R. N., & H A B E R , R. B . Eidetic imagery: I. Frequency. Perceptual and Motor Skills, 1964, 19, 131-138. HEBB, D, O. Organisation of behavior. New York: Wiley, 1949. HEBB, D. O. The American revolution. American Psychologist, 1960, 15, 735-745. H E B B , D. O. The semi-autonomous process, its nature and nurture. American Psychologist, 1963, 18, 16-27. HEBB, D. O. A textbook of psychology. (2nd ed.) Philadelphia: Saunders, 1966. H U B E L , D. H . , & W I E S E L , T . N. Receptive fields and functional architecture of monkey striate cortex. Journal oj Physiology, 1968, 195, 215-243. K. S. In search of the engram. In Symposia oj the Society oj Experi-
LASBLEY,
179
mental Biology. No. 4. Cambridge: Cambridge University Press, 1950. M A L M O , R . B.. & S U K W I L L O , W. W. Sleep deprivation; Changes in performance ami physiological indicants of activation. Psychological Monographs, 1960, 502 74(15, Whole No. 502). MORRIS,
G . 0.,
WILLIAMS,
H.
L., &
LUBIN,
A . Misperception and disorientation during sleep deprivation. A.M.A. Archives oj General Psychiatry, 1960, 2, 247-254. MOSELEY, A . L . Hypnagogic hallucinations in relation to accidents. American Psychologist, 1953, 8, 407. PAIVIO, A . Mental imagery in associative learning and memory. Psychological Review, 1969, in press. PRITCHARD,
R.
M.,
HERON,
W.
P
&
HEBB,
D. O. Visual perception approached by the method of stabilized images. Canadian Journal of Psychology, 1960, 14, 67-77. RYLE, G. The concept oj mind. New York: Barnes & Noble, 1949. S I I P O L A , E . M., & H A Y D E N , S. D. Exploring eidetic imagery among the retarded. Perceptual and Molar Skills, 1965, 21, 275-286. S I M M E L , M. L . Phantoms in patients with leprosy and in elderly digital amputees. American Journal oj Psychology, 1956, 69, 529-545. TEUBER, H . L . Perception. In J . Field, H. W. Magoun, & V. E . Hall (Eds.). Handbook of Physiology: Neurophysiology. Vol. 3. Washington, D. C . : American Physiological Society, 1960. T O L M A N , E . C. Cognitive maps in animals and man. Psychological Review, 1948, SS, 189-208. W O O D W O R T H , R. S. Experimental psychology. New York: Holt, 1938. (Early publication received July 10, 1968)
Reprinwd wiili pcnniss ion from Science. VW. 1,1. pp. 701-703. 1971 J1971 The American Association for Ihe AdvATicemeni of Science
Mental Rotation of Three-Dimensional Objects Abstract. The time required to recognize that two perspective drawings portray objects of the some three-dimensional shape is found to be (i) a linearly increasing function of the angular difference in the portrayed orientations of the two objects and (i0 no shorter for differences corresponding simply to a rigid rotation of one of the two-dimensional drawings in its own picture plane than for differences corresponding to a rotatio/i of the three-dimensional object in depth. Human subjects are of [en able to determine that two two-dimensional pic lures portray objects of tbe same three-dimensional shape even [hough the objects are depicted in very different orientations. The experiment reported here was designed to measure [he time that subjects require to determine such identity of shape as a function of the angular difference in the portrayed orientations of the two three-dimensional objects. This angular difference was produced either by a rigid rotation of one of two identical pictures in its own picture plane or by a much more complex, nonrigid [ransformation, Of one of the pictures, that corresponds to a (rigid) rotation of the three-dimensional object in depth. This reaction time is found fi) to
increase linearly with the angular difference in portrayed orientation and (ii) to be no longer for a rotation in depm [han for a rotation merely in the picture plane. These findings appear to place rather severe constraint on possible explanations of how subjects go about de[ermining identity of shape of differently oriented objects. They are, however, consistent with an explanation suggested by the subject themselves. Although introspective reports must be interpreted with caution all subjects claimed (i) that to make the required comparison they first had to imagine one object as rotated into the same orientation as the other and that they could carry out this "mental rotation" at no greater than a certain limiting rate; and (ii) that, since they perceived the two-dimensional pictures as objects
180
J
7W in three-dimensional space* they could imagine the rotation around whichever axis was required wilh equal ease. In the experiment each of eight adult subjects was presented with 1600 pairs of perspective line drawings. For each pair the subject was asked to pull a right-hand lever as soon as he determined thai the two drawings portrayed objects that were congruent with respect to three-dimensional shape and to pull a JetE-hand lever as soon as he determined thai the two drawings depicted objects of different three-dimensional shapes. According to a random sequence, in half of the pairs (the "same" pairs) Ihe two objects could be rotated into congruence with each other (as in Fig. I , A and B ) . and in the other half (the "different" pairs) the two objects differed by a reflection as well as a rotation and could not be rotated into congruence (as in Fig. 1C). The choice of objects that were mirror images or "isomers" of each other for the "different" pairs was intended lo prevent subjects, from discovering
_
A
_
some distinctive feature possessed by only one of the two objects and thereby reaching a decision of noncongruence without actually having lo carry out any menial rotation. As a further precaution, the ten different three-dimensional Objects depicted in the various perspective drawings were chosen to be relatively unfamiliar and meaningless in overall three-dimensional shape.
presented twice. The remaining 800 pairs, randomly intermixed with these, consisted of 400 unique "different" pairs, each of which (again) was presented twice. Each of these "different" pairs corresponded to one "same" pair (of either the "depth" or "picture-plane" variety) in which, however. One of the three-dimensional objects had been reflected about some plane in three-dimensional space. Thus the two objects in each "different" pair differed, in genera), by both a reflection and a rotation.
Each object consisted of ten solid cubes attached face-lo-face to form a rigid armlike structure with exactly three right-angled "elbows" (see Fig. 1). The sel of all ten shapes included two The 1600 pairs were grouped into subsets of five: within either subset, no blocks of not more than 200 and preshape could be transformed into itself sented over eight to ten 1-hour sesor any other by any reflection or rota- sions (depending upon the subject). tion (short of 360 ), However, each Also, although it is only of incidental shape in either subset was the minor interest here, each such block of presimage of one shape in the other sub- entations was either "pure," in that alt set, as required for the construction of pairs involved rotations of the same the "different" pairs. type ("depth" or "picture-piane ) , or "mixed." in that the two types of rotaFor each of the ten objects. IS different perspective projections—corresponding to one complete turn around the vertical axis by 20° steps—were generated by digital computer and assoA 1Plr.luie-pi.ine paif*> ciated graphical output [1). Seven of the IS perspective views of each object were then selected so as (i) to avoid any views in which some part of the object was wholly occluded by another part and yet (ii) to permit the construction of two pairs that differed in orientation by each possible angle, in 20° steps, from 0° to 18D . These 70 line drawings were then reproduced by photooffset process and were attached to cards in pairs for presentation to the subjects. 1
D
1h
6
B
_
Fig. L Examples of pairs of perspective line drawings presented to the subjects. (A) A "same" pair, which differs by an SO' rotation in tbe picture plane; (B) a "same" pair, which differs by an 80" rotaiion in depths and (C) a "different" pair, which cannoi be brought into congruence by any rotation.
Half of the "same" pairs (the "depth" pairs) represented two objects that differed by some multiple of a 20" rotation about a vertical axis (Fig. IB). For each of these pairs, copies of two appropriately different perspective views were simply attached to the cards in the orientation in which they were originally generated. The other half of the "same" pairs (the "picture-plane" pairs) represented two objects that differed by some multiple of a 20" rotation in the plane of the drawings themselves (Fig. JA). For each of these. One of Ihe seven perspective views was selected for each object and two copies of this picture were attached to the card in appropriately different orientations. A l together, the 1600 pairs presented to each subject included 300 "same" pairs, which consisted of 400 unique pairs (20 "depth and 20 "picture-plane" pairs at each of the ten angular differences from 0" to 180*). each of which was ,h
181
24
*2
90
90
I M liQ 140
IU
IH
Angle of rotation (degrees) Fig. 2. Mean reaction limes lo two perspective line drawings portraying object of the same three-dimensional shape. Times are plotted as a function of angular difference in portrayed orientation: (A) for pahs differing by a rotation in the picture plane only; and (B) for pairs differing by a rotation in depth. (The centers of the circles indicate the means and. when, they extend far enough to show outside (hese circles, the vertical bars around each circle indicate a conservative estimate of the standard error of that mean based on the distribution of the eight component means contributed by the individual subjects.)
703 lion were randomly intermixed within the same block. Each trial began with a warning tone, which was followed half a second Jater by the presentation of a stimulus pair and the simultaneous onset of a timer. The lever-pulling response stopped the timer, recorded the subject's reaction lime and terminated the visual display. The line drawings, which averaged between 4 and 5 cm in maximum linear extent, appeared at a viewing distance of about 60 cm. They were positioned, with a center-to-center spacing that subtended a visual angle of 9 , in two circular apertures in a vertical black surface (see Fjg. 1. A to C).
ner for the "different*" pairs. The overall mean reaction time for these pairs was found, however, to be 3.8 seconds— nearly a second longer than the corresponding overall means for Ihe "same" pairs. (In the postexperimental interview, the subjects typically reported that they attempted to rotate one end of one object into congruence with the corresponding end of ihe other object; they discovered that the two objects were different when, after this "rotation," the two free ends still remained no n congruent.)
a
The subjects were instructed to respond as quickly as possible while keeping errors to a minimum. On the average onfy 3.2 percent of the responses were incorrect (ranging from 0.6 to 5.7 percent for individual subjects). The reaction-time data presented below include only the 96.g percent correct responses. However, the data for the incorrect responses exhibit a similar pattern. In Fig. 2, the overall means of the reaction times as a function of angular difference in orientation for all correct (right-hand) responses to "same" pairs are plotted separately for the pairs differing by a rotation in the picture plane (Fig. 2 A ) and for the pairs differing by a rotation in depth (Fig. 2B). In both cases, reaction time is a strikingly linear function of the angular difference between the two three-dimensional objects portrayed. The mean reaction times for individual subjects increased from a value of about 1 second at 0° of rotation for all subjects to values ranging from 4 to 6 seconds at 180° f rotation, depending upon ihe particular individual. Moreover, despite such variations in. slope, the linearity of ihe function is clearly evident when the data are plotted separately for individual threedimensional objects or for individual subjects. Polynomial regression lines were computed separately for each subject under each type of rotation. In all 16 cases the functions were found lo have a highly significant linear component {P < .001} when tested against deviations from linearity. No significant quadratic or higher-order effects were found (P > .05, in all cases).
Not only are the two functions shown in Fig. 2 both linear but they are very similar to each other with respect to intercept and slope. Indeed, for the larger angular differences the reaction times were, if anything, somewhat shorter for rotation in depth than for rotation in the picture plane. However, since this small difference is either absent or reversed in four of ihe eight subjects, il is of doubtful significance. The determination of identity of shape may therefore he based, in both cases, upon a process of the same general kind. If we can describe this process as some sort of "mental rotation in threedimensional space," then the slope of the obtained functions indicates that the average rate at which these particular objects can be thus "rotated" is roughly 60° per second. Of course the plotted reaction times necessarily include any times taken by the subjects to decide how to process
0
The angle through which different three-dimensional shapes must be rotated to achieve congruence is not, of course, defined. Therefore, a function like those plotted in Fig. 2 cannot be constructed En any straightforward man-
182
the pictures in each presented pair as well as the time taken actually to carry out the process, once it was chosen. However, even for these highly practiced subjects, the reaction times were still linear and were no more than 20 percent lower in the "pure" blocks of presentations (in which (he subjects knew both the axis and the direction of the required rotation in advance of each presentation) than in ihe "mixed" blocks (in which the axis of rotation was unpredictable). Tentatively, this suggests that SO percent of * typical one of these reaction times may represent some such process as "mental rotation" itself, rather lhan a preliminary process of preparation or search. Nevertheless, in further research now underway, we are seeking clarification of this point and others. ROGER N.
SHEPARD
J A C Q U E L I N E MBTZLBB Department Stanford Stanford,
of
Psychology,
University, Califarnia
94305
References and IN'nlei L. H n . Jlh-Jie Chana of the Bell Telephone Lahor'tQrtej generated Ihe ISO perspective projections tor us by means al the Bell Laboratories' Strombera-t •: • • .1 microfilm retorder and tbe computer program for constructing tuch projections developed there by A- M . Noll. See. tor example, A . M . Noll. Computers Automation 14. 20 (1965). 1. We Ehonk Mrs. Chana l * » (I)]', and WC also Ibank Dr. J. D. ElMhoft tot her liigBfljtion concerning the statistical analyses. Assistance In 1be computer graphics was provided by the Bell Telephone Laboialo.net. Supported by N S F grant GS-5?S3 to R . N . S . 9 Maieh 1970; replied 9 September 1970
•
Reprinted with permission tram Science. vol. 243 pp. 234-236. 1939 © l9S9The American Association for the Advance meni of Science L
Mental Rotation of the Neuronal Population Vector
weighted
vector
sum
of
contributions
("votes") o f d i r e c t i o n a i l y t u n e d
neurons:
each neuron LS assumed t o vote i n its o w n
A P O S T O L O S P. G E O R G O P O U T . O S , * JOSEPH T. L U R I T O , M I C H A E L P E T R I D E S , A N D R E W R. S C H W A R T Z , JOE T. M A S S E Y
preferred d i r e c t i o n w i t h a strength that depends o n h o w m u c h the activity o f the neuron changes Tor (he m o v e m e n t
under
A rhesus m o n k e y w a s ( r a i n e d t o m o v e its a r m i n a d i r e c t i o n t h a i was p e r p e n d i c u J a r ( o
consideration. T h i s vectorial analysis
and counterclockwise f r o m the d i r e c t i o n o f a target light that changed i n position f r o m
proved useful tn visualising (he directional-
t r i a l t o t r i a l . S o l u t i o n o f t h i s p r o b l e m was h y p o t h e s i z e d t o i n v o l v e t h e c r e a t i o n a n d
ity o f the p o p u l a t i o n i n t w o - and
m e n t a l r o t a t i o n o f an i m a g i n e d m o v e m e n t v e c t o r f r o m (he d i r e c t i o n o f (he l i g h t ( o the
dimensional space d u r i n g the reaction time
d i r e c t i o n o f t h e m o v e m e n t . T h i s h y p o t h e s i s was tested d i r e c t l y b y r e c o r d i n g (he
{7) and d u r i n g an instructed delay p e r i o d
a c t i v i t y o f cells i n Che m o t o r c o r t e x d u r i n g p e r f o r m a n c e o f the task a n d c o m p u t i n g (he
(8).
n e u r o n a l p o p u l a t i o n v e c t o r i n successive time i n t e r v a l s d u r i n g t h e r e a c t i o n time. T h e
Given
the
mental
rotation
has
three-
hypothesis
p o p u l a t i o n v e c t o r r o u t e d g r a d u a l l y c o u n t e r c l o c k w i s e f r o m t h e d i r e c t i o n o f (he l i g h t t o
above and the neuronal p o p u l a t i o n vector as
t h e d i r e c t i o n o f t h e m o v e m e n t at an average t a t e o f 7 3 2 ° per s e c o n d . T h e s e results
a neural representation o f the
p r o v i d e d i r e c t , n e u r a l evidence f o r the m e n t a l r o t a t i o n h y p o t h e s i s a n d i n d i c a t e ( h a t rhe
direction^ a s t r o n g test is as f o l l o w s : i f a
n e u r o n a l p o p u l a t i o n v e c t o r is a u s e f u l t o o l f o r ' r e a d i n g : o u t " a n d i d e n t i f y i n g c o g n i t i v e
m o n k e y performs in (he above-mentioned
operations o f n e u r o n a l ensembles.
task and the neuronal activity i n the m o t o r cortck.
A
is recorded
during
movement
performance,
FUNDAMENTAL PROBLEM IN COG-
d i r e c t i o n o f a stimulus. U n d e r these c o n d i -
w o u l d the p o p u l a t i o n vector rotate i n t i m e ,
l i t i v c neuroscience is the idcntilica-
tions
the
as the hypothesis l o r a mental r o t a t i o n o f an
d o n a n d e l u c i d a t i o n o f b r a i n events
angle, w h i c h suggests that the subject may
imagined movement vector w o u l d predict?
The
solve this p r o b l e m b y a m e n t a l r o t a t i o n o f an
Because the appropriate m o v e m e n t dircC-
technique o f r e c o r d i n g the activity o f single
imagined movement vector f r o m the direc-
t i o n can be arrived at by either a counter-
cells in the brain o f behaving animals (2)
t i o n o f the stimulus r o rhe d i r e c t i o n o f the
clockwise or a clockwise r o t a t i o n , w h i c h o f
provides a direct t o o l
purpose.
actual movement ( 5 ) . N o w , ihe d i r e c t i o n o f
these t w o rotations w o u l d be realized by the
Indeed, a wcatrh o f k n o w l e d g e has accumu-
i n u p c o m i n g m o v e m e n t i n space seems to
p o p u l a t i o n vector? O f course, there is n o
lated d u r i n g rhe pasr I S years c o n c e r n i n g
be represented i n the m o t o r cortex as rhe
reason rhat rhe p o p u l a t i o n vector s h o u l d
the activity o f cells i n several b r a i n areas
neuronal p o p u l a t i o n vector ( 6 ) w h i c h is a
rotate ar ail, a n d i f i t rotates* there i$ n o a
underlying c o g n i t i v e operations
f o r that
(I).
the reaction time increased w i t h
h
d u r i n g performance b y monkeys o f complex tasks. A major f i n d i n g o f chese studies has been that the activity o f single cells in specific areas o f the cerebral cortex changes d u r i n g performance
ol
particular
tasks;
these
changes are t h o u g h r t o reflect the participat i o n o f rhe area under study in the c o g n i t i v e function i n v o l v e d i n the task ( J ) . H o w e v e r , a direct visualination o f a c o g n i t i v e operat i o n in terms o f neuronal activation i n the brain is lacking. W e chose as a test case f o r this p r o b l e m the c o g n i t i v e o p e r a t i o n o f mental r o t a t i o n . I m p o r t a n t w o r k in experimental psychology d u r i n g the past 20 years { J ) has established die mental r o t a t i o n p a r a d i g m as a standard in c o g n i t i v e psychology and as a p r i m e t o o l in investigating c o g n i t i v e operations o f the " a n a l o g " type. W e adapted t h i s procedure in a task that r e q u i r e d m o v e m e n t o f a handle in a d i r e c t i o n (hat was at an angle w i t h (he
A- P- GeQrRQpoulDS and |. T. LurvtO, The Philip Bard Laboraiones of Heuiophysiology, DtpJumem of Neunoetcnf( The Johns llopkuu UniverMry School of Meeutlne. 725 Nonti Wolf* Street, Qalrifnore, M£> 21205 M . FOnds, Deparanenr. of Psychology McGiU University 1205 Dr. Penfield Avenue, Montreal, Quebec, Canidl H3A J R | A. B. Schwann, Division o f Neurobiology, Sr. Joseph's Hospital and Medieal Center. BarrOv. Neurological Institute 3SO Wesc Thomas Road, fhucni-. A £ S S 0 I 3 . J. T. MiUcy, Deparanenr. of Neurowicn" and Department of Bioniedjral tncuieennE. The lohjis Hopkins University Sf hoot of Medicine. 725 N o r * Wolfe Srreei Baltimore. MD 21205 h
h
L
T o •^hom correspondence should be addressed.
Fig. 1. Result, from a direct [left] and rotation (right) movement. (A) Task. Unfilled and filled circles indicate dim and bright Light, respective lv. Interrupted and continuous, lines with, arrows indicate stimulus IS) and movement {M) direction, respectively. (B} Neuronal population vectors calculated every l O r m f r o m (he onset of the stimulus [S] at positions shcuwn in (A) unl.il after the onset o f the movement ( M ) . When the population vector lengthens, for the direct case (left} Lt poults in the direction o f the movement, whereas for the rotation case it points initially i i i the direcuon o f the stimulus and then rotates counterclockwise irrom 12 o'clock to 9 o'clock) and points in die direcuon o f the movement (C) Ten successive population vectors from (B) are shown in a spaual plot, starting from the first population vector that increased signuicandy in length Notice the counterclockwise rotatiim ol" the pnptilauon vector fright panel). (Dj Scatter plots- o f rhe direction of the population vectiir as a function trf tune, starting from the first population vector that increased significantly in length after stimulus onset (Sf. For the direct case (left panel) the direction o f the population vector is in the direction o f the mnvement (•-ISO"); fur the rotation case (right panel) the direction o f the population vcaor rotates counterclockwise from the direction o f the stimulus ( - v O " ! D the direcuon o f the movement (-1B0-).
183
A
Dlruci
Rotation S • W
g
c
D
~ £ c •= 8" J
"
J Q I )
_ 10O £00 300 Tin* (me}
•00 SOD 300 Urn* (Ml
235 priori reason char it should rotate in one or die other direction; for all we know, any of these alternatives is possible. The activity of single cells in rhe motor cortex was recorded (9) while a rhesus monkey performed in the mental rotation taskIn the beginning of a trial, . light appeared at the center of * plane in front of the animal, which moved its arm toward the light with a freely movable handle (10). After a variable period oftime(0.75 to 2.25 s). the center light was rumed off and turned on again, dim or bright-, at one of eight positions on a circle of 2-cm radius (If). The monkey was trained to move the handle in the direction of the light when it came on dim fdirecT. trials) or in a direction that was perpendicular (90") to and counterclockwise from the direction of the light when it came on bright (rotation trials) (T2) The movements of the animal were in the appropriate direction for both kinds of trials. The neuronal population vector was calculated every 10 ms starting from the onset of the peripheral light (that is, at the beginning of the r
reaction time). The preferred direction of stimulus onset. At this point its direction each cell (n = 102 cells) was determined was close to the direction of the stimulus; from the cell activity in the trials in which the average angle between the direction of the animal moved toward the light (direct the population vector and that of the stimutrials). For the calculation of the population lus was 17" counterclockwise (the average vector, peri stimulustimehistograms [10-ms absolute angle was 29"). The population bin width) were computed for each cell and vector stabilized in direction at 225 ± 50 each of the 16 combinations (classes) used ms after stimulus onset. At this point its [eight positions and two conditions (direct direction was close to the direction of the or rotation), see (11) above] with counts of movement; the average angle between the fractional interspike intervals as a measure of direction of the population vector and that the intensity of cell discharge. A square root of the movement was 0-5" clockwise (the transformation was applied to these counts average absolute angle was S ). Finally, the to stabilize the variance (13). For a given movement began 260 = 30 ms after stimutime bin, each cell made a vectorial contribu- lus onset, thai is, 35 ms after ihe direction of tion in the direction of the celTs preferred the population vector became relatively stadirection and of magnitude equal to the ble; this difference was statistically significhange in cell activity from that observed cant (P < 0.02, paired t rest). during 0.5 s preceding the onset of the These results support the hypothesis that peripheral stimulus (control rale, that is, ihe directional transformation required by while the monkey was holding the handle at the task was achieved by a cotmtcictockwise the center of the plane). The population rotation of an imagined movement vector. vector P for the j ' class andfc time bin is This process was reflected in the gradual change of activity of motor cortical cells, m which led to the gradual rotation of the Pj^-Su'.j.fcCi vectorial distribution of the neuronal ensemble and the population vector. The average where C, is the preferred direction of the slope of the rotation of the population veci" cell and iv\ is a weighting func- tor (7327s, see above) was comparable to tion M'ij.t = t^ij.fc) — -"ij "here 4j.k is but higher than that observed when human the square root-trarisfbrTned (IJ) discharge subjects performed a similar task (—4007s) rate of the r* cell for the f class and k' rime (5) and that observed in a task that involved bin, and a,j is the similarly transformed mental rotation of rwondimertsional images control rate of the r* ceil for the class. (-4007s) (14). It is likely thar all three Figure 1 illustrates the results obtained experiments involved a process of mental when the movement direction was the same rotation which, in the present case, was (toward 9 o'clock) but the stimulus was reflected in the motor cortical recordings of cither at 9 o'clock (direct trials, left panel] or this study and identified by using the popuat 12 o'clock (rotation trials, right panel). In lation vector analysis. Of course, other brain ihe direct trials die population vector point- areas are probably involved in such complied in the direction of the movement (which cated trarisformations; for example, recent coincided with the direction of the stimulus) (Fig. \, left). However, in the rotation trials experiments with measurements of regional the population vector rotated intimecoun cerebral blood flow (15) suggested that tetclockwise From the direction of the stimu- frontal and parietal areas seem to be in¬ lus to rhe direction of the movement (Fig. 1, volved in the mental rotation task of Shepright). Another example is shown in Fig. 2 ard and Metzler (16), whereas frontal and and illustrated in the cover photograph. The central areas seem to be involved in a line working space is outlined in blue. The time orientation task (f5); in both of these tasks axis is the white line direcred upwards. The there was a greater increase in blood How in population vector is shown in green, as it the right than in the left hemisphere. rotates during the reaction time from the The rotation of the neuronal population stimulus direction (between 1 and 2 o'clock) vector is of particular interest because there to the movement direction (between 10 and was no a priori reason for it to rotate at all. 11 o'clock). The population vector was cal- It is also interesting that the population culated with a 20-ms bin sliding every 2 ms. vector rotated consistendy in the counterThe red lines are projections of the popula- clockwise direction: this suggests that the tion vector onto the working space. spatial-motor rransfotmation imposed by the task was solved by a rotation through The rotation of the population vector was the shortest angular distance. Given that die a linear function of time with an average mental roration is time consuming, this soslope (for the eight positions of the Ughi lution was behaviorally meaningful, for it used) of 732 ± 4567s (mean * SD). The nuniiruzed both the time for the animal to population vector began to change in length get the reward and the computational effort 125 ± 28 ms (mean ± SD, rr = &) after ihe which would have been longer if the rotaB
h
111
1
JM
h
^
L
90
Fig. 2- Rotauon of the population vector for a different set of rotation trials. The stimulus and movement directions arc indicated by the interrupted and continuous lines at the top. The population vector in the two-dimensional space is mown for successive time frames beginning 9D ms after stimulus onset. Notice its rotation counterclockwise from the direction of the stimulus to rhe direction of the movement.
184
h
236 •on had been through 270" clockwise (77). Finally, these results were obtained from one animalr because cognitive problems could be solved in different ways bv different subjects, it is important that techniques for reading out brain operations be sensitive enough to be applied to single subjects. Indeed, the findings of our study indicate that rhe population vector is a sensitive tool by which an insight can be gained into the brain processes underlying cognitive operations in space.
its movement exceeded 3 em and slaved within =25" of the dirteOon. requited!. The average direction of ihe actual movement rja|ectoriCi wai wirJun =5" of die direcuon required. Performance was over 70% correct trials. 13 The square roar transformation was used as a variance-stabilizing Irani forma don for counrs [G. W. Snedetor and W G Cochran. Stattiliwl Mtihtdi (Iowa Slate TJniv. Press, Ames. Iowa, ed. 7, 1980.}. pp. 298-290.] Although the results obtained withour this iraniformiiion were similar, rhe transformation is mare appropnare because of ihe im JII size nf the time him (10 ms), and, therefore, the small number of counts.
1.....-.J..' P;.!-,-! 7.20(1975). 15. G. Deutseh. W. T. Bourbon, A. C Papanicolaou, H M Eiscnberg tovt°pty. fl. A_ P. Grorgopoukn, M . D. Cnitehcr, A. BSchwartt £»p. fl"r« fl=J.. in press. 9. The elecmcal sigrn of iCtivJi^' of Individual cells in the irTn area of The motor conei conirabteral to the pcrfonntng arm were recorded utracelluliriy |A. P. GeorgDpoulos, 1. F KzJaska. EL Cimlnld, )- T. Missey, j . 2, 1527 J1982)] All surgical •pflTJlionj [A. P. Georgopoulos, f, F. Kalaika. B_ CauunlTi, f, T. .Massey. / .\ •-•... 2, 1527 (1982)1 for the prepzradon of ihe animal for electrophysiological nccordingJ. •vers perfotmed under general pentobarbital anesthesii. BehaviDral control and data eollecnon and analysis were performed wilh a laboritory minicomputer. 10. The apparatus -"is udesenbed ui A. P Georgapoulo* and |. T Mfcucy I ^ T Brain Pts. 65, 361 (1987)]. Briefly, it consisted of a 25 cm by 25 cm planar working surface made of frosted plenigtzss •nro which a HcNc lalCt beam was back-projecled with a sysrem of manrors and rwa gzlvanomcters The monkey (5 kg} sac comfortably On 1 primate chair and grasped a freely movable, artieuTarcd handle at its distal end, net ro a 10-mm diameter rraruparCnr plexiglass circle wirhin which the animil capnired the center light. 11 The eight positions were equally spaced on the circle, that is. IE angullf intervals of 45°, and were the same thjOUghout the operiment. The brightness amditjon (dim or bright) and the pojipon of rhe light v^ere muted. The resulting lfi hnghtness-pOJi•on combinations were randomized. Eight Tepeu•ons of these 16 COmbinations were orocmed in a randomized block design. 12. The Term "eouiircidockwise" i l limply descriptive^ no counteiclockwise Or elotkwise direetioni were indicated to the animal. The direction tn which rhe animal was required io move can be desenbed equivalent^ 31 either 90" countcretockwise or 270° clodtwue The animal received 1 liquid reward wher*
185
2b Information Theory and Perception
Perhaps an introduction to this section could be dispensed with because the first text, (18) by Gregory, provides an almost ideal stage-setting for the four others. Nevertheless, here follow a few side remarks. Computation is information-handling, processing of information. The edification of the mathematical theory of information, to which the name of Claude Shannon is prominently associated, gave an adequate tool for engineering communication systems. It had also considerable impact on psychophysics and cognitive sciences, and raised more hopes and expectations than did materialise in subsequent developments. Gregory points out some difficulties of applying information theory to brain studies, yet conveys a feeling of expectation for a further revival. Essential notions such as information content, information entropy, channel capacity (measured in bit/s), span or repertoire (measured as a dimensionless number, or in bits), redundancy reducing code, etc., have been used to quantify the perceptual and motor capacities of biological brains i.e. their performances at processing the huge flow of sensory data. The following four papers will help the reader to assess the contributions and the limitations of information theory. The early paper of Attneave (paper (19)) expresses the excitement of the fifties with this new scientific tool. Its insistence on "perception as economical description", an early formulation of the "redundancy reduction" principle, prepares the way for two subsequent papers, (21) and (22) by Barlow, as well as for article (38) in Chapter 3. In this section are reproduced two major classics, papers (20) and (21), which we felt necessary in this selection because they express essential, and occasionally subtle, ideas with unrivalled clarity. The article of George Miller (20), inspired by the channel capacity concept, attempts to discover and quantify various information bottlenecks in our perceptual apparatus (such as span of absolute judgement, span of immediate memory) and to characterise the ways by which we manage to overcome
186
these innate limitations (such as chunking by learning). T h e first text of Barlow (paper (21)) sets itself a programme so ambitious, that success would have meant solving almost all the fundamental mysteries still present today in the field of neurocomputation. The "single neuron doctrine" is directly inspired by the information-theoretic principles of redundancy reduction and economy of representation. It involves a coding scheme based on decorrelation processes, requiring at each stage fewer and fewer neurons, leading finally to highly selective "cardinal cells". These issues will be evoked again in section 3d. T h e reader is invited to ponder on the differences between this line of thought, and that exposed in the previous section ("cell assemblies"). A t this point, the reader is supposed to begin drawing much benefit from the cross-sights between the various sections of this volume, and also to acquire a sense of historical depth provided by the chronology of the articles. T h e second text of Barlow (paper (22)), written about twenty years later, presents several useful additional remarks and suggestions concerning the interplay between perception and learning, the versatility of learning in higher mammals, synaptic mechanisms, neural coding, and correlation/ decorrelation mechanisms. The reader is invited to examine the compatibility of the requirements on neural coding, as exposed in papers (21) and (22): Is the impulse frequency of a neuron coding for subjective certainty, or for surprise? Finally, he will come to appreciate the antiquity of the definition of perception as unconscious inductive interference, and to construe his own insightful analogies between perceptual processes, thought processes and scientific processes.
187
Reprinted with permission from Odd pp, 187-194, 1986 © 1986Routledge
Perceptions.
WHATEVER HAPPENED TO INFORMATION THEORY?
The notion of information bridges the external world of physical events and our perception and understanding, for information of events and things is separate from us in space and time. How to think about and to measure information are still controversial matters, and there are snags, or at least limitations, in all theories and methods. The lead has come from engineers, but until recendy at least their concern has been with communication-transmission systems, rather than with brains or minds. With developments in artificial intelligence this is now changing, and we may look for further leads from the engineers who have to grapple with these difficult problems. The issues are technical, and we cannot altogether escape technical detail here. Twenty or thirty years ago the great hope for quantifying perceptual decision processes, andfindingout what goes on in skills, was the mathematical theory of information, which was developed for engineering communication systems and most elegandy worked out and presented by Claude Shannon and Warren Weaver of the Bell Telephone Laboratories in 1949. This promised to be able to measure amounts of information, rather as there are measures for costing energy, for estimating efficiencies and charging for messages conveyed by telegraph and telephone lines. It also promised to quantify the information-handling of the nervous system and so to give basic insights for a functional philosophy of perception and behaviour. Some of this came true, but we hardly hear of information
188
theory now in accounts of perception - though it is extremely important for communication and control engineering. Why isn't it as useful for physiology and psychology? Has this something to do with the subjective nature of perception? Does this prevent objective measurements? Conveying information by gestures and by speech is pre-human in origin, and indeed essential for all intelligent life. By writing, and much later by printing, information has been transmitted over space and stored in time with incalculable benefit, but attempts to transmit information very fast over distances beyond the range of speech are quite recent, and the means for costing or charging for information transmission have been devised more recendy still. There is, however, a considerable history before Shannon and Weaver. Suggestions for telegraphs, using magnetism or electricity, go back to Roger Bacon's suggestion in 1267 that a 'certain sympathetic needle might be used for distant communication'. T h i s was the lodestone, or rather a needle magnetized by rubbing with a lodestone, as used in the mariner's compass. T h e n there is a seventeenth-century description of a telephone working with wires by the inventive genius Robert Hooke, in the Preface to his Micrographia of 1664. After a discussion of optical instruments that 'must watch the irregularities of the Senses, but . . . not go before them or prevent their information', Hooke wrote: T i s not impossible to hear a whisper a furlong's distance, it having already been done; and perhaps the nature of the thing would not make it more impossible, though that furlong be ten times multiplied. And though some famous Authors have affirm'd it impossible to hear through the thinnest plate of Muscovy-glass; yet I know a way, by which it is easy enough to hear one speak through a wall a yard thick. It has not yet been thoroughly examin'd, how far Otocousticons may be improved, nor what other ways there may be of quickening our hearing, or conveying sound through other bodies than the Air: for that this is not the only medium, I can assure the Reader, that I have, by the help of a distended wire, propagated the sound to a very considerable distance in an instant, or with as seemingly quick a motion as that of light, at least, incomparably swifter than that, which at the same time was propagated through the Air; and this not only in a straight line, direct, but in one bended in many angles. Just what was this seventeenth-century otocousticon telephone? Was it a 'string telephone' - a wire joining distant diaphragms? If so, how, exacdy, did it work round comers? O r can it have been an electric telephone? Surely this was beyond even Hooke's ingenuity at that time. T h e r e were several suggestions for electric telegraphs a
189
century later, following Steven Grey's discovery in 1729 that electricity could be carried on insulated wires. By the 1780s there began to appear working single-wire telegraph systems using sounders with a code. Samuel Morse (1791-1872) turned from painting to building telegraph transmitting and receiving equipment, and to inventing his code. This was clever, because it used the fewest dots and dashes for the most frequently used letters - which showed how the code must be optimized for efficient transmission of information. It was first demonstrated on Saturday 2 September 1837, at Washington Square in New York, over 1700 feet of wire. But wire was expensive and information was in demand, and rapid rates of transmission of messages soon became economically vital. This need became all too clear with the first, incredibly expensive, transadantic cable. This supreme triumph of Victorian engineering, as it turned out to be, was very nearly a disaster, as fundamental principles were not available or understood. Itfirstoperated in 1858, but through misuse it broke down after only a few hundred messages had been transmitted and received. The next cable was successfully laid in 1865 by Brunei's huge ship the Great Eastern, which also recovered and repaired the earlier cable. At first messages could only be sent at the uneconomic rate of eight words per minute - not because of the human limitations of the senders and receivers but because of the inductive inertia (or low bandwidth) of the cable. The signal strength was low, which led to the invention of relays, amplifiers and repeaters which solved the problems for the long-distance telegraph cables as they do in the nervous system; but atfirstthe brute-force method of applying very high voltages to the cable was tried - and damaged it. Separating concepts of information from power and appreciating their statistical nature were the essential steps for the understanding and designing of information systems. The best scientists of the time - Michael Faraday, Charles Wheatstone, William Cooke, Lord Kelvin and, as a young man whose great inventive achievements came later, Thomas Alva Edison - contributed to this extraordinarily important technical development. Apart from the practical importance of rapid international communication, it focused attention on information as a concept, and it led to notions of channels and coding and so on that generated insights into all kinds of communication, including language, which deeply influenced philosophy as well as providing suggestions for how information is handled by the nervous system. The notion of a minimum necessary bandwidth (frequency range) for transrm'tting signals at a required rate was appreciated by Lord Kelvin, from his work on the transadantic cable; but it was not formulated until 1924 by H . Nyquist in America and K. Kumpfuller in Germany, who independendy stated the law that was developed to
190
190 its general form by R.V.L. Hartley in 1928-mat me transmission of a 'given quantity of information' is limited by the product of bandwidth and time. Hartley went further, to define information as the successive selection of signs, or words, from a given list or ensemble of possibilities. For this definition he rejected meaning as 'subjective' (although we usually think of the meanings of messages as what matters), for it is signals, not meanings, that are transmitted. But the word 'information' remained in use though it no longer referred to meanings, which can be confusing. Here we get more technical. Hartley showed that a message of N distinguishable signs (such as letters or dots and dashes) selected from a repertoire or ensemble of S signs, has Sn possibilities, and that the 'quantity of information' is most reasonably and usefully defined in logarithmic units, to make information measures additive. Hardey quantified information thus: H=
N\o% S 2
An essential notion for quantifying information is that the less likely the symbol or event, the more information it conveys. Information rate is not defined simply in terms of the number of symbols that can be transmitted, but also in terms of the probability (or the surprise value) of their occurrence. T o modify a favourite example of Bertrand Russell's, 'Dog bites man' does not convey much information - but '"Man bites dog" is news'. The second costs more to transmit, for to be sure such an unlikely message is correct we must know with confidence that it is free of error, and this requires expensive reliability - which is difficult to achieve in engineering, or for the nervous system. Since the simplest choice is yes or no (or on or off, for a switch) the information unit of a bit (binary choice) is useful. Norbert Weinerand Claude Shannon developed the Hartley approach by examining the statistical characteristics of signals, including the values of waves for analogue signals. They reinterpreted Hartley's Law, to define the average information of long sequences of n symbols as n
H=-'2p \ogp i
i
i
(The minus sign makes H positive, since it involves logarithms of p which is fractional.) It is well worth playing the dictionary game of finding a word with the 'Twenty questions' technique of asking, for example, 'Is it before K?' and, if it is, 'Is it before E?', and so on. Any word in a dictionary can usually be located within twelve such binary decisions, which is quite remarkable. This is surprising because we are not used to thinking in terms of powers of two. The number of binary choices
191
191 enabling selection of say a word, from a set of words, increases as follows (values rounded off): Binary decisions required
t 1 4 8 ifi 3* 64 128 256
Size of dictionary from which any word can be found (no. of headwords)
2 4 [6 (1.6X10) 256 (2.6 x c o ) 65000 (6.5x10*) 4300000000 (4.3 X I o ) 1g 000 000 000 000 000 000(1.9X10") 340 000 000 000000000000000000000000000000 (3.4X]o > 160000000000000000000000000000000000000 (1.6x10") 000000000000 000 000000000000000000000000 1
2
9
3a
Surely the power of this strategy has implications for how we recall memories. It is not possible, however, to use the amazingly efficient 'Twenty questions' technique except for items which are ordered in some way, such as words arranged alphabetically in dictionaries. Memories may be filed for our almost instant recall in some orderly way that is still beyond computer technology T h e greatest quantity of information that can be transmitted through a channel, with bandwidth u> over time r, in the presence of disturbing random noise, was shown by Claude Shannon to be * t log* | i +j^J (bits), where P and Ware mean signal and mean noise powers respectively. T h i s represents a definite limit which no channel (including the visual channel) can exceed. I f however the coding of the signals is non-optimal, the information rate may be very much lower, so we need to know the coding to assess physiological efficiency. There is something odd about information as described by Shannon's theory, which is now universally accepted as the best account, for information is quite different from anything in the natural sciences. Unlike normal causes, it depends not only on what has been, and what is, but also on alternatives of what might be. It is often said that information theory can give no account of meaning, but the situation is not quite so bleak - because the selected alternatives may have, or be, meanings. In order to apply information theory rigorously it is necessary to know the number of alternatives from which selections are made. Unfortunately we seldom ii ever know just what these are for humans, so information theory can seldom be rigorously applied outside purely engineering situations where we have full knowledge of the system and especially its range of alternatives. A law for human
192
192 response time was, however, established at Cambridge by Edmund Hick in 1952, which showed that in situations where there are clearly defined choices, such as pressing response keys to lights, the mean decision time, (, increases with the number of the possibilities, n, by t = K\o% (n + i )
The decision or choice time (the disjunctive reaction time minus the simple reaction time) is very nearly proportional to the number of bits per stimulus {where a 'bit' is a binary choice). As a matter of history, I was the only subject apart from Edmund Hick for this experiment, which went on one hour a day for months. Hick did not complete it himself, so Hick's Law is based on my (very) nervous system. The relation was later found by R. Hyman (1953), who showed that with each added possibility the choice time increases by hist over a tenth of a second; and that this same increase in choice occurs no matter how the information is increased - by changing the relative frequencies of the alternatives, their sequential dependencies (introducing redundancy, so that some tended to follow others in predictable sequences), or by increasing the number of possibilities. The maximum rate of information (bit rate) of even the most skilled human operator looks surprisingly small: about 22 bits per second for an expert pianist; speech does not exceed about 26 bits, and silent reading may reach 44 bits per second. This seems low in engineering terms, where bit rates of thousands per second may be achieved, but we have to remember that far more is going on - with more choices to be made - in the richness of the brain's circuits. At present, information-handling computer systems are not comparable to us because they cannot do anything like as much as we can. This takes us to the Big Snag with information theory when it is applied to us. Unlike engineering systems, organisms are not stricdy limited to the set of choices allowed by the experimenter's situation or task. Thus, ideally, doing Hick's experiment I should only have been able to respond to the ten little lights, which were pinned in a random arrangement on cork mats, each lit with a particular key from the row under one's fingers. But when less than all ten were being used, to restrict the value of n possibilities, I was not, strictly speaking, blind to the others. And I could respond to a knock on the door, or to Edmund Hick asking me why I was not doing better, or saying that it was lunch time. One was not deaf or blind to all except the allowed alternative choices of the experimental condition, in spite of the long practice with each number of lights in use - so the total bit rate must have been far greater than measured. Although Hick's Law works beautifully in some conditions (interestingly when there is not too much practice), it cannot truly quantify the information we handle because we cannot
193
set or assess the number of alternatives, n, we may choose from. The trouble is that, unlike a computer system, a human being's range of alternatives cannot be strictly limited by the situation he is in, or by an experimenter, or by his most concentrated attention. In a much quoted paper, bearing the splendid tide: 'The magic number seven plus or minus two', the American psychologist George Miller (1956) suggested that there is an absolute limit of around seven similar items for the immediate span of apprehension. Thus we can estimate at a glance, without counting, up to about seven dots spaced randomly. But the effective information in a perceptual span (the 'specious present') can be gready increased by 'chunking' bits into larger units. This is a form of coding of data, requiring decoding of course. The most powerful coding appears to be language. Coding can set the number of bits per chunk; but the number of chunks that can be retained in immediate memory is limited to around seven. Presumably Chinese ideogram characters are efficient chunks, each conveying a lot of information. Much of learning is chunking bits of information into large units which can be stored in memory and recalled as a unit, rather like a rich ideogram. Are perceived objects memory-chunked bits of information? Again, it is commonly said that information theory has nothing to say about meaning, but this is not entirely true. Donald MacKay (1960a) produced an interesting way of looking at this. He suggested that meaning was related to conditional readiness, as the meaning of a sentence (or of a perceived event) changes the partem of possibilities for future action. He gave a working definition of meaning as the 'selective function on the range of the recipient's states of conditional readiness for goal-directed activity; so the meaning of a message to you is its selective function on the range of your states of conditional readiness'. Defined in this way, meaning is clearly a relationship between message and recipient, rather than a unique property of the message alone. So there is a 'subjective' side to the engineer's information. Can this be measured? It might be, if we knew the selective function of a message. Then we could apply information theory to quantify meaning, at least in some cases. MacKay went on in a later paper (1960b), 'What is a question?', to suggest that states of readiness for organisms are large numbers of conditional probabilities. Asking a question is a means of changing the conditional probabilities Of the questioner's states of readiness. This notion can be expressed in computer-progranuning terms. If there remains a 'problem of meaning' the problem lies somewhere beyond the adequacy of this notion of selections from, or changes of, states of readiness. These can be described in computer terms, and be implemented by computers with no special problems. Surely discussions of meaning could well start at this point.
194
194 It is sometimes said that true statements have more meaning than false statements. And philosophers frequently deny meaning to logical (and even to contingent) impossibilities. T h u s '2 + 2=5', 'She is a dark-haired blonde', might be said to be meaningless. Such logical impossibilities are internal inconsistencies of the state of readiness, such that they are program-stoppers, preventing changes in states of readiness. It is an interesting question how general program-stoppers must be before we call them 'logical errors'. Is it possible that here lies the distinction between 'logical' and 'contingent'? o
r
Reaction times are of continuing interest and an acknowledged source of useful data for teasing out processes of perception and decision-taking. But, given that reaction times depend on the repertoire of possibilities, which cannot be established - which is what destroyed the effective use of information theory-how can reactiontime experiments work? Why do they not suffer from the same deep trouble? It is sad that information theory is virtually dead in current thinking on perception. Perhaps, like poor mistaken Juliet it deserves awakening from cryptic sleep: but if so - wherefore art thou Romeo?
REFERENCES
Hick, W . E .
11-26.
(1952) 'On the
rate of gain of information', Q. J. Psychol.
4,
Hyman, R. (1953) 'Stimulus information as a determinant of reaction time', J. Experimental Psychol 45, 188-96. MacKay, D.M. (1960a) 'Meaning and mechanisms', reprinted as chapter 3 in Information, Mechanisms and Meaning (1969), Cambridge, Mass., M I T Press. MacKay, D . M . (1960b) 'What is a question?', reprinred as chapter 4 in Information, Mechanisms and Meaning {1969), Cambridge, Mass., M I T Press. MacKay, D . M . (1969) Information, Mechanism and Meaning, Cambridge, Mass., M I T Press. Miller, G.A. (1956) "The magical number seven plus or minus two: some limits to our capacity for processing information',/'g'fAo/. Rev. 63, 81-97.
195
Reprinted with permission from Psychological Review, Vol. 61, No. 3, pp. 183-193, 1954 © 1954 American Psychological Association
SOME
INFORMATIONAL
ASPECTS
O F VISUAL
PERCEPTION
FRED A T T N E A V E Perceptual and Motor Skills Research Laboratory, Human Resources Research Center' The ideas of information theory are at present stimulating many different areas of psychological inquiry. In providing techniques for quantifying situations which have hitherto been difficult or impossible to quantify, they suggest new and more precise ways of conceptualizing these situations (see Miller [12] for a general discussion and bibliography). Events ordered in time are particularly amenable to informational analysis; thus language sequences are being extensively studied, and other sequences, such as those of music, plainly invite research. I n this paper I shall indicate some of the ways in which the concepts and techniques of information theory may clarify our understanding of visual perception. When we begin to consider perception as an information-handling process, it quickly becomes clear that much of the information received by any higher organism is redundant. Sensory events are highly interdependent in both space and time: if we know at a given moment the states of a limited number of receptors (i.e., whether they are firing or not firing), we can make better-tban-chance inferences with respect to the prior and subsequent states of these receptors, and also with respect to the present, prior, and subsequent states of other receptors. T h e preceding statement, taken in its broadest im-
plications, is precisely equivalent to an assertion that tbe world as we know it is lawful. I n the present discussion, however, we shall restrict our attention to special types of lawfulness which may exist in space at a fixed time, and which seem particularly relevant to processes of visual perception.
T H E NATURE OF REDUNDANCY IN V I S U A L STIMULATION: A DEMONSTRATION Consider the very simple situation presented in F i g . 1. With a modicum of effort, the reader may be able to see this as an ink bottle on the corner of a desk. L e t us suppose that the background is a uniformly white wall, that the desk is a uniform brown, and that the bottle is completely black. T h e visual stimulation from these objects is highly redundant in the sense that portions of the field are highly predictable from other portions. I n order to demonstrate this fact and its perceptual significance, we may employ a variant of the "guessing game" technique with which Shannon (17) has studied the
40
10
» 1
The experimental w o r k for this study was performed as part of tbe United States A i r Force Human Resources Research and D e velopment Program. The opinions and conclusions contained in this report are those of the author. They are not to be construed as reflecting the views or indorsement of the D e partment of the Air Force.
196
10
COLUMNS
F I G . 1. Illustration of redundant visual stimulation
184 redundancy of printed English. We may divide the picture into arbitrarilysmall elements which we "transmit" to a subject ( S ) in a cumulative sequence, having him guess at the color of each successive element until he is correct. This method of analysis resembles the scanning process used in television and facsimile systems, and accomplishes the like purpose of transforming two spatial dimensions into a single sequence in time. We are in no way supposing or assuming, however, that perception normally involves any such scanning process. If the picture is divided into SO rows and 80 columns, as indicated, our S will guess at each of 4,000 cells as many times as necessary to determine which of the three colors it has. If his error score is significantly less than chance [2/3 X 4,000 + 1 / 2 ( 2 / 3 X 4,000) = 4,000], it is evident that the picture is to some degree redundant. Actually, he may be expected to guess his way through F i g . 1 with only I S or 20 errors. I t is fairly apparent that the technique described, in its present form, is limited in applicability to simple and somewhat contrived situations. With suitable modification it may have general usefulness as a research tool, but it is introduced into the present paper for demon strat ion al purposes only. L e t us follow a hypothetical subject through this procedure in some detail, noting carefully the places where he is most likely to make errors, since these are the places in which information is concentrated. T o begin, we give him an 80 X SO sheet of graph paper, telling him that he is to guess whether each cell is white, black, or brown, starting in the lower left corner and proceeding across the first row, then across the second, and so on to the last cell in the upper right comer. Whenever he makes an error, he is allowed to guess a second and, if necessary, a third time until he is corr e c t H e keeps a record of the cells he
has been over by filling in black and brown ones with pencil marks of appropriate color, leaving white ones blank. After a few errors at the beginning of the first row, he will discover that the next cell is "always" white, and predict accordingly. T h i s prediction will be correct as far as Column 20, but on 21 it will be wrong. After a few more errors he will learn that "brown" is his best prediction, as in fact it is to the end of the row. Chances are good that the subject will assume the second row to be exactly like the first, in which case he will guess it with no errors; otherwise he may make an error or two at the beginning, or at the edge of the "table," as before. He is almost certain to be entirely correct on Row 3, and on subsequent rows through 20. O n Row 21, however, it is equally certain that he will erroneously predict a transition from white to brown on Column 21, where the corner of the table is passed. Our subject's behavior to this point demonstrates two principles which may be discussed before we follow him through the remainder of his predictions. I t is evident that redundant visual stimulation results from either (a) an area of homogeneous color ("color" is used in the broad sense here, and includes brightness), or (b) a contour of homogeneous direction or slope. I n other words, information is concentrated along contours (i.e., regions where color changes abruptly) , and is further concentrated at those points on a contour at which its direction changes most rapidly (i.e., at angles or peaks of curvature) . 2
2
Our "scanning" procedure introduces a certain artifact here, in that a particular subject will make errors at a linear contour only the first few times he crosses it. It is fairly obvious that if the starting point of the sequence and the direction of scan were varied randomly over a large number of subjects, summated errors would be distributed evenly along such a straight contour.
197
185 which the dots represented. A good sample of the results is shown in F i g . 2: radial bars indicate the relative frequency with which dots were placed on each of the segments into which the contour was divided for scoring purposes. I t is clear that 5s show a great deal of agreement in their abstractions of points best representing the shape, and most of these points are taken from regions where the contour is most different from a straight line. This conclusion is verified by detailed comparisons of dot frequencies with measured curvatures on both the figure shown and others. F I G . 2. Subjects attempted to approximate the closed figure shown above with a pattern of 10 dots. Radiating bars indicate the relative frequency with which various portions of the outline were represented by dots chosen. Evidence from other and entirely different situations supports both of these inferences. The concentration of information in contours is illustrated by the remarkably similar appearance of objects alike in contour and different otherwise. The "same" triangle, for example, may be either white on black or green on white. Even more impressive is the familiar fact that an artist's sketch, in which lines are substituted for sharp color gradients, may constitute a readily identifiable representation of a person or thing. An experiment relevant to the second principle, i.e., that information is further concentrated at points where a contour changes direction most rapidly, may be summarized briefly. Eighty Ss were instructed to draw, for each of 16 outline shapes, a pattern of 10 dots which would resemble the shape as closely as possible, and then to indicate on the original outline the exact places
Common objects may be represented with great economy, and fairly striking fidelity, by copying the points at which their contours change direction maximally, and then connecting these points appropriately with a straightedge. Figure 3 was drawn by applying this technique, as mechanically as possible, to a real sleeping cat. The informational content of a drawing like this may be considered to consist of two components: one describing the positions of the points, the other indicating which points are connected with which others. The first of these components will almost always contain more information than the second, but its exact share will depend upon the precision with which positions are designated, and will further vary from object to object. L e t us now return to the hypothetical subject whom we left between the corner
8
s
This study has been previously published only in the form of a mimeographed note: "The Relative Importance of Parts of a Contour," Research Note P&MS 5 1 - 8 , Human Resources Research Center, November 1951,
F I G . 3. Drawing made by abstracting 3 8 points of maximum curvature from the contours of a sleeping cat, and connecting these points appropriately with a straightedge.
198
186 of the table and the ink bottle in F i g . 1. H i s errors will follow the principles we have just been discussing until he reaches the serrated shoulders of the bottle. ( A straight 4 5 ° line would be represented in this way because of the grain of the coordinate system, but we shall consider that the bottle is actually serrated, as it is from the subject's point of view.) O n the left shoulder there are 13 right angles, but these angles contain considerably less than 13 times the information of an angle in isolation like the corner of the table. This is true because they fall into a pattern which is repetitive, or redundant in the everyday sense of the term. T h e y will cease to evoke errors as soon as S perceives their regularity and extrapolates it. This extrapolation, precisely like 5's previous extrapolations of color and slope, will have validity only over a limited range and will itself lead to error on Row 38, Column 48. At about the same time that he discovers the regularity of the stair-step pattern (or perhaps a little before), our 5 will also perceive that the ink bottle is symmetrical, i.e., that the right contour is predictable from the left one by means of a simple reversal. As a result he is very unlikely to make any further errors on the right side above Row 32 or 33. Symmetry, then, constitutes another form of redundancy.* I t should be fairly evident by now * The reader may be comforted to know that six subjects have actually been run on the task described. Their errors, which ranged in number from 13 to 26, were distributed as suggested above, with a single interesting exception: 4 of tbe 6 5s assumed on Row 1 that the brown area would be located symmetrically within the field, and guessed "white" on Column 61. B y tbe use of Shannon's formulas (17) it was estimated that the field contains between 34 (lower limit) and 1S6 (upper limit) bits of information, in contrast to a possible maximum of 6,340 bits. The redundancy is thus calculated to be between 97.5 and 99.5 per cent.
that many of the gestalt principles of perceptual organization pertain essentially to information distribution. T h e good gestalt is a figure with some high degree of internal redundancy. T h a t the grouping laws of similarity, good continuation, and common jate all refer to conditions which reduce uncertainty is clear enough after the preceding discussion, and we shall presently see that proximity may be conceptualized in a like manner. It is not surprising that the perceptual machinery should "group" those portions of its input which share tbe same information: any system handling redundant information in an efficient manner would necessarily do something of the sort. Musatti (20) came very close to the present point when he suggested that a single principle of homogeneity might subsume Wertheimer's laws as special cases. All of our hypothetical S's extrapolations have involved some variety of homogeneity (or invariance), either of color, of slope, or of pattern. The kinds of extrapolation that have been discussed certainly do not exhaust the repertory of the human observer. For example, if the brightness of a surface were changed at a constant rate along some spatial extent, an observer could probably extrapolate this change with a fair degree of accuracy (given an appropriate response medium, such as choosing from a set of Munsell color patches). Likewise, we may reasonably suppose that a contour, the direction of which changes at a constant rate (i.e., the arc of a circle), could be extrapolated. Any sort of physical invariance whatsoever constitutes a source of redundancy for an organism capable of abstracting the invariance and utilizing it appropriately, but we actually know very little about the limits of the human perceptual machinery with respect to such abilities. A group of psychophysi-
199
187 cal studies determining the accuracy with which observers are able to extrapolate certain discrete and continuous functions of varying complexity must be carried out before we can usefully discuss any but the simplest cases." A troublesome question arises in this connection: where does perception leave off and inductive reasoning begin? T h e abstraction of simple homogeneities from a visual field does not appear to be different, in its formal aspects, from the induction of a highly general scientific law from a mass of experimental data. Certain subjective differences are obvious enough: thus reasoning seems to involve conscious effort, whereas perception seems to involve a set of processes whereby information is predigested before it ever reaches awareness. When extrapolations are required of a subject in an experimental situation, however, it is difficult or impossible for the experimenter to be certain whether the subject is responding on an "intuitive" or a "deliberative" basis. I do not know any general solution to this problem, and can only suggest that a limited control may be exercised by way of the establishment of a desired set in the subject.
T H E ABSTRACTION OF STATISTICAL PARAMETERS
Although F i g . 1 presents a situation much simpler, or more redundant, than the visual situations which ordinarily confront us, the reader need merely look around the room in which he is sitting to find that the principles illustrated apply to the real world. Further, it may be argued on neurological grounds that the human brain could not possibly utilize all the information provided by states of stimulation which were not highly redundant. According to Polyak's (14) estimate, the retina contains not less than four million cones. At any given instant each of these cones may be in either of two states: firing or not firing. Thus the retina as a whole might be in any one of about 2».'">°-°°° or 1QI.2OO.OTO states, each representing a different configuration of visual stimulation. Now, if by some unspecified mechanism each of these states were to evoke a different unitary response, and if a unitary response consists merely of the firing of a single unique neuron, then i t ) ' ' of such response-neurons would be required. T h e fantastic magnitude of this figure becomes somewhat apparent when one calculates that only about 10 neurons could be packed into a cubic light year. T h e fact that the number of patterns of response-neurons might plausibly equal the number of retinal configurations simplifies matters only if there are certain one-to-one connections between cones and responseneurons, in which case the response is to some degree merely a copy of the stimulus. 1
3 0 0
0 0 0
54
s
There is, however, a great deal more that can be said about the simplest cases. Vernier acuity demonstrates that, under optimal conditions, error of extrapolation may be less than the "minimum separable." It has been found by Salomon (16) that the error made in "aiming" a line at a point some constant distance from its end is a decreasing, negatively accelerated function of the line's length. This may be taken to mean that increasing the length of a tine adds information about its extension, but at a decreasing rate, somewhat as increasing the length of a passage of English text adds decreasing increments of information about the next letter (13, 17). Dr. Karl Zener, under whose direction the Salomon study was done, is at present conducting a program of related psychophysical experiments which may answer some of the questions raised above.
We may nevertheless ask: how would an observer respond to a situation in which the retinal receptors were stimulated quite independently of one another? This situation would be in practice very difficult to achieve (even more difficult than its diametric opposite, the
200
188 tion of the random field suggested above might look at some particular instant. Perhaps the most striking thing about the figure is the subjective impression of homogeneity that it gives: the left half of tbe figure seems, at least in a general way, very much like the right half. T h i s is remarkable because we have previously associated homogeneity with redundancy, and F i g . 4 was constructed to be completely nonredundant. Now, in psychological terms, it is fairly clear that the characteristic with respect to which the figure appears homogeneous is what Gibson (6) would call its texture. I n physical terms, two invariant factors may be specified: (a) the probability (.50) that any cell will be black rather than white, and (b) the size of cells. Both of these factors probably contribute to perceived texture, which is undoubtedly a multidimensional variable, though the latter may be somewhat the more important. I f the figure is viewed from a sufficient distance, these two parameters become identifiable with (a) the central tendency, and ( 6 ) the dispersion, of a continuous brightness distribution in two dimensions.
F I G . 4. A "random field" consisting of 19,600 cells. The state of each cell (black vs. white) was determined independently with a p of .50. Ganzfeld), particularly if we demanded that the stimulation at a given moment (which might be supposed to have a duration of about 100 msec, [see Attneave and McReynolds, 1 ] ) be entirely independent of the stimulation at any other moment. I n an effort to get some notion of what such a random field would be like, F i g . 4 was constructed. E a c h of the 140 = 10,600 small cells of the figure was either filled or not filled according to the value of a number obtained from a conversion of Snedecor's (18) table of random numbers from decimal to binary. I f the figure is viewed from a distance such that the angle subtended by a cell is of the order of the "minimum separable" (about 1'), it illustrates roughly how a small porz
8
* This laborious task was carried out by Airmen 1/C W. H. Price and E . F . Chiburis. Unfortunately, a slight distortion of the relative sizes of black and white cells was introduced in the photographic copying process. The figure was constructed not only for demonstration purposes, but also to serve as a source of random patterns for experimental use. It may also be used wherever a table of random binary numbers is needed, facilitating, for example, the selection of random "draws" from a binomial distribution.
I t appears, then, that when some portion of the visual field contains a quantity of information grossly in excess of the observer's perceptual capacity, he treats those components of information which do not have redundant representation somewhat as a statistician treats "error variance," averaging out particulars and abstracting certain statistical homogeneities. Such an averaging process was involved in drawing the cat for Fig. 3. I t was said earlier that the points of the drawing corresponded to places of maximum curvature on the contour of the cat, but this was not strictly correct; if the principle had been followed rigidly, it would have been necessary to represent the ends of individual hairs by points. I n observ-
201
189 ing a cat, however, one does not ordinarily perceive its hairs as individual entities; instead one perceives that the cat is jurry. Furriness is a kind of texture; the statistical parameters which characterize it presumably involve averages of shape and direction, as well as size, of elements. T h e perceived contour of a cat (e.g., the contour from which the points of F i g . 3 were taken) is the resultant of an orthogonal averaging process in which texture is eliminated or smoothed out almost entirely, somewhat as if a photograph of the object were blurred and then printed on high-contrast paper (cf. Rashevsky, 15, and Culbertson, 5 ) . The sense in which a surface of a particular texture may be said to provide redundant stimulation has perhaps been adequately indicated. T h i s sort of redundancy might be demonstrated by the guessing-game technique, with a suitable modification in the level of prediction required, i.e., by increasing the unit area to be predicted and requiring the subject to select from a multidimensional array of samples the texture (i.e., the statistical parameters) which he believes the next unit will have. I n view of Gibson's (6) convincing argument that a physical edge, or contour, is as likely to be represented in vision by an abrupt texture change as by an abrupt color change, I have considered it important to show how texture may be substituted for color without materially altering the principles derived from F i g . I . PERCEPTION AS ECONOMICAL DESCRIPTION
I t is sometimes said that the objective of science is to describe nature economically. We have reason to believe, however, that some such process of parsimonious description has its beginnings on a fairly naive perceptual level, in
scientists and their fellow organisms alike; thus the difficulty, mentioned earlier, of distinguishing between perception and inductive reasoning. I t appears likely that a major function of the perceptual machinery is to strip away some of the redundancy of stimulation, to describe or encode incoming information in a form more economical than that in which it impinges on the receptors. If this point of view is sound, we should be able to generate plausible hypotheses as to the nature of specific perceptual processes by considering rational operations which one might deliberately employ to reduce redundancy. The approach suggested, as it applies to the perception of a static visual field, is equivalent to that of a communications engineer who wishes to design a system for transmitting pictures of real things over a practically noise-free channel with the utmost economy of channel time and band width, but in a manner designed to meet standards such as human observers are likely to have. Some of the reduction principles which he might usefully employ in such a system are listed below. I t will be found that these principles serve to summarize and integrate ideas which have been developed somewhat informally in the foregoing sections, as well as to introduce new considerations. T h e principles may be grouped according to the forms of redundancy with which they are concerned: thus 1-4 deal with varieties of continuous regularity; 5 and 6 with discontinuous regularity, or recurrence; 7-9 with proximity; and 10 with situations involving interaction. 1. An area of homogeneous color may be described by specifying the color and the boundaries of the area over which it is homogeneous. ( I t is assumed that limits of error tolerance on relevant dimensions have been agreed upon, e.g., that there is some definite number of
202
190 colors from which the receiving mechanism may be directed to choose.) 2. Likewise, an area of homogeneous texture may be described by specifying the statistical parameters which characterize the texture and the boundaries of the area over which these parameters are relatively invariant. Thus, if F i g . 4 represented a part of the upholstery of a sofa, it would probably be satisfactory simply to instruct the receiving mechanism to reproduce the texture by filling in cells of a certain size from any table of random numbers. I t is true that this process would result in the complete loss of 19,600 bits of information; the essential point is that we are dealing here with a class of stimuli from which such a huge information loss is perceptually tolerable. 3. A n area over which either color or texture varies according to some regular function may be described by specifying the function and the boundaries of the area over which it obtains (cf. Gibson's [6] texture gradient). This principle actually implies both 1 and 2 as special cases. 4. Likewise, if some segment of an area boundary (i.e., contour) either maintains a constant direction or varies according to some other regular function, it may be described by specifying the function and the loci of its limiting points. Figure 3 illustrates a special case of this principle. 5. I f two or more identical stimulus patterns (these might be either successive portions of a contour, or separate and discrete objects) appear at different places in the same field, all may be described by describing one and specifying the positions of the others and the fact that they are identical (cf. similarity as a grouping l a w ) . 6. I f two or more patterns are similar but not identical, it may be economical to proceed as in S, in addition specifying either (a) how subsequent patterns dif-
fer from the first, or else (b) how each pattern differs from some skeleton pattern which includes the communalities of the group (cf. the "schema-with-correction" idea discussed by Woodworth [ 2 0 ] ; also Hebb's [7] treatment of perceptual schemata). 7. When the spatial loci of a number of points are to be described in some arbitrary order, and the points are arranged in clusters or proximity groups (as in F i g . 5 ) , it may be economical to describe the points of each group with respect to some local origin ( 0 ' or 0 " ) , transmitting as a separate component the positions of the local origins with respect either to each other or to some arbitrary origin, whichever is required. Since the points occupy a smaller range of alternative coordinates on the local axes than on arbitrary axes, less information is required for their specification. I f the amount of information thus saved is greater than the amount needed to specify the positions of the local origins, a net saving will result. What is redundant in the present case is the approximate location of points in a cluster: this component is isolated out when a local origin is described (cf. the con¬ Y
:
!
i *
M t •
r'
F I G . 5. A functional aspect of proximitygrouping is illustrated. Tbe loci of clustered points may be described with choices from a smaller set of numbers if local origins are used.
203
191 cepts of "within" and "between" variance). T h e local origin principle may also be used in conjunction with some regular scanning procedure if tbe order in which the points are to be specified is not predetermined (but see also 9, below). T h e relevance of these considerations to proximity as a perceptual grouping law is evident. 8. The preceding principle may be generalized to apply to dimensions other than spatial ones; e.g., brightness, coarseness of texture, etc. A "local origin" on such a continuum would appear to have essentially the characteristics of Helson's ( 8 ) adaptation-level, in terms of which constancy phenomena and a variety of other psychophysical findings may be accounted for. This generalized principle is closely similar to 6 above, the chief difference being that 6 is applicable to combinations of discrete variables, or to situations of ambiguous dimensional organization. 9. I f the loci of a number of points are to be described, and the order in which they are taken is immaterial, they may be arranged in a sequence such that the distances between adjacent points are minimized, and transmitted with each point serving as origin for the one following it. This procedure will result in some saving if the points are clustered, as in F i g . 5, but it is most clearly applicable when the points are "strung-out" in some obvious sequence. In the latter case, a further economy may be achieved by the use of special coordinates such as distance from a line passing through the two preceding points (or from an arc through the three preceding points, etc.; cf. 4 above). 10. Certain areas and objects may be described in a relatively simple way, by procedures of the sort suggested above, if they are first subjected to some systematic distortion or transformation. Consider the case of a complex,
204
symmetrical, two-dimensional pattern viewed from an angle such that its retinal or photographic image is not symmetrical. I t will be economical to transmit a description of the pattern as if it were in the frontal plane, and thus symmetrical (eliminating the redundancy of symmetry by means of 6 a ) , together with a description of the transformation which relates the frontal aspect described to the oblique aspect in which the pattern is viewed (cf. Gibson's [6] discussion of perspective transformations; also the "Thompsonian coordinates" of D'Arcy Thompson [ 1 9 ] ) . Koffka ( 1 0 ) and other gestalt psychologists have held that many objects have some "preferred" aspect, and that this aspect has the characteristics of a "good gestalt." T h e present principle supports this view on functional grounds, since the perceptual transformation of a figure to an aspect in which similarities among parts are maximized may be interpreted as the initial step in an efficient information-digesting process. I t should be clearly recognized, however, that an over-all economy is achieved only if the amount of information required to describe the transformation is less than the amount of information saved by virtue of the transformation; thus a transformation must be relatively simple to be considered useful, at least by the present criterion. Let me indicate briefly how these considerations may be integrated with others of a more general nature. Interdependencies among sensory events may exist either in space or in time, or they may cut across both space and time. I n studying the redundancy of spoken English ( 1 1 ) , for example, one is dealing with interdependences which may be considered purely temporal. The present discussion has been restricted, quite arbitrarily, to relationships in space: to forms of redundancy and informationdistribution which may obtain in the
192 visual field at a particular instant, and which a computer of conceivable complexity might evaluate from a photograph. T h e extension of the visual field in time, which I propose to discuss in a subsequent paper, introduces new v a rieties of redundancy involving the temporal continuation or recurrence of spatial configurations which may be nonredundant at any instant considered in isolation. Any individual learns a great deal, over his life span, about whatgoes-with-what. Thus, if an ear is disclosed in a situation like that illustrated by F i g . 1, the observer can predict that a mouth, nose, eyes, etc. are also present, and approximately where they are. This sort of redundancy is spatiotemporal in its basis; predictions are not possible merely on the basis of the present visual field, but depend also upon previous fields which have contained faces. Principle 6 above suggests the approach to economical description which might be extended to such cases. Further, as Brunswik (2, 3, 4 ) has pointed out in some detail, ecological principles of very broad generality may be derived from experience.* F o r example, the frequency with which an observer has encountered symmetrical objects in his past may certainly affect the point at which, in predicting successive ceils of F i g . 1, he "assumes" that the ink bottle is symmetrical. Likewise in terms of economical encoding; each of the varieties of spatial redundancy suggested above will itself occur with some determinate frequency over any given set of fields (e.g., the r Brunswik ( 2 ) , Hebb (7), and the Ames group at Princeton (9) have advanced views concerning the role of experience i n perception which have much in common w i l h one another and w i t h m y o w n position i n the matter. I t appears to me, however, that they have i n general tended to underestimate (as the gestalt psychologists have somewhat overestimated) the importance of lawful relationships which may exist w i t h i n the static and isolated visual held.
set of pictures which a computer-transmitter might have been required to handle over some period of past operation), and a knowledge of this and related frequencies may be used in determining the optimal assignments of actual code symbols. As a result of factors such as these, spatial and spatiotemporal redundancy (or entropy) are difficult to separate empirically, but the distinction remains a conceptually convenient one. T h e foregoing reduction principles make no pretense to exhaustiveness. I t should be emphasized that there are as many kinds of redundancy in the visual field as there are kinds of regularity or lawfulness; an attempt to consider them all would be somewhat presumptuous on one hand, and almost certainly irrelevant to perceptual processes on the other. I t may further be admitted that the principles which have been given are themselves highly redundant in the sense that they could be stated much more economically on a higher level of abstraction. This logical redundancy is not inadvertent, however: if one were faced with the engineering problem suggested earlier, he would undoubtedly find it necessary to break the problem down in some manner such as the foregoing, and to design a multiplicity of mechanisms to perform operations of the sort indicated (some principles, e.g., 6, would require further breakdown for this purpose). Likewise, the principles are frankly intended to suggest operations which the perceptual machinery may actually perform, and accordingly the types of measurement which are likely to prove appropriate in the quantitative psychophysical study of complex perceptual processes.
R E F E R E N C E S 1. A T T N E A V E ,
F., :
MCRETNOLDS,
P. W.
A
visual beat phenomenon. Amer. J. Psy-
chol.,
205
1950, 6 3 , 1 0 7 - 1 1 0 .
193 2.
E . Systematic and representative design of psychological experiments: with results in physical and so-
( E d . ) , Handbook oj experimental psy-
BHUNSWIK,
cial perception. Berkeley:
Univer. of
chology. New Y o r k : Wiley, 1951. Pp. I04O-1OJ4. 12. M I L L E R , G . A. What is information meas-
California Press, 1947. 3.
BRUNSWIX, E .
urement?
The conceptual framework
of psychology. Chicago: Chicago Press, 1952. 4. B R O N S W I K , E . , & K A M I T A , J .
Univer. of
Amer. Psychologist, 1953, 8,
3-11. 13. N E W M A N ,
E . B . , & GERSTMATJ, L . S.
A
new method for analyzing printed EngEcological
cue-validity of "proximity" and of other gestalt factors. Amer. J. Psychol., 1953, 56, 20-32.
lish.
J. exp. Psychol, 1952, 44, 114¬
125.
S. L . The retina. Chicago: Univer. of Chicago Press, 1941. 5. CiTLBEKTsou, J . T . Consciousness and be15. R A S B E V S K Y , N . Mathematical biophysics. havior. Dubuque: Wm. C . Brown, Chicago: Univer. of Chicago Press, 1950. 1948. 6. G I B S O N , J . J . The perception of the visual 16. SALOMON, A. D . Visual field factors in world. Boston: Houghton Mifflin, 1950. tbe perception of direction. Amer. J. 7. H E B B , D . O. Organization of behavior. Psychol., 1947, 60, 68-88. New York: Wiley, 1949. 17. SHANNON, C . E . Prediction and entropy 8. H E L S O N , H . Adaptation-level as frame
of reference for prediction of psycho-
14.
POLVAZ,
of printed English.
Bell Syst. tech. J.,
1951, 30, 50-64. 18. SNEDECOR, G . W . Statistical methods. 60, 1-29. Ames: Iowa State Coll. Press, 1946. 9. I T T E L S O N , W . H . The Ames demonstra19. T H O M P S O N , D , W . Growth and form. tions in perception. Princeton: Princeton Univer. Press, 1952. New York: Macmfflan, 1942. 10. K O F F K A , K . Principles of gestalt psychol- 20. WOODWORTH, R . S . Experimental psyogy. New York: Harcourt, Brace, 1935. chology. New York: Holt, 1938.
physical data.
Amer. J. Psychol., 1947,
11. L I C K L I D E R , J . C . R „ * M I L L E R , G . A.
perception of speech.
The
I n S. S. Stevens
206
(Received June 26, 1953)
Reprinted with permission from Psychological Review. Vol.63, No.2, pp. 81-97,1956 © 1956 American Psychological Association
THE PSYCHOLOGICAL REVIEW THE
MAGICAL N U M B E R SEVEN, PLUS O R MINUS TWO: SOME LIMITS ON O U R CAPACITY F O R PROCESSING
INFORMATION'
G E O R G E A. M I L L E R
Harvard V nsverni j My
judgment. Historical accident, however, has decreed that they should have another name. We now call them experiments on the capacity of people to transmit information. Since these experiments would not have been done without the appearance of information theory on the psychological scene, and since the results are analyzed in terms of the concepts of information theory, I shall have to preface my discussion with a few remarks about this theory.
p r o b l e m is t h a t I h a v e b e e n perse-
c u t e d b y a s integer. t h i s n u m b e r has
F o r seven
years
followed m e a r o u n d , has
intruded i a m y most private data, and has
a s s a u l t e d m e f r o m t h e pages of o u r
most public journals.
T h i s number as-
s u m e s a v a r i e t y o f disguises, b e i n g sometimes a little
little larger a n d
smaller
than
sometimes
usual,
c h a n g i n g so m u c h a s to be able.
The
a
never
unrecogniz-
persistence w i t h w h i c h t h i s
number plagues
me
a random accident. a
but
famous
senator,
is far more
than
T h e r e i s , to q u o t e a design
behind
INFORMATION MEASUREMENT
it,
some p a t t e r n g o v e r n i n g its a p p e a r a n c e s . E i t h e r there r e a l l y is s o m e t h i n g u n u s u a l a b o u t the n u m b e r o r else I a m suffering f r o m d e l u s i o n s of p e r s e c u t i o n . I s h a l l begin m y c a s e h i s t o r y b y telling you
about
some
experiments
that
tested h o w a c c u r a t e l y people c a n assign n u m b e r s to t h e m a g n i t u d e s
of
aspects
the
tradi-
psychology
these
tional
of
a
stimulus.
language
of
In
various
w o u l d be c a l l e d e x p e r i m e n t s i n a b s o l u t e ' This paper w u first read u an Invited Address before the Eastern Psychological Association in Philadelphia on April IS, 1955. Preparation of the paper w u supported by the Harvard Psycho-Acoustic Laboratory un¬ der Contract N5ori-76 between Harvard U n i versity and the Office of Naval Research, TJ. S. Navy (Project N R 1 4 J - I 0 1 , Report P N R - 1 7 4 ) . Reproduction for any purpose of the U . 5. Government is permitted.
The "amount of information" is exactly the same concept that we have talked about for years under the name of "variance." The equations are different, but if we bold tight to the idea that anything that increases the variance also increases the amount of information we cannot go far astray. The advantages of this new way of talking about variance are simple enough. Variance is always stated in terms of the unit of measurement— inches, pounds, volts, etc.—whereas the amount of information is a dimensionless quantity. Since the information in a discrete statistical distribution does not depend upon the unit of measurement, we can extend the concept to situations where we have no metric and we would not ordinarily think of using
207
82
the variance. And it also enables us to compare results obtained in quite different experimental situations where it would be meaningless to compare variances based on different metrics. So there are some good reasons for adopting the newer concept. The similarity of variance and amount of information might be explained this way: When we have a large variance, we are very ignorant about what is going to happen. If we are very ignorant, then when we make the observation it gives us a lot of information. On the other hand, if the variance is very small, we know in advance how our observation must come out, so we get little information from making the observation. If you will now imagine a communication system, you will realize that there is a great deal of variability about what goes into the system and also a great deal of variability about what comes out. The input and the output can therefore be described in terms of their variance {or their information). If it is a good communication system, however, there must be some systematic relation between what goes in and what comes out. That is to say, the output will depend upon the input, or wUl be correlated with the input. If we measure this correlation, then we can say how much of the output variance is attributable to the input and how much is due to randomfluctuationsor "noise" introduced by the system during transmission. So we see that the measure of transmitted information is simply a measure of the input-output correlation. There are two simple rules to follow. Whenever I refer to "amount of information," you wUl understand "variance." And whenever I refer to "amount of transmitted information." you will understand "covariance" or "correlation." The situation can be described graphically by two partially overlapping cir-
cles. Then the left circle can be taken to represent the variance of the input, the right circle the variance of the output, and the overlap the covariance of input and output. I shall speak of the left circle as the amount of input information, the right circle as the amount of output information, and the overlap as the amount of transmitted information. In the experiments on absolute judgment, the observer is considered to be a communication channel. Then the left circle would represent the amount of information in the stimuli, the right circle the amount of information in his responses, and the overlap the stimulusresponse correlation as measured by the amount of transmitted information. The experimental problem is to increase tbe amount of input information and to measure the amount of transmitted information. If the observer's absolute judgments are quite accurate, then nearly all of the input information will be transmitted and will be recoverable from his responses. If he makes errors, then the transmitted information may be considerably less than the input. We expect that, as we increase the amount of input information, the observer will begin to make more and more errors; we can test the limits of accuracy of his absolute judgments. If the human observer is a reasonable kind of communication system, then when we increase the amount of input information the transmitted information will increase at first and will eventually level off at some asymptotic value. This asymptotic value we take to be the channel capacity of the observer: it represents the greatest amount of information that he can give us about the stimulus on the basis of an absolute judgment. The channel capacity is the upper limit on the extent to which the observer can match bis responses to the stimuli we give him. Now just a brief word about the bit
208
83 a n d we can begin t o look a t some data. One b i t of i n f o r m a t i o n is the a m o u n t of i n f o r m a t i o n that we need to make a decision between t w o equally l i k e l y a l ternatives. I f we m u s t decide whether a m a n is less than six feet tall or more t h a n six feet t a l l a n d i f w e k n o w t h a t t h e chances are 50-50, then we need one b i t of i n f o r m a t i o n . N o t i c e t h a t this u n i t of information does n o t refer i n a n y w a y to the u n i t of length t h a t we use—feet, inches, centimeters, etc. However y o u measure the man's height, we still need j u s t one b i t of i n f o r m a t i o n . T w o b i t s o f i n f o r m a t i o n enable us to decide among four equally l i k e l y alternatives. Three bits of information enable us to decide among eight equally Likely alternatives. Four b i t s of inform a t i o n decide among 16 alternatives, five among 3 2 , a n d so o n . T h a t is to say, i f there are 32 equally l i k e l y alternatives, we must make five successive b i n a r y decisions, w o r t h one b i t each, before we k n o w w h i c h alternative is correct. So the general rule is simple: every t i m e the number of alternatives is increased b y a factor of two, one b i t of i n f o r m a t i o n i s added. There are t w o ways we m i g h t i n crease the amount of i n p u t i n f o r m a t i o n . W e could increase the rate a t w h i c h we give i n f o r m a t i o n t o the observer, so that the amount of i n f o r m a t i o n per u n i t time w o u l d increase. Or we could ignore the timp variable completely a n d increase the a m o u n t o i i n p u t information b y increasing t h e number of alternative s t i m u l i . I n the absolute j u d g m e n t experiment we are interested i n the second alternative. W e give the observer as m u c h time as he wants to make his response; we s i m p l y increase the number of alternative s t i m u l i among w h i c h he must discriminate and look t o see where confusions begin to occur. Confusions w i l l appear near the point that we are calling his "channel c a p a c i t y . "
ABSOLUTE JUDGMENTS o r UNIDIMENSIONAL STIMULI N o w l e t us consider w h a t happens when we make absolute judgments of tones. Pollack ( 1 7 ) asked listeners to i d e n t i f y tones b y assigning numerals to t h e m . T h e tones were different w i t h respect t o frequency, a n d covered the range f r o m 100 to 8000 cps i n equal logarithmic steps. A tone was sounded a n d the listener responded b y giving a numeral. A f t e r the listener h a d made his response he was told the correct identification o f the tone. W h e n o n l y t w o o r three tones were used tbe listeners never confused them. W i t h four different tones confusions were q u i t e rare, b u t w i t h five or more tones confusions were frequent. W i t h fourteen different tones the listeners made m a n y mistakes. These data are plotted i n F i g . 1. Along the b o t t o m is the amount of i n p u t information, i n b i t s per stimulus. As the number of alternative tones was increased f r o m 2 t o 14, the input information increased from 1 t o 3.8 bits. O n the ordinate is plotted the amount of
INPUT
INFORMATION
FIG. 1. Data from Pollack (17, 18) on the amount of information that is transmitted by listeners who make absolute judgments of auditory pitch. As the amount of input information isincreased by increasing from 2 to 14 the number of different pitches to be judged, the amount of transmitted information
209
approaches as its upper limit a channel capacity of about 2.5 bits per judgment.
84 transmitted information. T h e amount of transmitted information behaves i n much the way we would expect a communication channel to behave; the transm i t t e d information increases linearly u p to about 2 bits and then bends off t o ward an asymptote at about 2.5 bits. T h i s value, 2.5 bits, therefore, is what we are calling the channel capacity o f the listener for absolute judgments o f pitch. So n o w we have the number 2.5 bits. W h a t does i t mean? First, note that 2.5 bits corresponds t o about six equally likely alternatives. T h e result means that we cannot pick more than six different pitches that the listener w i l l never confuse. O r , stated s l i g h t l y differently, no matter how many alternative tones we ask h i m t o judge, the best we can expect h i m t o do is t o assign them t o about six different classes w i t h out error. Or, again, i f we know that there were .V alternative stimuli, then his judgment enables us to narrow down the particular stimulus t o one o u t of If/6,
INPUT INFORMATION
Frc. I . Data from Garner (7) on the channel capacity for absolute judgments of auditory loudness.
N e x t y o u can ask bow reproducible this result is. Does i t depend o n the spacing o f the tones o r the various conditions o f judgment? Pollack varied these conditions i n a number of ways. The range o f frequencies can be changed b y a factor of about 20 w i t h o u t changing the amount o f information transm i t t e d more than a small percentage. Different groupings of the pitches decreased the transmission, b u t the loss was small. F o r example, i f y o u c a n M o s t people are surprised that the discriminate five high-pitched tones i n number is as small as six. O f course, one series and five low-pitched tones i n there is evidence that a musically so- another series, i t is reasonable t o exphisticated person w i t h absolute p i t c h pect that y o u could combine a l l ten into can identify accurately a n y one o f 50 a single series a n d still t e l l them a l l or 60 different pitches. Fortunately, I apart w i t h o u t error. W h e n y o u t r y i t , do n o t have time t o discuss these re- however, i t does n o t w o r k . T h e chanmarkable exceptions. I say i t is for- nel capacity for p i t c h seems t o be about tunate because I do n o t know how t o six and t h a t is the best y o u can do. explain their superior performance. So W h i l e we are o n tones, l e t us look I shall stick t o the more pedestrian fact next a t Garner's ( 7 ) w o r k on loudness. that most of us can identify about one Garner's data for loudness are sumout of only five or six pitches before we marized i n F i g . 2. Garner went to some begin t o get confused. trouble t o get the best possible spacing I t is interesting t o consider t h a t psy- of his tones over the intensity range chologists have been using seven-point f r o m 15 t o 110 d b . H e used 4 , 5, 6, 7, rating scales for a long time, o n the 10, and 20 different stimulus intensities. intuitive basis that t r y i n g t o rate i n t o T h e results shown i n F i g . 2 take i n t o finer categories does not really add much account the differences among subjects t o the usefulness of the ratings. Pol- and the sequential influence o f the i m lack's results indicate that, at least for mediately preceding judgment. Again pitches, this intuition is f a i r l y sound. we find that there seems t o be a l i m i t
210
85 and Gamer (8) asked observers to interpolate visually between two scale markers. Their results are shown in Fig. 4 . They did tbe experiment in two ways. In one version they let the observer use any number between zero and 100 to describe the position, alTASTES though they presented stimuli at only JUDGMENTS Of SALINE CONCENTRATION 5, 10, 20, or 50 different positions. The results with this unlimited response 1 2 5 4 technique are shown by the filled circles MPUT INFORMATION on the graph. In the other version the FIG. 3. Data from Beebe-Center, Rogers, observers were limited in their reand O C o n n r i l ( I ) on the channel capacity for sponses to reporting just those stimuabsolute judgments of saltiness. lus values that were possible. That is to say, in the second version the numThe channel capacity for absolute judg- ber of different responses that the obments of loudness is 2.3 bits, or about server could make was exactly the same five perfectly discriminable alternatives. as the number of different stimuli that Since these two studies were done in the experimenter might present. The different laboratories with slightly dif- results with this limited response techferent techniques and methods of analy- nique are shown by the open circles on sis, we are not in a good position to the graph. The two functions are so argue whether five loudnesses is signifi- similar that it seems fair to conclude cantly different from sue pitches. Prob- that the number of responses available ably the difference is in the right direc- to the observer had nothing to do with tion, and absolute judgments of pitch the channel capacity of 3.25 bits. are slightly more accurate than absolute The Hake-Garner experiment has been judgments of loudness. The important repeated by Coonan and Klemmer. Alpoint, however, is that the two answers though they have not yet published are of the same order of magnitude. their results, they have given me perThe experiment has also been done mission to say that they obtained chanfor taste intensities. In Fig. 3 are the nel capacities ranging from 3.2 bits for results obtained by Beebe-Center, Rogers, and O'Connell (1) for absolute / / 5 judgments of the concentration of salt . 3.25 solutions. The concentrations ranged — ^ • BITS from O J to 34.7 gm. NaCI pe I c c tap water in equal subjective steps. 0 2 u They used 3, 5, 9, and 17 different conIPO'NTS ON A LINE centrations. The channel capacity is o N,'N, I 1.9 bits, which is about four distinct • N, >O0 2 4 concentrations. Thus taste intensities a. i 2 i 4 a seem a little less distinctive than auditory stimuli, but again the order of INPUT INFORMATION magnitude is not far off. On the other hand, the channel caFIG. 4. Data from Hake and Gamer (S] on the pacity for judgments of visual position channel capacity for absolute judgments of the seems to be significantly Larger, Ha' position of a pointer in a linear interval. r
1
0
0
1
/
211
.
86
very short exposures of the pointer position to 3.9 bits for longer exposures. These values are slightly higher than Hake and Garner's, so we must conclude that there are between 10 and 15 distinct positions along a linear interval. This is the largest channel capacity that has been measured for any unidimensional variable. At the present time these four experiments on absolute judgments of simple, unidimensional stimuli are all that have appeared in the psychological journals. However, a great deal of work on other stimulus variables has not yet appeared in the journals. For example, Eriksen and Hake ( 6 ) have found that the channel capacity for judging the sizes of squares is 2.2 bits, or about five categories, under a wide range of experimental conditions. In a separate experiment Eriksen ( 5 ) found 2.8 bits for size, 3.1 bits for hue, and 2.3 bits for brightness. Geldard has measured the channel capacity for the skin by placing vibrators on the chest region. A good observer can identify about four intensities, about five durations, and about seven locations.
for the long exposure. Curvature was apparently harder to judge. When the length of the arc was constant, the result at the short exposure duration was 2.2 bits, but when the length of the chord was constant, the result was only 1.6 bits. This last value is the lowest that anyone has measured to date. I should add, however, that these values are apt to be slightly too low because the data from all subjects were pooled before the transmitted information was computed.
Now let us see where we are. First, the channel capacity does seem to be a valid notion for describing human observers. Second, the channel capacities measured for these unidimensional variables range from 1.6 bits for curvature to 3.9 bits for positions in an interval. Although there is no question that the differences among the variables are real and meaningful, the more impressive fact to me is their considerable similarity. If I take the best estimates I can get of the channel capacities for all the stimulus variables I have mentioned, the mean is 2.6 bits and the standard deviation is only 0.6 bit. In terms of One of the most active groups in this distinguishable alternatives, this mean area has been the Air Force Operational corresponds to about 6.5 categories, one Applications Laboratory. Pollack has standard deviation includes from 4 to been kind enough to furnish me with 10 categories, and the total range is the results of their measurements for from 3 to 15 categories. Considering several aspects of visual displays. They made measurements for area and for the wide variety of different variables the curvature, length, and direction of that have been studied, I find this to lines. In one set of experiments they be a remarkably narrow range. used a very short exposure of the stimuThere seems to be some limitation lus—y second—and then they re- built into us either by learning or by peated the measurements with a 5- the design of our nervous systems, a second exposure. For area they got limit that keeps our channel capacities 4v
2.6 bits with the short exposure and 2.7 bits with the long exposure. For the length of a line they got about 2.6 bits wilh the short exposure and about 3.0 bits with the long exposure. Direction, or angle of inclination, gave 2.8 bits for the short exposure and 3.3 bits
in this general range. On the basis of the present evidence it seems safe to say that we possess a finite and rather small capacity for making such unidimensional judgments and that this capacity does not vary a great deal from one simple sensory attribute to another.
212
87 ABSOLUTE JUDGMENTS or MULTIDIMENSIONAL STIMULI You may have, noticed that I have been careful to say that this magical number seven applies to one-dimensional judgments. Everyday experience teaches us that we can identify accurately any one of several hundred faces, any one of several thousand words, any one of several thousand objects, etc. The story certainly would not be complete if we stopped at this point. We must have some understanding of why the onedimensional variables we judge in the laboratory give results so far out of line with what we do constantly in our behavior outside the laboratory. A possible explanation lies in tbe number of independently variable attributes of the stimuli that are being judged. Objects, faces, words, and the like differ from one another in many ways, whereas the simple stimuli we have considered thus far differ from one another in only one respect. Fortunately, there are a few data on what happens when we make absolute judgments of stimuli that differ from one another in several ways. Let us look first at the results Klemmer and Frick (13) have reported for the absolute judgment of the position of a dot in a square. In Fig. 5 we see their re-
INPUT
REFORMATION
FIG. 5. Daca from Klemmer and Frick (13) on [he channel capacity for absolute judgments of the position of a dot in a square.
suits. Now the channel capacity seems to have increased to 4.6 bits, which means that people can identify accurately any one of 24 positions in the square. T h e postion of a dot in a square is clearly a two-dimensional proposition. Both its horizontal and its vertial position must be identified. Thus it seems natural to compare the 4.6-bit capacity for a square with the 3.25-bit capacity for the position of a point in an interval. The point in the square requires two judgments of the interval type. I f we have a capacity of 3.25 bits for estimating intervals and we do this twice, we should get 6.5 bits as our capacity for locating points in a square. Adding the second independent dimension gives us an increase from 3.25 to 4.6, but it falls short of the perfect addition that would give 6.5 bits. Another example is provided by BeebeCenter, Rogers, and O'Connell. When they asked people to identify both the saltiness and the sweetness of solutions containing various concentrations of salt and sucrose, they found that the channel capacity was 2.3 bits. S i n c e the capacity for salt alone was 1.9, we might expect about 3.8 bits if the two aspects of the compound stimuli were judged independently. A s with spatial locations, the second dimension adds a little to the capacity but not as much as it conceivably might. A third example is provided by Pollack (18), who asked listeners to judge both the loudness and the pitch of pure tones. Since pitch gives 2.5 bits and loudness gives 2.3 bits, we might hope to get as much as 4.8 bits for pitch and loudness together. Pollack obtained 3.1 bits, which again indicates that the second dimension augments the channel capacity but not so much as it might. A fourth example can be drawn from the work of Halsey and Chapanis (9) on confusions among colors of equal
213
88 luminance. Although they did not analyze their results in informational terms, they estimate that there are about 11 to IS identifiable colors, or, in our terms, about 3.6 bits. Since these colors varied in both bue and saturation, it is probably correct to regard this as a twodimensional judgment. I f we compare this with Eriksen's 3.1 bits for hue (which is a questionable comparison to draw), we again have something less than perfect addition when a second dimension is added. I t is still a long way, however, from these two-dimensional examples to the multidimensional stimuli provided by faces, words, etc. T o fill this gap we have only one experiment, an auditory study done by Pollack and Ficks (19). They managed to get six different acoustic variables that they could change: frequency, intensity, rate of interruption, on-time fraction, total duration, and spatial location. E a c h one of these six variables could assume any one of five different values, so altogether there were 5', or 15,625 different tones that they could present. The listeners made a separate rating for each one of these six dimensions. Under these conditions the transmitted information was 7.2 bits, which corresponds to about 150 different categories that could be absolutely identified without error. Now we are beginning to get up into the range that ordinary experience would lead us to expect. Suppose that we plot these data, fragmentary as they are, and make a guess about how the channel capacity charges with tbe dimensionality of the stimuli. The result is given in F i g . 6. I n a moment of considerable daring I sketched the dotted line to indicate roughly the trend that the data seemed to be taking. Clearly, the addition of independently variable attributes to the stimulus increases the channel capacity, but at a
a
HUMStH OF W i a B L t
iSPCCTS
F i c . 6. Tbe general form of the relation between channel capacity and the number of independently variable attributes of the stimuli.
decreasing rate. I t is interesting to note that the channel capacity is increased even when the several variables are not independent. Eriksen (5) reports that, -when size, brightness, and hue all vary together in perfect correlation, the transmitted information is 4.1 bits as compared with an average of about 2.7 bits when these attributes are varied one at a time. B y confounding three attributes, Eriksen increased the dimensionality of the input without increasing the amount of input information; the result was an increase in channel capacity of about the amount that the dotted function in F i g . 6 would lead us to expect. T h e point seems to be that, as we add more variables to the display, we increase the total capacity, but we decrease the accuracy for any particular variable. I n other words, we can make relatively crude judgments of several things simultaneously. We might argue that in the course of evolution those organisms were most successful that were responsive to the widest range of stimulus energies in their environment. I n order to survive in a constantly fluctuating world, it was better to have a little information about a lot of things than to have a lot of information about a small segment of the
214
39 environment. I f a compromise was necessary, the one we seem to have made is dearly the more adaptive. Pollack and Ficks's results are very strongly suggestive of an argument that linguists and phoneticians have been making for some time ( 1 1 ) . According to the linguistic analysis of the sounds of human speech, there are about eight or ten dimensions—the linguists call them distinctive features—that distinguish one phoneme from another. These distinctive features are usually binary, or at most ternary, in nature. F o r example, a binary distinction is made between vowels and consonants, a binary decision is made between oral and nasal consonants, a ternary decision is made among front, middle, and back phonemes, etc. T h i s approach gives us quite a different picture of speech perception than we might otherwise obtain from our studies of the speech spectrum and of the ear's ability to discriminate relative differences among pure tones. I am personally much interested in this new approach ( 1 5 ) , and I regret that there is not time to discuss it here. I t was probably with this linguistic theory in mind that Pollack and Ficks conducted a test on a set of tonal stimuli that varied in eight dimensions, but required only a binary decision on each dimension. With these tones they measured the transmitted information at 6.9 bits, or about 120 recognizable kinds of sounds. I t is an intriguing question, as yet unexplored, whether one can go on adding dimensions indefinitely in this way. I n human speech there is clearly a limit to the number of dimensions that we use. I n this instance, however, it is not known whether the limit is imposed by the nature of the perceptual machinery that must recognize the sounds or by the nature of the speech machinery that must produce them. Somebody will have to do the experiment to
find out. There is a limit, however, at about eight or nine distinctive features in every language that has been studied, and so when we talk we must resort to still another trick for increasing our channel capacity. Language uses sequences of phonemes, so we make several judgments successively when we listen to words and sentences. T h a t is to say, we use both simultaneous and successive discriminations in order to expand the rather rigid limits imposed b y the inaccuracy of our absolute judgments of simple magnitudes. These multidimensional judgments are strongly reminiscent of the abstraction experiment of Kiilpe ( 1 4 ) . As you may remember, Kiilpe showed that observers report more accurately on an attribute for which they are set than on attributes for which they are not set. F o r example, Chapman ( 4 ) used three different attributes and compared the results obtained when the observers were instructed before Ihe tachistoscopic presentation with the results obtained when they were not told until after the presentation which one of the three attributes was to be reported. When tbe instruction was given in advance, tbe judgments were more accurate. When the instruction was given afterwards, the subjects presumably had to judge all three attributes in order to report on any one of them and the accuracy was correspondingly lower. T h i s is in complete accord with tbe results we have just been considering, where the accuracy of judgment on each attribute decreased as more dimensions were added. T h e point is probably obvious, but I shall make it anyhow, that the abstraction experiments did not demonstrate that people can judge only one attribute at a time. They merely showed what seems quite reasonable, that people are less accurate i f they must judge more than one attribute simultaneously.
215
90
two dimensions of numerousness are area and density. When the subject can subitize, area and density may not be the significant variables, but when the subject must estimate perhaps they are significant. In any event, the comparison is not so simple as it might seem at first thought. This is one of the ways in which the magical number seven has persecuted me. Here we have two closely related kinds of experiments, both of which point to the significance of the number seven as a limit on our capacities. And yet when we examine the matter more closely, there seems to be a reasonable suspicion that it is nothing more than a coincidence.
SUB1TIZING
I cannot leave this general area without mentioning, however briefly, the experiments conducted at Mount Holyoke College on the discrimination of number (12). In experiments by Kaufman, Lord, Reese, and Volkmann random patterns of dots were flashed on a screen for 1/5 of a second. Anywhere from 1 to more than 200 dots could appear in the pattern. The subject's task was to report how many dots there were. The first point to note is that on patterns containing up to five or six dots the subjects simply did not make errors. The performance on these small numbers of dots was so different from the performance with more dots that it was given a special name. Below seven the subjects were said to subitize; above seven they were said to estimate. This is, as you will recognize, what we once optimistically called "the span of attention."
T H E SPAN OP IMMEDIATE MEMORY
T h i s discontinuity at seven is, of course, suggestive. Is this the same basic process that limits our unidimensional judgments to about seven categories? The generalization is tempting, but not sound in my opinion. The data on number estimates have not been analyzed in informational terms; but on the basis of the published data I would guess that the subjects transmitted something more than four bits of information about the number of dots. Using the same arguments as before, we would conclude that there are about 20 or 30 distinguishable categories of numerousness. This is considerably more information than we would expect to get from a unidimensional display. It is, as a matter of fact, very much like a two-dimensional display. Although the dimensionality of the random dot patterns is not entirely clear, these results are in the same range as Klemmer and F r i c k ' s for their two-dimensional display of dots in a square. Perhaps the
Let me summarize the situation in this way. There is a dear and definite limit to the accuracy with which we can identify absolutely the magnitude of a unidimensional stimulus variable. I would propose to call this limit the span oj absolute judgment, and I maintain that for unidimensional judgments this span is usually somewhere in the neighborhood of seven. We are not completely at the mercy of this limited span, however, because we have a variety of techniques for getting around it and increasing the accuracy of our judgments. The three most important of these devices are (a) to make relative rather than absolute judgments; or, if that is not possible, (4) to increase the number of dimensions along which the stimuli can differ; or (e) to arrange the task in such a way that we make a sequence of several absolute judgments in a row. The study of relative judgments is one of the oldest topics in experimental psychology, and I will not pause to review it now. The second device, increasing the dimensionality, we have just considered. It seems that by adding
216
01
more dimensions and requiring crude, binary, yes-no judgments on each attribute we can extend the span of absolute judgment from seven to at least ISO. Judging from our everyday behavior, the limit is probably in the thousands, if indeed there is a limit. I n my opinion, we cannot go on compounding dimensions indefinitely. I suspect that there is also a span oj perceptual dimensionality and that this span is somewhere in the neighborhood of ten, but I must add at once that there is no objective evidence to support this suspicion. T h i s is a question sadly needing experimental exploration. Concerning the third device, the use of successive judgments, I have quite a bit to say because this device introduces memory as the handmaiden of discrimination. And, since mnemonic processes are at least as complex as are perceptual processes, we can anticipate that their interactions will not be easily disentangled. Suppose that we start by simply extending slightly the experimental procedure that we have been using. Up to this point we have presented a single stimulus and asked the observer to name it immediately thereafter. We can extend this procedure by requiring the observer to withhold his response until we have given him several stimuli in succession. A t the end of the sequence of stimuli he then makes his response. We still have the same sort of input-output situation that is required for the measurement of transmitted information. B u t now we have passed from an experiment on absolute judgment to what is traditionally called an experiment on immediate memory. Before we look at any data on this topic I feel I must give you a word of warning to help you avoid some obvious associations that can be confusing. Everybody knows that there is n finite span of immediate memory and that for
a lot of different kinds of test materials this span is about seven items in length. I have just shown you that there is a span of absolute judgment that can distinguish about seven categories and that there is a span of attention that will encompass about six objects at a glance. What is more natural than to think that all three of these spans are different aspects of a single underlying process? And that is a fundamental mistake, as I shall be at some pains to demonstrate. T h i s mistake is one of the malicious persecutions that the magical number seven has subjected me to. M y mistake went something like this. W e have seer, that the invariant feature in the span of absolute judgment is the amount of information that the observer can transmit. There is a real operational similarity between the absolute judgment experiment and the immediate memory experiment. I f immediate memory is like absolute judgment, then it should follow that the invariant feature in the span of immediate memory is also the amount of information that an observer can retain. I f the amount of information in the span of immediate memory is a constant, then the span should be short when the individual items contain a lot of information and the span should be long when the items contain little information. F o r example, decimal digits are worth 3.3 bits apiece. We can recall about seven of them, for a total of 23 bits of information. Isolated English words are worth about 10 bits apiece. I f the total amount of information is to remain constant at 23 bits, then we should be able to remember only two or three words chosen at random. In this way I generated a theory about how the span of immediate memory should vary as a function of the amount of information per item in the test materials. The measurements of memory span in the literature are suggestive on this
217
92 SO
question, but not definitive. And so it was necessary to do tbe experiment to see. Hayes (10) tried it out with five different kinds of test materials: binarydigits, decimal digits, letters of the alphabet, letters plus decimal digits, and with 1,000 monosyllabic words. T h e lists were read aloud at the rate of one item per second and the subjects had as much time as they needed to give their responses. A procedure described by Woodworth (20) was used to score the responses. The results are shown by the filled circles in F i g . 7. Here the dotted line indicates what the span should have been if the amount of information in the span were constant. T h e solid curves represent the data. Hayes repeated the experiment using test vocabularies of different sizes but all containing only English monosyllables (open circles in Fig. 7 ) . This more homogeneous test material did not change the picture significantly. With binary items the span is about nine and, although it drops to about five with monosyllabic English words, the difference is far less than the hypothesis of constant information would require. I r L l " ( « » BCItS
COKS'SNT NO Of ITCMS-
LtTTERS
I
50 20
t I
•OOO
I
( CONSTANT
tnroftMATiati i
\
i % \
10 10
3
*
5
IZ
INFORM An ON PER ITEM IN BITS
Fin. 7. Data from Hayes (10) on Ihe span of immediate memory plotted as a function of the amount of information per item i n the test malerals.
•
SITS
F I G . 8. Data from Pollack ( 1 6 ) on amount of into m a t ion retained alter presentation plotted as a function of amount of information per item in the materials.
50 *0
Z
INFORMATION PER ITEM IN
the cne the test
There is nothing wrong with Hayes's experiment, because Pollack (16) repeated it much more elaborately and got essentially the same result. Pollack took pains to measure the amount of information transmitted and did not rely on the traditional procedure for scoring the responses. His results are plotted in F i g . 8. Here it is clear that the amount of information transmitted is not a constant, but increases almost linearly as the amount of information per item in the input is increased. And so the outcome is perfectly clear. I n spite of the coincidence that the magical number seven appears in both places, the span of absolute judgment and the span of immediate memory are quite different kinds of limitations that are imposed on our ability to process information. Absolute judgment is limited by the amount of information. I m mediate memory is limited by the number of items. I n order to capture this distinction in somewhat picturesque terms, I have fallen into tbe custom of distinguishing between bits of information and chunks of information. T h e n I can say that the number of bits of information is constant for absolute judgment and the number of chunks of informa-
218
93 tion is constant for immediate memory. T h e span of immediate memory seems to be almost independent of the number of bits per chunk, at least over the range that has been examined to date. T h e contrast of the terms bit and chunk also serves to highlight the fact that we are not very definite about what constitutes a chunk of information. F o r example, the memory span of five words that Hayes obtained when each word was drawn at random from a set of 1000 English monosyllables might just as appropriately have been called a memory span of 15 phonemes, since each word had about three phonemes in it. Intuitively, it is clear that the subjects were recalling five words, not 15 phonemes, but the logical distinction is not immediately apparent. We are dealing here with a process of organizing or grouping the input into familiar units or chunks, and a great deal of learning has gone into the formation of these familiar units. RZCODESG
I n order to speak more precisely, therefore, we must recognize the importance of grouping or organizing the input sequence into units or chunks. Since the memory span is a fixed number of chunks, we can increase the number of bits of information that it contains simply by building larger and larger chunks, each chunk containing more information than before. A man just beginning to learn radiotelegraphic code hears each dit and dah as a separate chunk. Soon he is able to organize these sounds into letters and then he can deal with the letters as chunks. T h e n the letters organize themselves as words, which are still larger chunks, and he begins to hear whole phrases. I do not mean that each step is a discrete process, or that plateaus must appear in his learning curve, for surely the levels of organization are
219
achieved at different rates and overlap each other during the learning process. I am simply pointing to the obvious fact that the dits and dahs are organized by learning into patterns and that as these larger chunks emerge the amount of message that the operator can remember increases correspondingly. I n the terms I am proposing to use, the operator learns to increase the bits per chunk. I n the jargon of communication theory, this process would be called recoding. T h e input is given in a code that contains many chunks with few bits per chunk. T h e operator recodes the input into another code that contains fewer chunks with more bits per chunk. There are many ways to do this recoding, but probably the simplest is to group the input events, apply a new name to the group, and then remember the new name rather than the original input events. Since I am convinced that this process is a very general and important one for psychology, I want to tell you about a demonstration experiment that should make perfectly explicit what I am talking about. T h i s experiment was conducted by Sidney Smith and was reported by him before the Eastern Psychological Association in 1954. Begin with the observed fact that people can repeat back eight decimal digits, but only nine binary digits. Since there is a large discrepancy in the amount oi information recalled in these two cases, we suspect at once that a recoding procedure could be used to increase the span of immediate memory for binary digits. I n Table 1 a method for grouping and renaming is illustrated. Along the top is a sequence of I S binary dibits, far more than any subject was able to recall after a single presentation. In the next line these same binary digits are grouped by pairs. Four possible pairs can occur: 00 is renamed 0, 01 is renamed 1, 10 is renamed 2, and 11 is
94 TABLE 1 WAYS
or
Binary D i o n (Bits)
RECODING SEQUENCES
1 0 1 0 0 0
2:1
Chunks Recoding
10 2
3:1
Chunks Recoding
101 5
4:1
Chunks Recoding
1010 10
Chunks Recoding
10100 20
5:1
10 2
00 0
or B t N u y D i o n s I
0 0
10 2
000 0
100 4 0010 2
01001 9
renamed 3. T h a t is to say, we recode from a base-two arithmetic to a basefour arithmetic. I n the recoded sequence there are now just nine digits to remember, and this is almost w i t h i n the span of immediate memory. I n the next line the same sequence of binary digits is regrouped into chunks of three. There are eight possible sequences of three, so we give each sequence a new name between 0 and 7. N o w we have recoded from a sequence of 18 b i n a r y digits into a sequence of 6 octal digits, and this is well w i t h i n the span of immediate memory. I n the last t w o lines the binary digits are grouped b y fours and b y fives and are given decimal-digit names f r o m 0 to IS and from 0 to 3 1 . I t is reasonably obvious t h a t this k i n d of recoding increases the bits per chunk, and packages the binary sequence i n t o a f o r m that can be retained w i t h i n the span of immediate memory. So Smith assembled 20 subjects and measured their spans for binary and octal digits. The spans were 9 for binaries and 7 for octals. T h e n he gave each recoding scheme t o five o i the subjects. T h e y studied the recoding u n t i l they said they understood i t — f o r about 5 or 10 minutes. T h e n he tested their span for binary digits again while they tried to use the recoding schemes they bad studied.
I
I
1 0
01 11 1 3 111 7 0111 7
0
00 0 001 1 0011 3
11001 25
1 1 I
11 3
0
10 2 no 6 10 no
T h e recoding schemes increased their span for binary digits i n every case. B u t the increase was not as large as we had expected on the basis of their span for octal digits. Since the discrepancy increased as the recoding ratio increased, we reasoned t h a t the few minutes the subjects h a d spent learning the recoding schemes h a d n o t been sufficient. A p p a r e n t l y t h e translation f r o m one code to the other must be almost automatic or the subject w i l l lose part of the next group while he is t r y i n g to remember t h e translation of the last group. Since the 4 : 1 a n d S: 1 ratios require considerable study, Smith decided t o imitate Ebbinghaus and do the experiment on himself. W i t h Germanic patience he drilled himself on each recoding successively, and obtained the results shown i n F i g . 9 . Here the data follow along rather nicely w i t h t h e results y o u w o u l d predict on tbe basis o f his span for octal digits. H e could remember 12 octal digits. W i t h t h e 2 : 1 recoding, these 12 chunks were w o r t h 24 b i n a r y digits. W i t h the 3 : 1 recoding they were w o r t h 36 b i n a r y digits. W i t h the 4 : 1 a n d 5 : 1 recodings, they were w o r t h about 4 0 binary digits. I t is a l i t t l e dramatic to watch a person get 4 0 binary digits i n a r o w a n d then repeat them back w i t h o u t error. However, i f y o u t h i n k of this merely as
220
95 so g s>
40 a
Jo
I
"
in I
to
PREDICTED
mat w > n* OCTH. OWITI
ONE HIGHLY
2:i
5:i
RECODING
PRACTICED SUBJECT
mi
ti
RATIO
Flo. 9. T h e ipan of immediatt memory for binary digits is plotted i s a function of tbe recoding procedure used. Tbe predicted function is obtained by multiplying tie span for oculs by 2, 3 and J J for recoding into base 4, base 8, and base 10, respectively.
a mnemonic trick for extending the memory span, you will miss the more important point that is implicit in nearly all such mnemonic devices. T h e point is that recoding is an extremely powerful weapon for increasing the amount of information that we can deal with. I n one form or another we use recoding constantly in our daily behavior. I n my opinion the most customary kind of recoding that we do all the time is to translate into a verbal code. When there is a story or an argument or an idea that we want to remember, we usually try to rephrase it "in our own words." When we witness some event we want to remember, we make a verbal description of the event and then remember our verbalization. Upon recall we recreate by secondary elaboration the details that seem consistent with the particular verbal recoding we happen to have made. T h e well-known experiment by Carmichael, Hogan, and Walter ( 3 ) on the influence that names have on the recall of visual figures is one demonstration of the process. The
inaccuracy of the testimony of
221
eyewitnesses is well known in legal psychology, but the distortions of testimony are not random—they follow naturally from the particular recoding that the witness used, and the particular recoding he used depends upon bis whole life history. Our language is tremendously useful for repackaging material into a few chunks rich in information. I suspect that imagery is a form of recoding, too, but images seem much harder to get at operationally and to study experimentally than the more symbolic kinds of recoding. I t seems probable that even memorization can be studied in these terms. T h e process of memorizing may be simply the formation of chunks, or groups of items that go together, until there are few enough chunks so that we can recall all the items. T h e work by Bousfield and Cohen (2) on the occurrence of clustering in the recall of words is especially interesting in this respect. SUMMARY
I have come to the end of the data that I wanted to present, so I would like now to make some summarizing remarks. First, the span of absolute judgment and the span of immediate memory impose severe limitations on the amount of information that we are able to receive, process, and remember. B y organizing the stimulus input simultaneously into several dimensions and successively into a sequence of chunks, we manage to break (or at least stretch) this informational bottleneck. Second, the process of recording is a very important one in human psychology and deserves m u c h more e x p l i c i t attention than it has received. In particular, the kind of linguistic recording that people do seems to me to be the very lifeblood of the thought processes. Recording procedures are a constant concern to clinicians, social psycholo-
96
gists, linguists, and anthropologists and yet, probably because recoding is less accessible to experimental manipulation than nonsense syllables or T mazes, the traditional experimental psychologist has contributed little or nothing to their analysis. Nevertheless, experimental techniques can be used, methods of receding can be specified, behavioral indicants can be found. And I anticipate that we will find a very orderly set of relations describing what now seems an uncharted wilderness of individual differences.
[or absolute judgment, the seven objects in the span of attention, and the seven digits in the span of immediate memory? For the present I propose to withhold judgment. Perhaps there is something deep and profound behind all these sevens, something just calling out for us to discover it. But I suspect that it is only a pernicious, Pythagorean coincidence.
Third, the concepts and measures provided by the theory of information provide a quantitative way of getting at some of these questions. The theory provides us with a yardstick for calibrating our stimulus materials and for measuring the performance of our subjects. In the interests of communication I have suppressed the technical details of information measurement and have tried to express the ideas in more familiar terms; I hope this paraphrase will not lead you to think they are not useful in research. Informational concepts have already proved valuable in the study of discrimination and of language; they promise a great deal in the study of learning and memory; and it has even been proposed that they can be useful in the study of concept formation. A lot of questions that seemed fruitless twenty or thirty years ago may now be worth another look. In fact, I feel that my story here must stop just as it begins to get really interesting. And finally, what about the magical number seven? What about the seven wonders of the world, the seven seas, the seven deadly sins, tbe seven daughters of Atlas in the Pleiades, the seven ages of man, the seven levels of hell, the seven primary colors, the seven notes of the musical scale, and the seven days of the week? What about the sevenpoint rating scale, the seven categories
222
REFERENCES 1. B E E B E - C E N T E K , J .
G.,
Roc ESS,
M.
S,
*
D. N. Transmission of information about sucrose and saline solutions through the sense of taste. J. Psychol., 1955, 39, 157-160. O'COXKEZL,
2. BOUSTTEXD, W . A.,
a
COBEK, B . H .
Tbe
occurrence of clustering in the recall of randomly arranged words of different frequencies-of-usage. / . gen. Psychol, 1955, 52, 83-95. 3. C A S M J C B A E L , L , H C C A S , H . P . , S W A L T E R ,
A. A. An experimental study of the effect of language on the reproduction of visually perceived form. J txp. Psychol., 1932, 15, 73-86. 4. C H A F S U N , D. W . Relative effects of determinate and indeterminate Aufgaben. Amer. J. Psychol.,
5.
1932 , 44, 163-174.
Multidimensional stimulus differences and accuracy of discrimi-
EFIESEK, C . W .
nation.
USAF,
WADC
Tech.
Rep.,
19S4, No. $4-165. 6. E W X S E N , C. W , k H A E E , H . W . Absolute judgments as a function of the stimulus range and the number of stimulus and response categories. J. txp. Psychol, 1955, 49, 323-332. 7. GAfLSEft, W . R. An informational analysis of absolute judgments of loudness. /. tip. Psychol., 1953, 46, 373-3SO. 8. H A K E , H . W . , a G A H C E K , W . R.
The
ef-
fect of presenting various numbers of discrete steps on scale reading accuracy. /. txp. Psychol^ 1951, 42, 358-366. 9. H A L S E Y , R. M., k C H A P A N I S , A. Chroma licit y-confusion contours In a complex viewing situation. J. Opt. Soc. Amer., 1954, 44, 442-454. 10. H A Y E S , J . R. M. Memory span for several vocabularies as a function of vocabulary siie. In Quarterly Progress Report, Cambridge, Mass.: Acoustics Laboratory, Massachusetts Institute of Technology, Jan.-June, 1952.
97 11. J A X O B S O N , R . , FAJTT,
C. C. M
n
*
HALLE,
M. Frelimmarit ] to speech analysis. Cambridge, Mass.: Acoustics Laboratory, Massachusetts Institute of Technology, 1952. (Tech. Rep. No. 13.) 12. E A T J F U W , T.
E . L „ LORD,
M.
W , a VOLXKANH, J .
W.,
REESE,
The discrimi-
nation of visual number. Amer. J. Psychol, 1949, 62, 498-S2S. 15. K U M I P J , E . T , a F a i c x , F . C . Assimilation of information from dot and matrix patterns. J. t*p. Psychol, 1 9 5 3 , 45,
15-19.
14. S U L P E , O . Versuche uber Abstraction. Ber. Q. d I Kor.gr. /. tzper. Psychol, 1904, 15.
56-63.
M n . i . E B , G . A-, a N I C E L Y , P . E . A n analy-
sis of perceptual confusions among some
English consonants. J. Acouit. Soc. Amer., 195S, 27, 33S-352. 16. P O L L A C E , I . The assimilation oi sequin* tially encoded information. Amtr. J. Psychol., 1953, 66, 421-435. 17. P O L L A C K , I . Tbe information of elementary auditory displays. / . Acoust. Soc. Amer, 1932. 2 4 , 745-749. 13. P O L L A C K , I . The information of elementary auditory displays. I I . J. Acoust. Soc. Amer, 1953, 2 5 , 765-769. 19.
POLLACK,
I , a FICKS,
L.
Information of
elementary multi-dimensional auditory displays. / . Acoust. Soc. Amer, 1934, 2 6 , 155-158. 2 0 . W O O D W O K T H , R . S. Experimental psytkoloty. New York: Holt, 1933. (Received M a y 4 , 1955)
223
Reprinted from Perception, Vol. 1. pp. 371-394,1972 © 1972 PlONLtd.
Single units and sensation: A neuron doctrine for perceptual psychology? H B Barlow Department of Physiology-Anatomy. University ol California. Berkeley, California 94720 Received 6 December 1972
Abstract. The problem discussed is the relationship between Ihe firing of single neurons in sensory pathways and subjectively experienced sensations. The conclusions are formulated as the following five dogmas: I, To understand nervous funclion one needs to look at interactions at a cellular level, rather Lhan either a more macroscopic or micioscopic level, because behaviour depends upon the organized pattern of these intercellular interaclions. Z. The sensory system is organized lo achieve as complete a representation of the sensoiy stimulus as possible with the minimum number of active neurons. 3. Trigger features of sensoiy neurons are matched to redundant patterns of stimulation by experience as well as by developmental piocessei. 4. Perception corresponds lo the activity of a small selection from the very numerous high-level neurons, each of which corresponds to a pattern of externa] events of the order of complexity of the events symbolized by a word. 5. High impulse frequency in such neurons corresponds to high ceitainty that the trigger feature is present. The development of Ihe concepts leading up lo these speculative dogmas, their experimental basis, and some of their limitations are discussed. 1 Introduction In this article I shall discuss the difficult but challenging problem of the relation between our subjective perceptions and the activity of the nerve cells in our brains. Results obtained by recording from single neurons in sensory pathways have aroused a lot of interest and obviously tel! us something important aboul how we sense the world around us; but what exactly have we been told? In order to probe this question, ideas that fit current knowledge as well as passible must be formulated, and they must be staled clearly enough lo be tested to see if they are right or wrong; this is what I have tried to do. The central proposition is that our perceptions ate caused by the activity of a rather small number of neurons selected from a very large population of predominantly silent cells. The activity of each single cell is thus an important perceptual event and it is thought to be related quite simply to our subjective experience. The subtlety and sensitivity of perception results from the mechanisms determining when a single cell becomes active, rather than from complex combinatorial rules of usage of nerve cells. In order to avoid vagueness, 1 have formulated this notion in five definite propositions, or dogmas, and the reader who wishes to see the trend of this article can glance ahead (to page 380). Some of the dogmas will be readily accepted by most people who hope to find a scientific basis for human thought processes, but 1 felt they required statement and discussion in spite of their widespread tacit acceptance. Others are more original, will be challenged by many, and have the nature of extrapolations from the current trend of results rather than conclusions reasonably based upon them. Before these dogmas are stated the developments that have led to them will be briefly reviewed. The literature is extensive, and much of it will have been incorporated into the reader's common knowledge. My aim, therefore, is to pick out the conceptual turning points in order to show the direction we are headed.
224
372
After stating the dogmas, criticisms and alternatives will be discussed in an attempt both to justify them and to clarify them further. 2 Recording from single neurons
2.1 Peripheral nerves In the twenties and thirties methods were developed for amplifying and recording the weak transient electrical potentials associated with the activity of nerve fibres, and Adrian and his colleagues used these methods to record the all-or-none impulses of single nerve fibres connecting the sense organs to the brain (Adrian, 1926a, 1926b; Adrian and Zotterman, 1926a, 1926b; Adrian, L928}. They showed. Tor example, that each fibre coming from the skin responded to a particular type o f stimulus, such as pressure, temperature change, or damage, applied to a specific region or receptive field. The frequency o f the impulses depended upon the intensity of the stimulus, but it was clear that the character of the sensation (touch, heat, or pain) depended upon the fibre carrying the message, not the nature of the message, since this consisted of trains of similar impulses in all fibres. Nerves had long been recognized as the link between physical stimulus and sensation, so these results provided physiological flesh and blood to the skeleton that anatomical studies had revealed a long time earlier. Most of the results confirmed another ancient idea, namely Miiller's doctrine o f specific nerve energies: the specificity of different sensations stems from the responsiveness o f different nerve fibres to different types of stimulus. The chemical senses proved to be a little different (Pfaffman, 1941, 19SS; Ganchrow and Erickson, 1970), but i n spite o f the fact that they did not quite fall in line, the concept that resulted from two decades o f recording from peripheral fibres and following their connections i n the brain was o f a simple mapping from sense organs to sensorium, so that a copy o f physical events at the body surface was presented to the brain (Bard, 1938; Marshall el al., 1941; Adrian, 1941, 1947). Some modification was recognized to occur, for sensory nerves usually adapt to a constant stimulus, and therefore signal sudden changes o f stimulus energy better than sustained levels. Neighbouring receptive fields and modalities were also known to overlap, but when the activity of neurons at higher levels in sensory pathways was recorded it became obvious that something was happening more complex and significant than could be fitted into Ihe concept of simple mapping with overlap and adaptation,
2.2 Sensory neurons of Ihe retina Starting with Granit (Granit and Svaetichin, 1939; Granit, 1947) and Hartline (1938; 1940a, 1940b) in the retina, and Galambos and Davis (Galambos and Davis, 1943; Galambos, 1944; Galambos and Davis, 1948) at the periphery of the pathway for hearing, a generation of physiologists has studied sensory neutons in the central nervous system; all this obviously cannot be reviewed here, but we shall concentrate on the results that expanded the conceptual frame built on the earlier work. Previously it was passible for physiologists to be satisfied with describing how the sense organs and their nerves present a picture a( Ihe external world to the brain, and they were happy to leave it to the psychologists l o discuss what happened next; but these next things started to happen around the physiologist's micro-electiodes, and he has to Join the discussion. The realization that physiological experiments can answer questions o f psychological interest first dawned on me personally when 1 was working on the frog's retina. A vigorous discharge can be evoked from retinal ganglion cells by stimulating the appropriate region of the retina—the ganglion cell's 'receptive field' (Hartline, 1940a); but i f the surrounding region is simultaneously stimulated the response of the cell is diminished or completely abolished (Barlow, 1953). This phenomenon is
225
373
called lateral inhibition, or peripheral suppression, and such a physiological mechanism had already been postulated in order to account for simultaneous brightness and Mach bands (Mach, 1886; Fry, 1948). Thus the physiologicai experiment was really providing evidence in support of a psychological hypothesis. The invasion of psychological territory did not stop at this point. I f one explains the responsiveness of single ganglion cells in the frog's retina using hand-held targets, one finds that one particular type of ganglion cell is most effectively driven by something like a black disc subtending a degree or so moved rapidly to and fro within the unit's receptive field. This causes a vigorous discharge which can be maintained without much decrement as long as the movement is continued. Now, i f tl,c stimulus which is optimal for this class of cells is presented to intact frogs, the behavioural response is often dramatic: they turn towards the target and make repeated feeding responses consisting of a jump and snap. The selectivity o f the retinal neurons, and the frog's reaction when they are selectively stimulated, suggest that they are 'bug detectors' (Barlow, 1953) performing a primitive but vitally important form of recognition. This result makes one suddenly realize that a large part of the sensory machinery involved in a frog's feeding responses may actually reside in the retina rather than in mysterious 'centres' that would be too difficult to understand by physiological methods. The essential lock-like property resides in each member of a whole class of neurons, and allows the cell to discharge only to the appropriate key pattern of sensory stimulation. Lettvin et at. (1959) suggested that there were five different classes of cell in the frog, and Levick, Hill and I (Barlow et at., 1964) found an even larger number of categories in the rabbit. We called these key patterns 'trigger features', and Maturana et at. (1960) emphasized another important aspect of the behaviour of these ganglion cells: a cell continues to respond to the same trigger feature in spite of changes in light intensity over many decades. The properties of the retina are such that a ganglion cell can, figuratively speaking, reach out and determine that something specific is happening in front of the eye. Light is the agent by which it does this, but it is the detailed pattern o f the light that carries the information, and the overall level of illumination prevailing at the time is almost totally disregarded. It is true that Ingle (1968, 1971), Griisser and Griisser-Comehls (1968), and Ewert (1970) have shown that it is too simple to suppose that feeding automatically and inevitably follows the activation of a certain class of retinal ganglion cells by their trigger features; higher coordinating mechanisms are also involved. Just as light is only an intermediate agent allowing a retinal ganglion cell to detect its trigger feature, so these optic nerve impulses must doubtless be regarded as intermediate agents enabling the higher centres to perform their tasks. We shall proceed to discuss these problems, but we have gained two important concepts from the frog's retina: it transmits a map, not of the light intensities at each point of the image, but o f the trigger features in the world before the eye, and its main function is not to transduce different luminance levels into different impulse frequencies, but to continue responding invariantly to the same external patterns despite changes of average luminance.
2.3 Sensory neurons of the cerebral cortex The function of the visual area o f the mammalian cerebral cortex is obviously more relevant to the problem of our own subjective perceptions than is the frog's retina, and Hubel and Wiesel (1959) early discovered examples of selectivity for pattern in the responsiveness of cells in the visual cortex o f cats. They found that a light or dark line, or a dark-light border, was required to evoke a vigorous response even in the simplest first-order cells. Furthermore the stimulus had to be at a rather precise
226
374
orientation and position i n the visual field and in addition it usually had lo be moving, often in a specific direction. Hubel and Wiesel (1962) also made a distinction between these cells and other classes wilh more elaborate stimulus requirements, which they believed corresponded lo cells al Inter stages of information processing. They called these 'complex' and 'hypercomplcx' units, and showed that they had properties suggesting that the input to each was from the simpler category of cells. The fascination of this analysis depends to a large extent upon successfully following the way units become selective for more and more complex properties at each stage. Some doubts have been cast on their hierarchical scheme (Stone, 1972), but it certainly gave new insight into how higher levels of categorization are developed from lower levels. As well as the hierarchical concept, this work provided evidence for a new typu of invariance. In the cat, as in the frog, the retina is mainly responsible for ensuring that the message sent to the brain is not much perturbed by changes in ambient illumination. In the cortex Hubel and Wiesel (1962) found that some of the higher level neurons responded to the same trigger feature over a considerable range of positions. The modality specificity of peripheral neurons indicates how one can, for instance, detect warmth at any point on the body surface, and we now see that the organized pattern specificity of a set of cortical neurons can in the same way produce positional invariance for pattern perception. This was previously one of the great puzzles, and, although we certainly do not understand how recognition is invariant for position, size, and perspective transformations, at least a start has been made. Later experiments have shown that the primary neurons o f the visual cortex are more specific in one respect than Hubel and Wiesel originally thought. They showed that most neurons are fed by inputs from both eyes, and they emphasized that the dominance of ipsi- or contra-lateral eye varied from cell to cell. Now it can be shown that a binocular stimulus often has to be very precisely positioned in both eyes in order to evoke the most vigorous response (Barlow el a t , 1967; Pettlgrew el al., 1968), and a more important variable than dominance emerges from Ihe exact relative positions in the two eyes. Consider what must happen when the eyes are converged on some point in front of the cat and appropriate visual stimuli arc presented; il is easy to position this stimulus correctly for either eye by ilself, b u l , i f i l is to be correctly positioned for both, it will have to be at some specific distance from the cat. When the precise positioning for different units is studied, it is found that this specific distance for optimal response vanes in different units in Ihe sanif curlcx. and among units serving the same region of visual space. Conversely, the selection of units which are activated provides the cat with some informalion about the distances of the various stimulus objects. In uncovering this aspect of the pattern selectivity of sensory neurons we again get the sense that a central neuron is reaching out to discover something important about what is happening in the real objective world. One even wonders i f the line and edge detectors of Hubel and Wiesel may not luivc. as their main function, the linking together o f information about the same object in the two retinal images i n order to determine the object's most important c o o r d i n a t e its distance from the animal. At all events, as in the case of the frog's bug-detector, the importance of the information abstracted from the retinal images gives some insight into the purpose or direction of the physiological mechanisms. Something is known about these first steps of information processing in the visual cortex; what about the later stages? Results suggesting greater and greater specificity of response requirements have been obtained, and a nice example is the unit described by Gross el al. (1972) in the uifero-temporal cortex of macaques; this responded best to stimulation by a figure with many of the specific characteristics of a monkey's hand, and the requirements o f one such unit are well documented.
227
375
Work in this area is not easy to repeat, for one can readily see that it is largely a matter of chance to find a trigger feature of this order of complexity. Also, the possibility that cells may retain to adulthood the modifiable properties of Immature cells that will be described later makes the prospect of investigating the sensory association areas an intimidating one. Cortical neurons receive selective excitatory and inhibitory inputs from other neurons and thereby possess selective responsiveness for some characteristics and invariances for changes in other characteristics. This seems to have the potentiality of being a powerful information processing system. 3 Single units and psychophysics The neurophysiological discoveries outlined above of course made a deep impression on those investigating sensation psycho physically, but although there are many superficial points of contact it has not proved easy to link sensations securely to specific patterns of neurophysiological activity. The topics 1 have chosen are again ones which seem to have implications about how we conceptualize this neuropsychic relationship.
3.1 Lateral inhibition and simultaneous contrast The relation between lateral inhibition in the retina and simultaneous contrast has already been mentioned, but there is a large gap between the physiological level and the subjective effects shown in textbook illustrations, and it is too big to be bridged by a single simple statement. It is quite easy to show that frog and cat retinal ganglion cells demonstrate relevant effects, since their antagonistic surrounds (Barlow, 1953; Kuffler, 1953) make their responses depend upon contrast rather than absolute luminance. Hence on-centre cells respond to spots we would call white, off-centre cells to spots we would call black, even when the so-called black spot has a higher luminance than the white spot (Barlow et al., 1957). But subjective contrast effects also hold for conditions where one cannot make such easy comparisons, for instance at the centre of an area which is much too large to fill the centre of a retinal receptive field. Of course, one can postulate some Tilling in' process (Yarbus, 1965), but the necessity oT introducing ad hoc assumptions makes many explanations of subjective effects in terms of single units unconvincing. The concept that enables one to escape this difficulty is to concentrate on the informational flow rather than on the direct subjective-physiological comparison. Information discarded in a peripheral stage of processing cannot be accurately added back centrally, and in the present case it helps to talk about 'attenuating low spatial frequencies' instead or 'signalling spatial contrast'. To say that some of the lowfrequency attenuation of the whole visual system is performed by the opposed centre-surround organisation of the retinal ganglion cell (Campbell and Green, I96S; Enroth-Cugel) and Robson, 1966) is more accurate than to say that all simultaneous contrast effects originate there.
3.2 Colour in the field of colour vision De Valois has looked for relationships between various psychophysically measurable aspects of colour and the properties of single unit responses recorded at the level of the lateral geniculate nucleus. The main tesults provided a startling confirmation of Hering's long-standing hypothesis about the reciprocal organisation of colour systems (Svaetichin and MacNicholl, 1958; De Valois, I960; Hurvich and Jameson. 1960; Wagner et al., 1960), but the details are important. He has been able to establish neuro-psychic parallels using what may be called the 'lower envelope' or 'most sensitive neuron' principle. A monkey's ability to discriminate hue and saturation (De Valois et al., 1966, 1967) is very close
228
376
to whal one would expect if the monkey only pays attention to the most sensitive of the optic nerves conveying information about these qualities of the stimulus. Thus the psychophysical performance follows the lower envelope of the performance of individual fibres. It is particularly interesting to see that a continuous psychophysical function, hue discrimination as a function of wavelength, is served by a different type of neuron in different ranges; over the long wavelength range the red-green opponent system was much more sensitive to wavelength shift, whereas the blue-yellow system was more sensitive at short wavelengths. This result again fits in with the concept that neurophysiology and sensation are best linked by looking at the flow of information rather than simpler measures of neuronal activity. For instance it might be suggested that sensation follows the average neural activity, and it would be easy to justify this on the neurophysiological grounds that post-synaptic potentials are usually additive. However, this oversimple suggestion is proved false by the fact that psychophysical hue discrimination does not follow the average response of the red-green and blue-yellow systems, but instead follows the lower envelope. Now when two noisy channels are both conveying information about a signal, the channel with the highest signal/noise ratio dominates the situation; the low signal/noise ratio channel can be used to improve performance slightly, but it is a very small contribution except where its signal/noise ratio is nearly as high as that of the more sensitive channel. Thus the 'most-sensitive neuron' principle again fits the concept that, to link neurophysiological activity and sensation, one should look at the flow of information.
3.3 Touch Another example is given by the work of Mountcastle and his colleagues (Talbot et at., 1968), in which they studied the responses at a number of levels to vibratory stimuli applied to the glabrous skin of the hand. First they recorded from cutaneous afferents in the monkey, then the cortical responses in the same species, finally they made psychophysical measures of sensory responses in humans to the same stimuli, As with the work on colour, they established that the sensory response depends simply upon the category of nerve fibre with the lowest threshold. The fact that the subjective sensation in both the colour and touch systems seems to follow the lower envelope of the responses of the various types of sensory neuron may give an important clue lo the way in which these neurons represent. sensations. It is as If the screen on which sensations appear is completely blank until a sensory pathway is activated, but when this happens a point lights up and becomes instantly visible. This is nol what one would expect i f there was a lot of ongoing activity in all pathways, or if the magnitude of the signal was a linear function of intensity, nor is it what one would expect if sensation depended in a complex combinatorial way upon the activity of many units. Rather it suggest the concept that Ihe magnitude of the signal directly represents the signal/noise ratio, for then the insignificant signals will automatically be small, and the neuron firing most will automatically be (he most sensitive. This concept receives some support in the next section and is taken up in the fifth dogma and its discussion.
3.4 Adaptation after-effects The fact that one is almost unaware oT the constant pressure applied tu the skin by the chair one is sitting on presumably results, at least in part, from the rapid decline in frequency of the volley of sensory impulses initiated by contact (Adrian. 1928), Central neurons that respond to specific patterns of sensory input also give a decreased response when the pattern is sustained or repeatedly presented, though there have actually been surprisingly few investigations of this effect. These
229
377
adaptation, habituation, oi fatigue effects lead to plausible explanations for many well-known sensory illusions. For example the rate of discharge in the directionally selective neurons of the rabbit retina declines if a stimulus is continuously moved through (he receptive field in the preferred direction, and following cessation of movement the maintained discharge is found to be suppressed (Barlow and Hill, 1963). The resulting imbalance between neurons signalling opposite directions seems to provide a ready explanation of the apparent reversed movement of stationary objects following prolonged inspection of moving objects (the so-called 'waterfall effect'), and provides another example of an ancient psychophysical hypothesis (Wohlgemuth, 191 J) being confirmed neurophysiologically. One must bear in mind that these neural effects were described in the rabbit's retina, whereas in the human, as in the cat and monkey, neurons are probably not directionally selective until the level of the visual cortex (Barlow and Brindley, 1963), but the same type or explanation may well apply to neurons at this level. It has been suggested that one can make inverse inferences from the existence of an after-effect to the presence of neurons with particular selective responses. This is no place to argue whether the after-effects of adaptation to gratings imply a Fouriertype analysis (Blakemore and Campbell, 1969), or whether they can be satisfactorily accounted for by families of different-sized neurons with conventional Hubel-Wieseltype receptive fields, but there is certainly room for argument, and this makes selective adaptation a difficult tool to use to discover later stages of information processing. Instead, 1 think the importance of sensory adaptational effects, and of the corresponding neurophysiological phenomena, lies in the support both these phenomena lend to the concept put forward at the end of the last section. If sensory messages are to be given a prominence proportional to their informational value, mechanisms must exist for reducing the magnitude of representation of patterns which are constantly present, and this is presumably the underlying rationale for adaptive effects. 3.5 Noisiness or reliability of single units It used to be commonly held that nerve cells were unreliable elements, much perturbed by metabolic or other changes and perhaps also by random disturbances of more fundamental origin (McCulloch, 1959; Burns, 1968). The fairly high degree of reliability that the nervous system achieves as a whole was explained by the supposed redundancy of neural circuits and appropriate rules for averaging and combining them. Developments in the study of human vision at the absolute threshold and of the absolute sensitivity of retinal ganglion cells in the cat now indicate that nerve cells are not intrinsically unreliable and that noise often originates externally. Signal detection theory has familiarized psychologists with the problem of detecting signals in the presence of noise (Tanner and Swets, 1954; Green and Swets, 1966), and I think the assumed prevalence of internally generated noise was a major reason why this was thought to be an important new approach. But psychophysical studies have actually shown that the senses and the brain can operate with astonishing intrinsic reliability. Noise may always be present, but to an amazing extent it originates outside the nervous system. This was originally implied by the results of Hecht et al. (1942) on the absolute threshold of vision; they showed that about 100 quanta at the cornea, leading to 10 or less absorptions in the retina, were sufficient to give a sensation of light. But their most revolutionary finding was that the frequency-of-seeing curve, describing the breadth of the threshold zone, is mainly accounted for by quantum fluctuations, not internal sloppiness or random variations of the threshold criterion as had previously been thought. That is not to say that
230
378
Intrinsic retinal noise' or 'dark light' is non-existent or unimportant, for it is probably the main factor determining how many quanta are required Tor reliable detection (Barlow, 1956). It now appears probable that this originates in the photoreceptors and, in some subjects at least, is low enough to allow the conscious detection of the sensation caused by absorption of a single quantum (Sakitt, 1972); similar sensations occur in the absence or light stimuli, but at a lower frequency. In addition, the subjects can apparently discriminate between the sensory messages resulting from 2, 3, 4 , etc. quantal absorptions, each being detected progressively more clearly and reliably. This psychophysical work shows that the human brain, acting as a whole, can distinguish between the disturbances caused by small numbers of quantal absorptions. These must of course originate from single molecular events in single cells, but possibly the disturbance is thereafter diffused through many cells and abstracted in some way from a redundant neural representation. It therefore becomes very interesting to go into the neurophysiology and find how the absorption of a few quanta is signalled. A sensitive example of a retinal ganglion cell of the cat, with its associated bipolar cells, receptors, amacrine and horizontal cells, will give a readily detectable discharge of impulses to as few as 2 or 3 quanta of light absorbed in the retina (Barlow al., 1971). Such a stimulus will give rise to an average of 5 to 10 extra impulses. Thus a single quantal absorption causes as many as 3 extra impulses, two quanta cause about 6 impulses, and so on. The addition of 3 impulses to the maintained discharge is detectable on average, though, like the absorption of a single quantum in the human, it cannot be reliably detected on a single trial. There is of course some intrinsic noise, as shown by the maintained discharge, but its level is extraordinarily low when one considers that a single ganglion cell is connected to more than 100 rods containing a total of some l O molecules of rhodopsin, each poised ready to signal the absorption of a quantum. The important point is that quantitative knowledge of the noise level and reliability of single retinal ganglion cells enables one to see that the performance of the whole visual system can be attributed to a single cell: averaging is not necessary. 1 0
Individual nerve cells were formerly thought lo be unreliable, idiosyncratic, and incapable of performing complex tasks without acting in conceit and thus overcoming their individual errors. This was quite wrong, and we now realise their apparently erratic behaviour was caused by our ignorance, not the neuron's incompetence. Thus we gain support from this neuropsychical comparison for the concept of a neuron as a reliable element capable of performing a responsible role in our mental life, though we need not of course go to the other extreme and assume thai mental errors arc never caused by malfunctioning, ill-educated, or noisy neurons. 4 Mod inability of cortical neurons The most recent conceptual change about the neural basis of our sensations has arisen from a reinspection of the origin of the selective responsiveness of cortical neurons, 4.1
Evidence
for
modi/lability
Hubel and Wiesel (1963) at first thought they had shown thai the whole of [he elaborate organization responsible for the selectivity of neurons in the primary visual cortex was developed solely under genetic control. They reported that they found cortical neurons with normal adult-type specificity of responsiveness in young kittens which had not opened their eyes, or which had been deprived or visual experience by suture of their eyelids. In later investigations (Wiesel and Hubel. 1963, 1965; Hubel and Wiesel, 1965) they found that abnormal visual experience, such as unilateral
231
379
eye-suture, or prevention of simultaneous usage of the eyes by alternating occlusion or surgically induced strabismus, caused the development of an abnormal population of cortical cells. In accordance with their earlier findings they attributed this to a disruption of the preformed organization, and they discovered the very important fact that abnormal experience only modifies the cortex if it occurs during a particular •sensitive' period—about 3 to 12 weeks in cats (Hubel and Wiesel, 1970). Recent developments have extended these seminal findings, but they lead to somewhat different conclusions about the relative importance of experience and genetic factors in determining the selectivity of cortical neurons. First it was shown that kittens brought up with the two eyes exposed to different stimuli, one to vertical stripes, the other to horizontal, had a corresponding orientation selectivity of the receptive fields connected to each eye (Hirsch and Spinelli, 1970, 1971). This was confirmed in kittens exposed only to vertically or horizontally striped environments; these had no neurons sensitive to horizontally or vertically oriented stimuli respectively (Blakemore and Cooper, 1970). Evidence has been obtained that cats raised with a vertical displacement of the images in one eye induced by prisms also have abnormal vertical disparities of the pairs of receptive fields of cortical neurons connected to both eyes (Shlaer, 1971). Again, the cortex of a kitten exposed only to bright dots, with no contours or edges, contained units of an abnormal type responding well to small spots of light and showing little of the customary preference for lines (Pettigrew and Freeman, forthcoming). Furthermore it appears that a very brief period of exposure, as little as an hour, can have very pronounced effects on the subsequent selectivity of neurons in the visual cortex (Blakemore and Mitchell, 1973). Such results could still possibly have been explained by disruption of the innately-determined highly-specific connections that were originally thought to underlie response specificity, but a reexamination of the properties of cortical neurons of kittens with no visual experience shows that they do not actually have fully-developed adult-type specificity (Barlow and Pettigrew, 1971). This is certainly Ihe case with regard to disparity selectivity and, although there is directional preference and may be some weak orientation selectivity, they are not as narrowly selective as adult cells (Pettigrew, forthcoming). The anatomy or the developing cortex shows that only a small Traction of the normal complement or synapses is present before the critical period, and it is hard to believe that the cells could have adult properties (Cragg, 1972). It will take more work to determine the limits within which the pattern selectivity of cortical neurons can be modified, but the results already make it impossible to believe Hubel and Wiesel's original claim that many cells of the visually inexperienced kitten have the full adult-type selectivity, 4.2 Type of modification caused It is instructive to look at the way in which experience modifies selectivity. In all cases the cortex of animals whose visual experience has been modified lacks neurons selectively responsive to patterns of excitation which a normal animal receives, but which have been excluded by the experimental modification. Thus unilateral lid suture led to a cortex with very few neurons excitable from the lid-sutured eye; likewise, alternating occlusion or strabismus, which decreases the probability of simultaneous excitation of corresponding neurons in the two eyes, decreased the proportion of neurons responding to both eyes. The same is true of the kittens reared in striped environments, or with a vertically deviating prism over one eye, or in an environment with point sources but no lines; in alt these cases the rule holds that neurons are found for patterns of excitation that occur in the modified environment, but normally occurring types of selectivity are rare or absent if the patterns they would respond to have not been experienced in the modified environment.
232
380
This rule seems to amount to a striking confirmation of the speculation (Barlow, I960) that a prime function of sensory centres is to code efficiently the patterns of excitation that occur, thus developing a less redundant representation of the environment. Previous examples of redundancy-reducing codes could be explained as genetically determined features of neural connectivity, but the above discoveries are definite examples of a modified code developed in response lo a modified environment. i f on this page we have begun the correct story for simple cells of area I 7. one can see lhal a book has been opened with regard lo the properties of cells higher in the hierarchy, which are presumably themselves experience dependent and are fed by information from these experience-dependent neurons at the lower cortical levels. Even a small degree of modifiability would be extraordinarily significant in a hierarchically organized system, just as, in evolution, weak selection pressure is effective over many generations. 5 Current concept of the single neuron The cumulative effect of all the changes I have tried to outline above has been to make us realise that each single neuron can perform a much more complex apd subtle task than had previously been thought. Neurons do not loosely and unreliably map the luminous intensities of the visual image onto our sensorium. but instead they detect pattern elements, discriminate the depth of objecls. ignore irrelevanl causes of variation, and are arranged in an intriguing hierarchy. Furthermore, there is evidence that they give prominence to what is inrormationally Important, can respond with great reliability, and can have their pattern selectivity permanently modified by early visual experience. This amounts to a revolution in our oullook. I l is now quite inappropriate to regard unit activity as a noisy indication of more basic and reliable processes involved in mental operations; inslead, we must regard single neurons as the prime movers of these meehanisms. Thinking is brought about by neurons, and we should not use phrases like 'unit activity reflects, reveals, or monitors thought processes', because Ihe activities or neurons, quite simply, OW thought processes. This revolution stemmed from physiological work iiritl makes us rcali/.e lhal ihe activity of each single neuron may play a significant role ill perception. I think thai more clearly staled hypotheses are now needed llbutil llu.se roles in order lo allow our psychological knowledge and intuitions about our perceptions to help us plan future experiments. 6 Five propositions The following five brief statements are Intended lo define which aspect of the brain s activity is i m p o r t J I I I for understanding tt.i main limclion. lo suggest the way that single neurons represent whal is going on around us. and lo say how this is related lo our subjective experience. The statements are dogmalie and incautious because it is important that they should be clear and testable.
6.1
First dogma
A description of that activity of a single nerve cell which is iransimilcd l o and influences other nerve cells, and or a nerve cell's response lo such influences from other cells, is a complete enough description for runctional understanding o f the nervous system. There is nothing else 'looking a l ' or controlling this activity, which must therefore provide a basis for understanding how ihe brain controls behaviour.
6.2 Second dogma At progressively higher levels in sensory pathways information about the physical stimulus is carried by progressively fewer active neurons. The sensory system is
233
331
organized to achieve as complete a representation as possible with the minimum number of active neurons.
6.3 Third dogma Trigger features of neurons are matched to the redundant features of sensory stimulation in order to achieve greater completeness and economy of representation. This selective responsiveness is determined by the sensory stimulation to which neurons have been exposed, as well as by genetic factors operating during development.
6.4 Fourth dogma Just as physical stimuli directly cause receptors to initiate neural activity, so the active high-level neurons directly and simply cause the elements of our perception.
6.5
Fifth dogma
The frequency of neural impulses codes subjective certainty: a high impulse frequency in a given neuron corresponds to a high degree of confidence that the cause of the percept is present in the external world. 7 First dogma: Significant level of description This dogma asserts that a picture of how the brain works, and in particular how it processes and represents sensory information, can be built up from knowledge of the interactions of individual cells. At the moment single-unit electrical recording is the only tool wilh temporal and spatial resolution adequate to locate the effect of a particular sensory stimulus in a particular cell. Other tools (biochemical, electron microscopy, etc.) can obviously provide essential information about these interactions, but the dogma may be criticized more fundamentally; it may be suggested that the whole problem should be approached at a different level. One could attack from either side, suggesting either that one should look at grosser signs of nervous activity, such as the weak extracellular potentials that result from the activity of many neurons, or that one should approach the problem at a more microscopic level, studying synaptic and molecular changes. Interest in evoked potentials and electroencephalography has waned partly because their study led to slow progress compared with single-unit recording, but also because the rationale for their use was undermined. A prime reason for attending to these macroscopic manifestations of nervous activity was the belief that individual cells were too unreliable to be worthy of attention singly, and hence it was better to look at a sign of activity that resulted from many of them. Here, it was thought, may be a property of a group of cells analogous to temperature or pressure as a property of a collection of molecules that individually behave randomly. The demonstration that single nerve cells have diverse and highly specific responsiveness to sensory stimuli, and are astonishingly reliable, showed the fallacy of this analogy. The search for a molar property of a mass of working nerve cells is certainly not worthless. Physiologists, and all biologists for that matter, tend to be emotionally divided into globalists and atomists. The globalists are amazed at the perfection of functioning of the whole animal, and they observe that the atomists' analytical investigations of living matter always leave unexplained many of the most remarkable attributes of the intact animal. As a result the globalist can play a crucially important role in pointing out where the atomists'explanations are incomplete. Now the brain does much more interesting things than produce weak extracellular potentials: it controls behaviour, and this is surely the global product that, at our present state of understanding, really does appear greater than the sum of its parts. It would be no use looking at single neurons if it will be forever impossible to explain overall behaviour in terms of the actions and interactions of these subunits;
234
382
if that were so, the globalists' despair would be justified. On the other hand il is precisely because rapid progress has been made lhal this article is being written; il no longer seems completely unrealistic to attempt to understand perception al I lie atomic single-unit level. The second criticism, that one should approach the problem at a more microscopic level, is really only answerable by saying. "Go ahead and do it", for undoubtedly there is much to be learned at a synaptic and molecular level. But the important question here is whether lack of this knowledge will impede a major advance in our conception of how the brain works. The dogma asserts that it is the intercellular actions and interactions that possess the elaborate organization responsible Tor behaviour; hence it asserts that knowledge at a more-microscopic intracellular level is not a prerequisite for understanding such organization. S Second dogma: The economical representation of sensory messages The main task in this section is to discern the principles that underlie the changes in characteristic responsiveness of single units at successive levels in sensory pathways. The aim is to understand how sensory information is represented or 'displayed'. The successive levels to be considered will be peripheral photoreceptors and cutaneous afferents; retinal ganglion cells of the cat, frog, or rabbit, the latter of which seem to exemplify a more complex type of processing; and the visual cortex of cats. Obviously these are not an ideal series for comparisons and extrapolations, but they are the best we can do. The discussion initially revolves around three issues: changes in the degree of specificity and generality of the stimuli to which the cells respond; changes in the number of parallel categories of selectively sensitive cells that carry the information; and changes in the number of the cells that one may expect to be activated by normal visual scenes. What emerges is that, at the higher levels, fewer and fewer cells are active, but each represents a more and more specific happening in the sensory environment. 8.1 Specificity and generality of responsiveness The pattern specificity of sensory neurons is the aspect that is most widely emphasized; it was spectacular to discover single neurons in the retina responding to movement of the image in a specific direction, cortical neurons responding only to slits of light at a particular orientation, and a unit in the inTero-temporal cortex that responds best to a monkey's hand. But the invariance of the response to changes in the stimulus is equally remarkable. A retinal unit continues to respond to direction of motion in spite of many decades of change in input luminance or contrast, in fact in spite of reversal of contrast (Barlow, 1969a). At the cortical level a complex cell insists that a stimulus is appropriately oriented, but will respond in spile of wide variations of position (Hubel and Wiesel, 1962), And the monkey-paw unit similarly retains its pattern specificity over a large part of the visual field (Gross et al,, 19721. In talking about these properties of sensory neurons actual examples are perhaps more informative than the words specific and general. A single receptor containing a red-sensitive pigment is specific in the sense lhal long-wave length light must be present at a particular part of the image in order to excite it, and it is general in the sense that all images with this property will excite it. In contrast to this type of specificity and generality, the high-level neurons are no longer limited to purely local attributes of the image. They are selective for pattern, which requires that a considerable region of the image is taken into account. But there are other aspects of their specific selectivity that also need to be considered.
235
383
8.2 Number of selective categories At Ihe level of receptors there are a small number of different sensory modalities picking up. in parallel, information from different positions. This is Ihe case both for the half dozen types of cutaneous Sensation, and for the smaller number o f retinal receptor types responding to the visual image. A t the level o f ganglion cells the number of sub-modalities or selective categories has greatly increased. Consider the rabbit, where there are the following (Barlow et at., 1964; Levick, 1967); two concentric types (on- and off-centre); four o n - o f f type directionally selective (for movements up, down, antero-posterior, and postero-anterior); three directions for slow, on-type, directionally selective; one type sensitive l o fast movement; one type sensitive to 'uniformity'; and, confined to the visual streak, two types of orientationselective neurons, neurons selective for slow-moving small objects, and neurons selective for edges. This makes a total o f 15 different selective categories. In addition there must be units signalling colour, since the rabbit shows behavioural evidence for it, but these have not yet been found in the retina. Now move up l o the simple cells in area 17 o f cat cortex. These vary in position, orientation, disparity, and size, as well as being selective for light bars, dark bars, or edges. The evidence is not sufficient to say how many distinct selective categories these form, but for each of the first three variables the resolution o f a single neuron is good, in the sense lhal small departures from the preferred position, orientation, or disparity cause large decreases o f response amplitude (Bishop, 1970). These variables already define four dimensions, and we have not yet considered size specificity, velocity specificity, nor the additional complexities o f light, dark, or edge detectors, and of course colour. There are certainly several orders o f magnitude more neurons in the primary projection area than there are input fibres, or resolvable points in the visual field, and it is abundantly clear that the number o f selective categories has increased enormously. Activity o f a particular neuron signifies much more than the presence o f light at a particular locus in the visual field; its activity signifies a great deal about the nature o f the pattern o f light at that locus. The fact that many parallel communication channels are used in Ihe nervous system has been widely recognised, but here we see an enormous expansion of the number of parallel paths, and this occurs without much redundant reduplication o f channels, for each neuron seems to have a different specific responsiveness. It is as if, at high levels, the size of the alphabet available for representing a sensory message was enormously increased. Perhaps it would be betler to say that, i f the activity o f a low-level neuron is like the occurrence of a letter, that of a high-level neuron is like the occurrence of a word—a meaningful combination o f letters. But to understand this better we must look at the third aspect o f the way sensory messages are represented at different levels, namely the proportion of neurons that are usually active. I f the pattern of activity caused by a visual scene has, on average, AT neurons active out of the total of N neurons, then we have seen above that N increases at high levels; can one say anything about how K changes?
8.3 Number of active celts If one considers the retinal cones under typical photopic conditions, the vast majority must be partially active. They may be nearer the depolarized than the hyperpo la riled limit of their dynamic range, but the majority will be somewhere well within i t . For Ihe relinal ganglion cells of a cat the situation is a little different; while a few units, those corresponding to the brightest and dimmest parts o f the scene, will be vigorously active, the majority, corresponding to the parts o f the scene near the mean luminance, will be discharging at rates close to their maintained discharge level, which in its turn is near the low-frequency end of their dynamic range. Thus there will be a lot o f units with low degrees o f activity and a few which
236
334
are vigorously active. Recoding in the retina changes the distribution oi' activity so that low impulse frequencies are common, high impulse frequencies rare. Now consider Ihe rahbil, wilh its more elaborate retinal processing, and greater richness of pattern-feature signalling neurons. IL is characteristic of the more specific of these neurons that they have a very low maintained discharge, and are extremely hard to excite until their exact trigger feature has been found. One flashes lights, waves wands, and jiggles 'noise figures' in the appropriate part of the visual field for many minutes, maybe hours, before finding the right combination for excitation. It is reasonably certain that the right combination does not occur often in the natural environment either, and therefore Ihese units must spend only a small fraction of the time in an active state. Low impulse frequencies are even commoner, high impulse frequencies even rarer, than in cat retina. For the cat cortex this trend is carried further, and one can see another aspect emerging. If one takes a small region of the visual field, it either does contain a bright bar, dark bar, or edge, or, much more likely, it does not. Thus, like the rabbit units, the cells with these specific responsivities must be only infrequently active. But in addition, on the rare occasions when one of the appropriate trigger features is present, it is one of a set which tend to be mutually exclusive: a bright bar cannot be a dark bar, and it can have only one orientation and disparity. The stimulus selects which cell to activate from a range of many possible cells, and it is pretty well impossible to activate simultaneously more than a small fraction of this number The picture developing is that at low levels visual information is carried by the pattern of joint activity of many elements, whereas at the upper levels of the hierarchy a relatively small proportion are active, and each of these says a lot when it is active. But, although we clearly see that the proportion active, KIN, decreases, we cannot tell whether it decreases as rapidly as N increases, and thus we still do not know how K itself changes. The second dogma goes beyond the evidence, but it attempts to make sense out of it. it asserts that the overall direction or aim of information processing in higher sensory centres is to represent the input as completely as possible by activity in as few neurons as possible (Barlow, 1961. 1969b), In other words, not only the proportion but also the actual number oi active neurons, K, is reduced, while as much information as possible about Ihe inpul is preserved. By now much can one reasonably expeel K to be reduced? One requires Ihe concepts of channel capacity and redundancy from information Ihcory (Shannon and Weaver. 1949; Woodward, I9S3) to make a rough estimate. Some reduction can be accomplished without any loss of information simply by the increase of N. KIN is the probability of a fibre being active, and, if it is the same for all neurons, the information capacity of a sel of N neurons, either active or not active, is -Aflogi(iSr/A/)-{/V- K)log,((/V- K)IN\. If KIN is small, the second term contributes little; the capacity then is, approximately, the number of active neurons times the information provided by each active neuron, and this increases directly as the negative logarithm of the probability of it being active, -\og(K/N). Hence the number active can be reduced as N increases without loss of information capacity, but by itself this does not allow K to be reduced very much: for instance, if we suppose that 1 of the 2 x 10 optic nerve fibres are active and that there are 10' cortical neurons receiving this information, then one finds that I -5 x 10 cortical neurons must, on average, be active in order to have the same information capacity as the 5 x 10' active optic nerve fibres. But this applies only to capacity, and a substantial reduction in K is possible on the basis of another principle. s
s
Visual information is enormously redundant, and it has been suggested previously that sensory coding is largely concerned with exploiting this redundancy to obtain
237
385
more economical representation of the information. If the argument is correct, the number of active neurons can be reduced, but it is very difficult even to guess how big a reduction in K such recoding can achieve; if it is rii up to the cortex, and another factor of achieved in visual I , II and III, one would end up with about 1000 active fibres carrying the information provided by 5 x 10 active optic nerve fibres; though the reductions might be substantially greater or less, this is the order of magnitude oT the reduction contemplated. J
According to dogma, these 1000 active neurons represent the visual scene, but it is obvious that each neuron must convey an enormously larger share of the picture than, say. one point out of the quarter million points of a television picture. Perhaps a better analogy is to recall the 1000 words that a picture is proverbially worth; apparently an active neuron says something of the order of complexity of a word. It seems to me not unreasonable to suppose that a single visual scene can be represented quite completely by about 1000 of such entities, bearing in mind that each one is selected from a vast vocabulary and will in addition carry some positional information. 9 Third dogma: Selectivity adapted to environment
9.1 Evolutionary adaptation Some economies of the type indicated above can be achieved by exploiting forms of redundancy which are present in all normal environments. Levels of sensory stimulation do not range at random over the whole scale of possible values, and it makes sense to regard adaptation of peripheral receptors as a measure to achieve economy by signalling changes from the mean instead of absolute values. Similarly in most situations neighbouring points on a sensory surface are more likely to be similar than distant points, and it thus makes sense to regard contrast enhancement by lateral inhibition as another economy measure. The argument can be carried on to cover the redundancy-reducing value of movement, edge, or disparity detectors (Barlow, 1969b), but, if these are genetically-determined redundancy-reducing codes, they must be fixed once and for all during development, and they could only work for redundant properties of all sensory environments. The hypothesis becomes more interesting when one considers the possible mechanisms Tor achieving economy by exploiting the redundancy of particular sensory scenes, for this requires storage of information and plasticity of the neural structures involved.
9.2 Reversible adaptation The neural changes of dark and light adaptation may be regarded as a simple example of reversible plasticity achieving this end. The luminance corresponding to zero impulses is affected by the past history of illumination and by the surrounding luminances in such a way that the majority of fibres are responding at low frequencies. But, even though this involves definite changes in the synaptic transfer properties of retinal neurons, the statistical characteristic of visual images that enables this to achieve economy is always the same, namely the fact that the distribution of luminances is grouped around local and temporal mean values, so that small deviations from the mean are commoner than large deviations (Barlow, 1969a). Hence the most commonly occurring luminances require fewest impulses.
9.3 Permanent adaptation The effects permanently impressed on the visual system during the sensitive period are the first example of plasticity for a particular type of redundancy. The distribution of orientational selectivity of primary neurons is biased in favour of the orientations the individual experienced during this critical time. If the analogy of a neuron's signal resembling the utterance of a word is recalled, this result suggests that the kitten's cortex only develops words for what it has seen. This could be brought
238
386
aboul by either selection or moilificalion: are Ihe dictionary words there, only ihe ones experienced becoming permanently connected; or do ihe cells themselves determine that a frequently experienced pattern, sucli sis lines of a particular range of orientations, are events for which words are desirable'! The evidence favours modification, and the idea to which i l leads of the successive hierarchical construction of a dictionary of meaningful neurons has enormous appeal. For the present we can only justify the third dogma by saying the evidence suggests such a dictionary may be built up. though we are far from being able to look into its pages by physiological methods, hi the ncxi section we turn to the subjective view of this dictionary. 10 Fourth dogma: Origin of perceptions
10.1 Persona! perception In order to delimit more accurately what this dogma does and does not say it may be useful to define and separate three mysteries of perception. The first is the personal, subjective, aspec! of my experience of. say. the red pencil with a blue eraser in my hand. There does not seem to be anything that could be said about the activity or nerve cells accompanying this experience lhat would in any way 'explain' the aspect of it that is mysterious, personal, and subjective. I think this part of the experience is something that one must be content to leave on one side for the moment, but it is important that this part of subjective experience almost always accompanies electrical stimulation of a peripheral sensory nerve, and usually accompanies electrical stimulation of the sensory areas of the brain, for Ihis implies that the Tull subjective experience, including this mysterious personal element, accompanies the neural events of sensation, however these arc caused. This fact strongly suggests that it is no waste of time to look into these neural events: beauly is a mysterious attribute of a work of art, but lhat does not imply thai you cannot create a beautiful painting by non-mysterious material means.
10.2 Conscious perception The second mystery is thai we are not consciously aware of much that goes on in our brains, so the inverse of the fourth dogma is certainly not true: not every cortical neuron's activity has a simple perceptual correlate. Even at high mental levels much neural business is conducted without conscious awareness, and my own belief is lhat (he conscious pari is ijimfjiKjd lo experiences one communicates to other people, or experiences one is contemplating communicating lo other people. This immediately introduces a social clement inlo individual consciousness, for communication is impossible without a channel being open to a recipient. However, for present purposes we need only point out that interesting aspects of consciousness of Ihis sorl are by no means incompatible with Ihe fourth dogma. An clement of perception can possess a simple neural cause without i l necessarily being the case lhat all simple neural eveuls cause perception. There is Iherel'ore plciily ol room lor social, historical, or moral influences on perception, because these can Influence ihe selection o f the neural events that enter conscious perception.
10.3 Validity of perceptions The third mystery about perceptions is why they arc generally 'Irue': why are they so extraordinarily useful in guiding our actions and helping us to make decisions? This is the aspeel that the second and third dogmas help one to understand. The economical and fairly complete representation of visual scenes by a reasonably small number of active neurons makes it much easier to visualize how they can be used for these purposes. The key point is that the active neurons carry the bulk of the information, and the vast number o f inactive ones need not be laken into consideration. The difficulty of detecting among our sense impressions the entities
239
387
we use for rational thought has always been baffling: 'water', 'men', 'sheep', and even the simple letter ' A ' represent particular logical functions o f activity among the sensory neurons, but the number or possible logical functions is so vast that we are mystified how particular ones are realized, or why particular ones are selected for realization. The representation suggested by the second and third dogmas would allow relatively simple logical combinations to have properties approaching those required for the literal symbols of Boole (1854), the subjects o f our conceptions. By using such symbols together with operational signs he founded mathematical logic, but the title of his major work, "The Laws of Thought", clearly states his claim that his inquiries had "probable intimations concerning the nature and constitution o f the human m i n d " It is gratifying to approach closer t o an intuitively plausible neural realization of what he symbolized. The notion that what we sense is a point by point representation of the physical signals impinging on our body has been rejected for psychological and philosophical reasons (see Boring, 1942), and more recent physiological evidence clearly supports this rejection. But this same evidence suggests that i t should be replaced, not by return to a subjectively constructed phenomenology (Dreyfus, 1972), nor by the notion that we sense the world in terms o f rigidly preordained 'structures', but by the deeper and more adaptive ideas of dogmas two and three; our sensorium is presented with a fairly small number of communications, each representing the occurrence of a group of external events having a word-like order of complexity, and, like words, having the special property that they lead to an economical representation o f these physical events. 11 Fifth dogma: Signalling subjective certainty There is one way in which the properties of neurons do not match up to the way Boole used symbols, for he insisted on their binary nature, or 'duality'. This is the property Aristotle called the principle of contradiction, and without it Boole's symbolic representation of logic would have been impossible. In contrast to this duality the response of a sensory nerve cell to its trigger feature consists o f a volley of impulses lasting TO to I second, during which time the neuron can discharge any number of impulses between zero and nearly 1000. Therefore the response is graded, and it is not legitimate to consider it as a Boolean binary variable. The essential notion expressed in the fifth dogma is that a neuron stands for an idealization of reality whose complement can be formulated as a null hypothesis, and it is this that has the required Boolean logical property o f duality. The idealizations should not be thought of as Kantian or Platonic, but rather as abstractions that model reality in the manner suggested by Craik (1943). The ideal populations of a statistician come even closer, for the parameters of such distributions model reality, and they are used to calculate whether or not a particular sample belongs within i t . The process o f idealizing the complement o f the trigger feature will be clarified by a simple example. Suppose we have a sensory neuron whose trigger feature is a simple physical event, such as the increase of light intensity at a specific position in the visual field. We are examining the suggestion that the graded responses this neuron gives to visual stimuli represent some function of the degree of certainty that the light did in fact increase, estimated from the physical events accessible to the neuron. For simplicity assume that a record is available of the total numbers o f quanta absorbed in the receptive field of the sensory neuron during successive periods of about rb second duration up to and including the period about which it is to signal centrally. It is i n principle possible to calculate the probability o f occurrence o f the observed number of quantal absorptions on the hypothesis that there was no change i n ihe light intensity, and this is the test a statistician would apply to determine whether or not
240
:
u
Ihe trigger feature was present. On ihis view impulse frei|Ucit e signals some rtintluj'n of Ihe sign i lie m i l . hvel at a lust i>f Ihis sort, low probabilities corresponding lo high impulse frequencies. Notice that Ihe trigger feutilrc is "an increase o f light', the idealization is 'there was no Increase', and this idealization is based on observation o f whal has recently happened and therefore incorporates a model o f Ihe recent past. The responses actually obtained to varying intensities of incremental stimulus fit quite well into this scheme (Barlow, 1969a), as do some less obvious features. The low-frequency maintained discharge could well represent the results, oT low significance, obtained by testing the null hypothesis when no stimulus has been applied. Although individual fibres would not reach significance, information about quantal absorptions would be retained, and changes insignificant singly could be combined centrally to reach significance. In the above example a high value o f P and low impulse frequency would result from a shadow falling on the receptive field. The detection of this shadow might be of great survival significance for the animal, but the on-centre unit's trigger feature, null hypothesis, and statistical tests would be a poor way of detecting and signalling this important event. A different type of unit is required whose null hypothesis should be "There has been no decrease in the quantal absorption rate"; these would fire when the hypothesis is disproved by the quantal absorption rate dropping below the normal range of variation. Obviously these are the off-centre units, and i l seems that the existence of complementary 'on' and ' o f f systems fits the notion quite well. When there are a large number of neurons with trigger features that cannot coexist, as in area 17, these correspond to a large number of mutually exclusive hypotheses to be tested. The fifth dogma clearly requires more development and testing, but it provides a possible answer to the question "Whal variable corresponds to impulse frequency in a high level sensory neuron ' " Furthermore the answer lies i l to a rather definitely felt subjective quantity—the sense of certainty. 12 Criticisms and alternatives Single-unit recording hints al this probabilistic adaptive, many Ii villi. U bygli.-iii lor processing and displaying sensory information, bul can we believe lhat what we perceive is the activity of a relatively small selection of upper-level units oi' this hierarchy'.' This is certainly a big jump beyond the present physiological evidence. We do not know how perspective transformations are disregarded, enabling us to perceive Ihe same object irrespective of our angle of view, nor do we understand the mechanisms underlying size constancy, yet these mechanisms must intervene between the highest neurons we know aboul and quite simple perceptions. I think we have seen enough of whal can be achieved in a few stages of neural image processing lo believe lhat a lew more stages could reach the point where a single neuron embodies, by virtue oT its peripheral connections and their properties, an elementary percept, but let us examine an alternative and the evidence adduced in its support. 12.1 Combinatorial or holographic represenlutitms The key suggestion aboul the organization of sensory processing thai Ihe second dogma asserts is that the information is carried by progressively fewer active neurons at progressively higher levels in Ihe hierarchy. The brain receives complex patterns or activity in nerve fibres from the sensory receptors, and it generates complex patterns o f outgoing commands to the muscles. It could be held thai the patterns are equally complex al all the intervening stages as well, and Ihis would mean lhal Ihe significance o f a single unit's activity would be virtually undecipherable without knowing what was going on in a host of other units. Certainly one would make
241
389
little progress in understanding a computer's operation by following the status of a single bit in its central processor, so this criticism is partly met by pointing to the success lhat has been achieved in the visual system by looking at the activity o f units singly, one at a time. But we should also look critically at the main evidence advanced in favour of the combinatorial or holographic scheme.
12.2 Mass action and resistance to damage The main argument that has been levelled against the view that individual cells play an important role in perception and in favour o f a holographic representation is the reported fact that large parts of the cortex can be damaged with only minor resultant changes in behaviour or learning {Lashley, 1929, 1950), This led to Lashley's doctrine of 'cerebral mass action', but repetition o f the original experiments and refinements in methods of testing, some of it by Lashley himself, have considerably weakened the original evidence in its favour (Zangwill, 1961). However, it certainly is remarkable that a mechanism with as much interdependence between its parts as the brain can function at all after it has been extensively damaged. A computer would not usually survive brain surgery or gunshot wounds, and it is therefore worthwhile considering the implications of the fact that the cortex is relatively immune to quite extensive injury. The whole of a visual scene can be reconstructed from a small part o f a hologram, with only slight loss of resolution and degradation of signal/noise ratio (Gabor et al., 1971), so it has been claimed that the cortex must operate by some analogous principle in order to account for its resistance to damage. What is not widely appreciated is the fact that holography differs from ordinary image-recording photography not only in principle, but also in the materials used, for it requires photographic emulsions with resolutions of the order of the wavelength of light (Gabor et a!., 1971). With such materials a good quality 35 mm picture could easily be reproduced and repeated in every 1 m m of the plate, and a plate containing such a reduplicated image would have to be pulverized into tiny pieces to prevent reconstructibility of the original from every fragment. The mass-action-like resistance to damage of the hologram is partly due lo the enormous informational capacity o f the materials lhal are required; immunity tu damage is easy to achieve when such high redundancy is permissible, and this argument carries little weight in favour of holographic views of nervous operation. 1
Codes can be given error-correcting properties much less wastefully. and the argument can be turned around to favour the representation hypothesized in dogmas two and three. Because the few active cells have a fixed significance, and because the inactive ones are thought to carry so little information that they can be neglected, the only result of removing part o f the cortex would be to eliminate some of the active neurons, and hence some of the perceptual entities, when a given scene is viewed. The 'meaning' of the active units in the undamaged cortex would remain the same, and might provide a sufficient basis for decision and action. This is very different from the situation where a neuron's activity has totally different significance depending upon the pattern of activity of which it forms a part, for i f any of this pattern was in a damaged region the significance of activity i n the undamaged part would be altered. Hence damage immunity is really an argument for neurons having an invariant meaning not dependent upon the activity of other neurons. It should also be pointed out that very limited replication of 'percept neurons' would give considerable damage immunity: i f a given neuron is replicated half a dozen times in different cortical regions there is a good chance of at least one o f them surviving an extensive cortical ablation. This sixfold redundancy is enormously less than the holographic scheme possesses, and it can be concluded that the m a s
242
390
action argument rebounds against the extensive combinatorial usage o( neurons and actually favours the hypothesis o f this article,
12.3 Pontifical celts Sherrington (194.1) introduced the notion of "one ultimate pontifical nerve-cell, ... lite climax of the whole system o f integration" and immediately rejected the idea in favour of the concept of mind as a "million-fold democracy whose each unit is a cell" Those who like the notion of perception as a cooperative or emergent property of many cells dismiss the suggestion lhat the activity or a single neuron can be an important element o f perception by saying that, carried to its logical conclusion, it implies there must be a single 'pontifical cell' corresponding to each and every recognizable object or scene. First, notice thai the current proposal does not say that each distinct perception corresponds to a different neuron being active, i f perception is taken to mean the whole o f what is perceived at any one moment; it says there is a simple correspondence between the elements or perception and unit activity. Thus the whole of subjective experience at any one time must correspond to a specific combination o f active cells, and the 'pontifical cell' should be replaced by a number o f 'cardinal cells'. Among the many cardinals only a few speak at once; each makes a complicated statement, but not, o f course, as complicated as that of the pontif i f he were to express the whole of perception in one utterance. Two important difficulties arise from the nolion of pontifical cells; first, i f a separate neuron is needed for each of our perceptions, there are not enough to account for their almost incredible variety; second, the activity of a single isolated element would not convey anything o f a perception's great richness, the connection between one perception and others. The 'grandmother cell' might respond lo all views of grandmother's face, but how would that indicate thai it shares features In common with other human faces, and that, on a particular occasion, it occurs in a specific position surrounded by other recognizable objects? Our pcrceplions simply do not have the properly of being isolated unique events as one would expect i f each corresponded to the firing of a unique neuron. Instead, they overlap with each other, sharing parts which continue unchanged from one moment lo another, or recur al later moments in different contexts. I think the 'cardinal cell' representation surmounts these problems without any difficulty; i f a critic can say how many different pcrceplions we arc capable of, and how rich a network of relalcdness exists between these perceptions, then one might be able to cslimalc how many cardinals' voices were required to represent these perceptions. But there is a misleading feature of the ecclesiastical analogy. Most organizational hierarchies are pyramids: there are many members of the church, fewer priests, only a select number oT cardinals, and a single pope. The hierarchy of sensory neurons is very different. It is true that there are more retinal receptors than ganglion cells, but the number of cortical neurons in area 17 is certainly orders o f magnitude greater than the number of incoming fibres. The numbers at succeeding levels may be somewhat fewer, but a high proportion of the nerve cells in the brain must be capable oT being influenced by vision, so i f the hierarchical organization is pyramidal it is inverted rather lhan erect, divergent rather than convergent. I f one uses the term 'cardinal cell', one musl be sure lo remember lhat the college of these cardinals outnumbers the church members and must include a substantial fraction of the 1 0 ' ° cells of the human brain. After-though ts It is sufficiently obvious that these propositions are incomplete, that there are aspects of the sensory problem left untouched, and lhat Ihe dogmas go considerably beyond the evidence. 1 have said, in essence, that the cells of our brain are each capable of
243
391
mote than had previously be more simply related to previously been thought. program with its recursive simple hierarchy of clever perception.
been supposed, and that what their activities represent may the elements of our conscious perceptions than had But clever neurons are not enough. The simplest computer routines and branch points has more subtlety than the neurons that 1 have here proposed as the substrate o f
I think one can actually point to the main element that is lacking. We have seen that some properties of the environment can be represented, or modelled, in a system of the type proposed; 1 feel that a corresponding model is also needed for our own motor actions and their consequences. Such motor and sensory models could then interact and play exploratory games with each other, providing an internal model for the attempts of our ever-inquisitive perceptions to grasp the world around us. A higher-level language than that o f neuronal firing might be required to describe and conceptualize such games, but its elements would have to be reducible t o , or constructible from, the interactions o f neurons. The five dogmas do not impede developments i n this direction. My claim for them is that they are a simple set o f hypotheses on an interesting topic, that they are compatible with currently known facts, and that, i f any are disproved, then knowledge in this field will be substantially advanced. Acknowledgements. This essay was started many years ago when Gerald Westheimer suggested Lo me thai, if a single-neuron dogma of the power and generality of 'DNA codes protein' could be found, il might speed progress of neuropsychology as much as Crick and Watson speeded up molecular biology. Since Ihen I have been helped by the discussion of these ideas with a group of neurophysiologists and psychologists organized in Berkeley by M. F. Land, and by many useful suggestions from B. Sakill, J think Ihe single neuron revolution is having a powerful effect in sensory psychology, bul I slill wish it could be expressed in a single dogma. References Adrian, E. D., 1926a, "The impulses produced by sensory nerve-endings", Pt.l, /. Physiol., 61, 49-71. Adrian, E. D., 1926b. "The impulses produced by sensory nerve-endings, Pt.4, Impulses from pain receptors",J. Physiol.. 62, 33-51. Adrian, E. D . 1928. 77ie Basis of Sensation (Christophers, London; also Haftier. New Yoik, 1964). Adrian, E. D.. 1941, "Afferent discharges to the cerebral corlex from peripheral sense organs", J. Physiol.,
100. 159-191.
Adrian, E. D., 1947, The physical background of perception (Clarendon Press, Oxford). Adrian, E. D., Zotteiman, Y., 1926a, "The impulses produced by sensory nerve-endings, Pi.2, The response of a single end-organ", J. Physiol., 61, 151-171. Adrian, E. D., Zotterman, Y., 1926b, 'The impulses produced by sensory nerve-endings, Pt,3, Impulses set up by touch and pressure", / Physiol, 61, 465-493. Bard, P., 1938, "Sludies on the cortical representation of somalic sensitivity", Harvey Lectures 1938 (Academic Press, New York), pp.143-169. Barlow, H. B., 1953, "Summation and inhibition in the frog's retina", / Phydol., 119, 69-88. Barlow, it. B., 1956, "Retinal noise and absolute threshold", / Opt. Soc, Amer., 46, 634-639. Barlow, H, B., I960, 'The coding of sensory messages" in Current Problems in Animal Behaviour, Eds, W. H. Thorpe, O. L. Zangwill (Cambridge University Press, Cambridge), pp33l-360. Barlow. H. B,. 1961, "Possible principles underlying (he transformations of sensory meissgei" in Sensory Communication, Ed. W. A, Rosenblilh (MIT Press. Cambridge, Mass. and John Wiley New York), pp.217-234. Barlow. H. B., 1969a, "Pattern recognition and the responses of sensory neurons", Arm, N, ¥. Acad Sci., 156, 872-881. Barlow, H. B., 1969b, 'Triggei fealuies, adaptation, and economy of impulses", in Information Processing in the Nervous System. Ed, K. N. Leibovic (Springer-Verlag, New York), pp.209-226. Baslow, H. B., Blakemore, C, Pelligrew, J, D,, 1967, 'The neural mechanism of binocular depth discriminalion", /. Physiol., 193, 327-342. Bailow, H. B., Brindley, C. S., 1963. "Interooular transfer of movement after-effects during pressure binding of Ihe stimulated eye", Nature, 200, 1346-1347.
244
392
Barlow, H. B„ FilzHugh, R„ Kuffler, S. W.. 1957, "Change of organization in ihe receptive fields or the cat's retina during dark adaptation", J. Physioi, 137, 338-354. Barlow, H. B„ Hill, R. M., 1963, "Evidence for a physiological explanation of the waterfall phenomenon andfigure!aricr-effects", Nature, 200. 1345-1347. Barlow, II B.. Hill. R. M., Levick. W. R.. 1964, "Retinal ganglion cells responding selectively 10 direction and speed of image motion in the rabbit", /. Physiol., 173. 377-407. Barlow. H. B., Levick, W. k., Yoon. M., 1971, "Responses lo single quanta of light in retinal ganglion cells of the cat", Vision Research, 11, Suppl. 3, 87-102. Barlow, H. B., Peitigrew, 3.D., 1971, "Lack of specificity of neurones in ihe visual corlex of young kittens",/. Physiol, 218, 98-100. Bishop, P, 0,, 1970, "Beginning of form vision and binocular deplh discrimination in corlex". in TTre Neurosciences: Second Study Program, Ed. P.O. Schmiti (Rockefeller University Press. New York), pp.471 -485. Blakemore, C, Campbell, F, W., 1969, "On the existence of neurones in the human visual system selectively sensitive lo the orientation and size or retinal images". J. Physioi, 203. 237-260. Blakemore, C , Cooper, C. P., 1970, "Developmeni of ihe brain depends on Ihe visual environment", Nature. 228. 477-478. Blakemore, G , Mitchell, D. E., 1973, "Environmental modification uf the visual cortex and the neural basis of learning and memory",Nature. 241, 467-468. Boole, C , 1854, An Investigation of the Laws of Thought (Dover Publications Reprint, New York). Boring. E.G., 1942, Sensation and perception in the history of experimental psychology (Appleton Crofts, New York). Burns, B., 1968, The uncertain nervous system (Edward Arnold. London), Campbell. F. W., Green, D.G., 1965, "Optical and retinal faclors affecting visual resolution", /. Physiol..
181. 576-593,
Cragg, B. G., 1972. "The development of synapses in cat visual corlex", investigative Ophthalmology, 11,377-385. Craik, K. 3 W., 1943. The Nature of Exphnaiion (Cambridge University Press. Cambridge) De Valois, R, L,, I960, "Color vision mechanisms in the monkey", J. Gen, Phvsiai.. 45, Suppl., 115-128. De Valois, R. L.. Abramov. I . . Jacobs, G. H.. 1966. "Analysis of response patterns uf LGN cells", J. Opt. Soc. Am.. 56, 966, 977. De Valois, R, L.. Abramov, I . , Mead, W. R,, 1967, "Single cell analysis of wavelength discrimination at Ihe lateral geniculate nucleus in the macaque", J. rJcitropliysiai. 30. 415-433. Dreyfus, H. L., 1972. Wlial Computers Can't Do (Harper and Row. N e w Yorkl Enrolh-Cugell. C . Robson. J . G . 1966. "The contrast sensitivity of retinal ganglion celts ol Ihe r
at". J. Physioi.
187.517-552.
Ewert, J. P.. 1970. "Neural mechanisms nf piey-calcliine. and avoidance lieiiavinr In the ton J {nufii bufa /, ) " , Helm. Era!., 3,36-56. Fry, G. A., 1948, "Mechanisms subserving simultaneous brightness contrast", Am. J. Qplitnt,. 25. 162-178, Gabor, D., Kock, W. E.,Siroke, G, W.. 1971, "Holography". Science. 173, 11-23 Galambos, R.. 1944, "Inhibition of aclivity in single auditory nerve libers by jenuslie sliinulaiinn", J. Neurophysiol.,
7, 287-303.
Galambos, R., Davis, H., 1943, "The lesponse of single audi Wry-nerve fibres lu iCQUItiC stimulation",./. Ncuniphyitoi. 7. 287-303. Galambos, R., Davis, H., 1948. "Action potentials from single auditory-nerve fibres?" Science, 108, 513. Ganchrow, J. R.. Erickson. R. P., 1970, "Neural currelales of gustatory intensity and quality", J. Neurophysiol..
33. 768-783.
Granit, R., 1947. The Sensory Mechanisms of the Retina (Oxford University Press. Oxford). Granit, R., Svaetichin, G. 1939, "Principles and lechnique of ihe electrophysiological analysis ut colour reception with Ihe aid or microelecnodes". Upsala Lakaref Farh., 65, 161 -177. Green, D. M,, Swets, J. A., 1966. Signal Detection Theory and Psych'iphviics (John Wiley. New York). Cross, C. G., Rocha-Miranda, C. E., Bender, D. B., 1972. "Visual properties of neurons in inferotemporal cortex of Ihe macaque", /, Neurophysiol., 35, 96- I I I . Griisser, O.-J., Griisser-Cornehls, U., 1968, "Neurophysiologische Grundlagen visueller angeborencr AuskSsemechanismen beim Frosch", Zeitschrift fir vergteichende Physiologic, S9, I -24. Hartline, H. K,, 1938, "The response of single optic nerve fibres or the vertebrate eye te illumination or the retina". Am. J. Physiol., 121, 400-415, T
245
393
Hartline, H. K., 1940a, "The receptive fields of optic nerve fibers", Am. J. Physiol.. 130,690-699. Hariline, H. K„ 1940b, 'The effects of spatial summation in the retina on the excitation oflhe fibers of the optic nerve", Am. J. Physiol. 130, 700-711. Hechi, S.i, Shlstr, S., Pirenne. M.. 1942. "Energy, quanta, and vision",/. Gen. Physiol. 25, 819-840. Iliisch, II. V. B„ Spinclli. D. N.. 1970, "Visual experience modifies distribution of horizontally and vertically oriented receptive fields in cats". Science. 168. 869-871. Iliisch, II. V, B„ Spinelli. D. N.. 1971, "Modification of the distribution of receptive field orieuiation in cats by selective visual exposure during development". Exp. Brain Res., 13, 50°-537. Iluhel, 0. H., Wiescl.T. N., 1959, "Receptive fields of single neurones in the cat's striate cortex", / Physiol..
148.
574-591.
Hubel, B. I I . . Wiesel. T. N.. 1962. "Receptive fields, binocular interaction, and functional architecture in (he cat's visual corlex",/ Physiol.. 160, 106-154. Hubel, D. H., Wiesel, T. N., 1963, "Receptive fields of cells in striate cortex of very young, visually inexperienced kittens", / Neurophysiol., 26, 994-1002. Hubel, D. H., Wiesel, T, N„ 1965, "Binocular interaction in striate cortex of kittens reared with artificial squint", / Neurophysiol.. 28, 1041-1059. Hubel, D, H., Wiesel. T. N., 1970. "The period of susceptibility lo the physiological effects of unilateral eye closure in kitlens",/ Physioi, 906, 419-436. Hurvich. L. M., Jameson, D., I960, "Perceived color, induction effects, and opponent-response mechanisms", / Gen. Physioi. 43, Suppl., 66-80. Ingle. D., 1968, "Visual release of prey-calehing behaviour in flogs and loads", Brain, Behaviour and Evolution,
1, 500-518.
Ingle. D., 1971, "Prey-caiching behaviour of anurans toward moving and stationary objects", Research, Suppl, No.3, 447-456, Kuffler, S. W., 1953, "Discharge patterns and functional organization of mammalian retina", /. Neurophysiol.,
Lashley,
Vision
16. 37-68,
1929, Brain Mechanisms and Intelligence: a Quantitative Study of Injuries to the Brain (University of Chicago Press, Chicago), Lashley, K. S,, 1950, "In search of Ihe Engiam physiological mechanisms in animal behaviour", in Symposium of Ihe Society for Experimental Biology, Ed. J. F. Danielli and R. Brown (Cambridge University Piess, Cambridge). Lettvin. J. Y,. Maturana, H. R., MeCull^h, W, S., Pills, W. H„ 1959, "What (he frog's eye tells (he frog's brain", Proc. Inst. Rod Eng.. 47, 1940-1951. Levick. VI, R., 1967. "Receptive fields and trigger features of ganglion cells in the visual streak of ihe rabbu's retina",/ Physiol. 188, 285-307. Mach, E.. 1886, The Analysis of Sensations, and the Relation of Ihe Physical to the Psychical Translation of first edition (1886) levised from fifth German edition by S. Waterlow (Open Court. Chicago and London, 1914) Ed. C. M.Williams, (Also Dover Publications, New York, 1959.) Marshall, W. H., Woolsey, C. N., Bard, P., 1941, "Observations on cortical somatic sensory mechanisms of cat and monkey",/ Neurophysiol, 4, 1 -24. Maiurana. H. R., Letivin, J. Y., McCulloch, W.S., Pitts, W. H., 1960, "Anatomy and physiology of vision in the frog {Rana Pip\ens)",J. Gen. Physioi.. 43, Suppl. No.2, Mechanisms of Vision. 129-171. McCulloch, W. S., 1959. "Agatha Tyche: of nervous nets—the lucky reckoners", in Mechanisation of
K. S.,
Thought
Processes:
Proceedings
of a Symposium
Held
at
the
National
Physical
Laboratory
Vol.2 (HMSO, London), pp.611 -634. Pettigrew, J. D, (forthcoming), "The effect of visual experience on the development of stimulus specificity by kitten cortical neurons". Pettigrew, J. D,, Freeman, R, (forthcoming), "Visual experience without lines; Effect on developing conical neurones" pciiigTCw. 1, D,, Nikaia, T,, Bishop, P. O., 1968, "Binocular interaction on single units in cat siriale C U I I C K ; simultaneous stimulation by single moving slit with receplive fields in correspondence", Exp.
Ilralnftes..
6. 39!
-410.
1'lafiman. C 1941, "Gustatory aflcrcnt impulses", / Cell. Comp. Physioi, 17, 243-258. Pfaffman, C. 1955. "Gustatory nerve impulses in rat. cat, and rabbit",/. Neurophysiol,. 18, 429-440. Sakitt, B.. 1972. "Counting every quantum", / Physioi, 111, 131-150. Shannon. C. E., Weaver, W., 1949. The mathematical theory of communication (University of Illinois Press, Urbana).
246
394
Sherrington, C. S., 1941, Man on His Nature {Cambridge University Press, Cambridge). Shlaet. R., 1971, ••Shift in binocular disparity causes compensatory change in the cortical structure of kittens", Science, 173, 638-541. Stone, J., 1972, "Morphology and physiology of the geniculocorlical synapse in the cat: The quest inn or parallel input to Ihe striate cortex", Invest. Ophthai, 1 1, 338-346. Svaelichin, G., MacNichoi, E. K . Jr., 1958. "Retinal mechanisms for chromatic and achromatic vision", Ann, N. Y. Acad. Sci, 74, 38S-404. Talbot, W. H., Darian-Smilh, 1., Kornhuber, H. H., Mounlcaslle, V. B., 1968. 'The sense of flutter-vibration: Comparison of human capacity with response patlerns of mechano-receptive afrerenls from ihe monkey hand". J. Neurophysiol., 31, 301 -334. Tannei, W. P., Jr.. Swcls, J, A., 1954, "A decision making theory of visual detection", Psychol. Review,
61, 401
-409.
Wagner. H, G„ MacNichoi, E. R., Wolbarshi, M. L., I960, "The response properties or single ganglion cells in Ihe goldfish retina", J. Gen. Physioi, 43. Suppl., 115-128. Wiesel, T. N., Hubel, D. II., 1963, "Single cell responses in striate cortex of kittens deprived of vision in one eye",/. Neurophysloi. 26, 1004-1017, Wiesel, T. N., Hubel, D. H., 1965, "Comparison or the effects of unilateral and bilateral eye closure on cortical unil responses in kittens"./. Neurophysiol., 28, 1029-1040. Wohlgemuth, A., 1911, "On the after-effect of seen movement", Brti. J. Psychol. Monograph. Suppl., I , 1-17. Woodward, P. M . , 1953, Probability and Information Theory with Applications to Radar (Pergamon Press, Oxford). Yirbus, A. L., 1965, Eye Movements and Vision. Translated from Russian by Basil Haigh (Plenum Press, New York). Zangwill, O. L., 1961, "Lashley's concept of cerebral mass aclion", in Current Problems In Animal Behaviour, Eds. W.H.Thorpe. O. L. Zangwill {Cambridge University Press, Cambridge).
247
Reprinted with permission from Vision Research, Vol.30. No. 11, pp. 1561-1571, 1990 © 1990 Pergamon Press
CONDITIONS
F O R VERSATILE
LEARNING,
HELMHOLTZ'S
UNCONSCIOUS
INFERENCE,
AND
T H E TASK
O F
HORACE
PERCEPTION
BARLOW
Physiological Laboratory, Cambridge, CB2 3EG, U.K. (Received
10 August
1989;
in revised form
1 March
1990)
Abstract—It is a mistake to consider perception and learning separately because what one learns is strongly constrained by what one perceives, and what one perceives depends on whal one has experienced. I shall propose the hypolhesis lhal perception is the compulation of a representation thai enables us to make reliable and versatile inferences aboul associations occurring in the world around us—that is, perception prepares the ground for learning. The statistical problem in learning is lo determine whether a compound eveni such as "C followed by U " is a random co-occurrence or a significant association, for if it is the former it would be a mistake lo pay any particular attention lo C, whereas if il is the Latter C is a conditional stimulus for U and a useful predictor for it. Now you cannot decide whether the association is random or not without knowledge of the prior probabilities of C and U: hence on my hypolhesis when you perceive an object or event the representation must nol only signal "it's there" or "it's happened", but musl also make evident (or rapidly accessible! ihe prior probability of what has been signalled. Furthermore it must de Ihis for all the objects or events that can acl as conditional slimub,and Ihis implies lhal Ihe representative elements should be statistically independent (or approximately so) in Ihe normal environment. Forms of coding that would do this, and ihe relationship with Helmholtt's unconsicous inference, will be discussed. These considerations imply lhal Ihe task performed in perception has been overlooked bolh by Learning Iheorists and by connectionisls working on associative and adaptive networks. Coding for independence may be particularly imponani in understanding ihe developmental processes during the sensitive period; it may be Ihe operation that leads ontogeneticatly-Limed. activity-dependent, connections to imprint appropriate codes i f the animal has experience, but inappropriate codes without experience. Coding Conditioning Cortex Representation Sensitive period
Inference
Helmhollz
Prior probability
Perception
THE RELATION OF PERCEPTION TO LEARNING
subsequent learning mechanisms to be versatile and reliable.
I decided to talk on this topic with some trepidation because Gerald knows his Helmholtz so much belter than I do and does not, I suspect, trust a non-German speaker to get him right. However Helmholtz expressed himself with unrivalled clarity and the ideas I shall propose are directly descended from his well-argued proposal that percepts represent unconscious inferences, so I cannot avoid bringing him in and musl risk Gerald's criticisms.
I shall assume that learning is based on what we perceive, and that cerebral cortex is where the representation we perceive is computed, even though neither assumption is 100% certain: McCormick, Lavond, Clark, Kettner, Rising and Thompson (1981) and Y e o , Hardiman and Glickstein (1985) have shown that conditioning of the nictitating membrane response in the rabbit occurs in the cerebellum, and I am sure that many other forms of learning can occur without the learner consciously perceiving the sensory stimulus that is learnt. Nevertheless our perceptions certainly provide much of the information from which we learn, and the cerebral cortex must create the representations used for this purpose. This is a sufficient basis for my argument, though one should be aware that other types of representation and learning do occur.
My argument is, briefly, that to understand perception one must view it as a prologue to learning. Acquiring new knowledge of the world is among the most important things our brains do for us, and for most people at this meeting it is probably the most interesting thing it does. So I shall try out on you the idea that perception is the process of preparing a representation of the current sensory scene in a form that enables
248
1562
Most people are familiar with the idea that representations vary in their completeness and accuracy, for these determine the results of tests of resolution, Weber fraction, and so forth. And the idea of a transformed representation, such as that provided by the coefficients of a Fourier transform, is now almost too familiar. ] want to examine a completely different question, namely "what properties should a representation have in order to make it suitable for use by subsequent learning mechanisms?" It is often taken for granted that any complete representation would do, but this is not the case and I shall start by considering what information is needed simply to establish that an association exists. A model of efficient learning should allow access to all this information, but associative network models based on Hebbian synapses can only access some of it. Because of this they do not account for the versatility of learning, and it seems to me lhat this is the aspect that is most remarkable in higher mammals. We have an astonishing store of knowledge about the associative structure of the world around us, and can recognise changes very readily; I think Helmholtz's unconscious inference results from automatic access and use of this store, and it is this that makes perception so effective for learning. These ideas also suggest a new hypolhesis about the puzzling defects that result from deprivation of experience during the sensitive period. THREE REQUIREMENTS FOR RELIABLY DETECTING PREDICTIVE ASSOCIATIONS
The formation of a conditioned reflex may be taken as a paradigm for the detection of a predictive association. To do this reliably the brain must determine that the conditional stimulus C precedes the unconditional stimulus U significantly more often than would be expected from the overall probabilities with which the two events have occurred in the past. This obviously requires knowledge of the occurrences of the sequence (U following C ) and some means of estimating how often this happens, but it also requires knowledge of the past occurrences of U and C and estimates of the rates they have occurred. One might question whether these three requirements must really be met, but I think it can be seen at once that predictive associations derived from less complete information would
be less reliable, and that an animal using an inefficient method would be at a disadvantage compared with one that made the correct computation; it would either detect fewer of the associations that were genuinely present, or it would attach importance to accidental associations, or it would make more errors of both kinds. Detecting predictive associations can bring enormous advantages, so it must be a very competitive business; of course no brain does the computation perfectly, but when considering the methods brains may use it is sensible to have in mind the requirements for the correct operation. Now consider the implications of these requirements: the site at which learning takes place must be influenced by the number of times C and U have each occurred previously, and also by the frequency of their joint occurrence in the correct temporal relation. Suppose these numbers are used to form estimates of the probabilities P(C). P(U). and P(U.C) where (U.C) symbolises a compound event, namely the joint occurrence of U and C in the correct temporal relationship: then the coincidence is significant and not random if N x P(U.C) significantly exceeds N x P(U) x P{C), N being the number of possible occurrences of (U.C) in the period under consideration. Questions naturally arise about the time scales over which these counts and estimates are made, for learning can occur in seconds, or may require years. It is probable that an efficient system would need to make estimates in parallel over several different times, and this is certainly a topic we need to know more about, but the logic of inferring that C predicts U requires some form of probability estimate so let us concentrate on this aspect. Figure 1 shows in outline how Hebb (1949) suggested that these requirements might be met by a nerve cell. He postulated that joint preand post-synaptic activity strengthens synapses, and the main merit of this suggestion is that it identifies the synapse between the input for the conditional stimulus and the output neuron as the site where the conjunction of U and C produces lasting effects. This is widely accepted, even though we do not yet know just what these lasting effects are, nor even whether they affect the presynaptic terminal, post-synaptic mechanisms, or both. The unconditional stimulus U and its frequency of occurrence could also produce lasting effects at this synapse because it is assumed that U by itself fires the post-synaptic
249
1563 Unconditional stimulus U
elements of the representation is an important part of perception. Always tlrsB pofll-Hynaptlc neuron
Conditional stimulus C
Hebb only postulated an increase in synaptic efficacy with joint pre- and post-synaptic activity as specified in the top line of the table at the bottom of Fig. 1. but a decrease of the transmission across the synapse when C or U often occur separately is needed if the mechanism is to identify predictive stimuli correctly. It is also needed to model the extinction of a conditional response when reinforcement is withheld, or when reinforcement occurs too often without the conditional stimulus.
Rules required tor weight changes Pre-synapilc Active Active Inactive Inactive
Pc-sl-ayneptlc Weight change Active Inactive Active I • i: • •:: I-. • -
Increase Decrease Decrease No change
Fig. I . Diagram showing Hebb's proposal that associations are detected and stored at the junction between synapses from afferent? carrying information about Ihe conditional stimulus on to a post-synaptic neuron. All Ihe required information is available al this site, and his proposal is now widely accepted.
neuron, and that the membrane at the site of the modifiable synapse is depolarised when this occurs; again we do not know what these lasting effects are. The occurrence of C is obviously signalled at the pre-synaptic terminal, and again it could produce lasting effects there, or on postsynaptic mechanisms But it is worth introducing immediately a rather different possibility for the way that P(C) is computed and signalled. Sensory messages often show habituation: they decrease in strength when a stimulus is repeated many times at short intervals. It is tempting to regard this as the means by which the prior probability of C is taken into account, strong signals with many impulses being given for rare events and weak signals with few impulses when the event signalled has happened frequently in the recent past. Although habituation occurs over time scales of seconds or minutes in the examples familiar from neurophysiologies! recording, one cannot exclude the possibility that much slower forms also occur, for they would be hard to observe in such experiments. One knows that sensory stimuli that have become familiar over much longer times are ineffective as conditional stimuli in learning experiments, which implies that P(C) can be estimated over these longer times, so it is tempting to suppose that this is caused by much slower habituation mechanisms that we do not yet know about physiologically. This would fit well the notion I shall develop shortly that the provision of estimates of P(C) for the
The most satisfactory development of the Hebb-type model is that of Sutton and Barto (1981), which includes suggested mechanisms for ensuring that C and U have the appropriate temporal relationship. Furthermore that paper showed the connection between this type of model, the learning theory of Rescorla and Wagner (1972]. and the Widrow and Hoff (1960), L . M . S . . or delta rule of adaptive networks. Here I want to go off on a different tack and consider how to make more versatile models based on the spirit of the Hebbian principle, for I think both learning theorists and those working on adaptive or associative networks have failed to consider some of the essentia! features that make this possible.
MAKING THE MODEL MORE VERSATILE In Fig. I it was assumed that the conditional signal C was already known and the problem was simply to find whether or not it predicted the unconditional stimulus U . The natural way of extending it would by by adding conditional inputs in parallel, as shown in Fig. 2; the post-synaptic neuron might then be conditioned to respond to many alternative stimuli such
Fig. 2. Possible extension of Hebb's scheme lo make learning more versatile. One difficulty is lo provide afferenls of word-like specificity; another is thai all ihese possible conditional stimuli would have to be known in advance in order lo be wired in.
250
1564 as "bell", "whistle", "tuning fork", " G # " , "flashing light", "foot-pinch" etc. But for learning in an advanced mammal it is totally unrealistic to assume prior knowledge of all possible conditioning stimuli, and it must be the versatility of our learning, based in my view on our perceptual capacities, that makes us preeminent in this way. We must therefore seek other ways of extending the model that will enable a large number of previously unknown conditional stimuli to be used, bearing in mind the requirement for estimates of P{C) for all such stimuli if learning is to be done efficiently. As well as the assumption that all possible conditional stimuli are known in advance there are other things wrong with Fig, 2. We have labelled the inputs to a neuron with words, but the fact that an appropriate word exists does not make it reasonable to postulate that the dog's brain has an input line with appropriate selectivity. Our facility with words has tricked us into making this step, but it isn't a simple one, and labelling the lines as in F i g . 2 just evades the problem. The model is inadequate in yet other ways, for it would not show any initial generalisation of conditioning from one stimulus to others sharing similar qualities, nor is there any obvious mechanism for subsequent narrowing of the class of effective stimuli. But although this extension of F i g . 1 does not provide what we are looking for, it does illustrate the enormous gulf that exists between simple cellular models of learning and the real thing. Perhaps it also points to the nub of the problem, namely how to create a representation whose elements would give the same versatility as is provided by labelling the input lines of Fig. 2 with words. George Boole (1854) thought that his logical functions composed with logical variables satisfactorily formalised the relation between a word and the set of sensations that it symbolised. It might be reasonable to suppose that the tnputs to the nerve cell of F i g . 2 correspond to individual items of sensation, so to approach the versatility of word-labelling we need to find a learning system that can use logical functions of its input lines; such logical functions could then be analogous to words. O f course no-one has ever claimed that there is a word for every possible logical function of a set of sensations, so we need not make the impossible demand that our learning model be capable of using every possible logical function of its input lines as a conditional stimulus, but it should be
B1
B)
ES t 4 91
11 17 I I
Fig, 3. An associative net (from Longuet-Higgins, Willshaw & Buneman, 1970). The input lines (A1-A8) run horizontally; the output lines (BI-B8) run vertically. « and (O) represent synapses that have or have not been turned on. Four associations have been recorded: A i , A2, A3 with B4, B6, B7; A2, A5, A8 with B l , B5, B7; A2, A4, A6 with B2, B3, B6; and A l . A3. A7 wilh B3. B4, B8. This is an efficient way of storing associations between input and output vectors, provided the probability of any input line being on is not too high, but it is not clear how the prior probabibties of input vectors could be taken account of. and this is necessary for efficient association formation. able to use a reasonable number of them: for instance it should be able to use as a conditional stimulus some at least of the possible conjunctions of inputs. This is precisely what is claimed for associative nets, so we must take them seriously as candidates for achieving versatility.
ASSOCIATIVE NETWORKS Figure 3 shows a neural network which associates inputs on the horizontal A lines with outputs on the vertical B lines. In discussing the problem of constructing " . . . an associative information store which can learn to associate very many pairs of conditional and unconditional stimuli . . . " Longuet-Higgins, Willshaw and Buneman (1970) claimed that this provided an "entirely satisfactory" solution. But there are snags. As we have seen, to do the job of detecting associations properly one should have access to the prior probabilities of the possible conditional stimuli, for without this one cannot tell whether an association is random or genuine; hence in this case the mechanism should have access to the prior probabilities of all the input vectors it can use, not just the probabilities of their components. It is perfectly reasonable to suppose that each synapse should have access to the prior probability of firing of its presynaptic input, but it is impossible to obtain the
251
IS6S
probability of a vector from the probabilities of its components unless there are statistically independent. The inescapable conclusion is that, if associations with conjunctions of inputs are to be discovered efficiently, these inputs must be statistically independent of each other under normal conditions of use. Note that independence does not ensure that the prior probabilities of vector inputs are made use of; it merely prevents this being impossible.
SHOUTING FOR ATTENTION
Thirty-one years ago Oliver Selfridge (1959) proposed a learning model called Pandemonium (Fig. 4). It had computational demons tuned to detect features of a stimulus, each of which shrieked with an intensity dependent on how closely the actual stimulus resembled the feature to which it was tuned; a set of cognitive demons combined these shrieks with weights adjusted according to the resemblance of the set of Note also that although ignorance of prior shrieks to a paradigm such as a letter of the probabilities makes efficient learning impossible, alphabet, and a decision demon then selected this does not imply that these networks cannot the loudest shrieking cognitive demon as the leam at all. The proofs that nets and perceptions recognised letter. I think the key element here is converge on the right solutions (e.g. Rosenblatt, that the demons shriek with a loudness that is 1959; Minsky & Papert, 1969; Longuet-Higgins supposed to indicate directly the importance of et al., 1970; Anderson, Silverstein, Ritz& Jones, their message for the next stage of processing; 1977) are not cast in doubt, but because they they not only signal, they also attract attention. cannot have access to prior probabilities they Since I am arguing that the prior probability of will be slow and error-prone compared with the current scene is one factor that determines efficient methods of association detection that its importance for association formation we make use of this additional information. need a Probabilistic Pandemonium in which the I think we need to separate the mechanism shrieks signal definite attributes of the stimulus that makes P(C) available for all usable con- as in Seifridge's model, but their loudness has ditional stimuli from the learning mechanism a probabilistic interpretation; they signal how that changes connectivity when there is evidence unexpected the occurrence of an attribute is on for a genuine association between C and U. The the evidence given by past history and the first mechanism corresponds roughly to percep- current presence of other attributes. When an tion, and I think it is this that is responsible for unconditional stimulus U arrives, it is the the versatile behaviour of higher mammals. The demons which have just shrieked loudest that second corresponds to the simplest forms of should be searched for possible conditional learning, and on this view there is nothing very stimuli, for their low prior probability means surprising in the suggestion that this is equally that the expected number of co-occurrences good in all subhuman vertebrates (Mcphail, with U is also low and there is therefore likely 1982). to be a genuine new causal factor behind the coincidence. If this distinction has anything to it the name Rosenblatt (1959) chose for his associative network, the perceptron, is singularly unfortuAnother idea relevant to this line of thought nate, for the insight it gives does not apply to is the noveltyfilter,as described, for instance, by perception but to learning. Kohonen (1984). Such a device learns the set of To summarise, learning an association is a usual images entering its input, and is able after definable statistical task and it is possible to a time to suppress those that have previously specify what is required to do it efficiently. The occurred. This idea has an initial appeal, but difficulty that arises is to make accessible the one does not want to suppress completely the prior probabilities of all input patterns that can non-novel inputs; what is needed is a filter that act as conditional stimuli; this seems to require makes novelty salient, but which continues to the elements of the representation should be transmit an image that is usable for ordinary independent of each other in the normal purposes. Seifridge's model used fixed feature environment, and this is not a problem that the filters, and there is plenty of scope for ontocurrent generation of associative networks genetically determined structure in the connectackle. Without it I think they do less than half tions of sensory pathways in addition to the the job; they may model learning, but they do plastic component for which there is evidence, not begin to model perception. There are, how- as described below (p. 1568). But the point of ever, some other early ideas that are relevant to interest here is the kind of outputs a model of perception should have in order to enable this problem.
252
1566 COGNITIVE DEMONS
Fig 4. Seifridge's Pandemonium (1959). as depicted by Lindsay and Norman (1977). This scheme for pattern recognition has many interesting adaptive aspects, but the feature of interest here is that each demon shrieks with loudness determined by Ihe fit of Ihe data lo rhe patlem.s Ihcy rcpreseni. so they signal the importance of their messages as well as the presence of the pattern. In a probabilistic pandemonium the shrieks would he proportional to — log P. where P is Ihe probability of occurrence of Ihe feature the demon detects. Normal sensory Inpuu
associative nets to be more versatile and efficient at learning; this does not depend on the preformed or plastic nature o f the connections that precede the output. Probabilistic
Response NirJeeelr
Reflponeea
Pandemonium
Figure 5 is a somewhat fanciful diagram illustrating the complexity o f Perception compared w i t h Learning. The perceptual outputs which act as inputs to the associative process are logically binary, that is they signal the presence o f a particular feature o f the i n p u t when they fire, but let them also be capable o f a graded discharge i n which the number o f impulses depends upon the p r o b a b i l i t y P o f the feature being present, based upon the past history o f excitation and also upon the other features that are present in the current input. F u r t h e r m o r e let us assume that this graded response approximates t o —log P, so the perceptual output neuron shrieks l o u d l y when its feature is unexpectedly present, softly when it is present but in circumstances such that this is not. surprising. Finally let us suppose that each o f these perceptual o u t p u t elements fires independently o f the others as long as the system is i n an environment to which it had adapted. I ' l l say a little more about h o w this might be achieved later, but first
Fig. 5. Diagram illustrating the complexity of perception and the simplicity of learning. The task of providing a representation whose elements are independent, and of signalling — log P for all of them, is much more difficult than Ihe simple associative task of learning. Note, however, thai the two are probably noi completely separable, for the results of Ihe associative process are likely to influence the demons of ihe probabilistic pandemonium through feedback.
253
1567 look at how desirable the output of such a (Barlow, 1989; Barlow & Foldiak, 1989). It is perceptual representation would be to act as the a plausible process physiologically because it could be achieved by having mutually inhibitory input to the next, associative, stage. The impulse frequency on any active percep- connections between the outputs whose strength tual output line gives —logP, where P is the increases when these ouputs are correlated, probability of the event or happening signalled along the lines earlier suggested by Wilson by that line; in other words it signals how (1975); thus it would do what ordinary lateral unexpected the feature or event is, given the past inhibition does, but instead of being fixed, the strengths of all the inhibitory interconnections history of its occurrence and the other sensory would be increased until the outputs were no events that are occurring. A s we have seen, this is just what is needed in order to assess whether longer correlated. Instead of such a negative the co-occurrence of this event with another feedback process, decorrelation could result event U is random or not. But this represen- from regulated positive feedback between outtation does more than provide this for each puts, the strength of which diminishes when the single output line: because the outputs are stat- feedback connections help to fire a neuron— istically independent, the probability of a com- anti-Hebbian positive feedback. Adaptation to bination of them is the product of their patterns, and contingent adaptation as in the individual probabilities, so the prior probability McCollough effect, are thought to result from of all the outputs together or of any subset of such mechanisms. ihem is simply the sum of the number of impulses each is firing. This idea of a probabilistic pandemonium is obviously informal and preliminary and is based on the fact that likelihoods can be multiplied or their logs added. The use of likelihoods to decide statistical problems goes back to Fisher (1925), and Kullback (1978) relates likelihoods to modern information theory and develops the methods further. I f one considers a plausible physiological realisation, the requirement that the demons shriek with loudness proportional to —log/ would not be too hard to approximate, for any mechanism of adaptation or habituation discounts frequently repeated events and thus leads to something like the desired "unexpectedness" signal. The requirement for independence is harder. 1
INDEPENDENCE Independence of the output signals implies that the presence of other outputs does not affect the significance (i.e. the prior probability) of any given one; positive or negative correlations among the sensory inputs thus have to be taken into account. I f this approximate independence of the representative elements is achieved, then the proposed representation can show how unexpected a signal is, given the other sensory stimuli that are present. Two methods of approximating independence can be suggested.
Decorrelation One method is to change the coordinate system used to represent sensory variables
Decorrelating networks of this sort could only handle small subsets of the sensory input, for it would be too vast a job to handle much of it at once. In addition, note that it only handles pairwise correlations, and would be insensitive to triples or larger groups of inputs that might be associated.
Minimum entropy coding This goes about the task of obtaining a set of independent representative elements by imposing two constraints on the code: first it must be reversible, and second the entropy calculated from the probabilities of the representative elements must be as low as possible (Barlow, 1989; Barlow, Kaushal & Mitchison, 1989). This entropy is always greater than the true entropy of the output calculated from the probabilities of all the output states, unless the representative elements are completely independent of each other; hence by finding a reversible code that reduces the entropy calculated from the representative elements one can diminish the mutual dependencies between these elements. It is a much more general method than decorrelation, but although it has the flavour of a problem suitable for a synthetic neural net it is hard to image a real neural mechanism for generating these codes.
Multi-stage recoding The above recoding methods are done in a single stage, though they could be repeated in the sort of way that the organisation of visual areas in the cortex suggests (Zeki, 1978; Barlow, 1981). By multi-stage recoding one might obtain
254
1568 representations which decorrelated associations between orientation and colour, or between stereo and motion depth cues; illusions to be mentioned below suggest that such mechanisms are present.
usually vanishes in a few minutes with normal use of the eyes. It is true that very long-lasting and powerful adaptation can produce effects persisting for days or more, but one cannot be sure this does not involve other mechanisms.
At first one might think that such recoding mechanisms would make it impossible to keep track of the prior probability of newly generated pattern elements: how would one know the prior probability of an element that received inputs, some inhibitory and some excitatory, from motion, colour and disparity selective units at earlier levels? This is not a serious problem, because once the pattern selectivity of an element has been established its prior probability can be determined afresh, simply by waiting to see how often it fires. Admittedly this would be inefficient, because it would not make use of experience ante-dating the establishment of the pattern selectivity, but it would avoid the need for working out the probability of what might be a very complicated logical function from the probabilities of its components. Thus it seems quite possible that the repeated application of a principle as simple as decorrelation would lead to a representation that was a good approximation to that postulated in the Probabilistic Pandemonium.
The other form of plasticity is that, long known to ophthalmologists, whose physiological basis was revealed by Hubel and Wiesel (1970) in their celebrated experiments on visual deprivation in kittens. In contrast with the first form of plasticity this requires a longer period to induce, it occurs mainly during a restricted sensitive period early in life, and the results persist indefinitely. Note in particular that plasticity in the sensitive period increases sensitivity to the inducing experience, whereas the other process actively desensitises the system to the adapting stimulus. Foldiak (1989, 1990) has developed a network of elements that receive inputs through simple Hebbian synapses and interconnect with each other through antiHebbian synapses. Possibly the rapid antiHebbian decor relating mechanism and the slower Hebbian mechanism of the sensitive period may work synergistically in the following manner. During the sensitive period pathways that are active become permanently connected, but the decorrelating mechanism (if it works the same during the sensitive period as in adult life) influences which pathways are active and can thereby determine the permanent pattern of connections that is established. T o caricature the suggested process, decorrelation ensures that different commonly occurring patterns of sensory stimulation each stimulate different cortical neurons, because if they failed to do so commonly occurring patterns would cause correlated outputs from two or more neurons. The Hebbian mechanism then ensures that the pattern of connections made to an activated neuron comes from the inputs that successfully activated it; thus the decorrelation mechanism helps to determine the pattern of connectivity that is permanently laid down. If the animal is deprived of experience during the sensitive period the Hebbian process presumably still operates, but there are no correlations in the spontaneous maintained activity of the inputs resulting from patterns in the outside world, so there is nothing for the decorrelating mechanism to work on. The consequence will be a somewhat disordered pattern of connectivity guided only by the ontogenetic mechanisms, without any adaptation to the
Figure 5 shows some of the tasks perception must achieve in order to give associative networks the versatility that has been claimed for them. O f course the diagram only poses the problem, but I hope it suggests to neuroscientists and psychophysicists the important role perception plays in giving higher mammals, especially humans, their intellectual pre-eminence, and I hope it reminds connect ion ists how much preprocessing of sensory input is required before their adaptive networks will function efficiently.
THE ROLE OF PLASTICITV AND THE SENSITIVE PERIOD I think there is good experimental evidence for two types of plasticity in the physiological mechanisms of perception. First there is the rapid adaptation or habituation that I have so far talked about, and which we think tends to make the representative elements uncorrected in the recently experienced environment (Barlow, 1989; Barlow & Foldiak, 1989). The illusions which provide the psychophysical evidence for such a process require adaptation times of the order of a minute or so to produce quite marked effects, and following such exposures the illusion
255
1569
statistical characteristics of the normal sensory input. This seems a very intriguing possibility that might reconcile the conflicting views about mechanisms, consequences, and purpose of the sensitive period (Movshon Sc. Van Sluyters, 1981), and it could turn out to be a model for the influence of experience on connectivity elsewhere in the brain.
Table 1 Inductive inference Major
All men are mortal
Minor
Perception
premiss
Stimulation of the temporal retina always results from luminous objects in the nasal field
premiss
Caius is a man
The temporal retina is being stimulated
Conclusion
UNCONSCIOUS INFERENCE
Therefore Caius is mortal
The link between the suggestions I have made and Helmholtz's views about unconscious inference or induction will be obvious to anyone familiar with his writings, but one cannot take it for granted, even with this audience, that everyone has read the Treatise on physiological optics (Helmholtz, 1925) from cover to cover, and Vol. iii is the part most likely to have been skipped. Warren and Warren (1968) have collected together Helmholtz's main writings on perception, and I think it is surprising that his views about unconscious inference are so rarely quoted or discussed. In his own writings one can perhaps detect a note of disappointment that the philosophers, whom he always took seriously even when he disagreed with them, seem to have dismissed the idea of unsconsious inductive inference for the apparently trivial reason that induction is a process necessarily conducted with the conscious use of words. Helmholtz argued, using examples from a wide field, that our percepts have a status analogous to the conclusions that are drawn by the process of inductive inference, the sole difference lying in the use of words to express major premiss, minor premiss, and conclusion in the latter case. I think his argument is correct, and it is a major inspiration for the views developed here. Thanks to Fisher (1925) and his followers the logic of induction is now understood very much better than in Helmholtz's day, and I have tried to use this understanding to draw conclusions about the operations that must necessarily occur in perception if it is to do what Helmholtz said it did, so let us examine what he said more closely. A very simple example of the analogy he draws is given in Table 1, which shows how straightforward syllogisms following the acceptance of the major premisses lead to the conclusion that Caius is mortal, or that there is a luminous object in the temporal visual field. Helmholtz used this example to explain why one refers the excitation to the nasal visualfieldeven
Therefore there is a luminous object in the nasal held
when it is actually caused by mechanical pressure on the temporal retina. However to "see" retinal stimulation in the position in the visual field that normally causes such excitation seems such a straightforward phenomenon that one is initially unwilling to attach much importance to it, let alone to call it an inference. But when you realise that a few days wearing inverting glasses changes the "always" condition in the major premiss, and correspondingly changes the position to which the excitation is referred, then it becomes difficult to call it anything else. Stratton (1897) published his account of the effects of wearing inverting glasses just after Helmholtz had died and long after he had formulated his arguments, but [ do not believe he made any clear reference to Helmholtz's theory, even though it is hard to conceive any more dramatic verification of its predictions. Of course it is still a mystery how the major premiss is changed as a result of experience, and how most of the complex inferential structure of perception is preserved, becoming adapted to the new conditions simply by this change in the accepted major premiss; perhaps we can dimly foresee a day when the hallowed subject of logic will be recognised as an idealisation of physiological processes that have evolved to serve a useful purpose. The extent to which our perceptions depend upon normal experience as a reference point does not need emphasising to this audience, but it comes as a continual surprise (even to me) to realise how the principle applies to minute associative details, as proved by a host of illusions which I can do no more than mention. Motion and tilt after-effects; the McCollough effect; micropsia with accomodative effort induced by minus lens or drugs; ditto with the convergence effort induced by prisms; the "toytown" effect induced by an unnaturally large
256
1570
range of disparities in a stereo scene; the reverse apparent motion when stereo parallax is present but the motion parallax expected from a movement is absent. All these illusions and many more can be explained as inferences that become false through the failure of a previously valid perceptual major premiss of the type shown in Table I, They show what an astonishingly deep knowledge of the normal patterns of associated activation our visual system possesses and automatically uses. The decorrelation model for approximating independence of the representative elements of perception would provide a means of storing this information in the form of the "antiHebbian" coefficients of interaction required to decorrelate. At the moment we know that cortical neurons show pronounced adaptation or habituation to patterned stimuli, but we cannot be sure that this interpretation of pattern adaptation is correct. The attempt to prove or disprove the hypothesis might bring some new life to cortical neurophysiology for it might bring out a common feature of cortical processings—the detection of new causal factors in the environment from the new associations they cause among cortical afferents. CONCLUSION
I think we have neglected the important role that perception must play in providing a representation that promotes the efficient learning of predictive associations. Ignoring this necessary preprocessing of the input is a serious defect both in current learning theory and in work on adaptive networks. But the argument I've given here probably does not go far enough, for it still leaves perception and learning as two processes almost as separate as they seem to be in current thinking. In reality it is pretty certain that what we learn influences what we perceive; in other words the demons of the probabilistic pandemonium are not only sensitive to the statistical structure of the input, but must also be influenced by fear of flogging and hopes of bribery administered on the basis of the results they have delivered. A composite, multistage, process, exploiting direct instruction as well as the statistical structure of the input, might begin to model the astonishingly versatile and useful representation that real perception gives us,
REFERENCES Anderson, J, A.. Silverstem, I . W., Ritz, S. A. & Jones, R. S. (1977). Distinctive features, categorical perception, and probability learning: Some applications of a neural model. Psychological Review, 84, 413-451. Barlow, H. B. (1981). Critical limiling factors in the design of the eye and visual cortex, (the Ferrier Lecture 1980). Proceedings
oj ihe Royal
Society,
London,
B 212,
1-34.
Barlow, H. B. (1989). A theory about the functional role and synaptic mechanisms of visual after-effects. In Blakemore, C. (Ed.). Vision: Coding and efficiency. Cambridge: Cambridge University Press. Barlow. H. B. & Foldiak. P. F. (1989). Adaptation and decorrelation in the cortex. In Durbin, R . Miall, C. & Mitchison, G. J. (Eds,), The computing neuron. Mass.: Addison-Wesley. Barlow, H. B., Kaushal, T. P. & Mitchison, G. J. (1989). Finding minimum entropy codes. Neural Compulation, l, 406-416. Boole, G. (1854). An investigation of the lam of thought. New York: Dover pubbcalions reprint. Fisher, R. A. (1925). Statistical methods for research workers. Edinburgh: Oliver & Boyd. Foldiak, P. F. (1989) Adaptive network Tor optimal linear feature extraction. International joint conference on neural networks 1989. Washington. D.C. Vol. I , pp. 401-405. Foldiak, P. F. (1990). Forming sparse representations by local anti-Hebbian learning. Biological Cybernetics (In press). Hebb, D. O. (1949). The organisation of behaviour. New York: Wiley. Helmholtz, H. von (1925). Treatise on physiological optics. Translated from ihe 3rd German ediiion (1910), Southall, J. P. C. (Ed.). Washington: Optical Society of America. Hubel, D. H. & Wiesel, T. (1970).The period or susceptibility to the physiological effects of unilateral eye closure in kittens. Journal of Physiology, London, 206. 419-436. Kohonen, T.
(1984). Self-organisation
and
associative
Berlin: Springer.
memory.
Kullback, S. (1978). Information
theory
Gloucester.. Mass.: Smith. Lindsay, P. H. & Norman, D. A. (1977). processing:
An introduction
and
Human
lo psychology
statistics. information
(2nd edn, p.
260).
New York: Academic Press. Longuet-Higgins, H.C., Willshaw, D. J. & Buneman, O. P (1970). Theories of associative recall. Quarterly Review of Biophysics,
3, 223-244.
McCormick, D. A., Lavond, D. G., Clark. G. A., Kettner, R. E., Rising, C. E. & Thompson, R. F (1981) The cngram found? Role of the cerebellum in classical conditioning of the nictitating membrane and eyelid response. Bulletin
of the Psychonomic
Society.
18, 103-105.
McPhail, E. (1982). Brain and intelligence Oxford: Oxford University Press.
in
vertebrates.
Minsky. M.&Paperl.S. (1969). Percepirons:
An
insroduction
Cambridge, Mass.: MIT press. Movshon, J. A. & Van Sluyters. R. C. (1981). Visual neural development. Annual Reviews by Psychology, 32,477-522. Rescorla, R. A. & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and non-reinforcement. In Block. A. H. & Prokasy. W. F. (Eds.), Classical conditioning II: Current research and theory (pp. 64-99), New York: ApplelonCentury-Crorts,
257
to computational
geometry.
1571 Rosenblatt, F. (1959). Two theorems of statistical separability in the perceptron. In proceedings of a symposium
on
Ihe mechanisation
of
though!
processes
(pp. 421-456). London: Her Majesty's Stationary Office. Selfridge, O. G. (1959). Pandemonium: A paradigm for learning, in Proceedings isation
of
thoughi
of a symposium
processes
on the
mechan-
(pp. 3-16). London: Her
Majesty's Stationary Office. Straiten, G. (1987). Vision without inversion of the retinal image. Psychological Review, 4, 341-360 and 463-481. Suttoa. R. S. & Barto, A. G. (1981). Towards a modern theory of adaptive networks: Expectation and prediction. Psychological Review. 88. 135-170. Warren. R. M . & Warren, R. P (1968). Helmholtz on
258
Its physiology and development. New York: Wiley. Widrow, G. & Hoff, M . E. (1960). Adaptive switching circuits. Institute of Radio Engineers, western electronic show and convention, convention record. I960, Part 4, pp. 96-104. Wilson, H. R. (1975). A synaptic model for spatial frequency adaptation. Journal of Theoretical Biology. 50, 327-352. Yeo, C. H.. Hardimau, M . J. & Glicksttia, M . (1985). Classical conditioning of the nictitating membrane response of the rabbit (3 papers). Experimental Brain Research. 60, 87-98; 99-113; 114-125. Zeki. S. (1978). Functional specialisation in the visual cortex of the rhesus monkey. Nature, London, 274, 423-428. perception:
Neuroanatomy
T h e two articles in this section illustrate an approach to biocomputation, that lays higher emphasis on neuroanatomical data than is done in the other sections of this chapter. Their common strategy stems from the tenets that evolution has incorporated knowledge about the world into brains, and that observation of brain structure might thus reveal a good deal about its functional goal. T h e caveat is that historical and developmental constraints may prevent optimisation of adaptation. T h e article of Braitenberg (23) starts with an admirably compact definition of a brain, and surveys the lessons that can be drawn from anatomical observations about several invertebrate (fly's eye) and vertebrate (cerebral cortex and cerebellum) structures. T h e reader will notice that the author seems to favour Hebb's cell assemblies and Hopfield's associative memory networks in comparison to Barlow's single unit doctrine, in the case of the mammalian cerebral cortex. His ideas on the cerebellum receive further elaboration in paper (50) of Chapter 4. This brief survey of various brain structures S H O U L D B E demonstrative enough to convince the reader of the diversity that exists in biological nervous systems, and thus of the danger of sweeping statements about biological neural networks. Both articles in this section can be viewed as building on the introduction provided in Chapter 1, by the paper by Blakemore (paper (14)). Perhaps, for the part of the article of Mitchison (24) that concerns stripes and blobs, the reader will find it useful to give an early glance at the survey of Livingstone and Hubel (paper (43)) in Chapter 4. Mitchison attempts to derive, from a principle of economy in neuronal wiring, answers to questions like: W h y have multiple cortical areas? W h y have separation of grey and white matter? W h y have structures within areas? The two texts in this section are exercises in educated guesswork that we found as enjoyable to read as detective novels.
259
Reprinted with permission from Network, Vol. l.pp. 1-11, 1990 © 1 9 9 0 IOP Publishing Lid.
Reading the structure of brains Valentino Braitenberg Max-Planck-Institute for Biological Cybernetics, Spermannsirasse 38. 7400 Tubingen, Federal Republic of Germany Received 23 August 1989 Abstract. It is a fashionable philosophical tenet to consider Darwinian evolution as a process which incorporates knowledge into brains. We ask ourselves: can this knowledge about the world be recognised in the structure of brains? The present article gives a partial answer to this. Mechanisms of information handling and storage may well be related to the impressive major cortices of the vertebrate brain. Ihe cerebral and the cerebellar cortices. The structure of the first fits (he idea of an associative memory while the second strongly suggests computation of movement in terms of velocities. In some insect brains the mechanisms of visual perception can be related to detailed neuroanatomical structure, and one such network incorporates knowledge about the optics of a camera-lype eye. Another one provides ihe wiring that would be expected in a set of velocity deleclors using the principle of crosscorrelation of neighbouring inputs. Knowledge acquired during a lifetime is also laid down in brains but the search for Ihe 'engram' in the structure of brains has not yel been very successful.
I . Introduction 20
A brain is a complex spatiotemporal affair perhaps describable as the occurrence of 1 0 action potentials in 10'° neurons in the course of a human life. The structure of a brain as studied in neuroanatomy is a projection of this complex affair onto the spatial coordinates, with loss of the temporal aspects, except for the very low resolution analysis, which the study of ageing or of comparative anatomy may provide. How much of the original spatiotemporal complexity can be recovered in this projection? In other words, how can we make physiological sense out of the histological information? There is information from three sources embodied in brain structure: (i) inborn knowledge from evolution, (ii) engrams acquired during a lifetime and (iii) leftovers from the phase of construction during embryogenesis. The first two are legitimate sources of information about the complex spatiotemporal solution of Ihe problem of coping with the environment, while the third is related to this problem only in a very devious way and may introduce misleading clues. I will neglect embryotogical considerations and act as if every part of the adult brain could eventually be understood in terms of the computation for which it is responsible. 1 will ask three questions. First question. C a n we relate brain structures to specific mechanisms or abstract schemes of computation? Second question. C a n we recognise inborn ideas (in the ethotogists' sense of this term) in some aspects of the anatomy of the nervous system? Third question. C a n we see anatomical traces of acquired knowledge or ideas? 260
2 2. The cerebral corlex Let us consider first the most impressive, quantitatively prevailing organ in the (higher) vertebrate brain, the cerebral cortex (Braitenberg 1978a, b). A quantitative analysis of the mouse cortex, which has occupied us for the past few years, has revealed a few remarkable properties which can most probably be extended to the cortex of larger animals as well. In Golgi preparations it is easy to distinguish a number of different types of neurons in the cerebral cortex (figure 1) but on the basis of characteristics revealed by electron microscopy the variety can probably be reduced to three fundamental types, although each of these types may be subject to extensive morphological variation: pyramidal cells, smooth stellate cells and Martinotti cells. The important finding is that most (perhaps 80%) of the synapses in the cerebral cortex connect neurons of one type only—the pyramidal cells to each other. These synapses, judging from their microscopical appearance and from some indirect physiological evidence, are excitatory. The skeleton of the cortex (figure 2) is thus given by a vast system of elements of one kind richly connected by fibres mediating positive feedback. This is a situation which is not very compatible with the picture that many have in mind, i.e. hierarchies of neurons fed at the basis by sensory input, structured by systems of logical relations akin to
Pyromidnl
Figure L Three different types of neurons in the cerebral cortex: pyramidal cell (about 80% of all cortical neurons), with excitatory axon; stellate cell, almost certainly inhibitory; Martinotti cell wilh ascending axon, probably inhibitory. The thin axon can easily be distinguished, in all three types, from the other cell processes, the dendrites. It is loosely ramified in Ihe pyramidal cell, very densely ramified in the stellate cell and ascending in the Martinotti cell.
261
3 M
Figure 2. The skeleton cortex containing only pyramidal cells. M is the motor region where axons leave the cortex to reach other parts of the brain. S is a sensory region, with tactile, auditory or visual sensory fibres terminating in a middle layer of the cortex. O is the olfactory region, where the sensory input enters mainly the uppermost layers of the cortex. There are short-range (SR| and long-range |LR) connections between pyramidal cells. The fibres from region B are seen leaving the cortex and re-entering again in region C. (After Braitenberg 1977.|
expressions of the calculus of propositions and culminating in individual neurons which represent by their activity situations of arbitrary complexity in the sensory spaces. If this perceptron-like picture were taken literally, one would expect a more unidirectional array of fibres in the cerebral cortex oriented along the direction defined by the sensory and motor areas. Also, one would rather expect a near equal frequency of excitatory and inhibitory synapses corresponding to the near equal use of negated and non-negated terms in the logical expressions. Another fact in cortical neuroanatomy which makes the connectivity of cortical neurons very different from that of elements in conventional computers, and thus does not fit the logical theory of nerve nets very well, is the vast divergence (and consequently convergence) embodied in this system. Each cortical pyramidal cell in the mouse participates actively and passively in five to ten thousand synapses, and these are distributed among almost as many different pyramidal cells. Multiple connections between any pair of pyramidal cells, which would reduce the convergence factor, must be quite rare. We may ask which theoretical idea, among those proposed for the function of the cortex, fits the realistic picture most closely. We conclude that it is the idea of the cortex as an associative memory in which temporary correlation of activity in sets of neurons leads to their being connected by strong excitatory contacts so that each such set, a so-called cell assembly, represents within the brain a situation or a thing of the external world characterised by a certain internal consistency. The special bonus which the idea of these cell assemblies (Hebb 1949) boasts is the property of pattern completion intrinsic in the excitatory connections. Any sufficiently large part of a cell assembly may ignite the whole assembly and it is important to realise that the part that triggers the recall does not have to be specified in advance in this sort of memory, setting it apart from other types of associative memory. For those who do not have direct experience of neuroanatomy, it is important to stress another aspect of the cerebral cortical network. It is quite obvious to anybody who has had the opportunity to look at many individuals of one sort of neuron, e.g. pyramidal cells (figure 1), that their morphological characteristics are only given in outline, the
262
4 details being filled in by largely random processes of growth. The apical dendrite has a tendency to distribute itself in a roughly conical region on top of the neural cell body, and the basal dendrites in a spherical region centred around it, but no two pyramidal cells have quite superposable dendritic arbors. The same goes of course for their axonal ramifications. If the genetic specification of the shape of neurons is in terms of some kind of geometrical statistics, then it would be wise to cast their neuroanatomical description in the same language in order to render the essential information. This may not be so in all parts of the vertebrate brain and is certainly not so in some well known cases of insect brains (see a later section of this article) but most likely applies to a large extent to the cerebral cortex where only the sensory input fibres as well as some types of stellate cells are likely to be somewhat better defined in their target neurons. This essential randomness argues very strongly against the idea of a wiring as precise as that of electronic circuits, as some continue to suspect in the cerebral cortex, and provides a counter argument against the very pessimistic note which the impossibility of unravelling precise circuitry for an enormous number of neurons would imply.
3. The cerebellum Another piece of nerve tissue whose structural analysis leads directly to the suggestion of a mechanism is the cerebellar cortex (Braitenberg 1961, 1967a, 1987). This network is radically different from that in the cerebral cortex. First of all, in the cerebellum we find no evidence of loops of excitatory connections either in anatomy or in the rather well worked out synaptic physiology. The only intrinsic excitatory elements within the molecular layer of the cerebellar cortex are the granular cell-parallel fibre neurons which relay their signals exclusively to inhibitory interneurons and to the inhibitory output cells of the cerebellar cortex, the Purkinje cells. Whatever signals are fed into the excitatory neurons come back with a reversed sign. Thus positive feedback does not seem to be prominent there, unless we want to construct complicated schemes of double or quadruple reversal of signs. Consequently, if memory plays a role in the cerebellum, which it probably does, it is not likely to be of the associative memory kind just described. Another striking characteristic of the cerebellar cortex is its essentially onedimensional layout (figure 3). Although the macroscopical relations are a bit more involved due to the three-dimensional curvature of the cerebellar cortical sheet, within any small region transmission of signals in the parallel fibres (the elements which numerically vastly outpower all the others) happens in one direction only, that direction being defined by the transversal axis of the cortical sheet. The cerebellar cortex in its synaptic layout is not a truly two-dimensional structure, but the sum of a very large number of one-dimensional strips corresponding to what the physiologists call 'beams of parallel fibres'. This idea is even more convincing when one realises that the connections at right angles to these strips are inhibitory and therefore may be thought of as insulating neighbouring strips from each other by preventing the signals from spreading from one strip to another. There is more in the system of parallel fibres to make it quite unique (figure 4). The connectivity of most pieces of the nervous system, especially of other sheet-like structures called cortices, seems to be defined only in terms of neighbourhood relations, with the distance of the elements and consequently the length of the connecting fibres often being greatly distorted by the macroscopical bending and stretching of these structures.
263
5
| / i l
1 1
f
\ \
1
\
z
—
-J. '.
*
_
Inpul
Beam of parallel fibres
• _
_
£
i
'
\
1
1
/
A W
•
I\
\
S
J I i
•
Purkinje Stellate Figure 3. Surface view of the cerebellar corlex (very schematic). The horizontal lines are Ihe 'parallel fibres', contacting the elongated dendritic trees of Purlcinje cells (open) and stellate cells (batched shading). The black dots are the places where the inpul reaches the parallel fibres in order to be propagated in both directions (arrows). Purkinje ceils are the output elements. The stellate cells (including the so-called basket cells) inhibit Purkinje cells on both sides of the beam of parallel fibres on which they are located. The inhibitory regions are shown for one stellate cell on the right (broken circles).
264
6 However, in the cerebellum the folds of the cortex seem to respect the system of intrinsic coordinates of the tissue, being always parallel to the direction of the parallel fibres. It is as if distortion of the lengths of the parallel fibres in different sublayers of the molecular layer (such as would be produced by bends in the other direction) is carefully avoided. The most likely explanation for such metric invariance, as opposed to the merely topological invariance prevalent in other neuroanatomical structures, is that the times of signal transmission and reception in different points of the parallel-fibre beam play a crucial role in the cerebellum. This idea led to a reinterpretation of the role of the cerebellum in movement and posture which is still subject to experimental and physiological analysis.
4. Visual movement detectors in the fly The third example which I have in mind for the detection of mechanisms of computation in the nerve tissue is taken from the realm of the much more schematic insect nervous system. Vision in the fly has been analysed in behavioural experiments to an unusual level of precision by Reichardt (1970) and his school. At the basis of the impressive system which these studies reveal are the postulated individual elementary movement detectors situated between neighbouring individual channels of the compound eye (figure 5). They transform the parallel input of temporal fluctuation in the raster of input points into a representation of local movement all over the visual field. The localisation of these movement detectors in the tissue has not yet been precisely defined by electrophysiological means. However, neuroanatomy provides some very strong clues. At one level of the visual system of the fly (in the first ganglion, the lamina ganglionaris) the picture of the environment, cut up into small overlapping and inverted subregions by the optics of the compound eye, is reconstituted in its original order. We will come back to this point later, when talking about inborn concepts. Here another point is important (figure 6) (Braitenberg and Hauser-Holschuh 1972). Signals from each point in visual space are relayed through synapses of exactly the same sort and number to two different secondorder neurons which in turn relay them to the same column, but at two different levels, of the next visual ganglion (I neglect the existence of other second-order neurons for clarity). It is not clear at first sight why the nervous system does not in this case apply the usual trick for the reproduction of the same signal in different places, namely the branching of axons. If two different neurons are being used in this case, we must suppose that they have different transmission properties; one may be excitatory and the other inhibitory or the time constants of their responses may be different. The latter idea is
Figure S. Simplified movement detector according to Reichardt (1970) comprising sensors (S). low-pass filter (F| and multiplying unit ( M | , For a given pattern moving from right to left, the integral of the output of the multiplying unit is a measure of the velocity.
265
7
o b
b t
c d
de
Figure 6. Schematic view of the fly's eye. The upper part of the drawing shows how the geometrical optics of the lenses is compensated by the fibre pattern between the compound eye and the brain. The lines of sight corresponding to the three light-sensitive elements underneath each lens (in reality there are seven) are parallel to the optical axes of the corresponding lens (the central one) or of neighbouring lenses (the other two). Note that a distant object in a certain position (a, b, c, d, e) is therefore seen by three (in reality seven) lenses The fibres corresponding to each of these positions are brought together in one of the compartments a to e of the visual brain. The lower part of the picture shows how the input to each of these compartments is relayed to two different elements L , and L which, because of their different shape, may be hypothetical!y identified with the different filters in the input to Reichardt's movement detectors (figure 5}. :
indeed corroborated by the finding that the two second-order neurons, and L , are of different thicknesses and, moreover, of a thickness varying in a systematic way in different parts of tbe ganglion. It is very appealing to think that these fibres are the filters interposed between the input and the movement detectors, since in a scheme proposed by Reichardt (1970) each input line feeds different movement detectors, one through one sort of filter and one through the other, the difference being mainly in the time constants. This interpretation is all the more appealing since some experiments seem to show that the effectiveness of the movement detectors in guiding the fly's behaviour is different in different parts of the visual field, and the quantitative variation of the size of the L , and L fibres in different parts of the ganglion may be related to this. Of course, if it turns out that the two fibres produce signals of different signs, excitatory and inhibitory, again we will have a different model of the movement detector to fit the picture (Barlow and Levick 1965), but this interpretation would not provide an explanation for the gradients of size found in anatomy which make such good sense if we suppose the fibres to be filters with different time constants. 2
3
266
8
5. Band-pass niters for spatial frequency In some nerve nets the size and shape of the neuronal arborisations can immediately be interpreted in terms of known concepts of information technology. Take as an example the varying sizes of the dendritic trees of pyramidal cells contained in the visual cortex. Their diameters vary by a factor of about ten, the smaller cells being completely contained within the confines of the dendritic trees of the larger cells and all the cells very strongly overlapping with their neighbours {figure 7). Very likely, many of these neurons receive the 'specific input', the set of about one million fibres projecting point-to-point the image of the visual environment onto the plane of the cortex. Neglecting some claims that have been made over the years of much more complicated properties of the dendritic membrane, we may assume that each neuron receives as input some sort of average of the individual input points within its compass. Thus the smallest neurons represent the picture with its original fine grain, while the sets of progressively larger ones, up to the largest, constitute low-pass filters for spatial frequency, each with a bandwidth inversely proportional to the size of its elements. Thus, if we assume that sets of neurons of different sizes add their output separately, then we have a system capable of something akin to Fourier analysis. It is easy to convince oneself, studying one's own visual perception, that we are well able to use different sets of such filters when scrutinising the visual scene, both consciously and unconsciously, depending on the level of detail at which the picture makes most sense. Thus spatial frequency is in a way an inborn concept, at the same level that contours and the slope of contours are inborn concepts if we want to interpret the results of Hubel and Wiesel (1977) in these terms.
Figure 7. Pyramidal cells of greally varying size, some so large that only a small part of their dendritic arbor is shown, collect the same input in the visual area of the cortex. They may be understood as sets of different band-pass filters which provide different views of the visual input. (From Ramon y Cajal 1911.)
267
9 6. Bilateral symmetry: an inborn category of perception Another example evident in visual perception is more impressive since it shows the discrepancy between the simplicity of some anatomical structures and the high-order analysis which it performs. It is well known that we have a very acute perception of bilaterally symmetric forms, even if they are embodied in considerable visual noise (Barlow and Reeves 1978). The detection of such symmetrical patterns (figure 8) is best when the fixation is on the axis of symmetry so that the two symmetrical halves of the picture are projected onto symmetrical positions of the right and left brain. With the fixation to one side, acuteness of the perception falls off very rapidly. Apparently there are fibre systems in the brain which are able to compare and establish the identity or non-identity of symmetrical points of the visual field. Such fibre systems are of course well known; the 200 million fibres of the corpus callosum and several other commissural bundles may well serve this purpose. (The fact that the primary visual area does not participate in the commissural system, except for a small part of it, is only an apparent counter argument since the perception of form is ascribed more to the secondary and tertiary cortical representations of the visual field which have abundant callosal connections.) The philosophical counterpart of this very simple wiring scheme is surprisingly complex. In a natural environment, bilaterally symmetrical shapes in the visual field mostly mean one thing: another animal facing the observer, with friendly or unfriendly intention as the case may be. This is in any case a situation which justifies a
268
10
special detector, be it at the expense of 200 million long fibres reaching from one side of the brain to the other. In a way, the corpus callosum is an inborn idea of a sociological or at least interactive behavioural nature.
7. Geometrical optics embedded in a nerve set Another example of inborn knowledge is interesting because it provides a counterexample to the common opinion (to which I myself also subscribe when talking about the cerebral cortex) that most of the wiring in the brain is statistical. A n argument which is often adduced in support of this idea is that genetic information cannot possibly be sufficient to specify with absolute precision the origin and termination of each fibre in a piece of nerve tissue. In smaller brains, for example in insects, the absolute order of fibres is nevertheless occasionally obtained (Braitenberg 1967b). We have already mentioned the optics of the compound eye of the fly with its 6000 individual lenses which project 6000 partial images of the optical environment onto as many small retinas. The optics being of the photographic-camera kind, these partial images are inverted and they are partly overlapping (figure 6). The fibres that re-establish the original order of the points in visual space find their target neurons in the ganglion with absolute precision according to a scheme which can be quite easily computed on the basis of the geometrical optics of the individual lens camera ('ommatidium') and of the divergence angle between the axes of neighbouring ommatidia. Obviously the first thing to do for this system of fibres is to rotate the little bundle of fibres emerging from each small retina ('retinula') by 180° in order to compensate for the rotation of the image due to the lens and to fit the orientation of the small images to the overall orientation of the image in the eye, which, being made of separate radially arranged channels, is non-inverting. And this is indeed the case. Thus it is no exaggeration to say that the system of these bundles contains the knowledge of the geometrical optics of the camera eye—a piece of physics already incorporated into the building of the insect eye while in the pupal stage, even before it has ever been used. I know that some people object to the use of the term 'knowledge' in this connection but, stripping it of ideological and introspective connotations, it is appropriate.
8. Acquired concepts I think it is safe to say that nobody has ever seen a specific memory trace ('engram') in the brain. O n the other hand, it is very likely that many of the details which we observe in electron microscopy or even at the level of ordinary light microscopy have developed on the basis of experience. The trouble with finding an engram is that, from all we know about tbe function of the higher nervous centres, it would consist of distributed changes submerged in a near infinity of details which perhaps represent changes connected with other engrams. T o find the trace of an event in perception, we would have to know first exactly how the event is represented in the brain. The very brutal changes in the environment ('deprivation experiments') which sometimes do lead to visible alteration of the normal anatomy are not really comparable to the events that are remembered in ordinary life. Clearly, only such changes in the anatomy that continue throughout life are good candidates for the anatomical counterpart of engrams. It is difficult to separate the
269
LI processes of embryology which set up the brain from the later changes which subserve memory in those animals, like mice, rats, cats and humans, where the embryology is not completed at the time of birth. In some other animals, like guinea pigs, however, birth happens late in embryology and
the animal immediately after birth faces the environment with a completely
developed brain. I n such animals we may look for ongoing changes, possibly related to memory. Schiiz (1981) found in guinea pigs that the number of vesicles increases considerably in some synapses after birth and so does the size of dendritic spines. Both are good candidates for suspected anatomical changes underlying memory.
References Barlow H Band Levick W R 1965 The mechanism of directionally selective units inrabbil retina J. Physiol. 178 447-506 Barlow H B and Reeves B C 1978 The versatility and absolute efficiency of detecting mirror symmetry in random dot displays Vision Res. 19 783-93 Braitenberg V 1961 Functional interpretation of cerebellar histology Nature 190 539-40 Braitenberg V 1967a Is the cerebellar cortex a biological clock in the millisecond range? The Cerebellum {Progress in Brain Research 25) (Amsterdam: Elsevier) 1967b Patterns of projection in the visual system of thefly.1. Retina—lamina projectons Exp. Brain Res. 3 271-98 1977 On the texture of brains Neuroanatomy
for the Cybernetically
Minded
(Berlin: Springer)
1978a Cortical architectonics: general and areal Archilectonics of rhe Cerebral Cortex ed M A B Brazier and H Petsche (New York: Raven) pp 443-65 (1978b) Cell assemblies in the cerebral cortex Lectures Notes in Biomatkematics vol 21, ed R Heim and G Palm (Berlin: Springer) pp 171-88 1987 The cerebellum and the physics of movement: some speculations Cerebellum and Neuronal Plasticity ed M Glickstein, Ch Yeo and J Stein (New York: Plenum) pp 193-207 Braitenberg V and Hauser-Holschuh H 1972 Patterns of projection in the visual system of the fly. I I . Quantitative aspects of second-order neurons in relation lo models of movement perception Exp. Brain Res. 16 184-209 Hebb D O 1949 The Organisation of Behavior (New York: Wiley) Hubel D H and Wiesel T N 1977 Functional architecture of macaque monkey visual cortex Proc. R. Soc. B 198 1-59 Ramon yCajal S 1911 Histologie du system* nemeux deThomme el des vertebres(]972 Translated by L Azoulay, (Madrid: instituto Ramon y Cajal)) Reichardt W 1970 The insect eye as a model for analysis of uptake, transduction and processing of oplical data in the nervous system Physikertagung [Salzburg, 1969) (Stuttgart: Teubnerl Plenorvortrage 34 Schiiz A 1981 Prenatal maturation and postnatal changes in the guinea pig corlex: a histological sludy of a natural deprivation experiment. I t Postnatal changes (Pranatale Reifung und postnatale Veranderungen im Cortex des Meerschweinchens: Auswertung eines natiirlichen Deprivationsexperimenls. I I . Postnatale Veranderungen) J. Hirnforsch. 22 113-27
270
Reprinted wilh permission Ironi Thuttb in fieuni^ientei Vol. 15. No. t. pp. 122-126. 1992 © 199! Elsevici Science Publishers Ud. (UK!
Axonal trees and cortical architecture Graeme Mitchison Graeme Milcbison isIn modem computet design considerable care is taken to arrange would Ihe be to examine different spatial arrangements it the Physiologicalcomponents in such a way lhat winng is kept to a minimum. Certain of neurons in the cortex that keep all the connecLaboistoiy. teatores ot cortical strudute- the mappings, stnpes ind blobs within tions of each neuron unchanged (rearranging and Cambridge C82 KG.areas, andaieas themselves-are somewhat reminiscent of tbe layout stretching the axons if necessaryl. This would show UK oi computer components, and suggest thai the codex may alsowhether be the present configuration comes close to organized so as to economize on neuronal 'winng' One important achieving the lowest possible wiring volume. difference between the brain and a computet if lhat Ihe wiring mHowever, tbe this is an impossible program because so brain lakes tbetornot elaborate branched structures, namely atonal many configurations could be obtained in this way bees. In this article, it il argued lhat an assessment oltbe efficiency butof at least it is possible lo ask how altering certain cortical winng must take account oi the branching rules of these trees. key features of cortical anatomy affects the wiring
volume. A large part of ihe mammalian brain is 'wiring'. In the mouse cortex, about 30% of the volume of grey matter is taken up by axons, and dendrites occupy an approximately equal volume ; similar proportions are found in cat visual cortex . It would be reasonable to expect that the cortex has been organized so as to keep this wiring to a. minimum*" , since a wasteful arrangement of neural processes could significantly increase the volume of the cortex, and hence of the whole brain, and this would presumably carry a considerable selective penalty. Both axons and dendrites could be regarded as 'wiring', but axons are much longer and thinner than dendrites and have more elaborate patterns of connectivity. It seems reasonable, therefore, to think of them as the basic wiring of the cortex. Dendrites, which are densely studded wilh synapses, tan be regarded for the present purposes as extensions of the cell body, designed to receive as many synapses as possible without occupying too much volume. How can it be decided whether the cortex is efficiently wired up by its axons? One approach 1
5
7
271
Scrambling the maps withm areas would clearly be very deleterious. Even il this was done only within areas, the effect of shuffling connections randomly, so lhat they would have lo extend over an enlire area rather than being grouped in a neighbourhood of a few hundred microns, would clearly greatly increase the wiring volume. It is more rewarding lo ask what would happen if the order within areas was preserved but the areas were merged into one structure. As we shall see, this suggests some guiding principles for wiring design. It may also be more closely relaled to the process whereby new cortical areas evolve. Cortical areas Suppose that two cortical areas ", assumed for simplicity to be the same size, were merged inlo a single composite area. The cells from the two areas could be interleaved, as far as possible without destroying their original order, in such a way as to produce a sheet of the same thickness as the original areas (Fig. 1). The intermixing of the two types of 3-
neuron means that an axon has to spread further to make the same pattern of connections. Assuming that the volume of dendrites, cell bodies and other components remains unchanged, the total volume of cortex will increase by about 16% (see Bo« 1) If more than two areas were combined in this way the increase in cortical volume would be greater. For instance, if ten areas were lumped together the total volume would increase by a factor of two; for 100 areas this becomes a factor of ten.
Box 1. Relative volumes of different cortical layout? Suppose an area ha; linear dimensions s (in a surface view of the cortex), and thickness X. iMis the volume of axons and & the volume of other elements (such as dendrites, cell bodies and blood vessels), then equating volumes gives 14
2
s \ =
A+B
If n such areas (assumed equal in size) are merged lo give one area of linear dimension t. Ihe volume of each component wiN be increased by a factor n (assuming that axon arbors from different areas are simply superposed, rather than being more efficiently combined in some manner; see section on agonal branching patterns). Furthermore, the axons have lo extend further to reach their targets, by a scale factor t/s. Thus, the volume of the merged area Is given by
Against this increase of volume a possible gain can be set. Suppose neurons in one area send axons to another area, so that each neuron has arbors in both areas (fig. 2A). If these connections preserve the topography of the areas, then they can be merged in such a way that the two arbors have the same centre, so that no connecting axon is needed (Fig. 2B). This clearly reduces the wiring volume, and the question is whether this decrease could r*-n(f+s) balance or even outweigh the gain due to the greater spread of arbors within the combined area. Given n, and assuming axons constitute 30% of the A semi-empirical answer lo this can be given by cortex and that A = 0.3 and 8 = 0.7. these two noting that the total volume of the white matter, equations can be solved for r/s. The ratio of the volume which contains the association fibres (the connecof the merged area to that of the rr separate areas, which tions between areas), is considerably less than the is (f/s> /n, can then be calculated. This is given below for some values of n. expected increase due to merging areas. In humans, for example, where estimates suggest there may be at least 100 separate cortical areas", the volume volume of merged areas would increase by a factor of more than ten on volume of n separate areas merging, and the white matter occupies a volume 2 that is approximately equal to that of the grey 1.16 10 2.1 matter' In the mouse, where 26 cortical areas have 100 10.4 been identified' , merging would increase the volume by a factor of about 3.5; yet the white matter occupies only about 13% of the volume of the grey matter . So, for both species, the associBy using empirical estimates for relative volumes ation fibres take up much less volume than would be of white and grey matter the question of how the occupied by the expanded connections in a merged connections in the white matter are arranged has structure. been side-stepped. Are the areas laid out on the cortex in such a way that the volume of association fibres is about as low as it could be. given the pattern of connections between areas? This is an intriguing question, which the rapidly amassing anatomical data on primate cortex may soon allow us to answer J
3
4
13
15
Anon a! branching patterns The preceding calculations suggest that dividing the cortex into areas allows large savings in wiring volume. However, there is a loophole in the argument, for it ignores the possibility that two axonal arbors might be combined in some fashion that makes the volume of the combined arbor less than that of the two original arbors. It will usually be possible, for instance, to make an arbor with a larger number of short branches (Fig. 2C). Whether this will have a volume smaller than the two onginal arbors depends on the way the cross-sectional area of axons varies.
I + 11 Ffg. 1. Two cortical areas, I and U (in cross-section), are merged fl+lt), by intercalating neurons from the two areas in an ordered fashion. The anon of a neuron has to extend further in the merged area than in the separate areas to make the same set oi connections.
If the cross-sertional area is constant for all orders of branching, then it turns out that an efficiently wired arbor has a volume proportional to the square root of the number of synapses it makes'. This means that two arbors of equal size could be combined to make a single one of V2 times the volume instead of twice the volume. A simple
272
A
B
sectional area appears to be conserved at branches; that is, the sum of the areas of the two daughter branches equals that of the parent axon . It is not easy to determine if the same law applies to pyramidal ceil axons, because they are so much thinner than dendrites. There is the suggestive observation that the number of microtubules in an axon is conserved at a branch, though this does not necessarily imply that the total area is conserved. Suppose the assumption is made that the number of synapses made by an axon increases in some fashion as the cross-sectional area of that anon increases. Although there is little direct evidence for this, it is consistent with the general behaviour of axons; for example, the larger magnocellular axons from the geniculate make many more synapses than the thin parvocellular ones' . If it is also assumed Fig- 2. (A) Diagrammatic surface view of cortex showing Ihe axon of a neuron that area is preserved at axon branches, the hypothramifying in two areas (square outlines labelled I and II}. the two arbors being esis can be proposed that the number of synapses connected by an association fibre. In (B) Ihe two areas are merged (1+11) in supported by an axon is proportional to its crosssuch a way as to preserve topographic order. The two arbors can be combined (the arbor originally from area II is shown dotted}, and the association fibre sectional area. If correct, this hypothesis would dispensed with. Two ways of combining the arbors are shown in (C): by imply that the commonly observed tendency for groups of axons to run nearly parallel over some superposition, and by a more finely divided tree. distance (Fig. 3A) is not so neglectful of wiring economy as it might at first appear to be. Suppose calculation shows that, using this rule, the result of two axons make synapses in the same region (Fig. combining 100 areas would be to increase the 3B|. If they are replaced by a single axon that makes volume by a factor of about sii instead of ten. This is all their connections (Fig. 3Q, the cross-sectional still comfortably in excess of the factor of two that area of this axon would be the sum of the areas of results from including the volume of white matter. the two original axons, which implies that its volume It is clear, however, that axon diameters do not would roughly equal that of the two original axons. remain constant, but decrease at branches. In the This argument can be extended to show that, given case of the dendrites of pyramidal cells the cross- our hypothesis, superposing arbors is an efficient way of constructing a combined arbor, provided that the original arbors are themselves efficiently A constructed, in this case, therefore, the assumptions made in the preceding section can be justified. 4B 16
17
6
Stripes and blobs Within certain cortical areas there is a finer level of structure-the stripes* " - = ' , b l o b s " " and other kinds of patches. In the primary visual cortex, neurons often make clustered connections"'" (see Fig. 4A), where the distance between the clusters is approximately the period of ocular dominance stripes (Fig. 4B). This suggests that neurons make connections predominantly to others of the same ocular dominance type. A similar phenomenon is observed with blobs . Therefore, the stripe or blob systems have some of the properties of areas; namely, they allow neurons to form more localized connections at the cost of making a few long connections, which, in the case of stripes and blobs, are the long axons connecting the clusters. 26
Fig. 3. (A) An axon entering the visual cortex. Many of its branches run approximately parallel and make synapses in overlapping or neighbouring regions. This might seem uneconomical, because a single axon could supply the same region. However, with conservation of cross-sectional area the volume of the two axons depicted in (B) would be approximately equal to that of the single axon in (C), which replaced them. Numbers refer to cortical layers.
Abbreviation: WM, white matter. [Part (A) taken from Ref. 18.1
273
The wiring volume of stripes or blobs can be analysed by the same method used for areas. However, there is a significant difference, because the long connections between stripes run through the grey matter rather than occupying the white matter (as in the case of the association fibres). If the long fibres share the same space as the arbors of other neurons, they spread these arbors apart, thereby increasing their wiring volume. How large this effect is depends on the way axonal size changes at branches. If axon diameter is
(airly constant, it can be shown that, even when the axons mix with the arbors, a stripe pattern gives a lower wiring volume than a uniform mixture of neuron types. In (act, there is an optimum width to the stripes, which would lead to each neuron having four or five clusters on average . However, if crosssectional area is conserved, so that the long axons are thick relative to the clusters they supply, then the decrease in volume achieved by making more compact arbors within stripes is almost exactly cancelled by the volume of the long axons . This would not be true if the long axons were segregated from the arbors, within, for example, a separate layer of the grey matter. The stria of Gennari, which give the striate cortex its name, consist in part of myelinated axons running long distances , and may provide an example of the kind of structure that can exploit the wiring advantages of a patchy organization. Given such a structure, the volume gains for making stripes would be the same as for an equivalent number of areas. In the striate cortex, two types of ocular dominance stripe and four types of orientation patch might be distinguished, which would give a gain equivalent to eight areas, that is, a factor of about 1.9 in volume. 7
7
need to make specific patterns of connections, like the serial synapses made by basket cell axons along apical dendrites' , or the 'corkscrew' formations of chandelier cells''. As pointed out earlier, it is not necessarily inefficient for axons to run parallel over some distance, since a single axon replacing them might have to be proportionately thicker to support the larger number of connections it makes. However, there is no denying the wastefulness of some of the tricks axons get up to, like the backtracking seen in Fig. 3A. This does not mean that the idea of wiring economy should be abandoned, but rather that we should only look for robust gains and losses. If tidying up the axonal tree could halve its volume, then (assuming as usual that 30% of the cortex comprises axons) the gain in volume for the whole cortex would be about 16%. Against this, the gain in making 100 cortical areas is a factor of ten in total volume, and the gain in making a system of stripes could amount to a factor of two or so. In essence, the idea proposed here is that a patchy cortical organization allows an axon to make localized clusters of short connections, and thereby 8
Concluding remarks There seems little doubt that subdividing the cortex into areas confers a considerable advantage in wiring economy, at least when contrasted with the somewhat hypothetical alternative scheme where areas are merged into larger structures with as little possible destruction of their topographic order. It is less clear that the next level of patchiness, the stripes and blobs within areas, offers significant advantages in wiring economy. Much depends on the way the thickness of axons changes at branches. If the higher order branches are much thicker - in particular if the cross-sectional area is preserved at branches, as happens with some pyramidal cell dend rites - then the axons that link localized arbors in different stripes may have a large enough volume to cancel the gains from making more compact arbors.
lOOiim
,s
A further question arises here: do the linking axons occupy the same region of grey matter as the axonal arbors of the same, or other, neurons? This is pertinent because fibres not only occupy volume but also fill out the space that must be traversed by other axons in order to reach their targets. It is this 'filling out' that eventually weighs against the merging of areas and makes it more efficient to segregate the association fibres in the white matter. We can rephrase our question, therefore, and ask whether there is a substructure within the grey matter that serves as a kind of white matter' for linking stripes and blobs within areas The stria of Gennari in primary visual cortex may furnish an example of this. If economy of wiring is important for the cortex it would be reasonable to expect to see evidence of efficient design in individual axonal trees. In fact, axonal branches often reach their targets in a very roundabout way. This may be due in part to the
S
SB|4B 4A 3 SA
4 Fig. 4. (A) A neuron in the visual cortex, the axon of which shows a very marked set of clusters at a spacing that is approximately that ol ocular dominance stripes. (B) A representation of ocular dominance stripes, in a surface view of the cortex, showing how the clusters may relate to the stripe system. Numbers refer to cortical layers. Abbreviation; WM. white matter.
(Part (A) taken from Ref. 25.)
274
Acknowledgements I
Ibank A. Schiir and
P. Somogyi to< helpful suggaliom.
economizes on wiring. II is not difficult to think of other reasons why a patchy structure might be advantageous, although some of these can be discounted readily enough. For example, the gain in the propagation time of spikes obtained by having shorter connections is probably insignificant {a few hundred microseconds", with the most favourable geometry). A more attractive possibility is that patchiness might improve the efficiency of the search for synaptic targets by a growing axon, supposing that the patches contain a higher concentration of appropriate cell types. This cannot be dismissed lightly, especially as the existence of circuitous paths taken by many axons hints that making the correct connections may be a difficult task. However, the criteria arrived at in this article should help to distinguish this goal from economy of wiring, and this in turn may give some insight into the engineering problems that have constrained cortical evolution. Selected references 1 Braitenberg, V. and Schiiz, A. (1591) Anatomy of the Cortex. Spnneer-Verlag 2 Foh, £., Hang. H., Kbnig. U. and Rait. A. ( 1 9 7 3 ) Microsc Acta 75. 148-168 3 Cowey. A. (1979) Q. J. Exp. Psychol. 341, 1-17 4 Mitchison. Q. J. and Durbin, R. M. (1986) SAM ISoc. Ind. Appl. Math.) ) . Alg. Discos!. Methods 7. 571-583 5 DufUn, R. M. and Mitchison. C . J. (1590) Nature 343. 644-647 6 Nelson. M. E. and Bower, J. M. (1990) Trends Neurosci. 13. 403-408 7 Mitchison. C . J. 11991) Proc. P. Soc. London Sei 8 746. 151-158 8 Hubel, 0. H. and Wiesel. T. N. (1977) Proc. S. Soc. London Ser. B 1 9 8 . 1 - 5 9 9 Brodman, K. (1909) Vtrgleichende Lokatisahonsleh/e der Crosshimnnde in ihren Pnnzipien dargestellt suf Crund des
275
10 11 12
13
14 15
Zellenhaues tPnnciples ot comparative localization in the cerebral corlex presented as the basis of cytoarchitecture). Leipzig. Earth Zeki. S. M (1978) Nature 274. 473-428 Van Essen, D. C. (1985) In Cerebral Cortex (Vol. 3) (Peters. A. and Jones, E. G., eds), pp. 259-379, Plenum Press Crick. F. and Asanuma, C. (1987) in Parallel Ditfnbuted Processing (Vol. 2) (McClelland, J. L. and Rumelharl, O. E., eds), pp. 333-371, MIT Press Braitenberg, V. (1978) in flrcfl/recronics ot Cerebral Cortex (Brarier, M. A. B. and Petsche, H„ eds), pp. 443-465, Raven Press Caviness. V. S (1975) I, Comp. Neurol. 164, 747-264 Felleman. D . J . and Van Essen. D. C . (1991) Cereb. Cortex 1, 1-47
16 HiHman. D. E, (1979) in The Neurosciencei. Fourth Study Program (Sehmitt, F. O. and Wofden, F. O , eds). pp 477-498, MIT Press 17 Weiss, P. A. and Mayr, R. 11971) Acta Neuropathology (Suppl. 5), 19B-206 18 Freund. T. F.. Marlin. K. A. C . Soltesz. I.. Somogyi, P. and Whitteridge. D. (1989) J. Comp. Neurol. 289. 315-336 19 Livingstone. M. S. and Hubel. D, H. (1982) Proc. Natl Acad. Sci. USA 79, 6Q9B-6101 20 Tootell, R. 6. H., Silverman. M. S., DeValois. fl. L. and lacobs. C. H. 09B3) Science 220. 737-739 21 Slasdel. C . G. and Salama. C . (1986) Nature 321. 579-585 22 Humphrey. A. L. and Hendiickson. A. E (1980) Soc. Neurosci. Abstr 6. 315 23 Norton, J. C. and Hubel. D. H. (1981) Nature 292. 762-764 24 Gilbert. C . D. and Wiesel, T. PJ. (1983) /. Neurosci. 1. 1116-1133 25 Martin. K. A. C . and Whitteridge. D. (1984) J. Physiol. 353. 463-504 76 Livingstone, M. S. and Hubel, D. H. (1984) I Neurosci. 4, 2830-7835 27 Valverde. F. (1985) in Cerebral Cortex (Vol. 3) 1 Pelers, A. and Jones, F. C . eds). pp. 207-257, Plenum Press 38 Kisvarday, I. F„ Martin, K. A. C , Friedlander. M. J. and Somogyi. P. (1987) i Comp. Neural 260, 1-19 29 Somogyi. P. (1979) I. Physiol. 296. 1B-19 30 Wesman. 5. G . and Bennett. M V. L (1972) Nature 238. 217-219
2d Aspects of Biocomputation
T h e unity of this section is not manifest a priori] it is to be found in the diversity, i.e. in the complementarity of the approaches presented by the three authors, who are experts respectively in immunology, psychophysics and neuropsychology. Within a human body reside two conspicuously intelligent systems, with a capacity for learning: the nervous system and the immune system. T h e article by Niels Jerne (25) exposes elegantly some similarities, and also differences, between these two smart systems. (Note in passing how a capacity for learning appears naturally as an important criterion for intelligence; the liver does a great processing job, but it does not hold a high reputation as an intelligent organ). Jerne's section on "cellular dynamics in immunology" is a bit technical, but the following one on "instruction versus selection in learning" addresses issues that are central in biocomputation. One fact that puts brains (specially human brains) apart, is that neuronal cells cease to reproduce early in life. Thus the notion of reproductive success, which is one main ingredient of Darwinian evolution theory, is missing in brain life. Intercell connection strengths change, not cell population sizes. T h i s is one sufficient reason to take doctrines of "neural Darwinism", as advocated by some in analogy with evolution and immunology, with a grain of salt. Jerne's thoughtful analysis of the selection-versus-instruction debate should also alert the reader against statements that describe perception and learning as "resonance" between external objects and internal "prerepresentations". The whole point in the selection-versus-instruction debate, as explained by Jerne, lies in the accurate assignment of the selective level — there is always one selective level, and it need not be the most interesting one. The review of Bela Julesz (article (26)) on early vision and focal attention was specially written for physicists who are novices in brain research; thus, it provides appropriate historical perspectives, and a commented list of references. The author's invention of random dot stereograms, used for the study of binocular vision, has opened a vast new field for quantitative
276
psychological investigations, and it has led to numerous fundamental and applied results. T h e study of stereopsis, followed by a discussion of preattentive texture discrimination and of the serial process of focal attention, provides ground for a wealth of casual side remarks that touch on almost all the issues evoked in the subsequent Chapters 4 and 5. T h e text of Larry Weiskrantz (paper (27)) on neuropsychology and the nature of consciousness, provides a brilliant and entertaining survey of some of the confusion that continues to surround the word "consciousness". He succeeds, though, in clarifying some core components of this compound concept, with a presentation of amnesia and blindsight syndromes in humans, and of several pertinent animal observations. His "monitoring" theory of consciousness is germane to Hebb's views, as exposed in the last article (68) of this volume.
277
Reprinted with penrussiori from The Neuwicicnets: A Study Program,
pp. 200-205, 1967 © 1967 Rockefeller University Press
Antibodies and Learning: Selection versus Instruction NIELS KAJ J E R N E
U N T I L LESS T H A N ten years, ago, there was an almost (man*
imous consensus among immunologists that antibody formation was equivalent to a learning process in which the antigen played an instructive role. T h e main basis for this belief was that the number of different antibody specificities, or the number of different antibody molecules that one animal can produce, is so large that it would be impossible for a cell nucleus to accommodate genes for this entire range of potentialities for protein synthesis. The number of different antigens is immense. Every species o f animal, for example, must have several species-specific antigens. Against any one of these millions of antigens the immune system of one individual animal can produce a specific antibody. Therefore, the argument went, the number of different antibodies that an animal can produce
must be virtually unlimited. Furthermore, the work o f Landstcjncr and his school had shown that an animal produces antibodies even against artificially synthesized substances (haptens) that were made in a chemical laboratory and had never before existed in the world. The immune system of an animal could not possibly have anticipated the arrival of such antigens and must therefore have been "instructed" by the antigen itself in the formation of antibody. The instructive mechanism proposed was that the antigen, after having entered a competent cell, guides rhe tertiary folding of polypeptide chains into globulin molecules, thereby imposing upon those molecules a conformation complementary to a surface region on the antigen. " 1
1
1
Instruction versus selection in antibodyformation Paul Ehrlicli Institute, University of Frankfur,. Frankfurt am Mam, West Germany F U E L S K A; jBJitJE
In contrast to this view, a selection mechanism was proposed, based on a logical argument concerning "recognition. The precision of recognition by the immune system
278
201 Can be illustrated by examining ihe antigenic properties of the "constant" parr of the kappa light chain of human immunoglobulin .This constant portion comprises 107 amino acids (numbered 108 to 214) at the carboxyl end of the light chain, which have been found to be identical in sequence in all individual cases so far examined, except for amino acid number 191. In some individuals, this is valine; in others it is leucine. This difference is known as an allotypic difference. Those individuals that have valine at the 191 position belong to allotype lnv b . whereas those that have leucine at this position belong to allotype lnv a*. This allotypic difference was detected by immunological methods involving the formation of allotype Specific antibodies.* The immune system is thus capable of recognizing the replacement of one amino acid within a long sequence of amino acids of a protein molecule. It is a characteristic fcarurc of the immune system rhat an animal does not normally appear to produce antibodies against its own circulating antigens, which do elicit antibody formarion when injected into a different animal. How docs the animal recognise that an antigen arriving in its tissues is. in fact, an antigen against which antibody should be produced and not one of its own antigens to which no response is desired? +
The immune system, then, must not let itself be stimulated to produce antibodies before having recognized that the anrigen with which it is confronted differs from its own antigens. In order to recognize its own antigens, which differ among themselves in innumerable ways, the animal would have to possess a large set of self-recognizing molecules, and it would not be able to decide that a given antigen is its own berorc this entire set had been applied—a scrutiny, moreover, that would have to be interminably repeated. It would therefore seem impossible for the animal to recognize its own antigens. The recognizing agent must recognize foreign antigens, and the obvious molecules to accomplish this task are antibody molecules. It follows that an animal cannot be stimulated to make specific antibodies, unless it has already made antibodies of this specificity before the antigen arrives. It can thus be concluded that antibody formation is a selective process and that instructive theories of antibody formarion are wrong. Many immunologists were not convinced by the logical argument presented above, and only because of direct experimental evidence, accumulated during the last three or four years, have instructive theories of antibody formarion finally been abandoned. Antibody molecules have been shown io consist of two identical heavy polypeptide chains and two identical light polypeptide chains. It has also been shown that these polypeptide chains are assembled on ribosomes and that the specificity of an antibody 7-10
11
l?
molecule is determined by the primary structure of its polypeptide chains. This leaves no room for instructive action by the antigen. Furthermore, it has been demonstrated rhat certain antibody-producing cells, which turn our more than 1000 antibody molecules per second, contain no antigen, The experimental methods would have detected the antigen if mote than ten molecules of antigen had been present per cell. At one ribosomic site, a light or heavy polypeptide chain cannot be synthesized in less than 15 seconds. Therefore more than 100,000 ribosomcs in an anribody-produeing cell must simultaneously be able to turn out specific chains in the absence of antigens. Although it is thus clear that the antigen plays a selective and amplifying role, we do not yet know by what mechanism the selective stimulus is transmitted. All we know with certainty is rhar, one day and later after having injected an antigen into an animal, we can find, in its spleen or lymph nodes, cells that are both multiplying and producing specific antibody. The simplest assumption would seem to be that an animal contains, among its population of 10* to 10 lymphocytes (depending on species and age), a very large number of subpopulations, each of which is capable of being stimulated by certain antigens to grow, divide, and produce antibody-secreting cells among their offspring {perhaps because on their surface the cells display the antibodies they can synthesize). 11
11
Cellular dynamics in immunology This picture can be illustrated by experiments that make use of an agar plaque method for counting, in a cell suspension obtained from a mouse spleen, the number of cells secreting a certain antibody. A certain amount of such a suspension—say IO mouse spleen cells, as well as 4 X 10* sheep red blood cells(SRC)—is added to 2 milliliters of fluid 0.7 per cent agar ar 45° C. The mixture is immediately poured into a pctri dish, where it solidifies into a Layer less than 1 mm thick. Each spleen cell is now surrounded by many SRC in a fixed position in the agar layer. I f a spleen cell, during incubation of the petri dish at 37 C, secretes antibody molecules directed against an antigen of the SRC surface, the molecules diffuse into the agar and become fixed to the SRC in the immediate surroundings. Red blood cells, to the surface of which an antibody has attached, are said to be sensitized. Such sensitized cells will lyse in the presence of a scrum factor called complement. By flooding the petri dish with complement after onehour incubation, the sensitized SRC will lyse and lose their hemoglobin. Thus, around each mouse spleen cell that secretes hemolytic antibodies against sheep red blood cells, a pale plaque, visible to the naked eye, appears. Microscopic observation reveals the antibody-producing
279
6
202 lymphocyte, Oi plasma cell, in the precise center of each plaque. The spleen of an untreated eight-weeks-old inbred mouse contains about BO plaque-forming cells (PFC) among a total of about 1.5 X 10 spleen cells. These SO PFC produce antibody against sheep red blood cells, although the mice have never experienced sheep antigen. As always i n immunological observations, there is a great variation among individual animals. The normal level of 80 PFC is an average. Among a group of 50 apparendy identical mice, the normal level could range from, say, 10 to 500 PFC. 16
17
s
We now give each of a few hundred mice one injection of 4 x 10 SRC into a tail vein. Every day we sacrifice 20 mice and determine the total number of plaque-forming cells in each spleen. Twenty-four hours after the antigen injection, the number of PFC Starts to rise above the n o r mal level, proceeding exponentially to reach an average of 10 PFC per spleen at four days, after which there is a rapid decline. T
6
The following t w o observations, appear to support the assumption rhat the exponential rise in PFC between day one and day four reflects cell multiplication. If, on day two, an animal is given one microgram of colcimid per gram body weight and is killed three hours later, about 20 per cent of the PFC in its spleen is found to have been arrested in metaphase of mitosis. If, on day three, the spleen cells are suspended for 30 minutes in vitro in a medium containing tritiated thymidine, about 55 per cent of the PFC can be shown by autoradiography to have synthesized DNA, whereas the remainder have not. The rale of appearance of PFC after an intravenous dose of 4 X 10 SRC corresponds to a cell-doubling rime of seven hours. Although smaller doses of SRC evoke smaller responses, it is not possible to obtain a larger response than that elicited by 4 X 10 SRC, even i f the dose is increased to 4 X 10* or 4 X 10* SRC. Each dose within this hundtedfold range produces the same maximum response, indicating that all cells capable of being stimulated by sheep red blood cell antigen are maximally engaged. T
T
The experiments described above can be repeated with similar results i f rabbit ted blood cells are used as antigen. The antibodies produced by mice against SRC and tabbit red blood cells do not cross-react. Also, the PFC that arise after a mouse has been injected with SRC do not form plaques in agar with rabbit red blood cells, and vice versa. Furthermore, the following experiment shows that the class of cells in the mouse spleen initially stimulated by an SRC injection is different from the class of cells initially stimulated by a rabbit red blood cell injection. The number o f PFC against rabbit red blood cells appearing after a single injection of 10* rabbit red blood cells is the same.
whether or not 10* SRC are injected simultaneously. The two types of antigen clearly do not compete for rhe same target. In summing up, we can conclude that the mouse spleen possesses, among its more than 10* cells, small classes of cells that can be stimulated to grow and divide by particular antigens, and that anribody-secreting cells arise from these subclasses by cellular proliferation. We can now try to estimate the number of cells in a mouse spleen that belong to the class char can respond to SRC antigen. We have reasons to believe that these cells are nor the PFC that fotm the normal level in nonstimulared animals. First, the normal level PFC are mostly plasma cells, whereas the cells that respond to a primary antigen stimulus are probably small lymphocytes. Second, the magnitude of rhe response of individual mice appears unrelated to their normal level of PFC. We therefore do not believe that the normal PFC belong to the class of cells that can respond to a primary stimulus of SRC antigen, nor that they become the ancestors of the PFC arising after a stimulus. The exponential curve describing the appearance of PFC extrapolates below the notmal level of PFC at less than 24 hours after the sheep red blood cell stimulus, suggesting rhat the average size of the class of responding cells might be less than 80. Other experiment appear ro leave little doubt, however, that the size of the class of initially responding cells is of the order of several thousand. These experiments involve (1) the exponential decay of this class of cells after increasing exposure of nonstimulated mice to X-radiarion, and (2) the transfer of a small fraction of the cells from a normal mouse spleen ro the spleen of a mouse that has been rendered immunologically incompetent by X-irradiation. Thus, only a small fraction of the immediate descendants of the initially responding cells arc PFC, i.e., secrete an antibody that can cause lysis of sheep red blood cells. The initially responding cells and most of their immediate descendants might then display or secrete antibodies of a degree of specificity that enables the antigen to stimulate these cells to divide, but which is not serologically recognizable- This might also explain the finding that certain serologically unrelated antigens, such as different Salmonella flagellar antigens and different protein subunits of lactodehydrogenase, can induce immunological tolerance with respect to each other. In both of these cases, tolerance may be due to the removal of a class of initially responsive cells that display cross-specific antibodies not detected by serological methods. The picture that emerges for the initial stages of antibody formation is that the antigen first selects a class of initially responding cells. These are stimulated to dif-
280
J3
l,,!l
11
H
203 ferentiatc and to divide. For each division they may require a new antigenic stimulus. All daughter cells do not necessarily produce the same antibody, and the antigen preferentially stimulates those descendant cells thai produce rhe antibody best fitting the antigen. This brings me to another relevant immunological phenomenon, rhe increase in "avidity" of the antibodies produced during the lime following an antigenic stimulus, in the early days of immunology, the term avidity was introduced to express rhe degree of firmness of the bond formed between serum antibody and antigen in vitro. I f the antigen dissociated easily from the antigen-antibody complex on ddution, the antibody was said to be of low avidity. Avidity is thus a measure of the goodness of fit o f antibody toward the antigen. It was shown that the antibody present in ammal scrum is a heterogeneous population of molecules varying widely in avidity, and that the average avidity of the antibody present in an animal increases after repeated antigenic stimulation, or even with time after a single primary stimulus, *Recent studies have confirmed that antibody molecules produced by one animal shortly after a single primary antigenic stimulus have a lower avetage association constant with tespcel to the antigen than later, and that this applies to a single class of antibody (IgG) with respect to a single antigenic determinant.** This indicates that a further selection of cells producing better-fitting antibody takes place among the descendants of the cells first stimulated after the initial antigenic stimulus. Avidity does not increase as quickly after a large dose of antigen as after a small dose. As might be expected, a large dose of antigen is less selective, and in extrapolation we might relate tolerance following excessive doses of antigen to the absence of progressive selection. The immunological tolerance observed after repeated minimal doses of antigen might be caused by stimulation of the entire class of available responding cells, followed by a decay of their descendants because of the absence of antigen needed for further stimulation. Trie assumption underlying the above discussion, that the descendants of the initially stimulated cells require further antigenic stimulation in order to continue to multiply, is supported by the finding that the exponential rate at which PFC appear in the spleens of mice after one i n travenous injection of SRC decteases with decreasing antigen doses. Thus the doubling rimes of PFC per spleen after one dose of 4 X 10', 4 X 10*, 4 X 10 , and 4 X 10* SRC are 7, 9, 21, and 36 hours, respectively. The simplest explanation of this remarkable fact is that the cells require, and must therefore wait for, a new antigenic stimulus for each division. In the picture developed above, a continuing selective 5
16
s
role has thus been assigned to the antigen. Among the general population of antigen-responsive cells, a particular antigen first selects the small class of cells that can respond to the primary presence of that antigen. These cells are stimulated to grow and divide. A secondary selection by the antigen then takes place among the differentiating descendants of these cells, resulting in the production of increasingly better-fitting antibodies. Instruction versus selection in learning In accordance with this general picture of rhe selective role of antigen in antibody formation, the view that antigen acts instructively at the intracellular level has been abandoned. It may be useful, therefore, to examine in a broader biological context whether there exist situations in which instructive mechanisms operate or to which the term "instruction" is applicable, k would seem rhat an answer to this question tequires a specification of rhe organizational level at which a process is described. Thus, although the mechanism by which an antigen brings about antibody formation in the tissues of a mouse must be purely selective, the antigen does not select a mouse. When viewing the situation at the level of the entire mouse, we may still say that the antigen "instructs" the animal to produce an adequate antibody. Similar reasoning can be applied to examples from other areas of biology. For instance, in the case of the selection by si re pio my cin of streptomycin-resistant mutants among a population of bacteria, it is clear that the streptomycin molecules did not cause these mutants to arise. They were already present before the streptomycin arrived; no instructive role can therefore be assigned to streptomycin. On the level of the entire bacterial culture, however, we may still say that streptomycin instructs the transition to streptomycin resistance. Let us consider a more complicated example of Darwinian selection, A large population of brown moths spend a major part of their time sitting on a factory wall of the same color. These moths are rhe prey of certain birds. Now the wall is repainted white. One or two years later we observe that the moths sitting on the wall ate likewise white. In this case, the signal that entered into the system, t.c,, the color change, was not even received by the moths, but by the birds. The mechanism by which the color change in the moths came about was obviously selective, in that moths of lighter color were already present among the original population before the signal arrived- Again we might say. however. On rhe level of the entire system, that the signal "instructed" the population of moths to mimic the color change, A clear example of an instructive process would be the role of messenger RNA in protein synthesis. The mes-
281
204 scngcr RNA molecules arriving in their ribosomal habitat do not select already existing protein molecules and may therefore be said, at the organizational level of protein, to play an instructive role. The messenger RNA docs recognize and select, however, already available su bun its, namely, species of amino acid-charged rtansfer RNA- At this lower level, therefore, the process is a selective one. J will finally turn to the question of the analogies between the immune system and the centra] nervous system. Both systems have a history that develops during the lifetime of the individual. Each antigen that mates its appearance irreversibly changes the immune system. In the same way, the state of the central nervous system reflects the experience of the individual. The immune system appears to be learning by responding to antigens entering from rhe outside world. The central nervous sysrem also appears ro learn in response to sensory signals. Like the central nervous system, the immune system appears lo have a memory that enables it to benefit from previous experience. It produces more and better antibodies i f the antigen enters a second time, or repeatedly. The experience gathered by the immune system of an individual cannot be transferred to its progeny. As with the central nervous system, each newborn must start, so to speak, from scratch. In the remaining, specularive part of this paper, I shall try ro make the most of these analogies. Let us consider the kappa light chains of the antibodies of mice and man. Each of these light chains consists of a "constant" sequence of 107 amino acids and a "variable" sequence of 107 amino acids. The Constant part of human light chains is identical in all individuals and m all antibodies they produce. It differs, however, from ihe constant part of mouse light chains by some 40 amino arid substitutions that have obviously arisen by mutation during phylogeny. The variable part of human light chains, on the contrary, differs berween different antibody molecules of one individual, and the differences are similar in nature to the differences between mouse and man in the constant part. This is reminiscent of the old saying that ontogeny mimics phylogeny: phylogenic differences between species in the constant part of the light chain are mimicked by the ontogenic plasticity of the variable part." 30
Similarly, in the central nervous system, instincts arc fixed in one species, but each individual (particularly man) has also a plasticity in learning capacity, which mimics the total of all phylogenically developed instincts of different species. In the immune system, rhe constant part of the light chain is obviously laid down in rhe DNA of the zygote, and it is equally cleat that there is DNA in the
zygote that represents the variable part of the light chain, although, ontogcnically, this DNA may exhibit an immense plasticity. In the central nervous system, instincts are also obviously encoded in the zygote, most probably in the DNA. But if DNA acrs only through transcription into RNA and translation into protein, and if the phenotypic expression of instincts is based on particular a rrangemenrs of neuronal synapses, then DNA through RNA and protein must govern the synaptic nerwork in the central nervous system. Analogous to the utilization of the diversity of the variable part of the antibody light chain in the immune system, it would seem probable to me that, in the central nervous system, learning from experience is based on a diversity in certain parts of the DNA, or to plasticity of its translation into protein, which then controls the effective synaptic network underlying the learning process. I would, therefore, find it surprising if DNA were not involved in learning, and envisage that the production by a neuronal cell of certain ptoteins, which I might call "synap[obodies,' would permit thaE cell to enhance or depress certain of its synapses, ot to develop othets. Pursuing these analogies even further, we might now ask whether one can distinguish berween instructive and selective theories of learning in the central nervous system. Looking back into die history of biology, it appears that wherever a phenomenon resembles learning, an instructive theory was first proposed to account for the underlying mechanisms. In every case, this was later replaced by a selective theory. Thus the species were thought to have developed by learning or by adaptation of individuals to the environment until Darwin showed rhis to have been a selective process. Resistance of bacteria to antibacterial agents was thought to be acquired by adaptation, until Luria and Delbrilck showed the mechanism to be a selective one, Adaptive enzymes were shown by Monod and his school to be inducible enzymes arising through the selection of pre-existing genes.^ Finally, antibody formaEion that was thought to be based on insttuction by Ehe antigen is now found Eo result from the selection of afteady existing patterns. It ihus remains to be asked i f learning by the central nervous system might not also be a selective process; i.e., perhaps learning is not learning either. Several philosophers, of coutse, have already addressed themselves this point. John Locke held that the brain was to be likened to white paper, void of all characters, on which experience painEs with almost endless varicEy," This represents an instructive theory of learning, equivalent to considering the cells of the immune system void of all characters, upon which antigens paint with almost endless variery.
282
h
11
205 Contrary to this, the Greek Sophists, including Socrates, held a selective theory of learning. Learning, they said, is clearly impossible. For either a certain idea is already present in the brain, and then we have no need of learning it, or the idea is not already present in the brain, and then we cannot learn it cither, for even if it should happen to entet from outside, we could not recognize it. This argument is dearly analogous to the argument for a selective mechanism for antibody formation, in rhat the immune system could not recognize the antigen i f the antibody were not already present. Socrates concluded that all learning consists of being reminded of what is pre-existing in the brain. * 3
Summary In concluding this analysis, it would seem that selection refers to a mechanism in which the product under considetation is alteady ptesent in the system prior to the arrival of the signal, and is thus recognized and amplified. Each system that is capable of receiving a signal, however, is subject lo instruclion by this signal. Thus, at the level of an cntite systemj all such processes are instructive, whereas all instructive processes al a lower level imply selective mechanisms. In learning, and in all processes resembling learning, a discussion of instruction versus selection serves only to determine the organizational level of the elements upon which selective mechanisms operate. During recent years the belief that antigen plays an instructive role in antibody formation by intracellular guidance of the formation of the tertiary structure of globulin
molecules has been replaced by the idea that antibody formation is based on a selective process in which antigen selects pre-existing patterns and causes molecules representing these patterns to be produced at increased rates. The logical arguments for selection have been enforced by experimental evidence showing that the general mechanism of protein biosynthesis also applies to antibody production, rhat primary polypeptide structure determines antibody specificity, and that plasma cells can produce antibody in the absence of intracellular antigen. The antibody response appears to depend On multiplication of cells of the immune system. Attempts are being made to describe the cellular dynamics involved and to understand the nature of the antigenic stimulus. The teplacement of instructive by selective theories appears to be a general trend in the development of biology. A number of analogies arc drawn between the central nervous system and the immune system, and the question is posed whether a selective mechanism may also underlie the learning process. An analysis of this question leads to the conclusion lhat the terms instruction and selection can apply to descriptions of the same process at different levels. Each system that is capable of receiving a signal is subject ro instruction by rhis signal. Thus, at the level of an entire system, all such signals are instructive, whereas all instructive processes at some lower level imply selective mechanisms, through which products that were already present in the system prior to the arrival of the signal ate selected and amplified. In learning, as in all processes resembling learning, a discussion of instruction versus selection serves only to determine the organizational level of the elements upon which selective mechanisms operate.
283
Reprinted wilh permission from Reviews of Modem Physics, Vol. 63. No. 3, pp. 735-772. July 1991 © 199] The American Physical Society
Early vision and focal attention Bala Juiesz Laboratory end Division
of Vision Research, of Biology,
Rutgers
California
University—Kilmer
institute
of Technology,
Campus, Pasadena,
New Brunswick, California
New Jersey
08903,
91125
At the Ihirly-year anniversary of ihe introduction of the technique of computer-generated random-dot stereograms and random-dot cinemalograms into psychology, the impact of rhe technique on brain research and on the study of arlhicial inlelligence is reviewed. The main finding—thai stereoscopic depth perception (slereopsis). motion perception, and prsallenlive lealure discrimination are basically bottomup processes, which occur withoul the help of ihe lop-down processes of cognition and semantic memory—greatly simplifies the sludy of these processes of early vision and permits the linking of human perception with monkey neurophysiology. Particularly interesting arc the unexpected findings thai stcrcopsis (assumed lo be local} is a global process, while tenure discrimination (assumed to be a global process, governed by stBlisticsl is local, based on some conspicuous local features (lextonsl- !t ia shown •hat the lop-down process of "shape (depth! from shading" does not affeel stcrcopsis. and tome of the models of machine vision are evaluated. The asymmetry effect of human lenlurc discrimination is discussed, together with recent nonlinear spatial filler models and a novel extension of the '.'••' > > v f. f»r' i j r r »' - » j r -«-< r c j i M"V i ' J - ' " " N ' - v * - r J ' > <s j < - -v r -» i ' . 2 T t " ' » - v< -' > - j " * V j "i j r s
4
1
V
v
x
i-
t _t', r
>r r r
iV*
v
r
>
r ^ ai±
J J < u.i
u
\ r
I^V
r „, j •>>*
( V « • » * * + «* j r i v< r i j i ^ ^ \5 j c +* + * * * * • <S »- J ' t f t j " ' V i¬ > r * * * ** • * • > ' * ^ "i *. / f f ' i ~ -n* * " \ r J < A 1 ^ - c VA "»*> F j > " ^ -» » J ' y '"^ V J J " -iij i -i • -)i-i' >ji.i.!,ri.r T Jv*t /•> c •"• j /' j -> v f *"vt * r 1 _ r* ' - 1 - r v i>-'V v j >t t , j\ , ^ * v
l
- , i,
v
v
v
1
(s
/ V
s
,
v
>
%
/
A
v
t
< "T-'V
f - i r *
(
i > ' Y ^ J
FIG. 2. Preattenlive (parallel) I C X I U T E discrimination vs serial scrutiny by focal attention. The X's among the L's pop out effortlessly, while finding tbe T's among the L's requires an element-by-element search. From Julesz and Bergen (1983).
288
740
direct, depending on which model is used to interpret the experimental results, the "searchlight of attention" scans about 30-60 msec/item (Sternberg. 1966; Treisman and Gelade. 1980; Bergen and Julesz, 1983; Weichselgartner and Sperling. 19871. Saarinen and Julesz (1991) have recently measured directly the scanning speed of focal attention by briefly presenting (with masking) numbers at random locations, and though observers had difficulty correctly reporting the order in the sequence, they could follow and identify as many as four consecutive numbers at rates of 30 msec/item with orders of magnitude above chance. Obviously this rate depends on the visibility of the texture gradients, and some parallel mechanism seems to facilitate serial search (Krose and Julesz, 1989; Wolfe and Cave, 1990). Because during our waking stales we are bombarded by countless visual objects and patterns in a scene and can only make a few eye or limb movements in a given instance, much of the unwanted information must be filtered out centrally. Furthermore, we cannot store at a given moment more than a limited number of items (called "chunks") in memory, (usually 7±2 chunks), which is another reason why we have to filter the amount of incoming information. It appears lhal. in order to inspect objects and events thoroughly, focal attention is needed. It seems as i f there are only a very few processes (perhaps only one) of the highest level that can count as well as observe consciously and in great detail objects, patterns, and events, and i l is focal attention that does the preselection and presents the selected patterns and events item-by-item to the highest levels. Indeed, we found (Sagi and Julesz, 19851 that observers were able to find the positions of texture gradients without scrutiny—they just popped out effortlessly land did not depend on the number of elements in the arrays). However, to identify the features on ihe two sides of the texture boundaries (whose differences yielded the lexture gradients) required serial search by focal attention (and the search increased monotonically with the number of elements). The question of what can we perceive preattentively while our focal attention is engaged (e.g., our ability to identify at some specific position a character and to identify as well some characteristic feature at a texture gradient) is a strategic problem, currently being studied by Braun and Sagi (1990). Preliminary results indicate that, for instance, some crude features of position can be perceived preattentively, bul fine positional information requires scrutiny by focal attention. Besides focal attention, which scans objects about five times faster than one can do with eye movements, there is another way to increase information intake. This is by learning increasingly complex chunks. This kind of learning is a high-level process, beyond the scope of my interest; it is in the domain of cognitive psychologists to study this important mental phenomenon. Here I only explain the main idea of "chunking." If ihe reader is very briefly shown, say. five pebbles, probably their number can be guessed without error, while wilh more peb-
289
bles mistakes will be made. However if the pebbles are arranged in chunks of regular pentagons, the reader can count up to five pentagons without errors, and thus can count 25 pebbles. Whether one can extend Ihis learning of pentagons to increasingly complex chunks of pentagons is a typical question cognitive psychologists like to study. The literature on attention is vast. The previous paragraphs emphasize the single searchlight metaphor that originated with Helmholtz (1896) and was neglected afterwards by cognitive psychologists for some time, as I alluded lo. Another prominent figure of psychology, the American psychologist William James (1890) asked ". . . how many ideas or things can we attend to at once, . . . how many entirely disconnected systems or processes of conceptions can go on simultaneously; the answer is, not easily more than one, unless Ihe processes are very habitual, but then two, or even three, without very much oscillation of the attention." James's question opened up an entirely new field of inquiry called divided attention. This is characterized by the metaphor of finite resources (or limited capacity). According to this melaphor the resources can be regarded as a "fuel," and one can change Ihe amount of fuel allocated to various multiple (usually double) tasks. In -i way, the question of what resources are left for perceiving certain features when focal attention is engaged belongs to the problem of divided attention, bul in a more concrete sense than usually asked by cognitive psychologists. (The Saarinen and Julesz experiment discussed above—though conceived in the spiril of the searchlight metaphor—can support equally well the melaphor of divided attention.) The interested reader might consult [he book edited by LeDoux and Hirst (1986), which contains a critical debate between psychologists and neurophysiologists, with four articles devoted to attention. The end result is somewhat disappointing, since there is no real consensus between workers in the two disciplines. However, the articles review the many theories of attention, including the bottleneck and variable filler theories by Broadbent (19581 and his many followers. For a recent review on some outstanding findings in cortical neurophysiology of attention, sec the article by Desimone and Ungerleider (1989). C. Structuralist versus Gestalt theories Having introduced the basic concepts above, we return to the definition of early vision. Conceptually defined, "early vision" should be identical to the pure bottom-up visual processes depicted in Fig. J, without being influenced by the top-down stream of semantic information. Neurophysiologically defined, "early vision" should correspond lo the first neural processing stages in the retina and the visual cortex. Psychologically defined, "early vision" should encompass a range of perceptual phenomena that can be experienced by humans in the absence of higher cognitive and semantic cues. In the next
741 sec lion, we shall see thai with the techniques of computer-generated random-dot stereograms (RDS), random-dot cinematograms (RDC), and texture pairs with controlled statistical properties (Julesz, 1960, 1962), it is possible to show that stereoscopic depth perception, motion perception, and preattentive texture segregation can occur in humans without the mysterious cues of form and Gestalt. This in turn permits us lo link these mental phenomena to the neurophysiological findings obtained in the last three decades by probing with microelectrodes individual neurons in the early processing stages of the monkey cortex. My structuralist approach of treating early vision in a thermodynamic fashion does not mean that T regard human vision in its entirety as amenable to such treatment. In real-life situations, bottom-up and top-down processes are interwoven in intricate ways, and the slogan of the Gestalt psychologists that "Ihe whole is more Ihan Ihe sum of its parts" — a negation of the structuralist view of science—is probably true. Indeed, in Fig. 3 it is obvious that in the right upside-down image the eyes and mouth of the face have been manipulated. Because we are not familiar wilh inverted faces, the original face (left side) and the right one appear quite similar. Turning the page upside-down, so that the faces are now correctly seen, we
experience a dramatic difference; (he untouched face appears normal while the manipulated face looks grotesque! Obviously the mouth, the eyes, etc., are not simple building blocks of a perceived face; instead some global and highly complex interactions between them exist, and the concatenations of these parts into a Gestalt make the study of form recognition so frustrating at present. What 1 am suggesting is not that Gestalt phenomena be overlooked, but instead that an entire subfield of vision— early vision—be experimentally isolated and studied by the proven structuralist methods of the physicists. [For an epistemological review on the limits and merits of structuralist models in psychobiology, see Uttal (1990).] While I assure the reader that I stick to the structuralist (reductionist) paradigm throughout this review, 1 acknowledge the important contributions that Gestalt ("configuration") psychologists have made in the first part of this century, from the principles underlying figure-ground organizations to perceptual grouping. One reason why their popularity waned was lhat their belief in electric brain fields whose convergence loward minimum-energy states was not supported by results gained from single-mieroelectrode neurophysiology. However, some of the Gestalt ideas have resurfaced recently under the guise of connectionist neural networks;
FIG. 3. Demonstration of Gestalt- The upside-down pictures appear rather similar in appearance, in spite of the fact that in one picture the eyes and mouth seem 10 be inverted. When the page is turned upside down, the two faces reveal a dramatic difference as a result of Gestalt organization. From Julesz (1984) after an idea of Thompson (1980!.
290
742
the interested reader should see the brief and clear article of Rock and Palmer (1990), My own view on connectionist neural networks in early vision is given in Sec. H I D; however, I would not be surprised to see a neo-Gestalt revival explaining higher visual functions in the not-toodistant future.
II. ON THE CREATIVE PROCESS AND SCIENTIFIC BILINGUALISM A. The manifold view of scientific inletaction and conjugacy, an approach to discovery From the Introduction it is quite cJear that the study of the human cortex is so complicated that a gamut of disciplines—including sensory, perceptual and cognitive psychology, neurophysiology, neurology, neuroanatomy, embryology, neuropharmacology, mathematics, engineering, information theory, neural network theory, physics, and so on—is necessary to study its workings. The question arises of how any individual can cope with such a variety of different fields. A good answer is given by Michael Polanyi 11969), as follows: Even mature scientists know little more than the names of most branches of science.. . . The amplitude of our cultural heritage exceeds ten thousand times the carrying capacity of any human brain, and hence we must have ten thousand specialists to transmit it. To do away with Ihe specialization of knowledge would be lo produce a race of quiz winners and destroy our culture in favor of a universal dilettantism.. . , Bui how can anybody compare the scientific value of discoveries in, say, astronomy wilh ihose in medicine? Nobody can. but nobody needs to. All that is required is that we compare these values in closely neighboring fields of science. Judgments extending over neighborhoods will overlap and form a chain spanning the entire range of sciences.
well ask whether there are ways to make this slow accumulation of knowledge more directed or conscious, thus accelerating progress. This question is intimately related to the essence of scientific creativity. While there must be many ways to create a new paradigm or get a novel insight, here 1 shall briefly discuss "conjugacy" and particularly "scientific bilingualism" as Two approaches lhat can be used to advance science in general and brain research in particular. The first approach, conjugacy, is the "trick" of establishing an equivalence relation between a difficult or unpected explored task (operation) and a familiar one, whose solution is already known. This is depicted in Fig. 4, where the difficult task O is to transport an object from point A to B through an impenetrable obstacle (wall). A possible way to complete this task is to drill a shaft S from point A , drill a tunnel T under the wall (assuming that drilling is a routine operation), and finally drill an inverse shaft S " lo point B . One case in point is the facilitation of the operation O of multiplication (division) by introducing the logarithmic transformation S and its inverse S lhal reduces the task to Ihe much simpler operation T of addition (subtraction). Similarly, cross-correlation O of two functions can be reduced to Ihe simple multiplication of the Fourier transforms S of Ihe two functions, followed by taking the inverse Fourier transform. As a mallet of fact, when neurophysiologists discovered Mexican-hat-shaped (Laplacian of a Gaussian) receptive field profiles of a concentric circular kind in retinal ganglion cells of ihe cal (KufHer, 1953) and elongated field profiles in some orienlalion in the visual cortex of the cat and the monkey (Hubel and Wiesel. I960, 1968), it was apparent that these spatial filter responses had to be cross-correlated with Ihe visual image (brightness distribution cast on the retina). [The 1
- >
CONJUGACY 0=STS"
The spread of knowledge between overlapping scientific disciplines according to this "manifold view," occurs naturally, similarly to the cooperation between members of a beehive. The only requirement is that specialists working, say, on the brain not have too narrow specializations, so they can indeed communicate with other specialists in overlapping areas of shared knowledge. Such cooperation can results in novel applications or technical breakthroughs. For instance, in the forties, electrical engineers started to develop special low-noise amplifiers that enabled neurophysiologists to record spike potentials in individual neurons. Another example is the development of surgical techniques to reduce epileptic seizures by splitting corlical hemispheres; these techniques enabled Sperry (19B2) and his collaborators to study the mental competence of each separate hemisphere. For an up-to-date review of hemispheric localization, see Gazzaniga (1989). One may
291
T
FIG. 4. The equivalence relation "conjugacy." or how to solve a difficult task by transforming il lo an already familiar lask.
743
Spaml
Racapliva
MT F
Field
I
Spalm I F r a q u s n c y
Spatial
c$iS3s^^ei^(?'
ID %
%
^\ % & t
(a)
^
(j. |
ss (J W *
*5
0
0
\^ & % *
^
> r v •/•/!•> < ^ >vv< v YY' f • * X M
i . N r .
/•A .vr > * * * c • «*«*^ *
.
305
F r o m Juiesz I1975J illustrated in ial. T h e pair becomes
757 1978; Julesz, 1981, 1984). [It can be mathematically proven that the four-disk method of Caelli, Julesz, and Gilbert (1978) is the only one using identical disks that can generate iso-second-order texture pairs in the Euclidian plane, thus permitting a thorough search for dual elements whose aggregates might pop out.] Figure 14(a) demonstrates the first iso-second-order discriminable texture pair we found, using the four-disk method. This figure, together with 14(b), and 14(c), which were generated with the help of the "generalized four-disk method" (in which the disks are replaced one by one by specific symmetric shapes), depicts iso-second-order texture pairs that are preattentively discriminable. Discrimination is based on local features which we named
pairs become indiscriminable), and therefore no linear spatial filters can model human texture discrimination. This essential nonlinearity of human texture discrimination will be demonstrated later in Fig. 17. Before 1 review recent work with nonlinear spatial filters, 1 briefly discuss theoretical studies aimed at generating stochastic textures with identical second- and third-order statistics, which led to the concept of texlons—the basic perceptual units (perceptual quarks?) of preattentive texture discrimination.
V. FROM TEXTONS TO NONLINEAR SPATIAL FILTERS
"quasicoliineariiy,"
A. A brief outline ot the lexton theory of texlure discriminalion In 1962 I asked a combined mathematical and psychological question that has kept many mathematicians and psychologists busy ever since (Julesz, 1962). Because 1 knew that texture pairs that differed in their first-order statistics would be effortlessly segregated (based on differences in tonal quality) and assumed that differences in second-order statistics could be distinguished from each other (based on differences in granularity), I wanted to study textures with identical Mh-order statistics but different W + Dth-order statistics. [Here I define M i l order statistics as the probability that the vertices of an 'iV-gon" (e.g., a hexagon, pentagon, etc.) thrown randomly on a texture fall on certain iV colors.] I wanted to determine the highest N that still yielded texture segmentation and wanted to know what perceptual quality would accompany such discrimination. For example, would texture pairs with identical second-order statistics (hence identical first-order statistics and identical Fourier power spectra) be discriminable, and what would the perceptual difference be? Surprisingly, at that time mathematicians did not know how to create such constrained stochastic textures, but from 1962 to 1975, Slepian, Rosenblatt, Gilbert, Shepp, and Frisch were instrumental in creating iso-second-order random texture pairs whose elements in isolation appear conspicuously different, yet as textures cannot be told apart. The indiscriminable texture pairs depicted in Figs. 12 and 13 were obtained by these efforts. It seemed that isosecond-order textures were so severely constrained globally that the visual system could not tell them apart. However, in 1977 and 1978, colleagues T. Caelli, E. Gilbert, and J. Victor helped me to invent stochastic texture pairs with global constraints of identical second-order (and even identical third-order) statistics that yielded preattentive texture discriminalion based on some focal conspicuous features, which I later called textons. Luckily, now that we know whal textons are and their role in vision has been clarified, the reader need not take the tortuous mathematical path that led to their discovery (for details see Caelli, Julesz, and Gilbert, 1978; Julesz el a!., L
"corner,"
and
"closure"
(Caelli,
Julesz, and Gilbert, 1978). Figure 14(d) shows iso-thirdorder textures (Julesz, Gilbert, and Victor, 1978) with tbe property that any triangle thrown on these textures has the same probability of its vertices falling on the same colors (however, the vertices of probing 4-gons will have different probabilities). As the reader can verify, discrimination is effortless and is obviously not due to computing differences in fourth-order statistics, but rather to elongated blobs of different aspect ratios and orientations. What these textons really are is hard to define. For instance, in Fig. 14(a), besides quasicollinearity there are also more white gaps between these elements, giving rise to anlitextons. As I pointed out (Julesz, 1986) it is not only the black (white) textons whose gradients yield texture discrimination but also the white (black) spaces between them, which act as textons too. In essence, we found that texture segmentation is not governed by global (statistical) rules, but rather depends on local, nonlinear features (textons), such as color, orientation, flicker, motion, depth, elongated blobs, and coJIinearity. to name the most conspicuous ones that are both psychophysical^ and neurophysiologically accepted as being fundamental. Some less clearly defined textons are related to ends of lines or terminators, which occur in the concepts of "corner" and "closure" and are hard to define for halftone blobs. Particularly important is the realization that—contrary to common belief—texture segmentation cannot be explained by differences in power spectra. On the other hand, it became obvious that instead of searching for higher-order statistical descriptors, the visual system applies some local spatial filtering followed by some nonlinearity, and the results must be averaged again by the next spatial filter stage. This is depicted in Fig. 15, which illustrates how a Kuffler-type unit (instead of a Mexican-hat-function profile, a simpler spatial filter of 2X2 pixel center addition with a 2-pixel-wide surround annulus of subtraction, as shown in the inset) acts on the iso-third-order texture pair of Fig. 14(d), followed by a threshold-taking device (Julesz and Bergen, 1983). When viewing the output of this nonlinear spatial filter in Fig. 15, our visual system performs a second spatial filtering by separating the two areas of different lumi-
306
75a
FIG, 14. Preattentively discriminable iso-second-order and iso-ihird-order texture pairs, (a) Iso-second-order texture pair that is indiscriminable due to the local conspicuous feature (lexton) of "quasicoliineariiy." (b) Iso-second-order texture pair Thai is di&crimmablc due to the local conspicuous feature (texton) of "corner," (ci Iso-second-order texture pair that is discriminable due to Ihe local conspicuous feature (lexton) of "closure." From Caelli, Julesz, and Gilbert (1978). (d) Iso-third-order texture pair that is discriminable due to the local conspicuous feature (texton) of "elongated blobs of specific orientation, width, and length." From Mes2, Gilbert, and Victorfl978).
307
758 nance distributions (that were obtained by the threshold taking). One problem with such nonlinear simple filters is their inability to account for the asymmetry problem of human
F I G . 14.
308
texture segmentation. It is known (Julesz, l98l;Gumsey and Browse, 19S7; Treisman and Gormican, 1988) that very often a given texture A pops out more strongly from a background of B than B does from a background of A ,
{Continued).
760
as shown in Fig. 16(a). For many years 1 was worried rhat the asymmetry problem of preattentive texture discrimination might depend on top-down processes and complex figure-ground phenomena. Therefore I am glad to report that the asymmetry effect in Fig. 16 can be explained (Williams and Julesz, 1990, 1991) by assuming that the nonlinear operation is the subjective contour phenomenon that "closes the gaps." When we discussed Fig. 6, 1 pointed out that subjective (also called illusory) contours are extracted in V2 and therefore belong to early visual processes. Figure 16(b) yields a similar asymmetry of texture discrimination to that of Fig. 16(a), even though the orientation of the gaps is not jittered. We shall return to this demonstration in the next section. B The asymmetry problem of texture segmentation and nonlinear spatial filters In 1988, several texture segmentation algorithms were developed based on the use of linear spatial filters fol-
io F I G . 15. Demonstration of how a simple local linear filter followed by a nonlinearity (threshold-taking) can segment Ihe isothird-order texture pair of F i g . 14(d). F r o m J u l e s ; and Bergen (1983).
lowed by squaring or some other nonlinear operation (Bergen and Adelson, 1988; Voorhees and Poggio, 1988). (For a definition of spatial filters, e.g., the Laplacian of a Gaussian, or Gabor filters, see Sec. II.B.) For a critique showing that linear spatial filters cannot segment textures, see Julesz and Krose (1988) and Julesz (1990c), Recently Williams and Julesz (19911 showed the nonlinear behavior of human texture discrimination as depicted in Fig. 17. Here the nondiscriminable iso-secondorder texture pair, invented by Caelli, Julesz, and Gilbert (1978), is shown on the right side. (This texture pair is one of the rare cases that have iso-second-order statistics without having to rotate the texture elements randomly.) We were able to decompose this texture pair into the sum of a highly discriminable texture pair and a nondiscriminable texture pair, as shown on the left side. The fact that a discriminable texture pair becomes nondiscriminable when a nondiscriminable texture pair is linearly added shows convincingly the violation of the law of superposition for lexture discrimination. More recently Fogel and Sagi (1989) and, independently, Malik and Perona (1990) developed texture segmentation algorithms based on the use of local spatial filters (oriented Gabor filters) followed by a quasilocal nonlinear operation (simple squaring in Fogel and Sagi's version and some inhibition between neighboring elements in the Malik and Perona algorithm), with a second spatial filter for final segmentation. It was most impressive that this approach emulated human texture discrimination performance as measured by Krose (1987), but still could not account for the asymmetry effects. Therefore it is of great significance that Rubenstein and Sagi (1990) extended their mode) by determining the variances of the local lexture elements' distributions after the nonlinear stage and found these variances asymmetric, particularly when the orientation of the elements was jittered, mimicking human performance. Their model could account for the textural asymmetries reported by Gumsey and Browse (1987) and probably will be able to handle some other asymmetries of the kind shown in Figs. 18(a) and 18(b). In Fig. 18 a typical input-output pattern of the Rubenslein and Sagi (1990) algorithm is presented as it segments a lexture pair (A in B) and its dual (B in A). It is mosl heartening lhat even the textural asymmetry effects lhat seemed to be based on figure-ground reversals—which in turn depended on unknown topdown processes—can be successfully explained by boltom-up processes modeled by relatively simple nonlinear spatial filters. The Rubenstein and Sagi (1990) model can account for the asymmetry problem by assuming that jitter of line orientation accounts for increase in variance of their filter's output, hence increase in texture discrimination asymmetry, However, the demonstration of Figs. 16(a) and 16(b) clearly shows that in general the asymmetry problem of Lexture discriminalion does not depend on orientalional jitter. Indeed, recently. Williams and Julesz (1991) extended the texton theory Lo include illusory con-
309
?6" psychology. The theory of trichromacy states that any color can be matched to a combination of three basic colors, red, green, and blue, such that the boundary between the selected color and the combination colors becomes minimum lor disappears without scrutiny). (As a matter of fact this theory, postulated by George Palmer in 1777, can be regarded as the first scientific atom theory. years before Dalton introduced atoms into chemistry.)
lours and "fill-in" phenomena between gaps and nearby elements, which could be regarded as antitextons. The filling in of the gaps by subjective contours can account for the asymmetry effect shown in Figs. 17Ial and 17(b), and the fill-in phenomenon between tenure elements can explain many other asymmetries, Antitextons together with the textons extend the theory of trichromacy, the only real scientific theory in
oooooooo oooooooo oooooooo oooooooo oooooooo oooooooo oooooooo oooooooo
oooooooo oooooooo oocooooo oooooooo oooooooo oooooooo oooooooo oooooooo (a)
oooooooo oooooooo ooooccoo oooooooo oooooooo oooooooo oooooooo oooooooo
cccccccc cccccccc cccccccc cooccccc cooccccc cccccccc cccccccc cccccccc (b)
FIG. 16. Asymmetry of texture discrimination. Ill The perception of gapped octagons among closed octagons yields weaker discrimination than vice versa, lb) Similar lo (a) but the position of the gaps is not jittered. This does not reduce Ihe asymmetry effect. From Williams and lulesi ( I N I , in press).
310
702 V
"1:
^
\
S
\
\
\
\
T(
\
\
\
\
S
•>, it
S
1,
\
\
Il
>l
^
^
V
'\
f.
%:
%
%
%
L.
L
«i
"H "S
*
y
L
L
*
>
ii
V
V
L» 1.
s
s
^,
"S
%•».i
•>. "S
:
t,
*\ ^
1,
-t,
%
^
\ ^ \ \ \
^
\
\
^
l
\ \ \ \ \
*\
\
^
*\
\
'\
L
i
L
i l
^
\
\ \ \ \ \ *\ l
\ \ ^
*\
*\
F I G . 17. D e m o n s . r a t i o n of [he nonlinearity of h u m a n Texture discrimination. A d d i n g a nondiscriminable Texture pair Co a highly discriminable texture pair renders the latter nondiscriminable. thus violating the law of superposition.
F r o m W i l l i a m s and Julesz
( 1 9 9 l in press). f
When I introduced textons into psychology I wished to extend trichromacy to encompass textures as well as colors- 1 wanted to know whether any texture could be matched to a finite (and not too large) number of textons such that the boundary between any textural array and an array containing a mixture of textons would perceptually disappear without scrutiny. It seems now that this can be achieved. The fact that the gamut of colors can be matched by just three colors is in itself amazing. The finding that the infinitely richer variety of 2D textures could be matched lo a mixture of a finite number of textons is even more unexpected! In Fig. 17 it was shown that the law of superposition does not apply for texture elements. Whether the combination of textons themselves is a linear or nonlinear operation is not yet known. Howard L . Resnikoff (1987a) in his interesting monograph The Illusion of Reality, devotes an entire chapter lo an early version of my texton theory and argues For the linear superposition of textons. Whether linear superposition still holds for the new texton theory, incorporating antitextons, remains to be seen. For the author, who spent much oF his scientific career in search of the elusive texton, it is anticlimactic, yet most satisfying, to find that quasilocat spatial filters can extract texton gradients without having to specify complex concatenation rules between adjacent textons. (I have no doubt that in the near future such filters will mimic human preattentive texture discrimination by incorporating several perceptual operations from the formation of subjective contours to the filling-in of gaps.) The reader familiar with speech research will recognize the similarity between "phonemes" and "textons."
311
While phonemes were never well specified, and complex computer algorithms are now used to cope with the many rules at their various concatenations in order to segment speech, nevertheless, the rudely defined phonemes permitted the development of phonetic writing, one of the great discoveries of human civilization. Had the development of phonetic speech coincided millennia ago with the invention of supercomputers that could automatically segment speech and talk, the skill of writing might never have developed. Of course, the fact that our voice organs limit the number of phonemes to a few dozen contributed to their universal acceptance. Similarly, the main insight from the texton theory was that, of the infinite variety of 2D textures, only a limited number of textons have perceptual significance and are evaluated quasihcalty in effortless texture discrimination. 11 use the term "quasilocal" instead of "local," because line segments, closed loops, corners, etc., have some finite dimensions.) Even though superfast computers will soon perform automatic texture segmentation, practitioners of visual skills—painters, designers of instrument panels or advertisements, directors of movies or TV shows—can benefit from the texton theory by the enhanced ability it gives them to manipulate the viewer's eye. Indeed, some of the great artists have instinctly known how to create a strong texton gradient to capture attention or create a texton equilibrium for which time-consuming scrutiny is needed to discover the hidden images. I end this section with a mention of the work of Enns (19E6), who made up arrays of little 2D perspective cubes with targets (cubes) biased to be perceived in one kind of 3D depth organization amidst cubes biased in the dual
763
312
764
3D organization. This depth from 3D perspective cues yields preattentive texture segmentation ("pop-out"), with the implication that, in addition to the textons of brightness, color, orientation, and aspect ratio of elongated blobs, flicker, motion, and stereopsis, even perceived depth in 2D perspective drawings might act as a texton. In my belief, some simple, quasilocal rules of 3D perspective, occlusion, transparency, etc., are probably utilized, without the need to invoke top-down processing. C. Recent psychological and neurophysiological tin dings in texture discrimination Perhaps the most important implication of the texton theory was its division of human vision into preattentive and attentive modes of action. Certain text u re-gradientlike detections could be performed in parallel without scrutiny, while some other tasks that required identification needed serial search by attention. Recently, Braun and Sagi (1990) lent support to the "twovisual-system" concept by showing that while an observer's attention was loaded (by being asked to identify a letter) it was possible for the observer to carry out simultaneously the detection of texture gradients. Other recent neurophysiological studies seem to support the texton-gradient notion of our perceptual studies. Van Essen el al. (1989) studied the responses of single units in visual areas V I . V2, and MT of the macaque monkey to stationary and moving patterns. In V I and V2 the presence of a static texture surround (e.g., an array of parallel needles) somewhat decreases the response to a central texture element (single needle) within the classical receptive field i f the orientation of the needles in the surround is perpendicular lo that in the center. I f they are parallel, the neural response is greatly reduced. Another neurophysiological finding is that of Robert Desimone and co-workers (Moran and Desimone, 1985; Desimone and Ungerleider, 1989), who located neurons in V4 whose firing for certain trigger features changes in accordance with the focal attention of the monkey. D. Learning effects in early vision One of the main themes throughout this review has been the phenomenological richness of early vision. Without cognitive and semantic cues, rather complex feats of false-target elimination can be performed in stereopsis and movement perception, asymmetry of texture discrimination takes place, subjective contours are formed, and so on. Even long-term memory effects occur in early vision. We have discussed some of the hysteresis effects that accompany the cooperative phenomena of global stereopsis and motion perception. Hysteresis is one of the simplest memory effects, in which some action modifies the outcome of a later response. For instance, Fender and Julesz (1967), using binocular retinal stabilization (by giving the subjects close-fitting contact lenses with mirrors attached), showed that a RDS had to be
brought within Panum's fusional area (i.e., within 6 min.arc alignment) lo obtain fusion. But after fusion, the left and right images could slowly be pulled apart by as much as 120 min.arc without breaking fusion. So, the fusional area depends strongly on the prior perceptual states. Another learning effect can be experienced when one first tries to fuse a RDS with large binocular disparities (Julesz, 1971 gives several demonstrations). At first, it might take minutes to achieve fusion, hut even years later one can do it quite easily. This is not real perceptual (cortical) learning, but tat her a ptocedural (cerebellar) Learning. When fusing RDS with large disparities, the novice is trying large convergence movements to bring the corresponding areas of the RDS into Panum's fusional area, and it is this unconscious learning of proper vergence movements that is remembered years later (Julesz, 1986a). Some other teal cortical Learning phenomena of global stereopsis are also reviewed in the previous reference. Here I give only two examples of Learning effects in p real ten five texture discrimination: First, when one presents some indistinguishable isosecond-order texture pairs that, however, are composed of element pairs with different convex hulls, after several hundred trials they can be effortlessly discriminated (Julesz, 1984). Second, and even more interestingly, Kami and Sagi (1990) report a remarkable long-term learning effect in simple texture discrimination tasks were learning seems to be local in a retinotopic sense. Whal has been learned must be relearned for each different area of the visual field. Surprisingly, [hough learning is specific for target location, it is not specific for target orientation, but rather for background element orientation. These authors briefly presented, in an array of horizontal Line segments, a few targets of adjacent line segments that were tilted from the horizontal orientation. Wilh small lilts, it took several sessions lo deled these targets correctly. This improvement was retained for the next sessions in the same retinal quadrant but was not transferable to other retinal quadrants. Changing the orientation of the targets (from left oblique to right oblique) had no effect on the learned performance. However, chaning the orientation of the background array (from horizontal to vertical) obliterated learning. These plasticity effects are of great interest, and it seems thai perceptual learning in early vision might be a useful tool in understanding the mysteries of human memory. I conclude this review with some perceptual phenomena that are not just bottom-up, bul require semantic memory and other top-down processes for their explanation. I have emphasized throughout this review that most of the perceptual processes in global stereopsis, motion perception, and lexture discrimination are essentially bollom-up; their linking to present neurophysiological results obtained in the early cortical areas of V I , V2, V3, V4, or MT is now possible. This is also in agreement with David Marr's view of computational vision's being basically boltom-up. However, in a recent short mono-
313
IBS
graph
Visual
Processing:
Computational
Psychophysical,
Roger Watt (1988) argues for algorithms in early vision that are under the control of high-level processes and memory. Indeed, in cognition there are many perceptual phenomena that depend on high-level processes, including semantic memory. A well-known example in cognition is the "wordsuperiority effect," in which the recognition of certain letters is superior when they are contained in an English word to when they are in a nonsense word or presented in isolation. This makes a good deal of sense, since recognition of letters and words is surely a high-level semantic process. However, as Naomi Weisstetn and Charles Harris (1974) have shown, the same phenomenon exists in visual perception, where they call it the object-superiority effect. The detection of a line segment of certain orientation was greatly improved if the segment belonged to a line drawing that portrayed a 3D object; it deteriorated i f the segment belonged to a random line drawing and was the worst if the line segment was shown in isolation. and
Cognitive
Research,
Of coutse, object and form recognition are more complex, high-level processes in which semantics and Gestalt organization play a prominent role. Therefore the experiments of Gorea and Julesz 11990) are of special interest. We convened the object-superiority effect from an identification paradigm into a detection paradigm, as follows: We presented an array of oblique line segments into which three horizontal and one vertical line segments were inserted- These four nonoblique line segments were clumped either in random fashion or representing a primitive human face (two horizontal lines representing the eyes, the vertical line segment between the eyes representing the nose, and the bottom horizontal line segment portraying the mouth). Observers were not aware that occasionally a face was presented, and were only asked to detect any line segment that was not oblique. Surprisingly, observers detected the horizontal and vertical line segments significantly better when they belonged to the face than when they belonged to a random clump (or to a four-line-segment symmetric pattern) that was not a face. I have always assumed that the detection of a line segment in a texture (based on a texture (texton) gradient between adjacent orientation differences) was a simple parallel bottom-up process. And here is a case in which even such a simple perceptual task might depend on top-down processing! I say "might" because the effect is very small (though statistically significant) and only four observers were tried. Because of the importance of this experiment, 1 would welcome the attempt of others to repeat this study of ours with more observers and perhaps some other experimental design!
VI. CONCLUSION A physicist reader who only glanced through this article might be surprised by the Jack of explicit mathematical equations. Obviously I did not want to bore the
314
reader with the difficult proofs of iso-Mh-order texture generation, including their ergodicities. Furthermore, the internal structure of the cooperative computer algorithms modeling stereopsis or texture segmentation are much more complex in detailed mathematical notation than the usual differential equations of the Maxwell or Schrodinger kind. Furthermore, physicists have a knack of ignoring "dirty" problems, such as computing the shape of a puddle of spilt milk on a kitchen floor (a favorite example of George Sperling, 1978). [When they are forced to do so, to compute, say, the shape of a plasma in a magnetic bottle, they are confronted with the same difficulties as their colleagues in psychobiology.] Indeed, 0) of the product of the spin and the magnetic field at each site and or the magnetisation, m. Here m may vary between 0 (no correlation! and I (completely correlated). The capacity increases wilh the correlation between patterns from o = 2 for correlated patterns with « - 0 and tends to infinity as m lends lo 1. The calculations use a saddle-point method and. ihe order parameters al the saddle point are assumed lo be replica symmetric. This solution is shown to be locally stable. A local iterative learning algorithm for updating the interactions is given which will converge to a solution of given • provided such solutions exist.
1. Introduction There has been a lot of recent interest in McCulloch-Pitts (1943) neural networks (Hebb 1949, Little 1974, Hopfield 1982). Analytic results (Amit et al 1985a, b, 1987a, b, Kanter and Sompolinsky 1987, Mezard et al 1986, Bruce et al 1987, Gardner 1986) have been obtained for thermodynamic and dynamical quantities using particular storage prescriptions for the coupling strengths. The storage capacity for the Hopfield model for random patterns is p = 0.14N, while the pseudo-inverse (Kohonen 1984, Personnaz et al 1985, Kanter and Sompolinsky 1987) stores N linearly independent patterns. For very correlated patterns, each with magnetisation m, where 1 — m~ In N / N , there is a prescription (Willshaw et al 1969, Willshaw and Longuet-Higgins 1970) which stores of the order of N /(\n N) patterns. However, the maximum storage capacity of Ihese networks can be larger. In the random case, the maximum number of patterns is 2 N (Cover 1965, Venkatesh 1986a, b, Baldi and Venkatesh 1987) and we will show that this increases for correlated patterns. 2
2
The network is defined as follows. Ising spins, S j — * 1 , are defined on each site i, i = 1 , . . . , N . They are updated according to the rule S,(/ + l ) = s g n ( M / ) - 7 - )
(1)
where S,(f) is the Ising spin at time 1 and the internal magnetic field M O at time t and site i is given by
MO =-77? E W )
2a
where J,j is the interaction strength for the bond from site j to site i. The interactions J and Jj, need not in general be equal. The field T, is a local threshold at the site i s
361
258 which is fixed in lime and Ihe interactions J„ are defined so thai {2b) at each site I The configuration {S,} is thus a fixed point of the dynamics (1), provided the quantity (3)
This paper follows a recent letter (Gardner 1987a) and will be concerned with the problem of choosing interaction strengths J,j such that p — a TV prescribed N-bit spin configurations or patterns. H = \,...,p\
# = ±1
i=
l,...,N
will be stored as fixed points of the dynamics defined in (1). It will turn out, however, that the requirement that each pattern is a fixed point is not sufficient to guarantee a finite basin of attraction and the stronger condition (4)
tfrW.ff})-!,)^
where K is a positive constant, will be imposed at each site r and for each pattern u.. Larger values of K should imply larger basins of attraction. The quantity of interest will be the density of states or the typical fractional volume of the space of solutions for the couplings {J^} to (2i>) and (4) and this will first be calculated. The volume vanishes above a value a of a which depends on the stability K and this determines the maximum storage capacity of the network. Secondly, a local iterative algorithm will be given which will converge to a solution of given K provided such solutions exist. In § 2, the volume will be calculated for uncorrelated patterns, where the thresholds T| are set equal to zero. For K = 0, the volume vanishes as a increases towards 2 and this determines the maximum storage capacity in agreement with the known results (Cover 1965, Venkatesh 1986a, b). The upper storage capacity Q ( K ) is calculated and decreases with n. In § 3, the calculation is repeated for patterns with a fixed magnetisation m and it is shown that the storage capacity increases with the correlation m between the patterns and, in particular, that a tends to infinity as m tends to 1 (for K=0). The network, therefore, can store more patterns if the constraints (4) are correlated. However, correlated patterns contain less information than random patterns and the information capacity of the network will turn out to decrease slightly with m. e
C
2
c
The calculation of the typical fractional volume of the space of interactions {J } which solve (4) is done by introducing replicas in this space while the prescribed patterns remain quenched. This is the inverse of what is done in the spin-glass problem (Edwards and Anderson 1975, Sherrington and Kirkpatrick 1975) where the interactions are quenched and the spins are allowed to vary. Since all pairs of spins are connected, the fractional volume can be obtained exactly using a saddle-point method. The integration is over variables, tj
in
362
259 and ihe replica-symmetric ansatz
Mf = M
B
q" = q
is assumed at the saddle point. The physical interpretation of the order parameter q is similar to that of the Edwards-Anderson order parameter in spin glasses and characterises the typical overlap between pairs of solutions for the couplings. As a increases, different solutions to (4) become more correlated and q increases. In particular, the fractional volume vanishes as q tends to its maximum value which is 1 (by equation (2b)) and the condition q = 1 therefore determines a . The local stability of the replica-symmetric solution is proved in the appendix. c
Since explicit solutions for the optimal ) are not known, it is necessary to have an algorithm for constructing solutions. In § 4 a local iterative learning algorithm will be defined which is a generalisation of perceptron learning (Rosenblatt 1962, Minsky and Papert 1969) to many threshold functions and to non-zero values of K necessary in order to obtain finite basins of attraction. The advantage of this kind of algorithm is that a convergence theorem exists. Provided solutions to the problem of storing the patterns with fixed K > 0 to equation (4) exist, the algorithms are guaranteed to converge to one such solution. v
2. C a l c u l a t i o n of the f r a c t i o n a l volume o r interactions for uncorrelated patterns with zero l o c a l threshold
In this section, the threshold T, in equation (1) will be set equal to zero and the f f will be taken to be random patterns. Since the quantity (5)
n
is one if the patterns can be stored and zero otherwise, the fraction of phase space V
T
which satisfies (2b) and (4) is given b y
for a given realisation of the random patterns {£f}.
The fractional volume V may be r
written
V r - f t V, where V, is the fractional volume in the space of interactions {/„} for fixed i. In the thermodynamic limit, we therefore have to study
We now assume that this quantity is self-averaging and it is necessary only to calculate (In V ) , the average of In V, over the quenched distribution of the patterns {(?; u. 1 , . . . , p}. This is done using the replica method,
=
i t #H{J
fi
dMa
dE
d
dF
° J J ^ "°
x e x p ^ a G l ^ . M J +G ^ F ^ E J -
x(J"
ft
2^