C OLO UR PE RCEPTION Mind and the physical world
Edited by
RAINER MAUSFELD Institute for Psychology, University of Kie...
174 downloads
1870 Views
5MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
C OLO UR PE RCEPTION Mind and the physical world
Edited by
RAINER MAUSFELD Institute for Psychology, University of Kiel, Germany DI E T E R H EYE R Institute for Psychology, University of Halle, Germany
1
Oxford University Press makes no representation, express or implied, that the drug dosages in this book are correct. Readers must therefore always check the product information and clinical procedures with the most up to date published product information and data sheets provided by the manufacturers and the most recent codes of conduct and safety regulations. The authors and the publishers do not accept responsibility or legal liability for any errors in the text or for the misuse or misapplication of material in this work.
C O LO U R PERC EPTI ON
3 Great Clarendon Street, Oxford OX2 6DP Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide in Oxford New York Auckland Bangkok Buenos Aires Cape Town Chennai Dar es Salaam Delhi Hong Kong Istanbul Karachi Kolkata Kuala Lumpur Madrid Melbourne Mexico City Mumbai Nairobi São Paulo Shanghai Taipei Tokyo Toronto Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York © Oxford University Press, 2003 The moral rights of the authors have been asserted Database right Oxford University Press (maker) First published 2003 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose this same condition on any acquirer A catalogue record for this title is available from the British Library ISBN 0 19 850500 0 10 9 8 7 6 5 4 3 2 1 Typeset by Newgen Imaging Systems (P) Ltd., Chennai, India Printed in Great Britain on acid-free paper by T.J. International Ltd, Padstow
P REFAC E
Colour is a topic of fascination, as well as a source from which to elicit puzzling questions that, almost inevitably, affect every thinking mind. In one sense, colours are in the mind of the beholder; in another sense, it is the world that is coloured. Colours seem to belong to the external world; nevertheless, they appear to have a different status from, say, forms (we have, within rational enquiry, greater difficulty in tying colours to the world as described by physics than we do form). Equally, colours are considered as being subjective in nature— nevertheless they have a different status from, say, emotional feelings. Colours are Janusfaced with respect to the subjective–objective boundary that has developed in the process of epistemological enquiry and the emerging natural sciences. Colour seems to be an attribute where the physical and the mental are glued together in an intimate and enigmatic way. Not surprisingly, for more than 2500 years, colour has been the most celebrated research topic in our attempt to understand the relationship of the physical and the psychological, the objective and the subjective. Demokritos, Descartes, Berkeley, Hume, Locke, and Kant, as well as Galileo, Newton, Maxwell, and Schrödinger, to mention just a few well-known celebrities of philosophy and physics, all dealt with colour, and thereby tried to fathom out the boundary between the physical and the psychological. Passionate controversies accompanied such enquiries; the most prominent names associated with these are Goethe and Newton, as well as Helmholtz and Hering. These controversies demonstrate that the corresponding issues extend far beyond colour proper. Enquiries into colour have brought to the fore deep divergences regarding the principles and aims of rational enquiry into the nature of perception, and about the limits of naturalistic enquiry into ‘How the mind works’. Colour, indubitably, is particularly suited as a testing ground for probing the soundness, range, and depth of our theoretical understanding of perception. A book dealing with colour will likely elicit quite different kinds of expectations about its content and intentions. This is because various domains of scientific enquiry centre around colour—as a distinctive feature that is used to classify phenomena and to formulate a cohesive body of corresponding questions. The intellectual landscape of colour-related investigations is extremely variegated, and we can encounter fundamentally different perspectives from which to approach phenomena of colour, each having its proprietary goals and intentions. A portrait of this intellectual landscape has to include: • physics (where Newton showed that, under certain conditions, the colour of a light spot
causally depends on the wavelength composition that reaches the eye);
• perceptual psychology (where the role played by colour within our perceptual system
has been studied);
vi
preface
• neurophysiology (where the neural coding of colour is investigated, and where specific
retinal receptors that give rise to a three-dimensional receptoral colour code have been identified);
• colour technology (where the possibility of assigning numerical co-ordinates to colours
and of representing them, under certain conditions, by a three-dimensional vector space, provides a basis for technological applications);
• evolutionary biology (where it is conjectured that the exploitation of certain environ-
mental properties, namely surface reflectances, by an appropriately tuned visual system provides certain evolutionary advantages);
• art history (where the peculiar power of colours to lend emotional and aesthetic qualit-
ies to the objects of our visual world has been studied, and where it has been investigated how painters solved the intricate problem of depicting aspects of light, such as ambient illumination or shadows, by using only pigments on a canvas);
• linguistics (where various aspects of how linguistic classifications operate on perceptual
representations have been studied);
• philosophy (where colours have been the object of all sorts of epistemological and
metaphysical enquiries, such as metaphysical attempts to justify the realism that is inherent in our common-sense discourse about colour).
Ralph Evans, one of the important figures of the field, opened one of his books with the remark that ‘color is a subject which involves so many people with such different attitudes and intentions that a book useful to all of them is quite difficult to write’ (Evans 1948, p. v). This qualification applies also to the present book. In order to avoid disappointment on the part of potential readers, it may be useful to say a few words on the scope and the intentions of this book. It focuses, as its title indicates, on fundamental issues of colour perception and thus of perceptual psychology. In perceptual psychology we have come to classify classes of perceptual phenomena according to characteristic perceptual attributes that they share, such as colour, form, or texture. Such a classification is based on the hope that such phenomena also share distinctive aspects with respect to the functioning of the perceptual system, and thus can be subsumed under a common explanatory framework. But how can we characterize a concept whose individual instances we call ‘colours’? In ordinary discourse we are not in need of a description or definition of what colours are; rather, we are simply acquainted with them (notwithstanding the fact that we regularly encounter difficulties in properly assigning names to individual instances of colour). In fact, such a definition would be entirely unintelligible to us were we not already acquainted with them. Colours are an irreducible aspect of our perceptual experience. They are a given part of the world as perceived. What, in rational enquiry, appears as a tension between objective and subjective aspects does not constitute a principal concern for our everyday way of talking about colours. Rather, ordinary discourse provides incredibly rich and subtle means of doing justice to both aspects. As in other cases of naturalistic enquiry, this situation changes once we turn to scientific enquiries into the ‘nature of colours’. In the process of such enquiries we modify, or even change, concepts of ordinary discourse so that they better
preface
vii
suit our explanatory needs. With respect to colour, the greatest conceptual change occurred when Newton discovered that the colour of a light spot depends causally on the spectral wavelength composition of the light. According to this, colours are not simply a more or less accidental aspect of the world-as-experienced, but rather are tied causally, in a lawful way, to the world as described by physics, namely the spectral wavelength composition of the light array that reaches the eye. Colours became so intimately attached to physical descriptions that they could be regarded, in a sense, as belonging to the world as described by physics. In past decades, the field of colour perception has again undergone an important shift in perspective—the result of taking advantage of functional and adaptive considerations, which have been made to flourish within the recent computational perspectives. This can be characterized as a shift from an emphasis on aspects of light, as the proximal stimulus of colour perception, to an emphasis on properties of objects. Not surprisingly, from a biological perspective, it became increasingly clear to what extent properties of colour perception are shaped, not primarily by the local wavelength composition of the light reaching the eye, but by the spectral reflectances of the objects that make up a visual scene. The physical regularities of our environment that determine properties of the incoming light array, and which are potentially exploited by the visual system with respect to ‘colour’, go far beyond local aspects of wavelength composition. The properties of the incoming light array can mirror complex properties of its causal history. This causal history can be determined by, for instance, the ambient illumination, various kinds of micro-illumination, or the reflectance and transmittance properties of surfaces and their micro-textures; furthermore, the complex interplay of illumination and surface characteristics can change with actions of the observer. Taking such physical regularities into account opens up entirely new ways of approaching colour perception. Previously, colour perception was studied with an emphasis on local aspects of colour coding. From a functionalist-computational perspective, in contrast, emphasis has been laid on more global properties to which the appearance of colours seems to be tied. By basing theoretical accounts of colour perception on physical aspects of the distal scene, rather than on properties of the proximal stimulus, impressive explanatory gains have been achieved. In the course of this functionalist-computational approach, colour science has entered one of its most productive phases. It came to be praised as a microcosm of cognitive science: a particularly rich and rewarding field for paradigmatically scrutinizing core issues concerning the nature of perception, and thus the nature of mental phenomena. In colour perception, more than in other fields, various levels of analysis cross-fertilize, or even amalgamate. These range from underlying neurophysiological aspects, stretching from peripheral colour coding to the way in which neural colour codes enter into higher cortical representations, to physical and ecological regularities that potentially could be exploited by the visual system, to functional and computational aspects, to evolutionary development and the genetic basis, and to comparative and developmental aspects. Although the field of colour perception, or, more properly, colour vision, has been intimately attached to neurophysiological questions during past decades, neurophysiological aspects are not in the foreground in this book. A proper account of the present stage of neurophysiological investigations of colour coding, a field with immense dynamics in techniques, findings, and perspectives, would require a book of its own and, indeed, can be found elsewhere (Gegenfurtner and Sharpe 2001). In this book, the primacy of conceptual,
viii
preface
functional, and computational aspects of colour perception is emphasized over issues of neural implementation. This reflects an attitude that is also shared by many neurophysiologists, such as Barlow (1983, p. 11), who reminded us that ‘anatomists and physiologists need to be told what the visual system does before they can set about the difficult task of finding out how it does it’. (See also Mollon et al. 2003.) Evidently, neurophysiological investigations of the neural processes underlying colour perception, and perceptual processes in general, can only be successful to the extent to which perception theory suggests the right kind of questions. But in this regard, we are still far from an agreement on what these questions are. In the field of colour perception as well as in perception theory in general, we have not yet settled upon what the relevant phenomena and facts are that a theory of (colour) perception is supposed to be able to explain and what should count as an adequate explanation. This may seem surprising in face of the fact that, one and a half centuries after Fechner and Helmholtz made colour perception a starting point for experimental psychology, the field of colour perception—‘the Queen of Psychophysics’ (Julesz 1994, p. 77)—is widely held to be the oldest and most mature domain of experimental psychology. There are undoubtedly not many areas in psychology where we have a comparable theoretical understanding—on a broad range of different levels of analysis—of what is under scrutiny. But nothing would be more misleading than to conclude that we have already arrived at an appropriate theoretical picture, and that the remaining task would be to identify the underlying neurophysiological mechanisms. Indeed, there is a great gulf between the immense progress that has been made in various colour-related domains of enquiry, on the one hand, and our theoretical understanding of the perceptual process itself and the role colour plays within our perceptual system, on the other. We can hardly expect a better theoretical understanding of colour perception without having a better theoretical understanding of the long-standing, deep issues that are at the core of cognitive science. One aim of this book is to put the field of colour perception into this wider context: to illustrate the impressive theoretical advances in the field, but also to locate our present theoretical understanding within the larger map of foundational issues of perception, where our present understanding still barely scratches the surface. In the process of producing the chapters of this book, we have placed great value on avoiding the usual character of research reports, such as typically found in journals. Rather, our goal was that the core ideas of the chapters should also be accessible to non-specialists, as far as scientific accuracy allows, and that the chapters should give a broad sense of the intellectual traditions in which various questions are embedded. The degree to which such intentions can succeed varies, of course, with the kind of questions addressed. While some chapters, due to their topics, meet this intention, others, which pursue more narrowly defined issues, necessarily carry a higher technical load. In general, however, we feel that the chapters convey the theoretical insights into colour perception that have been achieved, as well as the lacunae and complex unresolved issues. The book begins with a chapter that deals with physically based taxonomical aspects of colour, and supplies an entirely new and unprecedented coherent view on different colorimetric aspects. The chapters that follow span the entire range of issues of current theoretical interest, from elementary aspects of colour coding, such as ratio coding and adaptational schemes, to functional and computational aspects, including the role of colour
preface
ix
within perceptual architecture and its interrelation with other perceptual attributes. The book closes with two methodological and conceptually oriented chapters, on the role of perceptual errors in studying colour perception and, in its final chapter, with an ambitious attempt to develop a metaphysically sound realist concept of ‘colours out-there in the world’. The character of the book mirrors aspects of its history. The plan for the book originated during an academic year when its authors were at the Zentrum für interdisziplinäre Forschung (ZiF) of the University of Bielefeld. The main function of this centre is to promote dialogue between academic disciplines. During the academic year 1995–96, the ZiF offered us the opportunity to bring together a group of internationally renowned perception specialists from the fields of psychology, philosophy, artificial intelligence, biology, and neurophysiology. The aim of this endeavour was to tackle foundational issues of perception, more specifically how regularities of the physical world are mirrored within our perceptual system. The unique intellectual environment of this group, and the dialogues it elicited, had a great impact on all of us, which has extended far beyond the year at the ZiF. A dialogue between different disciplines is particularly apt for revealing tacit and implicit background assumptions and metatheoretical presumptions of its participants, as well as the world views that drive and guide investigations in different fields. Such a process seems to be almost essential in an area such as perception theory, which is still in search of an explanatory appropriate theoretical language. Therefore, another intention of this book— beyond conveying what has been achieved—is to contribute to this process in a fruitful way, and to set wider boundaries for further thought. This goal has shaped the specific, and somewhat unusual, structure of this book. Each chapter is preceded by a personal preface by its author, whereby the chapter is presented in the larger context and the author’s intentions are described. In addition, each chapter is followed by at least one commentary, mostly from other authors of the book, in which central ideas of the chapter are discussed critically. The prefaces and commentaries that accompany each chapter are intended to convey a lively picture of the intellectual landscape in which the investigations described in this book are located. The authors wish to express their gratitude to the ZiF for its generous support, financial and otherwise, both for the research year and for the subsequent conferences that we were able to organize in the course of the preparation of this book. The generous administrative help, commitment, and hospitality of staff of the ZiF was invaluable in enabling us to carry out this endeavour. Finally, we would like to extend our appreciation to the staff of Oxford University Press, in particular Martin Baum and Kate Smith, for their help in all steps leading to the publication of this volume. Rainer Mausfeld Dieter Heyer
References Barlow, H. B. (1983). Understanding natural vision. In Physical and biological information processing of images, (ed. O. J. Braddick and A. C. Sleigh), pp. 2–14. Springer, Berlin. Evans, R. M. (1948). An introduction to color. Wiley, New York.
x
preface
Gegenfurtner, K. R. and Sharpe, L. T. (ed.) (2001). Color vision: From genes to perception. Cambridge University Press: Cambridge. Julesz, B. (1994). Dialogues on perception. MIT Press: Cambridge, Massachusetts. Mollon, J. (2003). Normal and defective colour vision. Oxford University Press: Oxford.
C O N T ENTS
page xv
List of Contributors 1 Perspectives on colour space
page 1
Jan J. Koenderink and Andrea J. van Doorn Commentaries: From physics to perception through colorimetry: a bridge too far? Donald I. A. MacLeod Colorimetry fortified Paul Whittle
2 Light adaptation, contrast adaptation, and human colour vision
page 67
Michael A. Webster Commentary:
Adaptation and the ambiguity of response measures with respect to internal structure Franz Faul
3 Contrast colours
page 115
Paul Whittle Commentaries: A background to colour vision Michael A. Webster Contrast coding and what else? Hans Irtel
4 Colour and the processing of chromatic information
page 143
Michael D’Zmura Commentary:
The processing of chromatic information Laurence T. Maloney
5 The pleistochrome: optimal opponent codes for natural colours Donald I. A. MacLeod and T. von der Twer Commentary:
Thinking outside the black box Michael A. Webster
page 155
contents
xii
6
Objectivity and subjectivity revisited: colour as a psychobiological property page 187 Gary Hatfield Commentary:
7
Why is this game still being played? Paul Whittle
A computational analysis of colour constancy
page 205
Donald I. A. MacLeod and Jürgen Golz Commentary:
8
The importance of realistic models of surface and light in the study of human colour vision Laurence T. Maloney
Backgrounds and illuminants: the yin and yang of colour constancy
page 247
Richard O. Brown Commentaries: Colour construction Don Hoffman Fitting linear models to data Laurence T. Maloney
9
Surface colour perception and environmental constraints
page 279
Laurence T. Maloney Commentaries: On the function of colour vision Gary Hatfield Intrinsic colours—and what it is like to see them Zoltán Jakab
10 Colour constancy: developing empirical tests of computational models
page 307
David H. Brainard, James M. Kraft, and Philippe Longère Commentaries: Surface colour perception and its environments Laurence T. Maloney Comparing the behaviour of machine vision algorithms and human observers Vebjørn Ekroll and Jürgen Golz
11 The illuminant estimation hypothesis and surface colour perception Laurence T. Maloney and Joong Nam Yang Commentary:
Surface colour appearance in nearly natural images David H. Brainard
page 335
contents page 361 Donald D. Hoffman Commentary: The interaction of perceived colour and perceived motion? Richard Brown
12 The interaction of colour and motion
13 ‘Colour’ as part of the format of different perceptual primitives: the dual coding of colour
page 381
Rainer Mausfeld Commentaries: Phenomenology and mechanism Don MacLeod An internalist account of colour Don Hoffman
14 The importance of errors in perception
page 437
Alan Gilchrist
15 Avoiding errors about error
page 453
Robert Schwartz Commentaries: Deconstructing the concept of error? Alan Gilchrist Talking across the divide Paul Whittle On the veridicality of lightness perception Richard Brown
16 The place of colour in nature
page 475
Brain P. McLaughlin Commentaries: Asking about the nature of colour Margaret Atherton Who dictates what is real? Paul Whittle
Author Index
page 509
Subject Index
page 517
xiii
This page intentionally left blank
L I S T O F C O NT RIBUTORS
Margaret Atherton, Department of Philosophy, University of Wisconsin-Milwaukee, Curtin Hall 629, Milwaukee, WI 53201, USA David H. Brainard, Department of Psychology, University of Pennsylvania, 3815 Walnut Street, Philadelphia, PA 19104, USA Richard O. Brown, The Exploratorium, 3601 Lyon Street, San Francisco, CA 94123-1099, USA Michael D’Zmura, Department of Cognitive Sciences, University of California, Irvine, Irvine, California 92697, USA Vebjørn Ekroll, Institute for Psychology, University of Kiel, D-24098 Kiel, Germany Franz Faul, Institute for Psychology, University of Kiel, D-24098 Kiel, Germany Alan Gilchrist, Rutgers University, Psychology Department, 101 Warren Street, Newark, NJ 07102, USA Jürgen Golz, Institute for Psychology, University of Kiel, D-24098 Kiel, Germany Gary Hatfield, Department of Philosophy, University of Pennsylvania, Philadelphia, PA 19104-6304, USA Donald D. Hoffman, Department of Cognitive Science, University of California, Irvine, CA 92697, USA Hans Irtel, Psychology, University of Mannheim, Schloss, EO 265, D-68131 Mannheim, Germany Zoltán Jakab, Rutgers University, Department of Philosophy, Department of Psychology, 152 Frelinghuysen Road, Piscataway, NJ, 08854-8020, USA Jan J. Koenderink, Helmholtz Instituut, Universiteit Utrecht, Buys Ballot Laboratorium, PO Box 80 000, 3508TA Utrecht, The Netherlands James M. Kraft, Visual Sciences Laboratory, Department of Optometry and Neuroscience, University of Manchester Institute of Science and Technology, Manchester M60 1QD, UK Philippe Longère, 24 Chemin de la Cavalerie, Bat. A, 06130 Grasse, France Donald I. A. MacLeod, University of California at San Diego, La Jolla, CA 92093-0109, USA
xvi
list of contributors
Laurence T. Maloney, Department of Psychology, Center for Neural Science, New York University, New York, USA Rainer Mausfeld, Institute for Psychology, University of Kiel, D-24098 Kiel, Germany Brian P. McLaughlin, Department of Philosophy, Rutgers University, 26 Nichol Avenue, New Brunswick, NJ 08901-1411, USA Robert Schwartz, Department of Philosophy, University of Wisconsin Milwaukee, Curtin Hall 617, Milwaukee, WI 53201, USA Andrea J. van Doorn, Helmholtz Instituut, Universiteit Utrecht, Buys Ballot Laboratorium, PO Box 80 000, 3508TA Utrecht, The Netherlands Tassilo von der Twer, Department of Physics, Bergische Universitat Wuppertal, D-42097 Wuppertal, Germany Michael A. Webster, Department of Psychology, University of Nevada, Reno, NV 89557, USA Paul Whittle, Les Jonquets, 84750 Caseneuve, France Joong Nam Yang, Mail Stop 262–2, NASA Ames Research Center, Moffett Field, CA 940351000, USA
chapter 1
PERSPECTIVES ON COLOUR SPACE jan j. koenderink and andrea j. van doorn Preface We attempt to give a bird-eye’s view of ‘colorimetry’, a field that has existed (as a science) at least since the mid-nineteenth century. The first quantitative empirical work due to Maxwell and Helmholtz, the first comprehensive theoretical approach was due to Graßmann. Of course, these scientists based their studies on earlier work, of which that due to Newton is perhaps the most influential. It is interesting that there have been parallel developments that can be traced from Goethe and Schopenhauer to Ostwald and (perhaps) Schrödinger, with a branch off due to Hering. Important modern developments are due to Cohen (1970s) and Schrödinger (1920s). It is interesting that these parallel developments have remained largely isolated, with—to our mind—detrimental effects on the field. Another aspect that has worked (strongly we believe) against the development of colorimetry as a science has been its extreme anthropocentric orientation. For instance, no one has developed tetrachromatic colour vision (as many animals have) to any serious extent, no one has explored the consequences of variations in the nature of the action spectra, and so forth. Here we attempt to present a balanced perspective on the field, although we foresee that many cognoscenti will consider this essay quite besides the point, and generally in bad taste. Our perspective is not that of the professional colorimetrist (we don’t speak the proper CIE language), nor the experimental psychologist (we present no data on perception) or neurophysiologist (we don’t deal with the neural substrate at all). Our interest is more that of the interested amateur with a general background in the exact sciences. Our ideal would be to write an authorative text on the essential structure of colorimetry of the type one reads as a student in physics curricula (for instance classical mechanics or electromagnetic theory: excellent texts abound). In our opinion no such a thing exists in colorimetry. Of course, the present essay represents only a preliminary, feeble attempt. J. J. Koenderink and A. J. van Doorn
Introduction In this chapter we will consider the following sundry questions from the field of ‘colorimetry’: • Why are there so many ‘colour solids’ (A perusal of the literature [3] yields examples
of cubes, spheres, pyramids, double pyramids, cones, double cones, colour trees, and others.) Is any one type to be preferred?
2
colour perception
• Why are most colour order systems based on the ‘colour circle’, that is a periodic linear
sequence [36,37], whereas the spectral colours are naturally ordered as a linear, open segment [31,32]? (See Fig. 1.1.)
• Why is the most basic dichotomy recognized by the artist, that of the ‘warm’ and ‘cold’
colour families, not reflected in the colour spaces of ‘official’ colorimetry?
• Are Newton’s ‘homogeneous lights’ (the spectral colours) [31,32] to be considered espe-
cially basic? What is the position of other contenders, such as Goethe’s ‘boundary colours’ (Kantenfarben) [44,45] or Ostwald’s ‘semichromes’ (Farbenhalb)[34]? Is white to be considered a ‘confused mixture’ or the simplest colour imaginable?
• Apart from physiological considerations (fundamental response curves), the colour
spaces of colorimetry are only affine, that is to say arbitrary linear deformations are to be considered irrelevant [6]. Is there any way to define a preferred (‘canonical’) basis? Does a ‘natural’ metric exist?
• Is there any principled way to mensurate the colour circle, or can this only be done by
‘eye measure’ (which places it outside the realm of colorimetry proper)?
• Are the colours of colorimetry a truthful (though limited) reflection of the phys-
ical structure of the radiation? That is, colour vision—in the approximation of colorimetry—simply a form of low-resolution spectroscopy, or does the observer’s share go beyond this?
Apart from these key questions, we will have occasion to discuss several related, more technical questions. We assume that the reader has scant knowledge of the technical aspects of colorimetry. We provide an introduction that stresses the conceptual issues and will suppress technicalities, especially those involving extensive formal, mathematical notation. Because most of the formal structure of colorimetry is essentially of a geometrical nature, we will make up for the lack of formalism through illustrations that should yield an intuitive grasp of the structures involved.
Colorimetry The world of colour When we open our eyes we see the world around us, its geometrical layout (important for navigation) and various objects of possible importance to our continued existence. The objects have geometrical (size, shape) as well as material (colour, texture) attributes. These objects are involved in many processes, that is to say, our visual world is in a continual state of flux. If we are able to assume a painter’s attitude, we may succeed (at least somewhat and for short periods) in perceiving the world as a two-dimensional array of coloured patches [39]. The ‘colours’ can be perceived (with effort) as essentially meaningless (i.e., not particularly favourable to effective optically guided behaviour), shapeless and textureless qualities, appearing in such and such a direction [12,39]. This is certainly not a natural state of affairs. In real life we deal with processes and objects, and ‘colour’ is used to label particular objects or classes of objects (red fireengines, blue-eyed blondes, ripe and unripe apples, etc.). The ‘world of colour’ is a mess that we won’t be concerned with in
Figure 1.1 The spectrum and the colour circle. The spectrum is an open, linear segment. Notice how the colours merge into black at each side. The spectrum is not complete as the purples are missing. The colour circle is complete by construction. It is a closed, continuous (thus periodic) arrangement. All hues are as colourful and bright as the printing permits (See colour Plate 1 in the centre of this book). The relation between spectrum and colour circle is a major topic of this chapter.∗ ∗ Colour
versions of all figures in this chapter can be found in the website: http://www.phys.uu.nl/∼wwwfm/Navigation/Resframes.html.
4
colour perception
this paper. Instead we will consider the austere discipline of colorimetry1 which applies a thoroughly Procrustean method [47] to obtain something essentially simple and elegant (though perhaps not very interesting). The physics of radiation The physics of vision is rather simple, at least it is in its essential traits. The typical situation for the terrestrial observer is as follows: the observer and the objects are immersed in a transparent medium. A source of radiation (the sun) irradiates the scene. Radiometric interactions redistribute the radiation over the scene in complicated ways; moreover, the radiation interacts with the materials (surfaces of objects, the medium). Eventually we may pool all this in a single function, the spectral radiance. This function specifies, for any vantage point and any direction, the nature of the radiation that can be sampled by a directionally sensitive detector such as the human eye [13,14]. We may conceptualize the spectral radiance as a huge filing cabinet that contains photographs taken with arbitrary filters, from arbitrary points, in arbitrary directions. (Something like this is actually provided nowadays via satelites observing the Earth from orbit: nobody is actually in a position to ‘process’ all these data.) In practice, the observer samples only part of the available spectral radiance, that is to say, only from a limited number of vantage points, in a limited number of directions, with limited spatial resolution and with a limited spectral resolution and coverage of the electromagnetic spectrum. In the setting of this chapter we will only be concerned with what we will denote ‘beams’. A beam is a spectral radiance in such and such a direction, i.e. the (normal) observer will most likely see a ‘patch’ in that direction ‘caused’ by the beam. In order for the observer to actually ‘see’ the patch, the beam must be of sufficient radiance, of the right spectral composition, have the correct geometrical configuration, the observer must have open eyes, must look in the right direction, not be blind, and so forth. Thus (in the parlance of psychophysics) the ‘beams’ are the ‘stimuli’, whereas the ‘patches’ are the ‘percepts’. Of course percepts are only empirical facts for the perceiver (they are personal and ‘immediately given’), and in order for colorimetry to constitute a scientific endeavour we need to substitute an objective and operational alternative [7]. The physical parameters that describe a beam are its direction, its extent,2 and its spectral composition. In colorimetry we typically fix direction and extent, thus we may ignore these parameters. We are then left with spectral composition, or, more precisely, the ‘spectral radiant power density’. This is a measure of radiant power, that is to say, radiant power density is measured as radiant energy per unit of time, unit of area and unit of solid angle. Instead of radiant energy, one may exploit the discrete nature of radiation and count 1 ‘Colorimetry’ is a discipline that is far removed from ‘colour science’. For one thing, it has nothing to say about colour appearances. Whereas colorimetry is a scientific discipline, ‘colour science’ is—like everything that calls itself a ‘science’ (e.g. Christian Science)—not science. That doesn’t mean it has no intrinsic interest. On the contrary, if you are of the opinion that colorimetry is fairly trivial and not overly interesting, we find ourselves in agreement. 2 The technical notion is ‘throughput’ or ‘étendue’, basically the capacity of the beam to sustain rays. Informally, the radiance times the étendue denotes ‘the number of rays’ in the beam [13,14].
perspectives on colour space
5
‘photons’ per unit of time, unit of area and unit of solid angle. Then the definition of radiance involves only time and space and is indeed very elementary. In order to specify the spectral density we need to analyse the radiation in terms of its wavelength composition. This is usually done by means of ‘monochromators’ or ‘spectroscopes’ (Newton’s experiment with the prism is the paradigmatic case). The beam is decomposed into other beams with the special property that they only show up radiant power in limited spectral regions. The wave nature of electromagnetic radiation can be exploited by analysing the beams in terms of ‘frequency’, this is simply related to the ‘wavelength’ in vacuum3 (or, to a good approximation, air). In the ‘spectral analysis’ we measure radiant power in limited frequency or wavelength intervals. The relevant measure is spectral density, that is, the radiant power per frequency or wavelength interval, e.g. per 10 nm. Notice that one should not say that beams are composed of ‘monochromatic beams’, but only that they can be subjected to spectral decomposition.4 (A sausage can be cut into slices, but the (uncut) sausage is not composed of slices!) This is illustrated by the fact that the variance of the spectral density becomes arbitrarily large when we decrease the wavelength interval: thus the ‘monochromatic beam’ becomes less well defined the more we try to isolate it! Sunlight is essentially ‘noise’ due to the incoherent superposition of myriads of microevents in the photosphere of our sun. It has a continuous spectrum. In this case it is evident from the physics that it should not be conceived of (in the ontogenetic sense) as the superposition of many monochromatic beams, but rather as the superposition of many very short-lived pulses (much like the acoustical signal due to an applauding audience). Such pulses each have broadly distributed spectral power. Spectral decomposition is, as always, possible, but does in no way reveal ‘elementary constituents’. The discrete nature of the radiation can also be exploited in spectral analysis. Here one counts photons in limited regions of photon energy. Similar considerations apply as discussed above. The photon energy is simply related to the wave frequency via multiplication by Planck’s constant. The ‘visual region’ involves wavelengths (in vacuum) of about 380–740 nm, frequencies of about 4–8 × 1014 Hz, or photon energies of about 1.5–3 eV. This is a biologically highly relevant wavelength range, for many reasons. For instance, sunlight peaks in this region, thermal radiation from the animal’s own body can be neglected, photon energies are in the range of biologically important chemical bonds, and so forth. What colorimetry is Historically, colorimetry evolved as a branch of photometry [26]. Thus the methodology derives essentially from early photometric practice. In photometry it was soon realized 3 The frequency ν and wavelength λ are simply related as λν = c, where c denotes the speed of light in vacuum. The wavelength in a medium of refractive index n is λ/n. The energy of a photon is ε = hν, where h denotes Planck’s constant (Wirkungsquantum). 4 As remarked in the text, whereas a sausage can certainly be cut into slices, the (uncut) sausage can hardly be said to be composed of slices. Radiation can indeed be decomposed into monochromatic beams, but that doesn’t indicate that these are more ‘elementary’ than the beam itself, nor that the beam should be ‘composed’ of monochromatic beams. Since the theories are linear, we can just as well decompose monochromatic beams into other functions as long as we have a sufficiently large collection of them.
6
colour perception
that although observers are quite bad at estimating radiant power, they are dependable as ‘null indicators’, that is to say, they can judge reliably the equality of radiant power in simultaneously perceived patches. A look into the eyepiece of a paradigmatic photometer reveals a circular patch in a dark surround, divided into two hemifields. The observer’s task is to distinguish between a uniform patch and a bipartite one, i.e. to judge whether the division between the half-fields is noticeable. Colorimetry further pushes this paradigm to its limits. In the colorimetric paradigm, two beams are compared by shaping them into patches that appear as contiguous hemidiscs on a black (radiationless) background. The patches appear to the observer as textureless and at no particular distance or attitude (not like the surface of any object). The task is simply to distinguish between a uniform disc and a bipartite field with different colours (patches) on both sides of the division. When the division is not noticeable, one says that a ‘colorimetric equation’ between the two beams has been obtained. We will write this fact formally as A ⇔ B, therewith indicating that the beams A and B are not distinguishable. The ‘colorimetric equation’ A ⇔ B in no way implies equality of the radiant spectral power densities of the beams A and B. Here we may read that the beam appearing on the left-hand side of the equation fills the left-hand hemidisc, whereas that appearing on the right-hand side of the equation fills the right-hand hemidisc. Beams can be ‘added’ by simply superimposing them (for instance, one may take two slide projectors and project their images on the same screen). Such ‘incoherent superposition’ leads to a simple addition of the radiant spectral power densities [4]. The physics is really easy. It is an empirical fact that if two beams A and B cannot be distinguished in the colorimetric paradigm, then the beams A + C (here we indicate incoherent superposition of beams by the ‘+’ sign) and B + C are also indistinguishable, quite independent of the nature of the superimposed beam C. This suggests a simple complication in our notation. Let A and B be indistinguishable beams, then we have A ⇔ B. We also write this as A − B ⇔ O, where O is the ‘null beam’, i.e. not any beam at all, but the absence of radiation. That this is a reasonable equation is clear when we superimpose B, for then we obtain A ⇔ B, since the beam O + B is in no way different from beam B (adding nothing to a beam doesn’t change it). This explains plus and minus signs in colorimetric equations: to get rid of the minus sign add beams at both sides of the equation, thus obtaining a realistic situation, which is the one described by the equation (see Fig. 1.2). Colorimetry is simply the investigation of beams via the colorimetric paradigm. Its eventual goal is nothing more than to be able to predict the truth value of any colorimetric equation involving any conceivable physical beams [6] (but also nothing less!).
A C
B
A = C–B
Figure 1.2 The convention of ‘negative’ coefficients in colorimetric equations.
perspectives on colour space
7
Alongside our operational definition of the addition of beams we may also operationalize the multiplication of a beam with a scalar (a ‘scalar’ is simply a number). This simply changes the total radiant power in the beam but not the relative spectral radiant power density (the ‘quality’ of the radiation). There are various ways of doing this, one example is to use a ‘neutral density filter’ (‘sunglasses’) that attenuates the beam. When we start with very intense beams, then virtually all beams used in practice will be attenuated and multiplication with a scalar is operationally well defined. Since we have a null element (total darkness or no beam at all) and a linear addition and multiplication with a scalar, the space of beams is a ‘linear vector space’ [2,43]. However, not all elements of this linear space correspond to beams, e.g. when A is a beam then we can certainly not find ‘the beam −A’, simply because ‘negative radiation’ doesn’t exist. Yet we will admit such entities and call them ‘virtual’ or ‘imaginary’ beams, then any beam that can be realized will be called a ‘real’ beam. Thus the space of real beams becomes a subset of the (linear) space of beams. This formal artifice is most convenient because we have so many tools that deal with linear spaces effectively. What colorimetry is not It is often held that colorimetry involves an extreme stimulus reduction and, for that reason, does not bear on the issue of ‘colour appearances’, i.e. the descriptions observers venture regarding the colours of objects in their ken. It is indeed the case that colorimetry has nothing to say concerning colour appearances: that is both the reason for its phenomenal success and its fundamental limitation. However, it is not the case that such is due to stimulus reduction. Rather, the success (and limitations) of colorimetry are due to extreme response reduction. The reason why people so often stress the stimulus reduction (which certainly applies to colorimetry as conventionally practised) is related to the literature on the ‘modes of appearance’ of patches [12,17,39]. A patch can appear as a non-localized light (self-luminous), as an irradiated material surface, and so forth. Such modes of appearance depend critically on the context in which a patch appears [25]. Many conventional colour terms (grey, brown) cannot even be applied without such a context present. In the conventional colorimetric setting the patches appear as non-localized, non-material ‘film colours’ or ‘aperture colours’. In such a minimal context, one can have no greys or browns for instance—the world of colours has shrunk to minimal proportions [35]. But exactly because the colorimetric paradigm doesn’t require the observer to venture a colour description, the modes of appearance are essentially irrelevant to the paradigm. It is only required that the observer judges indistinguishability of patches: this is a severe form of task or response reduction. It is easily conceivable to perform colorimetry in a natural setting, with full context available. Suppose one presents the observer with a fully textured visual array (a landscape, say) and that one successively alternates a tiny patch (of such a size that it is nearly uniform, say one ‘pixel’ of a digital image) with a reference, fiducial patch. Then the observer’s task is to notice whether that patch appears steady, or alternating in time (toggling between two colours). Such a patch may well appear grey or brown to the observer (there’s plenty of context available), but the observer is never asked to comment upon appearances in the first place.
8
colour perception
We stress that the fact that colorimetry works is due to a rather extreme form of response reduction and that the severe stimulus reduction typically involved in conventional colorimetry is not essential at all (though it may serve to improve precision and reliability of settings, etc.). The causal explanation is physiological:5 colorimetric equations imply equal stimulation at the level of the ‘retinal photoreceptor action spectra’. Thus colorimetric equivalence implies identical input to the brain. Consequently, colorimetry involves only the flimsiest interface between the physics and the physiology [7]. In a sense, colorimetry does not involve any neural processing at all and is almost pure physics except for the ‘accidental’ form of the cone action spectra, which is an empirical datum of physiology. The results of elementary colorimetric investigation Colorimetric equations can be obtained with a precision and repeatability that is rare in experimental psychology. The ultimate reason is simple: colorimetric equations don’t depend on neural processing. If we speak of ‘colours’ in the context of colorimetry, we don’t mean perceptual qualities in any sense, but something very specific (and, many people would say, dull and uninteresting): a colour is an equivalence class of beams that can replace each other in colorimetric equations. This is a sensible definition because of the strong empirical evidence that different beams A and B (say), such that A ⇔ B, may be substituted for each other in any colorimetric equation (for instance, A ⇔ B and A + C ⇔ D together imply B + C ⇔ D). This definition of ‘colours’ would really be trivial if colours corresponded in a 1–1 fashion to beams. However, this is very far from being the case: there are infinitely many more beams than colours. The surprising empirical fact is that very many different beams look the same. A colour corresponds to a ‘metamer’; that is, an infinite set of indiscriminable beams. The interest, then, is to characterize the correspondence between colours and beams; that is, the partition of the space of beams into metamers, which are mutually disjunct subspaces by construction. Thus we have ‘beams’, which are physical entities, which cause ‘patches’ to be seen, which are psychical entities (percepts). Then we have ‘colours’ in the sense of colorimetry, which are equivalence classes of beams, where ‘equivalence’ is to be understood via the colorimetric equations. The colours are psychophysical entities. Colorimetric equations, again, are based upon the indiscriminability of patches. Now ‘indiscriminability’ is operationally well defined and is to be considered fully objective because it is established by a second person (‘scientist’) who investigates the behaviour (confusion of patches) of the first person (‘subject’).6 Thus colours, like beams, are well-defined objects for the exact sciences, although beams are physical and colours psychophysical objects. 5 In this chapter we discuss only colorimetry proper. Of course there are also psychological, physiological, molecular biological, evolutionary, medical, etc. branches of colour study of great intrinsic interest. We will simply ignore them, for instance we won’t refer to the ‘cone action spectra’ and so forth. In colorimetry proper there is simply no need, doing so would complicate the issues unnecessarily. 6 The scientist may try to use all the tricks in the book to try to ascertain that the subject simply reacts like a reliable physical instrument. Thus ‘catch trials’ may be introduced, stimuli are presented in random order, and so forth.
perspectives on colour space
9
The correspondence between colours and beams is characterized fully through a set of empirical laws first formulated by Graßmann [15]. Several alternative formulations have been framed. Perhaps the clearest formulation is the following: first, we notice that the space of beams S forms a linear vector space. Addition is defined by incoherent superposition, multiplication with a scalar by the introduction of non-selective attenuation (e.g. ‘neutral density filters’, like sunglasses). The zero element is no radiation at all, or total darkness. The dimension of this space is infinite (intuitively every wavelength is an independent dimension). Then Graßmann’s laws boil down to the statement that the space of colours, C (henceforth called ‘colour space’) is a three-dimensional linear space, a projection of the space of beams. This completely characterizes the relation between beams and colours in a formal way.7 It is only necessary to find the precise projection operator, which is—in principle—a simple enough experimental task. This has been done for a number of normal observers and an average has been canonized by committee (the CIE or Commission International d’Éclairage [9], in 1931). We will use the CIE 1964 10◦ data for the illustrations in this chapter. In the above discussion we have schematized things a little, for the sake of initial clarity. First of all, we have omitted ‘Graßmann’s fourth law’, which has a rather separate status. This ‘law’ states a rule to calculate the ‘luminance’ of any beam and thus addresses the problem of heterochromatic photometry. The empirical status of this law is in serious doubt. However, because luminance is so important (the customers of the electricity supplies pay for it, it has even a legal status), the law has been instated by committee [8] and is thus true by definition. Notice that this makes ‘luminance’ a purely formal entity. It doesn’t have any meaning in terms of the perceptual attributes of patches. In this chapter we will ignore the topic of luminance altogether; in our opinion (and in full agreement with Schrödinger’s elegant treatment [40,41]) it doesn’t belong to colorimetry proper.8 Another fact is that the beams fail to fill the linear vector space of beams. The reason is simply that ‘there exists no negative light’. That is to say, given any beam A (say), the beam −A that would (theoretically!) yield darkness when added to A, namely A − A ⇔ O, cannot be realized as a physical (existing) beam. Any beam for which the spectrum contains negative spectral radiant power densities likewise cannot be realized. Real beams are characterized by strictly non-negative spectral radiant power densities. That is not to say that non-realizable ‘virtual’ or ‘imaginary beams’ don’t have their uses—they often occur in colorimetric calculations. However, they have no reality as such, only when they occur in combinations that are realizable. Geometrical intuition suggests that all beams fill an ‘∞tant’ (after ‘quadrant’ in the plane and ‘octant’ in three-dimensional space) in the linear space of beams characterized by the non-negativeness of all coordinates. Since colourspace, C, is a projection of S, it is geometrically evident that all (really existing) 7 Here we ignore the (both important and interesting) historical development completely. Instead, we try to focus immediately on the modern views on the matter. 8 Here we find ourselves in full agreement with Schrödinger [40,41]. Sadly, preciously few others agree, because luminance is of such enormous utility. However, we think that one should not turn a blind eye to its very low (really rock bottom!) scientific status. It really detracts greatly from the formal elegance (combined with practical utility) of colorimetry proper. Often luminance is so intricately woven into the formal development of colorimetry that it becomes quite unclear what the scientific status of the concepts really is. It is to such formulations of colorimetry (all too frequent) that we object here.
10
colour perception
colours are confined to a convex conical volume in the space of colours, the projection of the ∞tant. Colours outside this cone may be called ‘virtual’ or ‘imaginary’, the others ‘real’. These virtual colours are often handy in calculations, but can’t occur as such. They only occur in mixtures that are realizable. In the above we have already referred to ‘convexity’. This is an important and fundamental issue in colorimetry. Most sets occurring in colorimetry have an essential linear structure, but fail to be linear spaces (by a narrow margin), in the sense that not any point in the linear manifold actually occurs in real life. Only some points are ‘real’, many are merely ‘virtual’ or ‘imaginary’. The simplest example (and one that may be considered paradigmatic) is the space defined as a ‘monochromatic beam’. This space is one-dimensional, for the only freedom is the radiant power of the beam, its nature being fixed (through its wavelength). Thus the space representing a monochromatic beam is a line. However, closer scrutinity reveals that only half of the line represents real beams, the other parts are merely virtual. The reason is simply that the radiant power is strictly non-negative. The set of real beams is convex, it is a half-line including the origin. When two beams, M1 and M2 , are in the set of real beams, then so is αM1 + (1 − α)M2 for 0 ≤ α ≤ 1. This is obviously true because M1 and M2 must be (positive) multiples of a single beam of unit radiant power, and we can immediately calculate the multiple for the mixture (it is a function of α, of course) and ascertain that it is again positive. Thus the monochromatic beams of a given fiducial wavelength form a convex subset of a one-dimensional linear space. This may well be the simplest example. Other instances of convex sets in colorimetry are the set of real beams (an ∞tant in the linear space S), the set of real colours (a convex cone in the linear space C), and so forth. One important geometrical fact is that convexity is invariant under linear transformations, including projections. This is one reason why C inherits many of the structural properties from S, that is from the physics of electromagnetic beams. Notice that whereas a linear space is necessarily (except for the trivial case of zero dimension) of infinite extent, convex sets may also be finite volumes. Another important geometrical tool is the construction of a ‘convex hull’. The convex hull is the set that can be obtained by linear combination of the type αx + (1 − α)y with 0 ≤ α ≤ 1. For instance, the colour cone is the convex hull of the (non-convex!) set of colours caused by monochromatic beams, the ∞tant of real beams is the convex hull of the set of monochromatic beams, and so forth. The convex hull of a finite number of colours is an example of a convex set of finite extent. The shape of the cone of colours has to be determined empirically.9 We find that its boundary consists of two connected components: The ‘spectral cone’ the generators of 9 The structure of the spectral cone is to be considered ‘accidental’ from the perspective of colorimetry and optics. We simply have to accept the measurements. Formal investigations reveal that different structures are conceivable that would really upset the nice structure of colorimetry. Such structures would be ‘pathological’. For instance, the spectral locus might not be simply connected, or fail to be ‘convex’ (in the sense that it would not be fully on the boundary of its convex hull). The fortunate ‘accident’ is no doubt due to evolutionary pressure, like the ‘accident’ that the retina is close to the focal plane of the eye’s optics.
perspectives on colour space
11
Figure 1.3 Some views of the spectral cone in the CIE basis. Notice the plane of purples. (See colour Plate 2 in the centre of this book.)
which represent colours due to monochromatic beams, and a planar sector, the so-called ‘plane of purples’ (to be explained later) (see Fig. 1.3). All real colours find their place in the interior (convex hull) of the spectral cone [42]. If one attenuates a beam (‘sunglasses’), one notices a decrease in ‘brightness’ whereas the ‘colour’, in some restricted sense, appears to remain invariant (we are talking about ‘perceptual attributes’ of patches here). This is the reason why one often finds it convenient to regard colours modulo their magnitudes. Thus we form equivalence classes µc, where µ > 0 and call them ‘the chromaticities of the colours’. The black colour, o, has an
12
colour perception
Figure 1.4 The chromaticity diagram. The cone of colours is projected from the origin on the ‘chromaticity plane’, c. This figure illustrates the CIE basis and choice of chromaticity plane. The red space curve is the locus of the monochromatic beams of unit radiant power: It generates the spectral cone. (See colour Plate 3 in the centre of this book.)
undetermined chromaticity in this scheme. The chromaticities are thus half-rays at the origin of colour space. The set of chromaticities is only two-dimensional, thus we have gained some simplicity by factoring out the magnitudes. A convenient way to represent the chromaticities is to assign a ‘chromaticity plane’ and to represent each chromaticity by its representative in this plane (this is equivalent to a central projection from the origin of C on the chromaticity plane, see Fig. 1.4). The choice of chromaticity plane is essentially arbitrary, though it is necessary to take a plane that does not contain the origin, and it is convenient to take one that cuts the cone of colours into two: a finite volume (containing the origin) and an open, infinite part. Then the boundary of the cone of colours appears as a convex curve in the chromaticity plane. It consists of two parts. One part is curved and has a roughly horseshoe shape. This part represents the chromaticities of the monochromatic beams and is the ‘spectral locus’ in the chromaticity diagram. The other part is a straight line segment and contains the chromaticities of the purples. It is the intersection of the plane of purples with the chromaticity plane. All real colours have chromaticities in the interior (convex hull) of this curve. All points outside represent ‘virtual’ chromaticities. Notice that—unlike colour space—the chromaticity plane has no natural origin. The chromaticity diagram has been very important in the history of colorimetry. Most reasoning used to be done in terms of chromaticities instead of colour space proper. Chromaticity diagrams are in common use in practical, everyday colorimetry. However, one should remember that it is only an incomplete representation of colour space and may easily lead the unaware astray. One example is the law of additive colour mixture. In C this is simply vector addition, nothing to it. However, in the chromaticity plane the law becomes rather more involved and beginners in colorimetry are apt to make mistakes that (unfortunately) often go unnoticed.
perspectives on colour space
13
R A
r
g
G
b B
A =rR + gG + bB Figure 1.5 The beam A is mixed from the primaries R, G and B. After adjustment to indistinguishability the fractions (r, g , b) (pure numbers!) are the ‘coordinates of beam A in the basis {R, G , B}. At the setting of equality the photometer field needn’t look like A, in fact this will certainly not be the case if one of the coordinates turns out negative. This is irrelevant.
Gauging the spectrum Starting from Graßmann’s laws, we may begin empirical colorimetric investigations. Maxwell [28,29] was the first person to seriously practise this. One proceeds as follows. First, one picks a ‘basis’ for C. This means that one selects a triple of fiducial beams called the ‘primaries’ {P1 , P2 , P3 }. The choice is essentially arbitrary, except for the condition that no equation c1 P1 +c2 P2 +c3 P3 ⇔ O can be established; the primaries should be ‘independent’. Next one takes each monochromatic beam of unit radiant power and wavelength λ, say Mλ ¸ in turn and establishes a colorimetric equation between it and the primaries. We write this equation Mλ ⇔ aλ1 P1 + aλ2 P2 + aλ3 P3 . The coefficients (aλ1 , aλ2 , aλ3 ) are known as the ‘trichromatic coefficients’ (see Fig. 1.5). These are pure numbers, dimensionless as the physicist would say, namely simply the fractions of the primaries that enter into the colorimetric equation. We typically sample the wavelength scale at equal intervals at N sample points (say), for clearly we cannot sample all monochromatic beams, nor is this necessary. We sample at wavelengths λmin + (i − 1)λ say, where λmin = 380 nm, λ = 10 nm and we stop at λmax = 740 nm, say. In this case we have N = (740 − 380)/10 + 1 = 37. In practice, N will be a large number, say of the order of 102 , it stands for the ∞ dimensionality of S. Now we have a set of N triples of numbers. We collect them in the N × 3 matrix, A (aij , i = 1 . . . N , j = 1, 2, 3), the so called ‘colour matching matrix’. The columns of the colour-matching matrix are known as the ‘colour matching functions’, and the rows are the trichromatic coefficients for the corresponding monochromatic beam. Notice that the colour-matching matrix is obtained row by row, the columns only appearing after the whole process has been completed. This whole procedure is known as ‘gauging the spectrum’.
14
colour perception
Given the spectrum of any beam, sampled at the N fiducial wavelengths, s = (s1 , . . . , sN ) (say), we find the corresponding colour via the colour-matching matrix: c = AT s. Here c is an element of the three-dimensional space C, whereas s is an element of the N -dimensional (notice that N stands for ‘∞’ dimensions!) space S. Thus the three components of c are simply the total radiant power when the spectrum of the beam is weighted by the corresponding colour-matching function. This is an immediate consequence of the linearity of Graßmann’s projection of S into C. The transpose of the colour matching matrix, AT projects the spectrum to the space of colours, dropping N − 3 dimensions in the process. Of course, the initial choice of primaries was quite arbitrary. Any other choice would have done as well. What difference does this make? Well, the colour-matching matrix will turn out different and the trichromatic coefficients for the same beam will turn out to be different too. Even when we change only a single primary, then (in general) all three colourmatching functions will change! Thus these numbers have no invariant meaning at all. However, it is easy to show (because of linearity) that the colour spaces for different choices of primaries are all affinely equivalent, that is to say, they are all related through linear maps (deformations, represented by non-singular 3 × 3 matrices) that depend only on the change of primaries. Such maps merely ‘relabel’ the colours and establish isomorphisms between the representations. Seen through ‘affine spectacles’ there exists only a single colour space and all specific representations only appear different, like the different perspectives of a single city. Of course, this raises many questions. For instance, since all colour-matching matrices (that is colour-matching matrices for different choices of primaries) yield essentially the same colour space except for the inessential representation, what exactly is common between them? And if all representations of C are equivalent, might not one representation still be considered ‘canonical’ in some sense? We will return to these—indeed pertinent— questions.
Warm and cold colours Consider the colour-matching functions in some basis. We may plot them in C, for clearly each triple of trichromatic coefficients represents a point in C (See Fig. 1.6). The colourmatching functions thus represent a one-parameter family of points, parameterized by the wavelength of the monochromatic beams of unit radiant power. Such a point set will be a space curve in C. Indeed, when we plot the colour-matching functions in some arbitrary basis we obtain a smooth curve. This curve departs from o as the wavelength progresses beyond about 380 nm (the short wavelength, or ‘blue’ limit, of the visual spectrum), for shorter wavelengths are invisible (ultraviolet (UV), X-rays and γ -rays are equivalent to total darkness). Then the curve frolics around in C, only to return towards o again when the wavelength approaches 740 nm, the long wavelength or ‘red’ limit of the visual spectrum (for infrared (IR), microwaves and radiowaves are also equivalent to total darkness). The curve leaves o and returns to o via different directions. We may construct the tangent directions
perspectives on colour space
15
Figure 1.6 Two views of the spectral locus and the plane of purples. The plane of purples (p) is spanned by the two tangents at the origin (O) of the locus of equienergetic monochromatic beams (s). Notice how the curve first moves away from the plane of purples, then returns to it. The turning point is the division between the ‘warm’ and ‘cold’ spectral colours.
at the two limits of the visual spectrum. These two directions of course span a plane,10 this is the ‘plane of purples’ mentioned earlier. Empirically we find that the curve (known as the ‘spectral locus’) runs on one side of the purple plane only—for the short wavelength end of the spectrum the curve moves away from the plane of purples, then at about 537 nm it runs parallel to this plane, then, for the long wavelength end of the spectrum it returns towards the plane of purples again11 (Fig. 1.6). This clearly separates all monochromatic beams into two families: those of wavelengths 380–537 nm and those of wavelengths 537–740 nm (these wavelength indications are only approximate). This dichotomy is quite independent of the arbitrary choice of primaries and is to be considered an invariant property of the structure of C and the spectrum of the achromatic beam. The latter can only be conventional, of course, say average daylight or a flat spectrum (as here) (the difference is actually slight). It appears that this dichotomy coincides with the basic dichotomy conventionally recognized by visual artists, that is the division of all colours (here we should really be saying: ‘(coloured) patches’) into ‘warm’ and ‘cold’ families. Colorimetry has nothing to say on why the monochromatic beams of 380–537 nm are termed ‘cold’ and those of 537–740 nm, ‘warm’. However, the dichotomy with the transition at 537 nm (‘greenish’) is unlikely to be coincidental. An invariant of the colour-matching matrices What do all colour-matching matrices (that is, for different choices of primaries) hold in common, that they may generate essentially the same colour space? The answer has to 10 Thus the points in the plane of purples occur as additive mixtures of the limiting (far red and deep blue) beams. The problem is that in the limit we need infinite radiant power to actually see something. Thus the plane of purples is really a limiting entity, a tangent plane whose points represent colours that just fail to be ‘real’. 11 One might think that ‘distance from the purple plane’ were a notion that is alien to the affine nature of colour space. However, we can formulate a fully affine definition easily. The reason is that we can consider planes parallel to the purple plane in affine geometry, and also their order. This suffices to find the farthest point.
16
colour perception
s
b
f
c = AT s = AT f O
o
Figure 1.7 S is the space of spectra, here it is represented as a fiber bundle, e.g. we have factored it as S = F ⊕ B. Notice that though S is ∞ D (or N D), here it is illustrated as 3 D. The ∞−3 D space, B, is illustrated as 2 D and the 3 D space F, is illustrated as 1 D, simply to obtain an intuitive geometrical picture. Colour space C is a 3 D space, quite distinct from S. Here it is illustrated as 1 D to show the important fact that it has the same dimension as F. In fact F and C are isomorphic. Notice that each point of F carries a translated copy of B. This ‘fiber over the fundamental’ is the metamer defined by that fundamental. Thus we have represented the space of beams S as a stack of metamers, each metamer corresponds 1−1 with a colour. When we pick any beam, spectrum s (say) we may project it on F via Cohen’s matrix R, thus obtaining the fundamental component f. We may also project it on the black space B, thus obtaining the metameric black component b. Notice that (in S) s = f + b. The (transpose of) the colour matching matrix maps the beams on the colours, i.e. s on c. Notice that this is the same as the map of f on C since the black component b remains causally ineffective. Thus C is the space of beams S modulo the metameric black space B.
be found in the parcellation of the space of beams into metamers. Clearly, there is only one such parcellation in terms of the geometry, although the metamers are given different coordinates by the various colour-matching matrices. The set of metamers as such must be identical for all choices of primaries because it doesn’t depend on the primaries at all: it is, once and for all, given by the structure of the visual system, that is to say, the cone action spectra. Coordinatization should not matter. A standard way to put order in this mess is to write the space of beams as a direct sum of two subspaces, ‘fundamental space’ (F) and ‘metameric black space’ (B), thus S = F ⊕ B (see Fig. 1.7). The symbol ‘⊕’ stands for the ‘direct sum’ of the ‘mutually orthogonal’ subspaces F and B. What does this mean? It means that we will write any beam with spectrum s (say) as the sum of two components, an (often virtual) ‘fundamental beam’ with spectrum f ∈ F and an (always virtual) ‘black’ beam b ∈ B, thus s = f + b. We will do this in such a way that the fundamental component yields the colour of the beam, that is to say c = AT s = AT f , whereas the black beam yields the black colour (the origin of C, hence the name), that is to say AT b = o. This can always be done and in a unique manner. Formally speaking the black space is simply the null space of the transpose of the colourmatching matrix, and fundamental space is the orthogonal complement. Since the rank
perspectives on colour space
17
of the colour-matching matrix is 3, the null space (metameric black space) has dimension N − 3, whereas fundamental space has dimension 3, that is the same dimension as colour space. Fundamental space and black space together span the space of beams. Conceptually we have now constructed a very elegant representation: any beam can be decomposed into a fundamental beam and a black beam. The black beam is causally ineffective, in the sense that it has no influence whatsoever upon the colour. The physiological explanation is that the black component does not stimulate the retinal cones and thus is unable to modulate nervous activity: the black beam never even reaches the brain! The fundamental component is the unique causally efficient part of the beam. Any nervous modulation is only due to the fundamental component. Moreover, different fundamental components yield different colours, thus the fundamental components stand in a 1–1 relation to the colours. Fundamental space is isomorphic with colour space. A colour corresponds to a metamer,12 which is an infinite equivalence class of beams. In any metamer there exists one canonical representative which is the fundamental component, all other beams in the metamer are equal to the fundamental component plus some black beam. We obtain the full metamer (‘metameric suite’) when we add all possible black components (that is the metameric black space) to the fundamental component. Since fundamental space and metameric black space are mutually orthogonal, we can write the length of the spectrum s as f 2 + b2 (this is nothing but the Pythagorean theorem). Thus among all metameric beams the fundamental component is the ‘shortest’ one, in a sense the ‘simplest’ (thus canonical) representative. We may then conceive of S as a fibred space, a three-dimensional linear space of fibres. (Fig. 1.7). Each fibre is a copy of B, attached to a specific fundamental component. The visual system sees only the fibres (a three-dimensional manifold) but cannot resolve the structure within the fibres. The fundamentals are sufficient to represent the fibres, but any point within the fibre would do as well: the visual system doesn’t notice the difference anyway. This is a rather pleasant way to put it because it shows up the fact that F is not really different from S: it is simply a low-resolution image of it. Moreover, C is isomorphic with F, that is, nothing but a linearly deformed copy of it. This is an important insight because it is often held that there is a fundamental cleft between the physical world (S) and the world of perceptions (C). This is evidently a nonsensical view since C turns out to be simply isomorphic to a subset of the space of beams. We will return to this issue later. Clearly, fundamental space does not depend on the choice of primaries. Thus the null spaces of all colour-matching matrices must be the same: here we have the sought-for invariant of the colour-matching matrices. It is possible to compute an operator in the space of beams (a N × N matrix) from the colour-matching matrix that projects any beam on its fundamental component: Rs = f . The matrix R is ‘Cohen’s matrix R’. This is a convenient numerical invariant of the colour-matching matrices. Indeed, if we calculate Cohen’s matrix R from an arbitrary colour-matching matrix, we always obtain the same result (See Fig. 1.8). Cohen’s matrix R enables us to calculate metameric beams of any given beam. For instance, consider beams A and B with spectra a and b. Then the beam with spectrum Ra + (b − Rb) must have the same colour as beam A because it is composed of 12
The term ‘metamer’, which is in common use today, was coined by Ostwald [34].
18
colour perception
700 650 600 550 500 450 400 400
450
500
550
600
650
700
Figure 1.8 Contourplot of the entries in ‘Cohen’s Matrix R’. Notice the three extrema on the diagonal, at about 440 nm, 530 nm and 590 nm: these are characteristic for the human visual system.
the fundamental of A and the black component of B. Here the beam B is quite arbitrary! Here we have a recipe to generate colours in the metamer of a at mad abandon. One thing to take good notice of is that the fundamental and black components of any given beam are unlikely to be physically realizable (the black beam obviously never will!) but are most likely to be virtual beams. If this is considered to be a problem, we can try to construct a beam as close to the fundamental one as possible that is realizable by adding a suitable black beam.13 However, in most cases the fact that the fundamental component is only virtual is no big deal. Canonical bases In order to motivate the discussion below we first consider an intuitive example from ‘real space’ instead of the space of beams. Consider a linear transformation from three-dimensional space on to a plane. We suppose the three-dimensional space to be metrical, i.e. we are able to compare length in three orthogonal directions. Similarly, we assume the image plane to be metrical. Such a transformation is clearly a projection, not an isomorphism, for it drops one dimension. Its null space is a direction in the three-dimensional space. All points that lie on a line extended in 13 There exist a variety of technical methods to handle this. The most effective are ‘linear programming’ and ‘iterative projection on convex sets’.
perspectives on colour space
19
Figure 1.9 On the left the result of a random projection (same matrix as used in Fig. 1.10) from three-dimensional space on the plane. The subject is a human head. Clearly there are strong deformations in this projection. On the right the same projection in a canonical basis constructed automatically via the ‘singular values decomposition’. This projection is clearly free of deformations (except for the loss of the depth dimension) and is clearly to be preferred. The canonical basis is determined up to an isometry in the plane. There is no way to figure out automatically that human observers prefer a specific orientation of ‘the vertical’.
this direction will be mapped on a single point of the plane, this is the basic notion of ‘projection’. The orthogonal complements of the null space are planes in the three-dimensional space that are perpendicular (in a metric space perpendicularity is well defined) to the null space. We expect of a ‘nice’ projection that configurations in such a plane are transferred to the plane of projection without suffering any further deformation. For instance, if we take a photograph of a frontoparallel wall (using a long telelens to suit our example—almost ‘parallel projection’) we expect an ‘undistorted image’, we should only lose the ‘depth’. Now our general linear transformation will fail miserably in this respect: Fig. 1.9 shows an example. It will induce some arbitrary deformation (‘shear’) that yields a highly distorted image. In order to do better we may attempt to find a basis in the plane (the image) that ‘undoes’ the deformation. There exist standard methods (‘singular values decomposition’ SVD) to find such a basis.14 Such methods are used in many fields where it is sometimes unavoidable that images are distorted because of unfortunate imaging parameters (for instance, in photographing a skyscraper we may have to tilt the camera, which will have the effect of a distorted image, the skyscraper appearing as if it were tumbling down). 14 Singular values decomposition has quickly become the premier tool of linear algebra. Standard methods are generally available that let one handle basically any matrix of practical interest robustly and in reasonable time. The method exploits the basic fact that any linear map between two ‘inner product spaces’ (linear spaces with a metric) can be simply factored. When the domain space has higher dimension than the target space, the map can be considered as the combination of a trivial map (dumping everything on a subspace) and a deformation (isomorphism). The deformation again can be considered as a simple rescaling of the axes when one picks the representations in the two spaces shrewdly. The ‘SVD’ (‘singular values decomposition’) does exactly this, it presents one with canonical bases for the spaces and a set of ‘singular values’ which are the required scaling factors.
20
colour perception
1 1
0.5
0.5 0
0 –0.5 –1
1 0
–1 0 1
–0.5 –1 –1
–0.5
0
0.5
1
–1
Figure 1.10 A simple example of the SVD for a map from three- to two-dimensional space. Shown is a cloud of dots, uniformly distributed on the surface of the unit sphere in three-dimensional space. We picked a random map, given by a matrix with rows {0.2364, −0, 1011, 0, 6127} and {0.08872, 0.1462, 0.711}. The straight application of this map produces a cloud in the plane that fills an oblique elliptical area with 1:4.61: the ‘image’ of the sphere is highly deformed. A straight SVD produces bases in three-space (indicated in the left figure) and the plane (right figure) and scaling factors (‘singular values’) that let us draw a truthful (not deformed) image: the cloud fills a circular area. The nullspace (‘black space’) is shaded in the left figure; it is one of the directions of the canonical SVD frame.
We encounter a very similar situation in colorimetry. The colour-matching matrix discards the N − 3 dimensional metameric black space and maps the fundamental space in an 1–1 fashion on colour space C. The only part of the space of beams that we will ever ‘see’ is fundamental space, F ⊂ S: can we at least hope to see an undistorted image of it? The necessary requirements are given: in the space of beams, S, we can compare lengths in arbitrary dimensions, we have simply spectral radiant power density to compare. Singular values decomposition immediately supplies us with a basis in which C is an undistorted image of F. We only lose the ‘depth’ (that is, the black component) but gain a truthful image of the beams otherwise, it is shown in Figs 1.10 and 1.11. The situation is much like the visual projection where we drop one dimension (the depth) out of three (breadth, height and depth); here we drop (N − 3) dimensions (black) out of N (the full spectral radiant power density). This shows that colorimetry is simply ‘low-resolution (only 3 degrees of freedom) spectroscopy’. In this respect colour vision is indeed very close to the physics of radiation, the only arbitrary factor is the structure of the null space. What gets lost (misses its way to the brain) depends on the accidental (that is, shaped through evolution of the species) cone action spectra. In Fig. 1.12 we show an ‘undistorted’ version of the spectral locus and spectral cone. There is still some freedom left: we may apply arbitrary isometries (rotations, reflections, etc.) to F. Isometries leave all lengths and angles invariant. Similarly, the image in C will be subjected to an isometry, but all geometrical configurations will be unaffected by this. In summary, we are quite free to pick any orthonormal basis in F and this will change our C. One unique choice is to pick the achromatic axis and to let another direction lie in the plane that divides the warm–cold colour families. The third orthogonal direction is then
(a)
0.3
Yellow–blueness
(b)
Achromatic
0.2
0.25 0.2
0.1
0.15
0
0.1 –0.1
0.05 0 400 450 500 550 600 650 700
400 450 500 550 600 650 700
Green–redness
(c) 0.2 0.1 0 –0.1
–0.2 400 450 500 550 600 650 700 Figure 1.11 The colour-matching functions of the canonical basis. These function are similar to the classical ‘colour moment’, ‘red–greenness’ and ‘yellow–blueness’. The latter ‘opponent signals’ are similar to neural encodings found in the brain. Apparently the human brain uses a representation not unlike the canonical basis, an undistorted representation of the physical structure of beams.
(a)
(b)
Achromatic/yellow–blueness plane
Achromatic/green–redness plane
Figure 1.12 The spectral cone in the canonical basis: an undistorted view.
(c)
Yellow–blueness/ green–redness plane
22
colour perception
automatically defined. The only remaining freedom is the orientation, but this can only be settled by fiat (like left and right: which is which?). Notice that this is a representation that is unique and independent of the (fully accidental) choice of primaries. We obtain a truthful image of F as our C, that is indeed the same in any laboratory, because it is fully independent of our accidental choices except for the choice of the achromatic beam (more on that below) and the orientation (discussed above). In Fig. 1.11 we show the ‘primaries’ (two of the three are virtual colours) that result from a straightforward singular values decomposition of the CIE 1964 colour-matching matrix and subsequent rotation to the canonical frame. The achromatic beam here is the uniform, flat spectrum. The first primary is simply the fundamental of the equal energy spectrum, the other two are yellow–blue and green–red ‘opponent signals’. Notice that the zero crossings (about 490 nm for the yellow–blueness and 570 nm for the green–redness) are indeed close to those of Hering’s Gegenfarben. In Fig. 1.12 we illustrate the spectral locus in the canonical basis. Here is an undistorted view of this important configuration. We’re getting a limited, but true-enough glimpse of the space of beams. The second and third primary are indeed remarkably close to Judd’s specification of the Hering system [21], but the first one clearly differs—apparently the Hering frame is far from being orthogonal. That colorimetry is essentially just low-resolution spectroscopy shows that the discipline isn’t really to be considered part of experimental psychology. Rather, it is to be considered simply (applied) physics, or—because of the accidental status of the structure of the black space—perhaps physiology. The only arbitrary choice left is in the description of the physics of the situation. What is the ‘natural representation’ for the beams? This is important because it defines the metric of the space of beams which will induce a metric in the space of colours. This is not a question of physics but one of ‘ecological optics’. As we have seen earlier, the spectral description can be done in terms of frequency (essentially photon energy, see below) or wavelength, whereas the amount of radiation can be measured as radiant power or as photon number density. The choice makes a difference to the eventual representation, but neither physics, nor colorimetry proper, have anything to say on the matter. The conventional choice is radiant power as a function of wavelength, but this is purely for historical reasons, dating all the way back to Newton [31,32]. A, perhaps more rational, representation would be photon number density as a function of photon energy. The reason is that the photoreceptors are essentially photon counters, whereas the photon energy is the major causal factor in the interaction of electromagnetic radiation with matter. Thus there are good biological reasons (that is the ‘ecological optics’) to prefer the latter representation.
Additional structure: the achromatic beam When we circumnavigate the boundary of the cone of colours (avoiding the origin) we experience a continuous change of ‘hue’. Since the path is closed, the hues form a periodic linear sequence. Since real colours in the interior of the colour cone also have ‘hues’, one wonders about the loci of constant hue in colour space. They must be surfaces that intersect the boundary of the colour cone transversely. Some topological reasoning soon reveals that somewhere in the interior must exist a singular curve of points of undetermined hue.
perspectives on colour space
23
One may designate such colours ‘hueless’ or ‘achromatic’. It was already clear to Graßmann [15] that if we want to talk about hue we need an achromatic locus inside the cone of colours. However, it is clear that we cannot discuss these things in the context of colorimetry proper, for the simple reason that any talk about ‘hue’ oversteps the boundaries of the very response reduction that makes colorimetry a viable discipline. One way out of this dilemma is to point out some fiducial beam as ‘achromatic’ in a fully arbitrary or deus ex machina fashion. The only requirement is that the beam doesn’t map on the boundary of the colour cone. This move has the obvious advantage that it doesn’t violate the response reduction (the subject is never asked to agree on the ‘achromaticness’ of the fiducial beam). It has perhaps the drawback of arbitrariness though. However it may be, such a definition allows us to add considerable additional structure to colour space. It is almost a necessity if we desire to proceed beyond the point that we have now reached. We will follow up the consequences in this chapter. In practice, one designates a fiducial beam as ‘achromatic’ for some pertinent reason. For instance, one may pick ‘average daylight’, or a ‘flat (equienergy) spectrum’, or the spectrum of the illuminant that is important in a given setting. That is not to say that such a definition is not an arbitrary act from the vantage point of colorimetry. It is particularly useful to pick a spectrum whose spectral radiant power density dominates the spectra occurring in a given setting throughout the spectrum. This is a frequently occurring case in practice: one has the beam of an illuminant (say sunlight) and all other beams are created by taking away (that is attenuating) radiant energy from this spectrum. This is essentially Goethe’s notion (dating back to the Greeks) that ‘colours are shadow-like entities’. In such a case we assign the illuminating beam as the ‘achromatic’ one. The chromaticity of the achromatic beam represents a half-ray that we will call the ‘achromatic axis’. This immediately induces additional geometrical structure, namely a sheaf of planes that all contain the achromatic axis. Each such plane meets the boundary of the cone of colours along a generator that either represents the chromaticity of a monochromatic beam, or a purple one. We may thus label the planes with the corresponding monochromatic beam. In the case of a purple we simply extend the plane beyond the achromatic axis and find the monochromatic beam at the opposite side of the cone. We label the planes with the wavelength of the corresponding monochromatic beam, in the case of a purple we prefix a minus sign. One calls this the ‘dominant wavelength’ of any colour contained in the plane. By extending the planes beyond the achromatic axis we establish a relation between pairs of dominant wavelengths, conventionally these are called mutually ‘complementary’ wavelengths. In the chromaticity plane the achromatic beam is represented by an achromatic point, and the sheaf of planes by a fan of lines on this point. The dominant and complementary wavelengths are found as the intersections of the lines of the fan with the spectral locus and line of purples. Using this construction we may add various geometrical configurations to the basic structure of colour space. First we notice that the plane through the monochromatic chromaticity of (about) 537 nm divides the cone of colours into two parts, the warm and the cold colours. Now we have extended the definition of warm and cold to all colours, not just the monochromatic ones. There is no way to do this without the achromatic axis and the construction indeed depends on the (arbitrary!) choice of this axis. Notice that we also
24
colour perception 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.1
0.2
0.3
0.4
0.5
0.6
0.7
Figure 1.13 The CIE chromaticity diagram with several remarkable objects: the purple line with the spectral limit points, the warm–cold division on the spectral locus (see colour Plate 4 in the centre of this book). The achromatic point defines the complementarities of the spectral limits and the plane that divides all chromaticities into warm and cold. All these objects depend for their existence on the introduction of an achromatic beam.
obtain a special purple, dividing the plane of purples into a warm and a cold part. Next we notice that the spectral limits (‘red’ and ‘blue’) are special generators of the spectral locus (clearly invariant against changes of primaries) and that these also define unique planes with the achromatic axis. Extending these to the opposite side yields two special chromaticities, namely the complementary of the short wavelength end (‘yellow’) and the complementary of the long wavelength end (‘cyan’ or ‘blue–green’). These geometrical entities turn out to be very significant in the further development of the structure of colour space. They do depend on the (again, arbitrary) choice of the achromatic axis though (see Fig. 1.13).
Newton’s spectrum From Newton’s famous drawing of the spectrum we can estimate that the effective ‘slitwidth’ must have been somewhere between 50 and 100 nm. Thus the spectrum cannot have been very ‘pure’! Yet one easily checks that the spectrum looks fine at such a slitwidth (as shown by Newton [31,32] ), despite the fact that the ‘spectral beams’ are not monochromatic beams (‘homogeneous lights’ in Newton’s terminology) by a long shot, but rather ‘confused mixtures’ as Newton would have it. Indeed, when one tries to improve the situation by decreasing the slitwidth, the spectrum doesn’t really look any better but it becomes dimmer
perspectives on colour space
25
Spectra of various slit width
Figure 1.14 The Newtonian spectrum at various slitwidths (see colour Plate 5 in the centre of this book).
and for very narrow slits (the ‘ideal’ situation), actually appears black. The reason is simple enough, a very narrow slit hardly lets any radiation pass, thus the beams approach the black beam O. Only in very special laboratory situations does one ever get to see very pure spectra.15 They really don’t look any better than Newton’s impure spectrum though, so the effort is really wasted. When one increases the slitwidth the colours become brighter (the slit passes more radiation), but from a certain point on they tend to ‘desaturate’, that is, they approach the colour of the entrance beam (let’s call it ‘white’ for the moment). When the slit is very wide the spectrum is lost and one sees only a white beam. Clearly, there is some optimum slitwidth at which the spectrum appears ‘most colourful’ (see Fig. 1.14). 15 One has to increase the radiance of the entrance beam enormously to get some radiation to enter the eye and screen the observer from it in order to avoid dazzling the visual system.
26
colour perception
(a)
Dominant wavelength 560 nm Colorimetric purity 60% Luminance 464 cd/m2
(c)
CIE D65 day light
0.8 0.7 0.6 0.501 400
(b)
1
450
500
550
600
650
700
0.5 0.4
Green leaf (#1: leskenlehti)
0.3 0.2
0.5
0.1 0 400
450
500
550
600
650
700
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Figure 1.15 Example of ‘Helmholtz coordinates’. The remitted spectrum of a green leaf illuminated with average daylight in the CIE chromaticity diagram. We can obtain the colour as a mixture of the achromatic beam with a monochromatic beam (indicated). The proportions are easily obtained from the chromaticity diagram. (See colour Plate 6 in the centre of this book.)
For a fixed entrance beam it is not hard to calculate the optimum slitwidth. One uses Helmholtz’s and Graßmann’s observation [46] that any beam admits of a metameric beam that is made up of an achromatic beam and a monochromatic beam. Thus any beam can be characterized through the intensity of its ‘achromatic component’, the intensity of the ‘monochromatic component’ and the wavelength of this monochromatic beam. This particular description is often known as the ‘Helmholtz’ coordinates of the beam (Fig. 1.15). In our case we are especially interested in the intensity of this (purely hypothetical) monochromatic component, we’re looking for the slitwidth at which it reaches a maximum. (It is a priori clear that there will be an optimum for some slitwidth since the intensity of the monochromatic beam clearly decreases for very narrow and very broad slits: in the first case the beam becomes black, in the second it becomes white.) The result depends on the spectrum of the entrance beam in so far that we have to designate it the achromatic beam. The solution was guessed intuitively by Ostwald [34,35] and later rigorously proved by Schrödinger [42]. It is very simple, the edges of the slit should be located at complementary wavelengths. (Notice that this result indeed depends on the achromatic beam, in this case the entrance beam.) If this is not possible (there exists no complementary wavelength for the left or right edge of the slit), then the slit should be opened such that only one edge is located in the visual region. In that case either a long wavelength or a short wavelength side of the spectrum is passed by the ‘slit’. That this is a reasonable result is perhaps illustrated by the following. Consider a colour at the optimum slitwidth. Let’s take a ‘green’ for instance, then we have the case of two
perspectives on colour space
27
Complementary spectra
Figure 1.16 Newton’s spectrum and the inverted spectrum at some reasonable slitwidth. Subtractive combination yields black (top), additive combination white (or rather achromatic, bottom). (See colour Plate 7 in the centre of this book.)
complementary slit edges. Suppose we slightly open the slit, taking care to preserve the dominant wavelength of the exit beam. Then we effectively add two complementary beams that superimpose to an achromatic beam: we add no colour, we add white! Likewise, consider what happens when we decrease the slitwidth slightly, again preserving the dominant wavelength of the exit beam. In this case we subtract an achromatic beam, or, as Ostwald would have it, ‘we add black’. In any case slightly increasing or decreasing the slitwidth doesn’t change the amount of colour, thus we must be at an optimum. These colours are known as ‘semichromes’ or ‘full colours’ (Ostwald called them Vollfarben). They depend on the spectrum, not just the chromaticity of the entrance beam (this is the reason why we introduced achromatic beams instead of mere achromatic chromaticities). Thus the Newtonian spectrum is indeed most colourful when the slit is very wide, about half of the visual region. This indicates that the monochromatic beams are more like certain limiting cases than that they are to be considered the basic building blocks, as Newton would have it. In the case that one slit edge lies outside the visual region (this happens for the reddish and bluish optimal colours), it is a matter of taste whether one asserts that part of the spectrum is admitted or that part of the spectrum is blocked. We may extrapolate this to the case of the slit proper and consider what happens when we use an obstructing bar instead of a slit (lets call it a ‘complementary slit’). The first to try this was Goethe [44,45].16 He took a prism before the eye and looked at a white stripe on a black card (one sees the Newtonian 16 We don’t want to join the discussion regarding Goethe’s and Newton’s relative achievements. When Goethe took his first look through a prism at a white wall and didn’t see the promised spectral colours he knew then and there that Newton was wrong. His acid polemics triggered much unfortunate debate.
28
colour perception Inverted spectra of various slit width
Figure 1.17 The inverted spectrum at various slit widths. (See colour Plate 8 in the centre of this book.)
spectrum), but also at a dark stripe on a white card. In the latter case one sees the ‘inverted spectrum’. In the inverted spectrum all colours are exactly complementary to the colours seen in the Newtonian spectrum (see Fig. 1.16). This is no great surprise since the slits themselves are complementary in the geometrical sense. From geometrical optics it is clear that the exit beams for the Newtonian and the inverted spectrum must always add to the entrance beam (achromatic)—this is technically known as ‘Babinet’s principle’ [4]. Notice that the addition of Goethe’s white stripe and dark stripe is simply a uniformly white card. The inverted spectrum looks as colourful as the Newtonian spectrum (see Fig. 1.17), even more interesting, all experiments that one can do with the Newtonian spectrum one can also do with the inverted spectrum, and with similar results. In particular, Newton’s experimentum crucis, believed to ‘prove’ that white light is ‘actually’ a confused mixture of
perspectives on colour space
29
Figure 1.18 How the closed sequence of full colours is generated from the spectrum (an open linear segment).
homogeneous lights (monochromatic beams), is easily repeated. The conclusion has to be that Newton’s experiments were OK, but his conclusions from the experiments overly hasty [18–20]. For colorimetry, the monochromatic beams are nothing special. If it is asserted that colours ‘really are’ superpositions of monochromatic beams, then—by the same logic—one should be ready to assert that they ‘are really’ superpositions of beams from the inverted spectrum. This evidently defeats Newton’s purpose.17 An interesting observation is that whereas the Newtonian spectrum lacks the purples (and thus cannot be closed to obtain the colour circle as Newton erroneously did, freely inventing novel colours for the express purpose), the inverted spectrum contains the purples but lacks the greens. This suggests that all colours are somehow represented by a combination of the Newtonian and the inverted spectrum (indeed, we will show later that exactly half of the colours belong to the Newtonian spectrum, whereas the other half stem from the inverted spectrum). The correct way to proceed is to consider the full set of Ostwald’s semichromes [34]. The simplest way to obtain an overview of this set is to draw the slit edge positions in a chromaticity diagram. One simply draws a line through the achromatic point and notices the intersections with the spectral locus (see Figs 1.18 and 1.19). There exists either a pair or only a single intersection. By rotating the line over all orientations one obtains a continuous periodic series of slits and thus a continuous periodic series of semichromes. This construction proves geometrically that the linear sequence of semichromes is closed, i.e. has the topology of a circle. Here we find the relation between the open-ended linear segment of wavelengths that is the Newtonian spectrum and the colour circle which has always been the intuitive representation of colours by visual artists. One doesn’t obtain the 17 This should not be read as ‘Newton bashing’! It would indeed be easy enough to add some negative remarks concerning Goethe’s contributions.
30
colour perception Low pass optimal colour 485 nm
Low pass optimal colour 560 nm
Band pass optimal colour 450 nm to 564 nm
Band pass optimal colour 485 nm to 611 nm
High pass optimal colour 490 nm
High pass optimal colour 560 nm
Band stop optimal colour 450 nm to 564 nm
Band stop optimal colour 485 nm to 611 nm
Figure 1.19 Some representative full colours: spectra (left) and chips (squares on the far right). (See colour Plate 9 in the centre of this book.)
colour circle by simply tying the spectral limits together as Newton did, rather, Ostwald’s definition of the Vollfarben (semichromes) [34] provides a natural map of all colours on a closed manifold. Notice that this construction depends critically upon the introduction of an achromatic beam. It cannot be done in the austere colorimetry without such an arbitrary fiducial beam. We consider these insights to be a major contribution by Ostwald to colorimetry. When we circumnavigate the set of semichromes we encounter four different kinds of these, namely (moving in a steady progression in the direction from blue through green to red): 1.
The ‘short wavelength boundary colours’. Here one edge of the slit is beyond the blue spectral limit, the other moves from the complementary of the red limit to the complementary of the blue limit. These patches tend to appear bluish.
2.
The ‘band pass colours’. Here both edges of the slit are in the visual region. One edge of the slit moves from the blue end of the spectrum to the complementary of the red
perspectives on colour space
31
limit, the other edge moves from the complementary of the blue limit to the red limit. These patches tend to appear greenish. 3.
The ‘long wavelength boundary colours’. Here one edge of the slit is beyond the red limit of the visual region. The other edge moves from the complementary of the red limit to the complementary of the blue limit. These patches tend to appear reddish. The ‘band stop colours’. Here both edges of a complementary slit are in the visual region. One edge moves from the complementary of the blue limit to the red limit, whereas the other edge runs from the blue limit to the complementary of the red limit. Notice that here we meet the short wavelength boundary colours again: the set of optimal colours closes. These patches tend to appear purplish (or ‘magenta’).
4.
Notice that the band pass colours are part of the Newtonian spectrum whereas the band stop colours are part of the inverted spectrum. We have indeed combined these complementary entities! The ‘boundary colours’ (Kantenfarben, Fig. 1.20) were first studied by Goethe [44,45]. One sees them when looking at a light–dark boundary through a prism. By changing the orientation of the prism or the light–dark edge, one switches from the short wavelength boundary colours to the long wavelength boundary colours or vice versa. Only parts of the sets of boundary colours appear as optimal colours (as only parts of the Newtonian spectrum and the inverted spectrum appear as optimal colours). The full progression of short wavelength boundary colours runs from dark blue over cyan to bright white; that of the long wavelength boundary colours runs from dark red over yellow to bright white. As Goethe noticed, the boundary colours contain neither greens nor purples. If one is inclined to do so, the boundary colours can be mixed to obtain monochromatic beams.18 Likewise one can mix monochromatic beams to obtain boundary colours.19 This shows that it is really immaterial whether one bases spectral descriptions on Newton’s ‘homogeneous lights’ [31,32] or on Goethe’s Kantenfarben [44,45]. The boundary colours have some advantages, e.g. they can be produced easily and exactly, whereas monochromatic beams can only be produced problematically and approximately (the ideal ones—for zero slit width—obviously cannot be produced at all and in that sense cannot even be said to exist). The whole discussion on which is more fundamental is really of little or no fundamental importance. We can plot the loci of boundary colours and semichromes in the chromaticity plane or in colour space itself (Figs 1.20 and 1.21). The boundary colours describe spirals from the origin to the colour of the entrance beam [11,27,33]. The semichromes describe a closed, twisted space curve encircling the achromatic axis. We will discuss the fundamental relevance of these curves later. For the moment it is most useful to plot the curves in the chromaticity plane, thus abstracting from the brightness of the entrance beam. The resulting configuration depends on the spectral composition of the entrance beam, i.e. the 18 19
One simply looks at a narrow white bar on black paper. One simply combines parts of the spectrum with Maxwell’s [28,29] or Ostwald’s [34] apparatus.
32
colour perception 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
Short wavelength boundary colors
0.1 0.2 0.3 0.4 0.5 0.6 0.7
Long wavelength boundary colors
Figure 1.20 The boundary colours in the CIE chromaticity diagram and impressions of sequence of hues of the short wavelength and long wavelength boundary colours. (See colour Plate 10 in the centre of this book.)
0.8
Full colour locus White point
0.7 0.6 0.5 0.4 0.3 0.2
Black point
0.1 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7
Figure 1.21 The full colour locus in C (left) and in the CIE chromaticity diagram.
achromatic beam. We obtain a configuration of three mutually concentric entities: 1.
The achromatic point. This is the chromaticity of the spectral colours with wide open slit. They all look white.
2.
The closed curve of semichromes. These are the chromaticities of the spectrum and inverted spectrum for optimum slitwidth, i.e. these are the most colourful points in the chromaticity plane. The closed curve made up of the spectral locus and a segment of the line of purples. This is the boundary of the real (as opposed to virtual) chromaticities. These points represent the spectral colours at vanishing slitwidth, i.e. black or total darkness.
3.
perspectives on colour space
33
Band pass optimal colour 570 nm to 575 nm Band pass optimal colour 560 nm to 580 nm Band pass optimal colour 555 nm to 590 nm Band pass optimal colour 535 nm to 620 nm Band pass optimal colour 485 nm to 700 nm Band pass optimal colour 470 nm to 700 nm Band pass optimal colour 440 nm to 700 nm Band pass optimal colour 400 nm to 700 nm Figure 1.22 Spectra and samples (chips) of a ‘yellow’ paint (see colour Plate 11 in the centre of this book). The difference is the width of the spectrum remitted by the paints (they are all optimal colours). When this range is very narrow, the paint appears dark brown. When it is very large (the whole visual region), the paint looks white. The ‘best yellow paint’ remits all wavelengths above 490 nm. Notice that this paint is a long wavelength boundary colour.
Notice that this picture is quite different from the conventional one where the most colourful points are put at the boundary of the region of real chromaticities. In such a conventional representation it is silently assumed that when you close the slit you simultaneously increase the intensity of the entrance beam. The points at the boundary then represent exit beams at zero slitwidth and infinite radiant spectral power density. Such points are only approximately reached under special laboratory conditions. In real life, colourful colours are always nearer to the semichromes and the achromatic point, not to the boundary of the region of real colours. This fact was discovered empirically by Ostwald [35], who simply put highly coloured pigments before a spectroscope (Fig. 1.22). He noticed that the best pigments remit about a semichrome. For instance, the best yellow pigments remit all wavelengths over 490 nm and absorb those below it. Since yellow occurs in the spectrum at a wavelength of about 575 nm, it shocked many physicists [35] at the time, who believed that the strongest colours should necessarily be like Newton’s ‘homogeneous lights’, thus a yellow paint was expected to reflect only a narrow wavelength region at about
34
colour perception
575 nm. Such a paint would be black though, because it would hardly remit any radiation from such a narrow wavelength region. These relations are illustrated in Fig. 1.22.
The space of object colours In the case of object colours we meet a situation much like that considered in the previous section. In a rather simplified setting we have an illuminant and surfaces which remit part of the beam of the illuminant. We assume that only reflection and scattering occur (no fluorescence). Then the remitted beams are essentially equal to the beam of the illuminant with the radiant power at certain wavelength regions selectively attenuated (Goethe’s notion of colours as ‘shadow-like’ entities [44,45]). In order to make this more precise we will define a new entity that should figure in colorimetry next to the concepts of ‘beam’ and ‘patch’. We will call it a ‘chip’, after the chips to be found in conventional colour atlases. Thus chips unlike beams (not to speak of patches!) are material things that you may indeed grasp with your hand. However, we are only concerned with some optical properties of these objects. First we define the concept of a ‘black chip’. This is very simple, a black chip is a surface that does not remit any of the impinging radiation (at least within the visual region). For if it were otherwise, we could certainly produce an even ‘deeper’ black. Notice that this is a physical definition that allows one to construct approximations to black chips (e.g. black satin velvet, soot, a hole in a box, etc.) and that the definition is not dependent on colorimetric notions, nor need one store a black chip as an international standard. Next we define the concept of a white chip. This is rather more intricate. First of all, a white chip should remit all impinging radiation. For otherwise I could certainly construct a chip looking brighter than the white chip under the same illuminant, which defeats the very notion of a white chip. But this is not sufficient. For instance, a perfect mirror also remits all impinging radiation, yet no one would be prepared to call a mirror ‘white’. The reason is that a perfect mirror, illuminated with a collimated beam will actually look black from almost all vantage points, except from the specular direction from which it looks dazzlingly bright (e.g. much brighter than writing paper or white chalk, the whitest materials we can imagine). Some thought reveals that white objects should look the same (namely white) from all vantage points. This actually suffices to define the white chip: it must be a surface that remits all radiation (‘unit albedo’) and scatters the radiation equally in all directions (a ‘Lambertian surface’). Again, this definition does not draw on colorimetric notions, nor will it be necessary to store an international standard. The definition is a recipe to construct white chips. Good approximations are flour, white chalk, matte writing paper or the classical magnesium oxide smoked upon glass from the photometric laboratory. This definition of a white surface is essentially due to Ostwald [34,35] and, again, is to be considered a major contribution to colorimetry. Once one has defined the white chip, it is easy enough to define general (coloured) chips. These are exactly like the white chip except for a wavelength-dependent attenuation. Thus the chips can be characterized by a spectral remittance factor, which is simply the ratio of the spectral radiant power density remitted by the chip to that remitted by the white chip. The beams remitted to the observers can then be taken as the spectral remittance factor times the white beam, that is the beam remitted from a white chip. Thus given the illuminant
perspectives on colour space
35
(i.e. the beam remitted from a white chip) we can uniquely specify the beams remitted by given chips. Then a chip plus an illuminant define a beam which causes a patch to be seen which can be ascribed a colour (through the colorimetric paradigm), in this case an ‘object colour’ (but see below). In this setting the choice of the achromatic beam is obvious: of course, we simply take the white beam. Thus we have completed the canonical setting for the colorimetry of object colours. Notice that whereas an aperture colour depends on the beam only (we take the eye as a constant factor here), the ‘surface colours’ depend on the illuminant and a chip. One way to handle this formally is to specify an ‘object colour’ as a pair of aperture colours, namely the pair made up of a colour due to the chip and one due to the white chip. Operationally, this would correspond to the chip being presented on a white background, supplying the (really minimal) ‘context’ necessary for a colour to be a ‘surface colour’ to start with. Change of illuminant When one changes the illuminant the remitted beams associated with a given chip change. The achromatic beam changes, too. As a result, the colours of the chips change. The transformation is a complicated one, it is not simply that we obtain a deformation (reshuffling or isomorphism) of colour space. Rather, it may well happen that two chips looking alike under one illuminant look unlike under another illuminant and vice versa. This clashes with the naive notion that chips possess a ‘real’ or ‘intrinsic’ colour. It is not quite clear what one could mean by the ‘real’ colour of a chip. Apparently it should be taken to mean something like the colour of the chip under a standard illuminant (which is a novel entity that enters the picture here). But if this is to be the definition, then the real colour of the chip will never be perceived, except in very special laboratory conditions! This can hardly be considered satisfactory. As a physicist one is perhaps induced to take the spectral remittance factor as the ‘real’ colour of a chip. This definition has at least the advantage that it does not depend on any (arbitrary) ‘standard illuminant’. It is a property that is characteristic of the material of the chip, quite independent of its arbitrary irradiation in some photometric setting. A problem is, of course, that the spectral remittance factor as such is never seen. What one may perceive are the colours under various illuminants, like variations on a theme. The invariant theme, then, would be the spectral remittance factor or the ‘true’ colour. From the vantage point of the physicist this is not unreasonable, for it exactly describes the method of spectroscopy. In spectroscopy one irradiates the sample with monochromatic beams and measures the remittance: this is a direct determination of the spectral remittance factor. A problem is that this requires that one knows the illuminant. This can be solved by measuring the sample against a white background and indeed, this is common spectroscopic practice (it is the preferred method because random fluctuations of the source then tend to cancel out automatically). Our formal device of representing object colours as pairs of aperture colours (due to the chip and the white chip) reflects this. One way to describe this in colorimetric terms is to factor the multiplicative relation: colour = colour-matching functions ⊗ spectral remittance factor ⊗ illuminant spectrum
36
colour perception
in a different way. One contracts the colour-matching functions with the illuminant spectrum, to obtain ‘the colour-matching functions for the given illuminant’ and lets them work on the spectral remittance factor. Then the illuminant ‘changes the eye’, whereas the chips are described in a way that is independent of the illuminant (by their ‘true colour’). Thus colour = eye ⊗ spectral reflectance, where eye should be interpreted as eye = colour-matching functions ⊗ illuminant spectrum. This is often an advantageous way to frame the problem and perhaps most closely captures our intuitive notions. It is fully equivalent to the other formulation of the same problem: one contracts the spectral remittance function with the illuminant spectrum (the ‘remitted beam’) and lets the regular colour-matching functions work on it. In this view one looks at the ‘apparent colour’ of the chip, which is the spectrum under the given illuminant, not a material property of the chip. The two views are, of course, formally equivalent. The colour solid In the history of colour science one encounters many different types of ‘colour solid’. One finds colour pyramids (Lambert [24]), spheres (Runge [38]), double cones (Ostwald [34]), shapeless colour trees (Munsell [10,30]), colour cubes (Hicketier [23]), etc. Perhaps the most immediately remarkable aspect is the fact that these colour solids are finite volumes (except perhaps for the Munsell tree, which is, at least theoretically, permitted to grow without bounds, although in actual fact we are only given a finite set of samples), whereas the colour cone is an infinite volume. The reason is that the colour solids put a simultaneous order on surface colours (chips), whereas the colour cone puts an order on beams. Now the colour of beams may vary from darkness to dazzling bright (infinite distance from the origin), whereas surface colours may only vary between black and white. All chips remit less than the white chip, thus they should map on a finite region of C. Let us consider the shape of this region (might it be a tree, pyramid, sphere, etc.?). First, consider the representation of the beams remitted by the chips in S. In order to simplify the discussion we will start with an illuminant that has a flat spectrum (same spectral radiant power density at all wavelengths). Then the beams remitted by the chips have spectral radiant power densities that vary between zero and the density remitted by the white chip (which doesn’t depend on wavelength). Geometrically this means that the region filled by the beams remitted from all conceivable chips is a hypercube (same edge length in every dimension). The colour solid then is simply the projection of this hypercube in three-dimensional colour space. One can show that such a projection appears as a fusiform (Zeppelin-shaped) body in three-space (Fig. 1.23). When the illuminant doesn’t have a flat spectrum, little changes. Instead of a hypercube we get a hyperbox and the fusiform body deforms a bit but doesn’t change qualitatively. Another way to find the colour solid in C was developed by Schrödinger [42]. He first solves the problem of ‘which paints are the brightest’ for a given illuminant and a given
perspectives on colour space
37
Figure 1.23 Some projections of the seven-dimensional hypercube. In the middle view the essential fusiform shape of the colour solid is easily visible.
chromaticity. He proves in a very general way that such paints have an ‘ideal’ spectral remittance factor, namely, their spectral remittance factor is either zero or unity.20 Moreover there are no more than two transitions in the visual region:21 such chips are called ‘optimal’. (Clearly, the boundary colours and full colours considered earlier are examples of such optimal colours.) Then it is clear that the optimal colours must lie on the boundary of the colour solid, for otherwise one could find a still brighter paint at a given chromaticity. Thus the optimal colours form an explicit parameterization of the boundary of the colour solid. As parameters one may use the transition wavelengths, thus the optimal colours form a two-parameter family, i.e. generically a surface. When we plot such a surface we obtain a fusiform body, indeed the projection of the hyperbox defined by the illuminant in S (Figs. 1.24–1.27). Notice that the boundary colours are only one-parameter families, thus generically curves. Indeed, the boundary colour loci are spirals in C that lie on the surface of the colour solid [11,27,33]. They divide the boundary of the colour solid into two (congruent) parts, one part containing the band pass optimal colours (Newtonian ‘impure’ spectral colours), the other part the colours of the impure inverted spectrum. Thus exactly half of the optimal colours are Newtonian spectral colours, the other half are inverted spectral colours, whereas the boundary colours form a set of vanishing measure. Again, this is a striking demonstration that the Newtonian ‘homogeneous lights’ are nothing special. The colour solid inherits central symmetry from the hyperbox of which it is the projection. The symmetry centre is the median grey (remittance factor 50% for all wavelengths) chip. We define the ‘grey axis’ as the line connecting the white and black points. The grey axis is obviously a segment of the achromatic axis. It is a natural question to ask for those optimal 20 The argument is simple and general. Suppose there existed a region of intermediary reflectance. Then one might perturb the reflectance at three places and thus obtain a colour change, e.g. one might perturb such as to add more of the fiducial colour! This clearly should not work, thus the reflectance should be either zero or one. 21 The argument is again simple and general. Suppose there existed three or more transitions. Then one could slightly perturb the locations of three of them and thus mix yet more of the fiducial colour! This should be impossible, thus there cannot be more than two transitions.
38
colour perception View of the white apex 300 200 100 0 –100 –200 –300 –300 –200 –100
0
100
200
300
Figure 1.24 The colour solid in the canonical (SVD) basis. Here is a view of the white pole (see colour Plate 12 in the centre of this book).
Projection on U–V plane 200
200
100
100
0
0
–100
–100
–200
–200
–300
–300
–400
–400 –600 –500 –400 –300 –200 –100
0
–600 –500 –400 –300 –200 –100
0
Figure 1.25 The colour solid in the canonical (SVD) basis. Here is a view in the direction of the third dimension (see colour Plate 13 in the centre of this book).
colours that are as far removed from the grey axis as possible, for these will be the most ‘colourful’ optimal colours. These colours turn out to be exactly Ostwald’s semichromes, or full colours (Vollfarben) [34] i.e. optimal colours with complementary transition wavelengths. The locus of semichromes runs as an ‘equator’ (except that it is not a planar curve) over the boundary of the colour solid, encircling the grey axis. Like the colour solid itself, it has central symmetry about the median grey chip.
perspectives on colour space
39
Projection on U–W plane
300
300
200
200
100
100
0
0
–100
–100
–200
–200 –600 –500 –400 –300 –200 –100
0
–600 –500 –400 –300 –200 –100
0
Figure 1.26 The colour solid in the canonical (SVD) basis. Here is a view in the direction of the second dimension (see colour Plate 14 in the centre of this book).
Projection on V–W plane 300
300
200
200
100
100
0
0
–100
–100
–200
–200 –400 –300 –200 –100
0
100 200
–400 –300 –200 –100
0
100 200
Figure 1.27 The colour solid in the canonical (SVD) basis. Here is a view in the direction of the first dimension (see colour Plate 15 in the centre of this book).
For very dark optimal colours, the optimal colours approach the monochromatic beams (very ‘narrow slits’, thus hardly any radiation remitted). This means that the shape of the fusiform body near the black point is approximately that of the spectral cone boundary. In the limit, the spectral cone boundary is tangent to the conical singularity of the fusiform body at the black point. If we take a myopic view and see only the neighbourhood of the black point, then the colour solid looks in no way different from the space of aperture colours! By central symmetry we see that at the white point the tangent cone is the inverted spectral
40
colour perception
cone. This can be seen as the spectral cone reflected on the median grey, but—equivalently— also as the cone due to the inverted spectrum. Indeed, near the white point the optimal colours are like colours from the inverted spectrum with very narrow (complementary) slitwidth. (The myopic observer at the white point sees an ‘upside down’ and complementary version of the space of aperture colours.) Thus we see that the colour solid is a centrally symmetric body (like the cube, the sphere, or the double cone) and that it is everywhere smooth, except for conical singularities at the white and black points (unlike the cube which has edges, the sphere which has no conical singularities, or the double cone which has a crease at the equator). It is totally unlike the Munsell tree, which is a (theoretically) unbounded structure. The idea of the Munsell tree is that you can extend it ad libitum when novel (ever more colourful) pigments become available. The point (lost to Munsell) is that this will definitely never happen: the colour solid is bounded because the optimal colours are the brightest possible and, indeed, lie on the natural boundary. This answers some of our initial questions. Although some of the colour solids suggested in the literature have properties not unlike the true colour solid, all deviate in arbitrary ways, and some configurations are clearly very unfortunate or even mistaken. We should regard all properties not present in the true colour solid as irrelevant fancies and distortions of reality. Spectral dominance The surface colours are due to spectrally selective attenuation of the white beam. This suggests that it may be of interest to put a partial order on the space of all beams (S) in the following way. Beam A will be said to ‘spectrally dominate’ beam B when it is the case that, for all wavelengths throughout the visual region, the spectral radiant power of beam A is not less than the spectral radiant power of beam B. This makes the set of beams a poset (partially ordered set). In the case of the surface colours the white beam spectrally dominates all remitted beams. Optimal colour A spectrally dominates optimal colour B if the pass band of B is fully contained within the pass band of A. Thus the subset of optimal colours forms a poset under inclusion of pass bands. The white colour is the unique LUB (lowest upper bound) and the black colour the unique GLB (greatest lower bound) of this subset. Thus the optimal colours satisfy a lattice order, or a true hierarchy. We may continue in this spirit and define several additional useful relations. Two optimal colours will be called ‘categorically different’ when their pass bands are disjunct. When two optimal colours are not categorically different we can always find illuminants under which they appear equal (though not black). When two optimal colours are categorically different one can find no such illuminant, the colours will either look different, or both will appear black. For any two beams we can define a novel and very strong notion of complementarity. We call two beams ‘complementary’ if their superposition equals the white beam. Here ‘equality’ is not colorimetric equality, but actual equality of the spectra. It is evident that such complementary beams will also have complementary dominant wavelengths. However, beams with complementary dominant wavelengths are unlikely to be complementary in this
perspectives on colour space
41
novel, strong sense. The notion of complementarity is important because it is immediately related to the central symmetry of the colour solid. For any beam I can construct the complementary beam, which is again a possible surface colour, and which has a colour that is symmetrically located with respect to the median grey point for the simple reason that the colours should always add to the white point. Notice that when A spectrally dominates ¯ B, then the complementary beam A¯ of A will be spectrally dominated by B. A better term would perhaps be ‘supplementary colours’ (German: Ergänzungsfarben), but ‘complementary’ is regrettably the common term.
Colour atlases The idea of a colour atlas is both simple and attractive [1]. One produces fiducial chips and ‘measures’ a given sample by comparison with the fiducial chips under some standard illuminant. Of course there are numerous problems with this concept, both of a practical and of a theoretical nature [22]. Here we only consider some of the conceptual issues. We start by noticing that there are two quite incompatible approaches to be found in the literature. In one approach one simply forgets about colorimetry and attempts to arrive at a perceptually ‘evenly spaced’ and ‘naturally ordered’ system by purely psychological means. (Of course, one may measure the result colorimetrically and attempt to build a bridge to colorimetry in retrospect, but that doesn’t interest us at this point.) The Munsell system [10,30] is perhaps the best known example. This is a valid approach in itself though it has nothing to do with colorimetry. Consequently it doesn’t concern us here. Another approach is to attempt to construct a colour order system on colorimetric principles. (Of course, one may perform psychological measurements and attempt to build a bridge to perceptually even spacings in retrospect, but that doesn’t interest us at this point.) The best example is perhaps Ostwald’s atlas [34]. There exist many attempts of a mixed nature (e.g. the CIE’s Lab-system, the DIN system) but such mongrel attempts are not very relevant from a principled point of view. Since the Ostwald system is by far the most rational attempt, we consider it here in some detail. Historically it is a unique attempt at a rational systematization of object colours. Unfortunately Ostwald committed some (relatively minor) errors in the process. We don’t consider that these glitches detract from the fundamental importance of the attempt though. In any case, no one has done better since. The fundamental advances proposed by Ostwald have largely been lost on the colorimetric community though. Although one finds it mentioned (a rare enough occasion!) we know of no author past the 1950s (say) who recognizes the essential conceptual differences with, for example, the Munsell system. Ostwald bases his atlas on the semichromes. He considers any object colour as made up of fractions of optimal colour, white and black, the fractions (colour content, white content and black content) adding up to 100% (Figs 1.28 and 1.29). This immediately leads to his double cone representation. The oversight is that there exist colours (for instance, the optimal colours other than white, black or the full colours) that cannot be incorporated in this scheme and will fall outside the double cone. The double cone only represents part of the colour solid, namely the convex hull of the full colours, the white and black point. However, this oversight is relatively minor and the problem can easily be mended in Ostwald’s own spirit (see below).
colour perception
42 (a)
(b)
Figure 1.28 (a) Principle of an Ostwald page. The page consists of partitive ternary mixtures of white, black and an optimal colour. (b) The same Ostwald page as in (a), but with the white, black and colour sectors mixed (one may think of a set of Maxwell tops being spun). (See colour Plate 16 in the centre of this book.)
24 hues circle
Ostwald page #12
Seegrün-Laubgrün Seegrün-Eisblau
Ublau
Laubgrün
Eisblau
Colour chip
Gelb-Laubgrün
Ublau-Eisblau
Gelb-Kress
a Ublau-Veil
Kress
Veil
Rot-Kress Rot-Veil
12
12 {50, 25, 25}
Parameterized by hue, full colour, white and black content
Figure 1.29 The basic structure of the Ostwald atlas: colour circle (mensurated set of Vollfarben), single page, single chip. (See colour Plate 17 in the centre of this book.)
The double cone is a rather arbitrary topological deformation of the colour solid as we have discussed it. Although complementarity and linear structure in planar sections through the grey axis are conserved, the general linear structure of C is destroyed. This is a severe (and unnecessary) disadvantage. Notice that with the introduction of the colour, white and black content, Ostwald has solved the simultaneous order in the planes of constant dominant wavelength. In order to complete the ordering, he has to put a rational order on the semichromes themselves. As Ostwald remarks, the semichromes are like ‘beads on a string’, in the sense that one
perspectives on colour space
43
Figure 1.30 Ostwald’s principle of internal symmetry in action. Here we have mensurated a circle and deformed it into an ellipse. The ellipse is automatically mensurated because Ostwald’s principle is affinely invariant. (See colour Plate 18 in the centre of this book.)
may push them about as one pleases. Some principled method has to be found to fix their positions. The solution offered by Ostwald is the principle of internal symmetry (Fig. 1.30). What he means is the following. Suppose we want to subdivide the locus of optimal colours in M ‘cardinal colours’ (our term). Let’s take M = 24 for concreteness, as Ostwald did in his second attempt (the exact number is not important). Then the central symmetry means that cardinal colours i and i +12 must be complementary. Thus one need only order the first 12 cardinal colours. The principle of internal symmetry states that cardinal colour i should have the same dominant wavelength as the equal mixture of cardinal colours i − 1 and i + 1. It is not clear how Ostwald conceived of the idea, but it is certainly an algorithm that leads to a unique mensuration of the semichromes through purely colorimetric calculations. The idea is contained in a letter by Graßmann [16] and is used, as a matter of fact, by some painters [37]. It must be said that Ostwald was somewhat confused on the issue and added irrelevant and even inconsistent axioms to the principle of inner symmetry. This has been cleared up admirably by Bouma [5]. In retrospect, the principle of inner symmetry is quite sophisticated. In the limit for large M it defines a kind of parameterization of the full colour locus by arc length. Were the semichrome locus a flat curve, then it would indeed be a parameterization by affine arc length and would lead to an affinely invariant mensuration of the semichromes, quite independent of the choice of primaries. As the case is, the semichrome locus is a general (twisted) space curve and the situation is much more complicated. Again, though defective in practice, this idea of Ostwald’s contains the nucleus of a great idea, namely the invariant mensuration of the semichromes by purely colorimetric means. Ostwald has been severely criticized for his errors, but the fact is that all later attempts are feeble or arbitrary by comparison. Instead of critique, it might be more constructive to attempt a rationalization
44
colour perception
Figure 1.31 How the spectral cone and its complementary image (inverted spectral cone at the white point) define a double conical volume in C. (See colour Plate 19 in the centre of this book.)
of the simultaneous order of object colours by colorimetric means in the spirit of Ostwald, but avoiding his less fortunate decisions. We offer exactly such an attempt here. Perhaps the first unfortunate decision of Ostwald was to base his atlas upon the semichromes. For one reason, the colours of the semichromes depend on the spectrum, not just the chromaticity of the illuminant. Moreover, many colours (i.e. the optimal colours) appear as ‘supersaturated’. It appears to be more natural to let the semichromes be less than maximally saturated since one may certainly spot some whiteness in them. For instance, especially in the red, people have often remarked that some of the darker chips of Ostwald’s atlas look more strongly coloured than the Vollfarbe. (Bouma [6] has a good discussion.) There exists a simple way to meet these problems (Figs 1.31 and 1.32). Instead of basing the atlas on the semichromes, we base it on a certain family of virtual colours that we will designate ‘characteristic colours’. We define them in the following way. The colour solid is a smooth body except for the conical points at the white and black points. If we construct these tangent cones (spectral cone and inverted spectral cone), we find that they intersect in a curve that, like the semichrome locus, encircles the grey axis. We define this curve as the locus of the ‘characteristic colours’. The characteristic colours depend on the chromaticity of the illuminating beam (the position of the white point in C), but not on its spectrum (thus unlike the semichromes). If one varies the spectrum of the illuminating beam, leaving the white point invariant, the colour solid (and thus the semichromes) changes, but the double cone (not to be confused with Ostwald’s ‘double cone’!) with the characteristic colours remains invariant. Moreover, the double cone is the envelope of all colour solids obtained in this way. Thus we may unambiguously describe any colour in Ostwald’s tradition as characteristic colour content plus white content plus black content. In this scheme the semichromes have less than 100% colour content and finite white and black content. No supersaturated colours occur. This essentially solves the major problem found in Ostwald’s scheme. The other problem involves the mensuration of the characteristic colour locus. We cannot simply use Ostwald’s principle of inner symmetry, since the locus is not a planar curve. Instead, we use a similar principle that is manifestly affinely invariant. We use the fact that
perspectives on colour space
45
Figure 1.32 Various views of the intersection of the spectral cone at the black point and the inverted spectral cone at the white point. The sharp edge (equator) is the locus of characteristic colours. Notice that the overall shape is strongly determined by the (flat) purple sector and its inverted copy. (See colour Plate 20 in the centre of this book.)
[K, W, pi – 1, pi] = [K, W, pi, pi + 1] W
pi – 1
pi + 1
pi
K Figure 1.33 The geometry of the mensuration by volume ratios.
we have a special segment on the origin, namely the grey axis. If we consider three nearby characteristic colours in sequence, p1 , p2 and p3 , say, we may find the volumes (Fig. 1.33) of the tetrahedra kp1 p2 w and kp2 p3 w (here k denotes the black and w the white point). We notice that the ratio of these volumes is an affine invariant, in particular it will be invariant
46
colour perception
against changes of the primaries. This ratio can serve to define an affine invariant arc length of the locus of characteristic colours. When we simply cumulate incremental volumes and divide by the total volume we have constructed an affinely invariant parameter that increases by unity when one circumnavigates the locus of characteristic colours once. This completes the colorimetric mensuration of the colour solid in the Ostwaldian spirit. It yields a fully rational simultaneous order of object colours. When we analyze the volumetric mensuration analytically we find that, perhaps surprisingly, for small increments (formally ‘infinitesimal’, in practice a 24-hue division will do) the equal mixture of hues pi and pi+2 lies in the plane kwpi+1 . Thus pi+1 has the same dominant wavelength as the equal mixture of pi and pi+2 , which, again, is exactly a reformulation of Ostwald’s principle of internal symmetry: we have found the correct generalization of Ostwald’s principle. When formulated in this way, the principle is a flawless affine invariant. Such a principled scheme is interesting because it yields a global order on the surface colours. A major problem with the psychological schemes based on colour difference judgments is that their order is only locally well defined, but globally not well determined. This is the case because it is very hard to judge equal differences between very different chips. Thus such systems are locally even but globally uneven. Of course, experiment has to decide whether the principled order is anything close to a perceptually even one, for colorimetry proper has nothing to say on this issue. In Fig. 1.34 we present a comparison of the results of a mensuration of locus of characteristic colours via the (new) principle of inner symmetry with the (perceptually uniform) Munsell scale. Deviations are of the same order as deviations between the various perceptually uniform scales (one to several steps on a 24-hue scale). We conclude that ‘perceptually uniform’ essentially coincides with metrical uniformity. From a pragmatic point of view, colour atlases are probably best constructed from first principles, rather than via laborous and noisy perceptual judgements.
The colour cube: RGB-display colour space The colours on CRT screens are produced by way of additive mixture of three beams that can be attenuated in programmable proportions. A typical program statement controlling the colour of a ‘pixel’ is ‘RGBColor[r, g , b]’. Here numbers should be substituted for the actual parameters (r, g , b). Typically the beams are off (black pixel) for (0, 0, 0), and white (also the brightest colour) is produced for (1, 1, 1). The phosphors are such that RGBColor[1,0,0] looks red, RGBColor[0,1,0] green and RGBColor[0,0,1] blue. Then RGBColor[1,1,0] will produce yellow, RGBColor[0,1,1] cyan (blue–green) and RGBColor[1,0,1] magenta (purple). Notice that in the case of these CRT screen colours the space of beams, S, is very limited, it is a mere three-dimensional space. This means that there exists a 1–1 map of S into C, although (depending on the phosphors used) only part of the colour cone can be reached. This is an instructive example because of its low dimensionality. We will use it to illustrate several of the configurations of C in an intuitively very obvious way.
perspectives on colour space Munsell hue circle 490 510 530 550
47
Munsell–internal symmetry comparison 0 18 12 570
470
6 0
450 18 12 590 610 650 630
6 0
0
6
12 18
0
6
12 18
0
Figure 1.34 A comparison of the mensuration of the characteristic colours with the Munsell scale (right). Both scales have been resampled to 24-hue scales. Maximum deviations are about one and a half unit on this scale, the root mean square deviation is about half a unit. Such deviations are of the same order as the deviations between perceptually uniform scales among each other, e.g. the Munsell scale and the DIN scale. Notice that the scales themselves are very non-linear, as is illustrated with the Munsell color circle (left).
First notice that all beams can be understood as selective attenuations of the white beam (1, 1, 1). The region of S filled with real beams is easy to construct: it is simply the unit cube 0 ≤ r ≤ 1, 0 ≤ g ≤ 1, 0 ≤ b ≤ 1 in (rgb)-space. The projection in C will be a parallellopiped, but when we pick a nice basis for C we may actually use the cube in (r, g , b)-space as a (in this case fully congruent) model of the colour solid. We will refer to this as the RGB-cube. We can think of it as residing in S or in C, in this case it makes no difference (Figs 1.35 and 1.36.) We can use this RGB-cube to illustrate many of the geometrical properties of the colour solid in a very simple way. First notice that spectral dominance is simply defined as A dominates B if rA ≥ rB ∧gA ≥ gB ∧ bA ≥ bB . Thus white (W) dominates all colours, in particular the binary colours cyan (C), magenta (M) and yellow (Y). Cyan dominates green (G) and blue (B), magenta dominates red (R) and blue (B), and yellow dominates red (R) and green (G). All colours dominate black (K), in particular black is dominated by the unary colours red, green and blue. When we plot the lattice structure as a Hasse diagram, we notice that this has the structure of the projection of a cube, in a way, the RGB-cube can double as the Hasse diagram of its dominance hierarchy. The topological structure of the RGB-cube is also apparent in the familiar tricolor diagram of additive ternary mixing of beams (it has the topological structure of the ‘Schlegel diagram’ of the cube, Fig. 1.37). It is easy enough to determine the complementary beams of the ternary, binary and unary mixtures, the complementary pairs are W–K, C–R, M–G and Y–B. Thus we can
48
colour perception
Figure 1.35 A generic view of the RGB-cube. (See colour Plate 21 in the centre of this book.)
Figure 1.36 A view at the white pole of the RGB-cube. (See colour Plate 22 in the centre of this book.)
immediately construct the inverse dominance hierarchy. In the RGB-cube the grey axis is the body diagonal W–K. Its midpoint is the median grey and it is indeed a symmetry centre of the cube. By inversion in this centre the vertices (K , R, G, B, C, M , Y , W ) go over into (W , C, M , Y , R, G, B, K ) that are the complementaries.
perspectives on colour space
49
W C
M
G
B
Y
R
K Figure 1.37 Tricolour spot diagrams for subtractive (top left) and additive (top right) colour mixture. At the bottom, the Hasse diagram of spectral dominance, on the left with the RGB contributions explicitly drawn, at the right with the hues indicated. (See colour Plate 23 in the centre of this book.)
Notice that the ‘spectral cone’ is made up of the three faces of the cube that meet at the black point, whereas the inverted spectral cone is made up of the three faces that meet at the white point (they go over into each other through the central symmetry). The spectral cone and the inverted spectral cone together exhaust the surface of the RGB-cube. They intersect in the closed polygonal arc RYGCBM. This is the locus of the characteristic colours. In this case the colour solid coincides with the envelope (the intersection of the spectral cone and the inverted spectral cone), thus the closed polygonal arc RYGCBM is also the locus of the full colours or semichromes (Fig. 1.38). The boundary colours (Fig. 1.39) are spirals between the white and black point. The long wavelength series of boundary colours lie on the polygonal arc KRYW, whereas the short
50
colour perception
Figure 1.38 Two views of the locus of full colours on the RGB-cube. (See colour Plate 24 in the centre of this book.)
W
Y C
R
B – R – Y
– B K
– C
Figure 1.39 The loci of boundary colours on the RGB-cube with sequences of boundary colour hues. (See colour Plate 25 in the centre of this book.)
wavelength series of boundary colours lie on the polygonal arc KBCW. When you trace these arcs on the RGB-cube you see that they are congruent twisted space curves (spirals) of opposite chirality. They divide the surface of the RGB-cube (the optimal colours) into two equal areas, the band pass optimal colours (colours from the—impure—Newtonian spectrum) and the band stop optimal colours (colours from the—impure—inverted spectrum).
perspectives on colour space
51
Figure 1.40 The RGB chromaticity diagram. The boundary colour loci are plotted in the RGB colour triangle and in the complementary (inverted) triangle. (See colour Plate 26 in the centre of this book.)
We can also construct the representation of the various geometrical loci in the chromaticity diagram. One obtains a neat, symmetrical representation when one selects a chromaticity plane that is orthogonal to the grey axis. We may attach it to the white point, for instance, and project from the black point. In this case the spectral cone at the black point projects into an equilateral triangle (it degenerates into a curve). Two sides of this triangle are the ‘spectral locus’ (namely RGB), the remaining side (BR) is the line of purples (Figs 1.40 and 1.41). The inverted cone at the white point maps into the full interior of this boundary. We can easily find the loci of semichromes RYGCBM; it coincides with the boundary. The loci of boundary colours KRYW and KBCW appear much as in the CIE chromaticity diagram for the general case. Finally, we can put an Ostwald-style atlas structure on the RGB-cube. The locus of characteristic colours is the closed polygonal arc RYGCBM. We may consider it mensurated (six divisions) since the vertices are equally distributed according to the equal volumes scheme. The characteristic colours can be used to label the planes of a sheaf of planes on the grey axis. Each plane on a certain characteristic colour defines a triangle WCK (C the characteristic colour). Such a triangle is a ‘single page’ from the atlas. We may indicate colours on the page by their colour, white and black content. Thus each point inside the RGB-cube is uniquely specified via the four numbers: index of characteristic colour (0–5), white content (0 ≤ w ≤ 1), black content (0 ≤ b ≤ 1) and colour content (0 ≤ c ≤ 1) (there are only three degrees of freedom because of the constraint w + b + c = 1). This is a much more intuitive and practical way to specify a colour than by red-fraction, green-fraction and blue-fraction (the ‘RGBColor[r, g , b]’ method). It is used in some ‘colour pickers’ for
52
colour perception
Figure 1.41 The RGB colour triangle. At each chromaticity we have plotted the brightest RGB colour. (See colour Plate 27 in the centre of this book.)
graphics applications, notably the ‘Painter’ application of Fractal Design Corporation. Of course one can easily convert between the RGB and Ostwald denotations. Thus we see that we find exactly the same geometrical structures in this simple and intuitive situation as we find in the general case (though here the formal structure is much more complicated because of the colour matching functions). Indeed, the structure of the RGB-cube can often be used to gain a more intimate understanding of the essential structure of the general case. This throws a new light on one of the questions posed at the beginning: Is there any reason to prefer one of the many ‘colour bodies’ (spheres, double cones, cubes, trees, etc.)? It appears that the cube has definitely some advantages over the others since it preserves most of the structure that characterizes the general case. The reason is simply the threedimensionality of C.
Discussion Did we find any answers to the problems posed in the introduction? Here we take them in order: • That there exist so many colour solids is largely the result of human fancy. The one
feature that is common to (almost) all colour solids that have been proposed is that they are convex, finite bodies with pronounced singularities at the white and black poles. This feature, at least, has firm roots in colorimetry. The only aberrations here are the ‘colour trees’, which do not explicitly endorse convexity (nor finiteness for that matter), and systems such as Lambert’s and Helmholtz’s, which are really based on colours in the
perspectives on colour space
53
aperture mode (no black). The latter are strikingly at odds with the (very fundamental) central symmetry of the colour solid. • Since biological systems (by all odds) developed outside the laboratory, they are set to
deal with the structure of continuous spectra, not monochromatic beams. The surface colours are most naturally parameterized via their common boundary, the optimal colours. The ‘equator’ of the optimal colours (full colour locus) is a closed curve. Thus the hues quite naturally fall into a periodic order. The open, linear segment of the Newtonian spectrum is to be considered a laboratory artefact with little relevance for natural vision (monochromatic paints are black!). • The warm and cold colour families have not been formalized as colorimetric concepts up till now (at least as far as we are aware). We have shown that the dichotomy is implicit in the most primitive, affine colorimetric structure, one does not even need the notion of an achromatic beam for its definition. It is to be considered one of the most basic invariants of colour space. • Newton’s ‘homogeneous lights’ are nothing special. Although they often occur in the
formalism they rarely occur in real life and are to be considered more of a laboratory artefact than a basic building block. Ostwald was on the right track when he replaced the monochromatic beams with his Vollfarben. Indeed, one of the more rational bases for object colours are the attenuated optimal colours. At least they are complete. The Newtonian (impure) spectral colours represent only half of the object colours. Goethe’s Kantenfarben are in many formal respects similar to monochromatic beams. They are more robust than monochromatic beams and can actually be produced easily in the laboratory [technically, they are the ‘primitives’ (cumulated integrals) of the monochromatic spectra]. The relation between the full colours and the boundary colours is an intimate one. ‘White’ is simply one of the ideal colours, the highest (LUB) in the spectral dominance hierarchy. To consider white a ‘confused mixture of homogeneous lights’ buys one nothing, to consider it the ‘mother of all colours’ is closer to the actual state of affairs.
• There does exist a natural metric, it is simply the metric provided by the physics.
Since colour space is essentially a low-dimensional image (linear projection) of the space of beams, we have an induced metric for the colours. Thus not all affinely equivalent copies of colour space are equal—the large majority of them shows us only a deformed view of fundamental space. Only a small set (related via isometries) shows us an undeformed, ‘true’ view of the structure of fundamental space. Even among those ‘nice’ colour spaces we can make a rational choice and pick a truly ‘canonical’ one: we might pick a preferred spatial orientation on the basis of the fiducial directions specified via the achromatic beam and the plane separating the warm and cold colour families. Such a choice has the obvious advantage that it can be re-established from first principles (and colorimetric experiments, namely ‘gauging the spectrum’) without arbitrary international agreements such as the CIE 1931 basis.
• There are indeed principled ways to mensurate the colour circle. This may be understood
in different ways—one may either mensurate the full colour locus (Ostwald’s choice) or the locus of characteristic colours. As we have shown, the latter choice has the advantage
54
colour perception that it depends only on the colour, but not on the spectrum of ‘white’. Principled methods are invariant against variation of the primaries. Ostwald’s principle of internal symmetry is an attractive idea, and indeed contains the nucleus of a solution, but fails because the full colour locus is non-planar. We have identified two invariant methods: a division by arc length in the canonical basis and a division by volumes (incremental volume between two characteristic colours is the volume of the tetrahedron defined by the two full colours, white and black). The methods seem to offer slightly different advantages and we won’t venture a final choice here though the latter can be shown to be the natural (and correct!) generalization of Ostwald’s somewhat mystical principle of internal symmetry. An exciting empirical finding is that the results of ‘eye measure’ (purely psychological ‘uniform spacing’) are quite similar to each of the principled methods. This indicates that vision uses a metric close to the one induced from the physics, and suggests that one should prefer a rational method in practice. Results are probably not significantly different from a pragmatic point of view. The principled methods are well reproduceable and guarantee globally uniform results, whereas ‘eye measure’ scores badly on both counts.
• The colours of colorimetry are indeed a truthful (though limited) reflection of the
physical structure of electromagnetic radiation. Colour vision—in the approximation of colorimetry—is largely a form of low-resolution spectroscopy and the ‘observer’s share’ appears to be minimal. This is both an exciting and prima facie surprising fact (in retrospect, however, one should probably expect the results of evolution to converge to such a state of affairs). It is exciting because it offers us a powerful handle on the interaction of humans with their physical environment. It is surprising because only the minutest move from pristine colorimetry gets us into situations where we are at a loss to predict even the simplest empirical facts relating to ‘colour vision’; this is where (puritan) psychophysics ends and psychology starts.
Apart from the key questions, we have discussed several more technical points. In the process we have introduced a few novel concepts and developments that readers with some background in colorimetry may have noticed.
References 1. Anonymous. ISCC–NBS color-name charts illustrated with centroid colors. Standard sample No. 2106, Supplement to NBS Circular 553. 2. S. Axler. Linear algebra done right. Springer-Verlag, New York, 1996. 3. F. W. J. Billmeyer. Survey of color order systems. Col. Res. Appl., 12:173–186, 1987. 4. M. Born and E. Wolf. Principles of optics, electromagnetic theory of propagation, interference and diffraction of light (2nd edn). Pergamon Press, Oxford, 1964. 5. P. J. Bouma. Zur Einteilung des Ostwaldschen Farbtonkreises. Experientia, 2:99–103, 1946. 6. P. J. Bouma. Physical aspects of colour. N.V. Philips Gloeilampenfabrieken, Eindhoven, 1947. 7. G. Brindley. Physiology of the retina and the visual pathway, Edward Arnold, London, 1970. 8. CIE. CIE proceedings. Cambridge University Press, Cambridge, 1924. 9. CIE. CIE proceedings. Cambridge University Press, Cambridge, 1931.
perspectives on colour space
55
10. M. C. Company. Munsell book of color. Munsell Color Co., 10 East Franklin Street, Baltimore, MD, 1929. 11. C. Dolland. Über den Luther-Nybergschen Farbkörper. Die Farbe, 5:113–136, 1956. 12. R. M. Evans. An introduction to color. John Wiley & Sons, New York, 1948. 13. A. Gershun. Sur une théorie du champ lumineux. Rev. Gen. de l’Elec., 44:307, 1938. 14. A. Gershun. The light field. English translation by Moon and Timishenko. J. Math. Phys., 18(51), 1939. 15. H. Grassmann. Zur Theorie der Farbenmischung. Ann. Phys., 89:69–84, 1853. 16. H. Grassmann. Bemerkungen zur Theorie der Farbenempfindungen. Va. Selbstanzeige von V. Königsb. Rep. II, pp. 213–221, 1879. 17. E. Hering. Grundzüge der Lehre vom Lichtsinn. Julius Springer, Berlin, 1920. 18. T. Holtsmark. Newton’s experimentum crucis reconsidered. Am. J. Phys., 38:1229–1235, 1970. 19. T. Holtsmark. Das Experimentum Crucis und die Theorie der Dispersion. Opt. Acta, 18:867–873, 1971. 20. T. Holtsmark and A. Valberg. On complementary color transitions due to dispersion. Am. J. Phys., 39:201–204, 1971. 21. D. Judd. Color in business, science and industry (1st edn). John Wiley & Sons, New York, 1952. 22. J. J. Koenderink. Color atlas theory. J. Opt. Soc. Am., A4:1314–1321, 1987. 23. H. Küppers. Das Grundgesetz der Farbenlehre. DuMont Buchverlag, Köln, 1978. 24. J. Lambert. Photometria sine de Mensura et Gradibus Luminis Colorum et Umbrae. Eberhard Klett, Augsburg, 1760. 25. E. H. Land. Experiments in color vision. In R. Held and W. Richards, editors, Readings from scientific american (originally Scientific American May 1959), number 28, pp. 286–298. W. H. Freeman and Company, San Francisco, 1972. 26. R. Longhurst. Geometrical and physical optics (1st edn). Longmans, Green and Co., London, 1957. 27. R. Luther. Aus dem Gebiet der Farbreizmetrik. Z. f. techn. Physik, 12:540–558, 1927. 28. J. C. Maxwell. XVIII. Experiments on colour, as perceived by the eye, with remarks on colourblindness. Trans. of the Roy. Soc. Edinburgh, pp. 275–298, 1855. 29. J. C. Maxwell. On the theory of compound colours, and the relations of the colours of the spectrum. Phil. Trans., pp. 57–84, 1860. 30. A. Munsell. A color notation (10th edn). Munsell Color Co., 10 East Franklin Street, Baltimore, MD, 1947. 31. I. Newton. A new theory about light and colors. Am. J. Phys., 61:108–112, 1993. 32. I. Newton. Opticks. Dover Publications, New York, 1952. 33. N. Nyberg. Zum Aufbau des Farbenkörpers im Raume aller Lichtempfindungen. Z. f. Physik, 52: 406–419, 1928. 34. W. Ostwald. Das absolute System der Farben. Z. f. physik. Chem., 92:222–226, 1917. 35. W. Ostwald. Er und Ich. Theodor Martins Textilverlag, Leipzig, 1936. 36. C. Parkhurst and R. L. Feller. Who invented the color wheel? Col. Res. Appl., 7:217–230, 1982. 37. S. Quiller. Color Choices, Making Color Sense out of Color Theory. Watson-Guptill Publications, New York, 1989. 38. P. O. Runge. Farben-Kugel oder Construction des Verhältnisses aller Mischungen der Farben zu einander, und ihrer vollständigen Affinität, mit angehängtem Versuch einer Ableitung der Harmonie in den Zusammenstellungen der Farben. Friedrich Perthes, Hamburg, 1810. 39. J. Ruskin. The Elements of Drawing. Dover Publications, New York, 1971.
56
colour perception
40. E. Schrödinger. Grundlinien einer Theorie der Farbenmetrik im Tagessehen. Parts 1 and 2. Ann. Phys., 63:397–456, 1920. 41. E. Schrödinger. Grundlinien einer Theorie der Farbenmetrik im Tagessehen. Part 3. Ann. Phys., 63:481– 520, 1920. 42. E. Schrödinger. Theorie der Pigmente von grösster Leuchtkraft. Ann. Phys., 62:603–622, 1920. 43. G. Strang. Linear Algebra and its Applications. Academic Press, New York, 1976. 44. J. W. von Goethe. Theory of Colours (6th edn). The MIT Press, Cambridge, MA, 1982. 45. J. W. von Goethe. Farbenlehre, mit Einleitungen und Kommentaren von Rudolf Steiner, volumes 1–3 (3rd edn). Verlag Freies Geistesleben, Stuttgart, 1984. 46. H. von Helmholtz. Handbuch der Physiologischen Optik (2nd edn). Voss, Hamburg, 1896. 47. G. Wyszecki and W. S. Stiles. Color science. concepts and methods, quantitative data and formulas. John Wiley & Sons, New York, 1967.
commentary: perspectives on colour space
57
Commentaries on Koenderink and van Doorn From physics to perception through colorimetry: a bridge too far? Donald I. A. MacLeod Koenderink and van Doorn’s ‘Perspectives on colour space’ is a landmark contribution, a uniquely scholarly and insightful synthesis of old and new ideas. Unlike the authors I regard it as a contribution to the literature of ‘colour science’ as well as of colorimetry. It can, at least, be read with profit by all colour scientists (who, whether they like it or not, have to deal with colorimetry). I admire it for its range and clarity, its bold originality and its sometimes amazing ingenuity. But my admiration is tempered by alarm. One of the things I like about the essay is its liberating effect. It loosens the grip of the Newtonian paradigm on current thinking about colour and colorimetry. By focusing on surface colours instead of on the monochromatic lights into which Newton decomposed the spectrum, it draws attention to interesting but neglected avenues for exploration in the physics of colour, and revives the unduly neglected tradition of Goethe, Schopenhauer, and Ostwald. The chapter has much to teach us, both by precept and example. For instance, it is the best extant demonstration of the advantages of adopting a full three-dimensional geometrical description of the colour stimulus, rather than the two-dimensional projections (chromaticity diagrams) that most discussions have preferred to concentrate on. What is alarming is that, in their attempt to say something interesting about colour based on physics only, the authors have enhanced the attractiveness, to unwary readers, of what Mausfeld (2002) has called the ‘physicalist trap’. The physicalist trap is the attempt to give purely physical accounts of perceptual phenomena, and it is a conceptual pitfall into which we are all prone to fall (with the frequent exception of physicists themselves, from Newton with his famous insistence that ‘the rays are not coloured’ to the authors of the present chapter). We fall into the trap because evolution has made us victims of what might be called the illusion of objectivity: the pre-scientific starting point for any scientific consideration of perception is the intellectually naive but biologically necessary conviction that the phenomenal world of perception is nothing different from the world of external physical reality. This illusion of objectivity remains in the cognitive background as a conceptual pitfall when we try to develop scientific accounts of perception. The authors are not naive realists, and they are in no real danger of falling into the physicalist trap themselves while they so successfully embellish it. They simply invite the reader to note the intriguing qualitative and quantitative correspondences they have discovered between the world of physical colour stimuli and the world of colour appearance. Their project is to promote a view in which the naive notion of identity of the phenomenal and the physical is left behind, but the notion of a simple (albeit limited and qualified) isomorphism between the two is retained. Although the tendency of their work is in this sense merely quasi-reductionist rather than genuinely reductionist, I believe it is dangerous. Even if the physicalist trap fails to capture the authors or their sophisticated readers outright, it can have insidious effects by its mere presence in the cognitive background, because it lends a spurious significance to parallels between the physical and phenomenal worlds—parallels sometimes so indirect that they may be almost accidental. This chapter identifies many such parallels that are novel and intriguing, but whose significance becomes questionable on close scrutiny. And while it might be interesting to endow these distant parallels with significance within either a mechanistic framework (with a basis in physiology) or an evolutionary framework (with a basis in ecology, including natural scene statistics), it should be clear that any such project would have to invoke an elaborate set of new and disagreeably uncertain assumptions. The authors have understandably chosen not to emphasize such difficulties and uncertainties in their enterprise of tracing colour to its physical roots. In what follows, I try to do this for them.
58
colour perception
From physics to colorimetry The authors want to limit themselves to colorimetry, excluding other aspects of colour perception (although, as we will see, they occasionally stray beyond that self-imposed boundary). Colorimetry is the part of colour vision most closely tied to physics. Indeed, Koenderink and van Doorn characterize colorimetry, with the colour-matching functions on which it is based, as ‘almost pure physics’. They recognize that all colorimetry depends on the form of the cone spectral sensitivities, but they refer to those as ‘accidental’, and dismiss them (in their note 5) as an unnecessary complication, as if the colour-matching functions belong to physics, but the cone excitations to physiology. Yet in reality the cone excitations determine the colour-matching functions. The judgements of a subject about a colour match—on which colorimetry depends—are therefore at least as ‘accidental’, and at least as divorced from ‘pure physics’, as the cone excitations that the compared stimuli elicit in the subject’s retina. How, then, could physics include the colour-matching functions but exclude the cone excitations? That conception of the scope of physics may draw its plausibility from the illusion of objectivity, which encourages us to identify the phenomenal world of colour with the physical world of colour stimuli, and to neglect, when making this identification, the intervening physiological processes on which phenomenal experience undoubtedly depends. Be that as it may, the cone photoreceptor excitations that the authors are careful to exclude from their discussion have provided natural and convenient coordinate systems for characterizing colour stimuli, both in the early days of colorimetry (Luther, 1927) and in more recent discussions (MacLeod and Boynton, 1979). By abandoning this obvious, but overtly physiological, choice of coordinates, Koenderink and van Doorn find themselves faced with an embarrassment of choice among various more or less arbitrary coordinate systems for representing the colour stimulus. I consider next the choices that they make and that determine their perspective on colour space.
The quest for physical invariants Koenderink and van Doorn opt to use as a starting point the CIE XYZ coordinate system, whose claim to reality is less physical than sociological–historical. Recognizing the conventional nature of that framework, they cleverly transform it, following Cohen and Kappauf, to a physical stimulus representation that is invariant with the initial choice of primaries in terms of which the colour-matching functions of a given observer are expressed. As they explain (p. 17), in this representation, any given spectral energy distribution is decomposed into two component spectral energy distributions: a ‘fundamental component’ and a black component. The fundamental component is an energy distribution (virtual, because sometimes negative) constructed as a linear combination of three primaries, each with a spectral energy distribution proportional to one of the colour-matching functions for an equal energy spectrum—indeed, it is the one such combination that matches the given beam. Each row and each column of matrix R is the fundamental component, in that sense, of some monochromatic beam from the equal energy spectrum. If now the colour matches of the observer are described in terms of new primaries—say, red, green, and blue instead of X, Y, and Z—the colour-matching functions themselves will change. But their weighting as primaries in the construction of the fundamental component of any given beam must change correspondingly, so as to keep the fundamental component itself the same (since it must match the given beam). But what is gained by this detour back to the colour-matching functions in the choice of the primaries from which a beam’s ‘fundamental component’ is built up? After all, the beam could equally well be regarded as the combination of any of its metamers with some suitably chosen black: the fundamental component preferred by the authors is just one choice among the infinity of possibilities. It could be misleading, therefore, to think of this ‘fundamental’ spectrum as ‘the unique causally
commentary: perspectives on colour space
59
efficient part of the beam’ (p. 17). It has that status only if the particular, somewhat arbitrary decomposition favoured by the authors is adopted. Not only the fundamental status of the fundamental components and of the matrix R, but their invariance as well, must be qualified: the fundamental and matrix R remain completely tied to convention in their dependence on the equal energy spectrum. There is no obvious physical warrant for giving that spectrum a special status (as opposed to, for instance, one with equal quantum flux per unit frequency), and no ecological one either. In view of these considerations it is not clear when, or why, we might prefer to calculate the ‘fundamental’ spectral component of a stimulus, instead of (for instance) the corresponding triplet of cone excitations. The decomposition into the invariant fundamental spectrum plus black is formally elegant but lacks practical utility. The only use the authors suggest for it is to generate metamers for a given beam. Surely the cone excitations have a less hollow claim to fundamental status: even in the assessment of the degree of metamerism between two given beams, the cone excitation triplet provides a much simpler alternative representation of the visually effective stimulus, in the form of three quantities that are literally fundamental for colour vision, including colour discrimination. Much as Koenderink and van Doorn would like to banish them, the spectral colours not only appear in the matrix R but return to haunt them persistently in their later enlightening discussion of full colours and of the colour solid for reflecting surfaces. The difficulty here again arises from a lack of invariance. The colour solid, including the locus of full colours or semichromes (p. 27) in particular, is illumination dependent. The envelope of that solid for all possible illuminants is very different from any particular realization of the solid, and indeed is none other than the convex hull of spectral colours: with no restriction on illuminant power, the colour solid can contact that spectral cone at any point, as is clear in the limiting case of intense monochromatic illumination. Koenderink and van Doorn show that a particular closed curve of ‘characteristic colours’ will form, when linked to white and black, the envelope of the colour solid for all illuminants that share a particular colour (metameric illuminants). But these characteristic colours are again simply the colours of monochromatic lights. (Each has the maximal intensity that could be remitted by a white surface under any illumination metameric with the prevailing one. That intensity is achieved when the illuminant is just a mixture of the monochromatic light and some other spectral light or purple.) The search for invariance here forces a return to the decomposition of the spectrum into monochromatic lights. This lack of invariance is an unwelcome complication for the chapter’s enterprise of providing a general treatment of the colorimetry of reflecting surfaces. But it may be helpful to the visual system in the context of colour constancy. The gamut—the solid shape in colour space that can be filled by surface colours under a given illuminant—is an unduly neglected physical constraint in natural colour vision, a constraint that this chapter explains with unprecedented clarity and elegance. And just because it is illuminant-dependent, the gamut affords cues that could be very useful for estimating the illuminant and achieving a relatively illumination-invariant estimate of surface colour in natural scenes. This possibility has been explored by Forsyth, and is a focus of Chapter 7 (this volume) by MacLeod and Golz. The vague notion of those authors that ‘when the light gets red the reds get lighter’ has a definite realization in the behaviour of Koenderink and van Doorn’s illuminant-dependent locus of characteristic colours. Under a red illuminant, the inverted spectral cone that tapers to white is translated redward, and the locus of characteristic colours, where the origin-centred spectral cone meets the inverted cone, gets tilted away from the origin on the red side. This serves to illustrate how the shape of the full colour solid, hence the illuminant colour, can provide a cue to illuminant colour (if there are enough visible samples from the colour solid to give information about its shape). But the theoretical utility of the spectral characteristic colours in this and other contexts may be limited because they are a very extreme case, encircling the colour solids for typical natural illuminants at a
60
colour perception
far greater distance from the achromatic axis. Here invariance comes at the price of the artificiality that the authors have been keen to avoid.
Colorimetry and perception The discussion of canonical bases (p. 18) suggests that a particular linear transform of colour space offers a natural or undistorted view of it, unique except for a choice of viewpoint through rigid rotation. The very idea that some particular view of colour space can be regarded as undistorted, with a faithful rendition of distances and angles, is a provocative one. In proposing it, the authors stray beyond the domain of colorimetry into that of colour perception, since (as they note) colorimetry proper does not constrain distances or angles. This makes it all the more interesting if an undistorted view can be determined on the basis of physics (as represented in the colour-matching functions) alone. And it is, indeed, intriguing that the distances in the undistorted view of colorimetric colour space tally so well with the Munsell distances that represent phenomenal colour differences. Unfortunately, though, the ‘undistorted’ basis functions obtained by singular value decomposition (SVD) have hardly a better claim to physical, biological or psychological reality than the colourmatching functions from which they were derived. Like the untransformed space of ‘fundamental components’ discussed above, the structure derived using SVD is satisfyingly independent of the initial choice of primaries, but is strongly affected by the conventional choice of the equal energy spectral colours as a stimulus set. So the SVD ‘canonical’ basis can be faulted on the same grounds that its authors fault so much of contemporary colorimetry: it gives an unjustified primacy to the spectral lights (and to the equal energy spectrum in particular). To be consistent with the authors’ position elsewhere in this chapter, the suggestion that the brain achieves ‘an undistorted representation of the physical structure of beams’ (Fig. 1.11) should be evaluated with reference to natural colours, not the Newtonian spectrum. Recent investigations of the coding of natural colours, for instance the chapter by MacLeod and von der Twer (Chapter 5 this volume), illustrate this approach. But any set of colours is a somewhat arbitrary choice, so the derived view of colour space will always lack the objectivity and generality that we expect for a datum of physics. The canonical orientation suggested for colour space adopts as one axis the ‘achromatic axis’ (p. 23) that is said to represent points of indeterminate hue. But the existence and location of such an axis are facts of psychology, not of physics. An observer might, for instance see all realizable colours, including any candidate ‘whites’ suggested by physics, as reddish. The authors acknowledge such difficulties by saying that the choice of which axis to identify as achromatic is completely arbitrary; yet if the choice is truly arbitrary, surely nothing but confusion can result from labelling it the achromatic axis. This problem is compounded when the authors designate Ostwald’s semichromes as the most ‘colourful’ colours: the semichromes (or full colours) must indeed be optimal in colorimetric purity in the sense illustrated in Fig. 1.15, but this carries no implications whatever about their appearance. More generally: the cone sensitivities, and the colorimetric data that they determine, place no constraint whatever on the way that colour appearance varies across physical colour space. Failure to appreciate this point is a serious, though ubiquitous, theoretical error. It is an intellectual blind spot fostered by the illusion of objectivity: we assume an objective physical basis for qualities of our experience that are in reality accidents of our physiology (or if you prefer, of our psychology). This point deserves elaboration. For concreteness, consider observers who have the normal three cone photoreceptors together with three or more postreceptoral neural signals that determine colour appearance. The postreceptoral signals each depend, let’s say continuously but non-linearly, on all the cone excitations. Even this minimally complex and familiar colour vision system allows each of the three colour signals to take any value at any point in colour space (subject only to the continuity constraint). There are
commentary: perspectives on colour space
61
many possible structures for the phenomenal colour space of observers of this general type (a class which doubtless includes humans). Such an observer might, for instance, perceive the entire locus of semichromes as perfectly achromatic rather than maximally coloured, with colours inside that locus in the chromaticity diagram appearing greenish (for example) and colours outside it appearing reddish. The ‘achromatic axis’ could be perceptually the greenest colour of all! And this is among the least outlandish of the possibilities . . . . In fact, of course, the semichromes are indeed relatively colourful in appearance (more so than the necessarily very dark, narrow-band reflectances), but this has nothing directly to do with their special physical nature. This chapter provides no physical, physiological, or functional rationale for expecting any correspondence between colorimetric purity and colourful appearance (nor does that correspondence hold strictly even as a matter of observation: exceptions are noted later in the chapter). Although physics does not tell us why stimuli on a particular colorimetric axis appear perceptually achromatic, the chapter by MacLeod and von der Twer (p. 155) attempts an answer based on ecology, and on efficiency of neural coding. There it is suggested that for efficient use of noisy and compressively non-linear neural signals, extremes of sensation (and maximal signals) should be associated with stimuli close to the boundary of the probability density function for natural stimuli, while the most frequent stimuli are perceptually encoded as close to neutral. This simple postdiction is approximately fulfilled: in any reasonable colour space, the most frequent natural stimuli are ones that appear nearly white (though often slightly yellowish and/or greenish). Koenderink and van Doorn may be excused, or even congratulated, for having avoided such messy lines of enquiry. But, is it realistic to hope that less messy answers could ever suffice for such inherently messy questions? Along with the achromatic axis, the authors identify a new axis in colour space, that links the origin with the spectral boundary between warm and cold colours. This boundary they locate in the spectrum at 537 nm, which they show is, in any linear colour space, the farthest point on the equal energy spectrum locus from the plane of purples. This colorimetric definition of warm and cool colours is a novel and intriguing proposal. But readers should ask themselves: (1) how the proposed correspondence should be explained or understood, if not as a coincidence; and (2) why it should hold for the equal energy spectrum in particular (other spectra, such as that of sunlight, place different wavelengths farthest from purple). A further technical point is pertinent. Strictly, the unique plane of purples is a fiction. The notion of a limited spectrum has no support from physics, and behavioural data also fail to support it in the required sense of suggesting tangent directions at the two spectral extremes. Very short wavelengths, in particular, exhibit no well-defined limiting chromaticity, so no ‘plane of purples’ is objectively given. Instead, a conventional choice of wavelengths for the ends of the spectrum defines the set of colours identified as the plane of purples, and this in turn determines the wavelength most distant from that plane. Less arbitrary, though more overtly biological, axes can be suggested for colorimetry. An axis representing luminance is a standard choice, roughly a substitute for the ‘achromatic’ axis. Koenderink and van Doorn cheerfully characterize the status of luminance as ‘rock bottom’. Yet in the context of human vision, the status of energy is lower still (unless one’s interest is in cooking the retina). To represent a local light stimulus by a single number, luminance is the only reasonable choice. It not only corresponds roughly to perceived brightness, but also determines, among other things, visual performance in acuity, reaction time, and temporal resolution. Luminance-normalized cone excitations provide serviceable chromaticity coordinates (Luther, 1927; MacLeod and Boynton, 1979), and the addition of luminance itself to those (Derrington, Krauskopf and Lennie, 1984) provides a useful, albeit not physically derived, three-dimensional representation of the colour stimulus. In these physiologically based representations of colour, the S cone excitation axis provides a second axis,
62
colour perception
naturally orthogonal to the luminance axis and likewise capturing directly information relevant for (among other things) spatial and temporal resolution and distinctness of borders. As these competing examples suggest, the considerations that have been adduced for favouring certain candidate axes for colour space do not come from the domain of physics. They derive instead from psychophysics or physiology. But merely by preferring certain axes to others, we are leaving the domain of colorimetry proper. As this chapter makes clear, colorimetrically relevant properties of colour space are unaffected by origin-preserving linear (affine) transformations. Thus if our concern is purely with colorimetry, the very enterprise of searching for canonical axes is at best unnecessary and at worst misguided. It is natural to want to use the freedom of axis choice to capture something meaningful outside of the domain of colorimetry, and that is what all the candidate choices attempt to do. This is harmless in itself. But in making such a choice, it is important to remember how restricted are the explanatory limits of colorimetry per se. For colorimetric purposes, a colour space merely has to indicate whether colour stimuli match (have the same coordinates). If we take the liberty of adopting the cone spectral sensitivities (or if you prefer, the colour-matching functions) as a physical datum, physics determines that much of colour vision—but no more: the physically determined coordinates need not indicate anything at all about how two distinct colours relate to one another in their appearance. Two avenues, neither of them through physics, allow us to advance our understanding of how colours look. Mechanistic hypotheses must involve physiological postulates or data, as in the tradition of Hering, together with implicit or explicit psychophysical linking hypotheses (Brindley). And teleological or evolutionary hypotheses must involve ecological postulates or data that pertain to the organism’s visual interactions with its environment.
Conclusion Although my purpose in this postscript has been to register objections and suggest qualifications to a few of the chapter’s claims, I hope those remarks will only encourage readers to investigate Koenderink and van Doorn’s magnificent analysis of the structure of colour space more comprehensively. But to those readers I say again: beware—this is treacherous, though fascinating, territory, to be explored with caution! The physicalist trap, that ever present pitfall in the study of perception, has here been most dangerously camouflaged by the artful physicist authors. Even if their intentions are innocent, avoiding the trap is up to you alone.
Reference Brindley, G. S. (1970). Physiology of the retina and visual pathway. Williams and Wilkins, Baltimore. Derrington, A. M., Krauskopf, J., and Lennie, P. (1984). Chromatic mechanisms in lateral genticulate nucleus of macaque. Journal of Physiology (London), 357: 241–65. Luther, R. (1927). Aus dem Gebiet der Farbreizmetrik. Z. Tech. Phys., 8, 540–58. MacLeod, D. I. A. and Boynton, R. M. (1979). Chromaticity diagram showing cone excitation by stimuli of equal huminance. Journal of the Optical Society of America, 69, 1183–6. Mausfeld, R. (2002). The physicalistic trap in perception. In Perception and the physical world. (ed. D. Heyer and R. Mausfeld), pp. 75–112. Wiley, Chichester.
commentary: perspectives on colour space
63
Commentaries on Koenderink and van Doorn Colorimetry fortified Paul Whittle Colour has long elicited geometrical representation. In this chapter, by two physicists steeped in the intuitive geometry of Hilbert and Cohn-Vossen (1932), we have the best summary exposition and tutorial of colorimetry and its associated colour spaces that I know of. Each reading raised something new for me, not least gaps in my own understanding. The authors are at pains to stress the austerity of their subject matter, how it deals only with arid and abstract things of little interest to the general reader, overlaps hardly at all with ‘colour science’ and eschews completely the ‘mess’ of ‘the world of colour’. At least the first disclaimer can be taken with a pinch of salt. For these two physicists are no more able than the rest of us to avoid the seductions of colour, and so cannot resist introducing the fertile notion of an achromatic point, and later on several ‘intuitions’ that take them further. It is fortified colorimetry (‘supplied with added nutrients’, Oxford English Dictionary). For me one of the challenges of the chapter was to spot what was being slipped in and where, to work out whether its appearance of going a long way on remarkably little fuel was a conjuring trick or not. I try to do it here for the colour circle. It should be noted that the authors give a principled account. They are not at all concerned with empirical details. They do not mention, for example, that the results of metameric matching can vary considerably with the stimulus parameters. I find this refreshing in a field that is often so empiricist that understanding is continually postponed. I wish we had more of it.
Colorimetry is independent of photometry The beginning student of colour naturally assumes that colorimetry must be more complex than photometry. Is it not three-channels versus (effectively) one? The authors show that the opposite is the case. Colorimetry has a formal elegance that quite escapes photometry. At first sight this is a little puzzling, because colorimetry, of course, includes the intensity dimension. But its intensity is purely radiant intensity: a scalar multiplying the absorptions of all three pigments by the same factor. The photometry question is quite unrelated: the relative effectiveness of lights of different chromaticity. There are many criteria for effectiveness and the results do not agree exactly (‘very low (really rock bottom!) scientific status’). Consensus has been imposed only by committee. Colorimetry, however, for all its formal elegance, has not escaped the same fate. The idealizations of science are one thing, their application another—see Johnston (2001), for a good account of the committee wrangles in both fields. But the contrast is instructive: colorimetry is biophysics, it only scrapes the surface of the organism, the photopigments, and ignores the nervous system proper. Whereas the effectiveness that photometry is concerned with involves the whole organism.
Colorimetry does reduce the stimuli The authors point out that the conventional use of isolated lights in dark surrounds for metameric matching is unnecessary. It can just as well be done with a small patch in the middle of a natural scene. It is response reduction, not stimulus reduction, that is crucial. The subject has only to respond ‘same’ or not and this can be reduced to the detection of an edge in a bipartite field or a flicker in a temporally alternated one. No reports of colour are needed; no colour concepts. Animals can easily be trained to do it.
64
colour perception
But there is stimulus reduction too. If for you the essence of colour—your most vivid experiences of it—is in the play of colour, light and shade in a complex changing scene, such as a wood on a sunny spring day with a slight breeze moving the foliage, or a glistening salmon run in the hills, then working with one small textureless element at a time is as severe a stimulus reduction as could be. It excludes the greater part of what makes the world colourful for you. And that will never be restored. Colorimetry is resolutely atomistic. But the authors’ fortified colorimetry is not quite so atomistic. Their exposition deals with the full (though static) manifold of all visible ‘beams’, and the colour concepts that fortify it depend, as we shall see, on having the manifold there.
Spectrum to colour circle A ‘major topic of this paper’ is ‘Why are most colour order systems based on the “colour circle”, that is a periodic linear sequence whereas the spectral colors are naturally ordered as a linear open segment?’ After reading it I felt I understood this much better than before. In this section I try to say how, and to pick up what presuppositions are being introduced along the way. First, where does the cone of colours in colour space come from? We start with a ‘space of beams’ in which each axis corresponds to the radiation at one wavelength. ‘Geometrical intuition suggests that all beams fill an “∞tant” . . . in the linear space.’ It is this manifold that projects into the cone of colours, via the projection operator that ‘gauging the spectrum’ empirically determines. It gives readymade a single bounded cone with all colours nicely inside the monochromatic ones which, together with the purples, constitute its surface. Note that the cone can expand or contract a little depending on the choice of wavelength interval, which can only be pragmatic: neither too wide nor too narrow. Note also that the nice projected cone is not pure geometry. Biology also comes in (note 9). Where do the non-monochromatic purples come from that close off the cone? They arrive because we are projecting a filled ∞tant (the manifold of all visible beams) so these mixtures of far red and far blue are already included. Their particular plane in the space of beams just finds itself on the outside of the cone of colours because of the form of the projection operator (our photopigments plus the particular experimental procedure). I’m jumping the gun in referring to ‘colours’ and ‘purples’. So far these are purely formal entities: ‘metamers’, projections of subspaces of indistinguishable beams. To start talking about colours as we know them, we need appearances, which, as the authors note, takes us outside pure colorimetry. And to get from the bounded cone of metamers to the colour circle proper, we also need more structure: in order, chromaticities, chromaticity planes, an achromatic point, and semichromes. Chromaticities first. ‘If one attenuates a beam (“sunglasses”) one notices a decrease in “brightness” whereas the “colour” in some restricted sense appears to remain invariant . . . This is the reason why one often finds it convenient to regard colours modulo their magnitudes . . . and call them “the chromaticities”.’ A lot enters at ‘in some restricted sense’. These are our concepts, which took centuries to develop (Gage 1993). In many cultures, and in many contexts in ours, a light and a dark red that differed only via ‘sunglasses’ would not be seen as qualitatively similar. The authors’ description is particularly persuasive in a laboratory where one can continuously attenuate beams, or change wavelength or purity. The concept depends for its persuasiveness (this is rhetoric, not logic) on having the manifold of colours present in practice or in imagination and navigating it in a particular way. Factoring out an intensity dimension is fundamental to radiation-based systems of colour, but also marks their restricted context of applicability. For instance, black is an anomaly in such systems; it has ‘undetermined chromaticity’. But in a set of paint samples, black is a colour like any other. Of course all this is why the authors say ‘ “colour” in some restricted sense’. The restricted sense is precisely that which makes their scheme work. For their purposes it fits the world. But not for all.
commentary: perspectives on colour space
65
They then introduce a ‘chromaticity plane’ that transects the cone. In the section ‘Additional structure . . .’ much more comes in. ‘When we circumnavigate the boundary of the cone of colours . . . we experience a continuous change of “hue” . . . Since real colours in the interior of the colour cone also have “hues”, one wonders about the loci of constant hue in colour space. They must be surfaces that intersect the boundary of the colour cone transversally. Some topological reasoning soon reveals that somewhere in the interior must exist a singular curve of points of undetermined hue.’ Here we have formal definitions (‘labelling’, ‘calling’ . . .), some natural history of colours (‘experience’, ‘noticing’), implicit assumptions (e.g. of continuity of hue ‘in some restricted sense’ as one moves inwards from the spectral locus; a very big assumption), and gestures to mathematics (‘some topological reasoning’). Again the persuasiveness depends on navigating the manifold in particular ways. It is hard to be clear about the relative roles of the components in this section, and to separate the formal structure from one’s understanding of it in terms of colour experience. In the same section, an ‘arbitrary achromatic locus’ was introduced. ‘Arbitrary’ to maintain colorimetry’s independence of appearances, although the illuminant is a ‘useful’ choice since its spectrum then dominates all the other (reflected) lights in a scene. ‘All other beams are created by taking away . . . radiant energy from this spectrum.’ Colours can then be thought of as spectral ‘shadows’, a notion to be found in ancient Greek thought and in Goethe. It is surely one of the most appealing and suggestive answers to the question ‘What is colour?’ The authors forbear from mentioning it, but the arbitrariness of the achromatic point is of course matched by the eye’s ability to adapt, to take, to a considerable degree, any prevailing illuminant as achromatic (‘discounting the illuminant’). The same phenomenon is alluded to later on, where the option is raised of formally treating the illuminant spectrum as part of ‘the eye’. At these points the principled account could get slightly further into the nervous system. As also in the derivation of optic-nerve-like opponent colour functions from considerations of a canonical basis for the projection of beams to colours. I note that both these scarcely get beyond the retina. We now have a closed curve of colours, a proto-colour-circle. The authors go on to show some interesting things about this circle, and to do that they need to get a better look at it, that is, to get away from the spectral locus, whose colours in the limit are invisible, and find conditions where the gamut of hues is best displayed. For this, they turn to material considerations, viz., how colours can be produced with a spectrometer. How does the colour vary with slit width? A very narrow slit doesn’t pass enough light to see: it generates black. A very wide slit passes the whole spectrum which recombines to white (if that is the colour of the entrance beam; the arbitrary achromatic point has to be really achromatic for you to follow the exposition in this section). There must be an optimum somewhere in between: a slit that generates the most colourful colours. This occurs for a surprisingly wide slit, with its ends at complementaries (Ostwald, Schrödinger). These colourful colours are therefore known as ‘semichromes’: made from half of the spectral-locus + purples (Fig. 1.18) (also known as ‘full colours’ or Vollfarben; confusingly, ‘optimal colours’, defined later, are different). There is a paradox here. The argument holds for any given spectrometer. But if we have a more powerful light source and use narrower slits, the ‘high purity’ colours produced are different from less pure ones. We can discriminate purity, at constant luminance, up to the highest value we can make (Wyszecki and Stiles 1967, p. 509), and increasing purity certainly doesn’t make colours look less ‘colourful’. My memory is that they acquire an intensely coloured, jewel-like quality. However, the authors assert that a spectrum made up of such colours ‘really [doesn’t] look any better than Newton’s impure spectrum though, so the effort is really wasted’. The tone is pragmatic, as also in ‘In real life, colourful colours are always nearer to [sic] the semichromes’. That invocation of ‘real life’ is significant. This is a point where material considerations are important: the spectroscope or Ostwald’s most colourful paints. How relevant is it that the colourfulness of a spectral display is a different criterion
66
colour perception
from that of individual colours? We perhaps shouldn’t worry exactly how well the semichromes fit ‘the mess’ of “the world of colour”. As I’ve already noted, it is good to have a principled exposition. As the wide slit that produces semichromes is moved across the spectrum (rotated around the achromatic point in Fig. 1.18, a very helpful diagram), one end of it will go beyond the visible region. One end of the spectrum is then passed by the slit, but we could equally well say that the other end is blocked. What happens if one uses from the start, instead of a slit, a stop (a ‘complementary slit’), which blocks off part of the spectrum? This, in effect, was Goethe’s procedure, and the colours produced are the ‘inverted spectrum’ which shares yellows, reds, and blues with the Newtonian (slit) spectrum, but now includes the purples but not the greens. The complementary geometry of the slit/stop generates complementary colours. As far as generating colours is concerned, Newton’s monochromatic lights were nothing special; Goethe’s ‘boundary colours’ will do just as well. (The authors do a fine job of historical rehabilitation on both Goethe and Ostwald.) The story of the colour circle continues into what is perhaps the most original contribution of the chapter: the development of the surface-colour solid and the mensuration of the colour circle, but I stop here for reasons of space. So, what have we learnt so far about the relation of spectrum to colour circle? We learnt how we get a cone of colours by projecting a manifold of beams; we learnt where in the chromaticity diagram to look for the best colours, and how to produce them; and any residual impression that Newton’s spectrum was primary was removed by the slit/stop complementarity. But note that the crucial substrate of appearances was brought in from outside; just taken as given. It was then woven into the development. The situation is not that of footnotes in a mathematics text mentioning applications, where the body of the text is an independent formal argument. This, of course, is just repeating that the chapter is fortified colorimetry, which the authors would not deny. What I’ve tried to do is to pinpoint some of the added nutrients.
What’s left out? When I first read this chapter, I was struck by how many of the topics of colour science were touched on, even though some, like adaptation, are implicit rather than explicit, and others enter surreptitiously. I found it an effort to write a list of topics that were completely excluded. But I can also see it in quite a different perspective, in which I agree with the author’s disclaimers about its relevance to colour at large. Look at it this way. The chapter is about trichromacy. But is trichromacy any more relevant to most of what we do with colour than is the writing system to the content of a book? Anyone who has transcribed a spoken tape knows that much of what went on— prosody, hesitations, changes of tone and so on—is lost in transcription. Similarly, our photopigments filter out much of the spectral variety of the world. But from most points of view they are both just interfaces. They explain very little indeed of what we do with colour or what is written in books.
References Gage, J. (1993). Colour and culture. Thames & Hudson, London. Hilbert, D. and Cohn-Vossen, S. (1932). Geometry and the imagination. English translation 1952. Chelsea, New York. Johnston, S. F. (2001). A history of light and colour measurement. Institute of Physics Publishing, Bristol. Wyszecki, G. and Stiles, W. S. (1967). Color science (1st edn). Wiley, New York.
chapter 2
LIGHT ADAPTATION, CONTRAST ADAPTATION, AND HUMAN COLOUR VISION michael a. webster Preface I recently had the chance to watch a hypnotist perform at a county fair. To fend off skeptics, he began the show with a remarkable demonstration of the power of suggestion. We were first told to stare at a spinning spiral that appeared to be steadily expanding. (This ‘Wheel of Artemis’ had apparently been devised centuries ago as an aid to hypnosis.) He then commanded us to look into his eyes and succumb to the suggestion that his head was shrinking. The illusion of his contracting face was overwhelming, and from that moment the audience was hooked! As an illustration of his own hypnotic powers, I would have found it more compelling had he instructed us to see his head—like the wheel—also expanding. But the power of the stimulus was undeniable. It left me thinking that the environment is always presenting suggestions to us, and that we are constantly under its spell. The spell takes hold of us through the medium of adaptation. In this chapter I wanted to consider how adaptation alters visual sensitivity, and what this can tell us about our colour vision. Perception is so malleable that we can easily change the state of adaptation in the lab. Artificially manipulating these states can provide important insights into the neural representation of colour (though at times what I think is an insight others might view as a misguided bias!). A particularly powerful feature of colour adaptation is that simple changes in the stimulus can tap into very different levels of the visual system and reveal very different representations, and I have focused on my own interests in colour coding in the cortex. Given that the visual system adapts so readily to the patterns of stimulation before it, I also wanted to consider how our colour vision is shaped by adaptation to the colours we encounter in the natural environment. We actually know very little about this, but I hope this chapter might stimulate some interest. The motion after-effect induced by spinning wheels is among the most striking of visual illusions, yet we appreciate it only because of the sudden switch from a moving to a static stimulus, and often fail to notice that our visual system has changed during the inducing movement. In the same way, my guess is that we often fail to recognize the tremendous hold that the world exerts on our vision as we adapt to the patterns it presents us. We are only beginning to explore what the relevant patterns of colour and form in images might be, but I would risk a further misguided bias to suggest that adaptation to these will prove to be one of the fundamental factors determining what we see. Indeed, it may be that most of what we notice about the world will turn out to be a visual after-effect. M. A. Webster
68
colour perception
Introduction One of the most remarkable properties of visual coding is that it changes, or adapts, its properties in response to specific properties of the prevailing stimulus. These adjustments allow the visual system to follow and tune for the ever-varying characteristics of the visual environment. Adaptation adjusts sensitivity at multiple levels of the visual system and to multiple aspects of the stimulus. The patterns of these sensitivity changes reveal much about how the visual system is organized to represent information. Indeed, adaptation effects have been called the ‘psychologist’s microelectrode’ (Frisby 1979) because they have proven to be one of the most powerful tools for probing the nature of visual mechanisms. In this chapter we will examine how adaptation can be used to explore different levels of colour processing. A second theme is to explore the nature of the adaptation processes themselves. Adaptation is inherent in the visual response to any stimulus, and thus the workings of the visual system can only be understood within the context of the prevailing adaptational states. Moreover, as several chapters in this book illustrate, adjustments in these states play a critical role in many visual functions—far more, perhaps, than we currently understand. The study of adaptation is thus of great importance in its own right, because it is central to an understanding of the kinds of information that the visual system is designed to extract, and because its properties determine the capacities and limits of our perception. In thinking about adaptation in the context of colour vision, it is instructive to consider that the colour of any stimulus can be decomposed into two components: the overall mean colour, and the variations in colour relative to the mean. These two components are tied to two fundamentally different forms of visual adaptation. Light adaptation adjusts sensitivity to the mean luminance and chromaticity averaged over some time and region of the image, and produces mean shifts in colour perception. Contrast adaptation adjusts sensitivity according to how the ensemble of luminances and chromaticities are distributed around the mean, and instead alters colour appearance by changing the perceived contrast along different directions in colour space (Fig. 2.1). In the following sections we first review the general characteristics of these two distinct forms of visual adjustment, and then consider how they combine to influence colour perception. This will lead, in the final sections, to a consideration of how the two forms of adaptation adjust our colour vision to the natural visual environment.
Light adaptation That our vision adjusts to the ambient light level is common knowledge. Several minutes may be required to adapt to a dark theatre, and stars disappear against the background of daylight. The dependence of visual sensitivity on average light level can be characterized by measuring the minimum light increment visible on backgrounds of different intensities. Figure 2.2 shows an example of these threshold-versus-intensity (t.v.i.) curves. Vision is limited in the dark by the absolute threshold of the system, but becomes progressively desensitized as the background intensity increases. For large fields and long flashes, the slope of the curve is close to 1.0, so that the just-visible increment is a constant proportion of the background level and thus represents constant sensitivity to contrast (i.e. I/I = constant, known as Weber’s law). However, thresholds for short, brief flashes yield shallower slopes that instead approach a limit of 0.5 (Barlow 1972).
(a)
(b) +S
Light adaptation +S
(c)
Contrast adaptation +S
mean –L
+L –L
white
white
–S
contrasts +L
+L –L
–S
–S
Figure 2.1 (a) Two different types of adaptation adjust to two different properties of the stimulus. (b) Light adaptation shifts the mean perceived colour toward white. (c) Contrast adaptation reduces sensitivity according to how the individual colours are distributed around the mean. The largest sensitivity losses occur along the axes with the highest variance (the 45–225◦ C axis in this example).
3
log increment threshold
inc bkgd
2
1
short duration small area
0 long duration large area
–1
–4
–3 –2 –1 log background radiance
0
Figure 2.2 Threshold-versus-intensity (t.v.i.) curves illustrate losses in sensitivity resulting from light adaptation as background light level increases (Barlow 1972). However, sensitivity to contrast (I/I) is maintained (large stimulus) or improved (small stimulus).
colour perception (b)
log increment threshold (475 nm)
(a) –2
0
S
M
L
475
–3
550
–4 –5
–6 –6
‘blue-sensitive’ mechanism ‘green-sensitive’ mechanism –5 –4 –3 –2 log background radiance (550 nm)
log relative sensitivity
70
–1 –2 –3 –4 –5
400 450 500 550 600 650 700 Wavelength (nm)
Figure 2.3 Two-colour thresholds. (a) The t.v.i. curve for a blue (475 nm) test on a yellow–green (550 nm) background reveals two branches reflecting light adaptation in different colour-selective mechanisms (Stiles 1961). These ‘pi’ mechanisms approximate the form of the cone spectral sensitivities (b) (from Stockman et al. 1993), but also reflect post-receptoral influences.
In a classic series of experiments Stiles (1959) measured t.v.i. curves for tests and backgrounds that differed in wavelength. Figure 2.3a shows an example of these ‘two-colour thresholds’, for the case of a 475 nm test on a 550 nm background. Threshold follows the normal t.v.i. curve but then there is a break followed by a new plateau from which threshold again rises. The two branches suggest that thresholds are limited by two different mechanisms with different spectral sensitivities. While we will see below that the interpretation of these curves is more complex, we can understand the general flavour of these results by relating the sensitivity changes to the spectral sensitivities of the cone receptors, as shown in Fig. 2.3b. Due in part to their sparsity, the short-wavelength (S) cones have a much higher threshold than the long- (L) and medium-wavelength (M) cones, so that in the dark the M cones are most sensitive to the 475 nm test and limit threshold. Yet the L and M cones are also much more sensitive to the 550 nm background, so that as background intensity rises they adapt more than the S cones. Eventually the greater adaptation of the M cones renders them less sensitive to the test, and the thresholds then follow the t.v.i. curve for the S cones. By analysing a large combination of tests and backgrounds, Stiles showed that to a first approximation the thresholds could be accounted for by independent adaptation in a small set of ‘pi mechanisms’. The t.v.i. curves for individual mechanisms have a constant shape, but are shifted vertically or horizontally, depending on their sensitivity to the test or background wavelength, respectively. The two-colour threshold experiments show that light adaptation is selective for the chromatic properties of the stimulus (and in this context is referred to as chromatic adaptation). Suppose this selectivity arises because light adaptation occurs independently within each class of cone. A specific formulation of this hypothesis is known as von Kries
light adaptation, contrast adaptation, and human colour vision
71
adaptation: adaptation adjusts responses in the three cone types separately and is equivalent to multiplying their fixed spectral sensitivities by a scaling constant (von Kries 1970). If the scaling weights (von Kries coefficients) are inversely proportional to the absorption of light by each cone type, then von Kries scaling maintains a constant mean response within each cone class. This effectively discounts the mean colour of the illuminant, so that the cone signals instead convey the differences, or contrasts, relative to the mean (see Whittle, Chapter 3, this volume). For example, a red adapting light might stimulate the L cones more than a green light, but adaptation to the red light will be proportionately greater, so that after complete von Kries adaptation the average response to the two lights is the same. We will see below that this provides a simple yet powerful mechanism for maintaining the perceived colour of objects despite changes in illumination (see also Chapters 7, 8, and 10, on colour constancy). Under a number of conditions von Kries scaling provides a good account of the effects of light adaptation on colour sensitivity and appearance (e.g. Brainard and Wandell 1992; Chichilnisky and Wandell 1995; Webster and Mollon 1995), yet it does not provide a complete account. Early measurements of chromatic adaptation and colour appearance suggested that the colour changes could be complex and could not be explained by receptor changes alone (Wyszecki 1986). In terms of sensitivity, the pi mechanisms identified by Stiles cannot reflect independent cone types, because more than three were required to account for all conditions, and the spectral sensitivities are broader than estimates for individual cones (although under certain conditions the curves for some pi mechanisms closely approximate receptor spectral curves). Pugh and Mollon (1979) showed that the different short-wavelength pi mechanisms could be explained by assuming two sites of S cone adaptation—one cone-specific and the second resulting from polarization in opponent channels. Polarization effects also occur under conditions where detection is mediated by L and M cones (Boynton and Kambe 1980; Stromeyer et al. 1985). This second-site adaptation is consistent with a variety of chromatic adaptation phenomena (e.g. additivity failures for adapting backgrounds and ‘transient tritanopia’, an increase in short-wavelength threshold when long-wave backgrounds are turned off) (Pugh and Mollon 1979), and is a major source of evidence for colour opponency in post-receptoral channels. A related shortcoming of von Kries adaptation is that it explains only part of the way in which the visual system adjusts to adapting backgrounds (Shevell 1978). In addition to setting the gain of the visual system, backgrounds also add light physically to any increment presented on them. However, large, steady backgrounds have much less effect on colour appearance than would be predicted from the added light. For example, suppose a light that appears unique yellow is flashed as a spot on a large red background. The background may add some redness to the flash, but much less than predicted by the physical mixture. Thus the colour appearance depends much more on the light coming from the flash than from the background. To a large extent, the visual system discounts or subtracts out the background so that the response is primarily to spatial and temporal transients. The work of several authors has led to a detailed account of how multiplicative and subtractive processes combine to control visual sensitivity (see Walraven et al. 1990). The different processes are revealed psychophysically by elaborating on the threshold-versusintensity experiment to measure—for each single adapting background—the threshold for
colour perception
72
(a)
(b) 1.0 no adaptation
0.8
inc
0.7
steady background
Response
background
inc flash bkgd steady bkgd time
0.6 0.5
multiplicative
multiplicative and subtractive
0.4 0.3 0.2
no background
log increment threshold
0.9 flashed
no adapt
mult
mult + sub
no bkgd
0.1 1.0 2.0 3.0 log flashed background radiance
4.0
log flash background radiance
Figure 2.4 The effects of multiplicative and subtractive sensitivity changes on the visual response (a) and incremental sensitivity (b) (Walraven et al. 1990). The dark-adapted (no steady background) responses follow a comprehensive non-linearity that gives a limited operating range leading to response saturation for bright flashes. With no adaptation, backgrounds would add to the response leading to rapid saturation. Multiplicative adaptation attenuates the response to both the steady background and the probe and flash, recovering part of the response range and thus increasing sensitivity. Subtractive adaptation selectively attenuates the response to the steady background, preserving the response range for signalling the transients.
detecting a probe superposed on brief flashes that vary over a range of intensities. Figure 2.4 shows a model of the visual responses to the probe and backgrounds, and how these are manifest in the measured probe–flash curve. The curves are generated by assuming that the threshold for the probe represents a fixed increment in response above the response to the flash. At any given adaptation level, the dynamic range of the visual system is limited and follows a compressive non-linearity. Thus in the dark (no background) the incremental response to the probe becomes progressively weaker as flash intensity increases, and the response eventually saturates. If the visual system did not adapt, then adding a steady background would simply add a baseline response, compressing the dynamic range available for signalling the flashes. Backgrounds could thus greatly reduce sensitivity by themselves saturating the system. However, multiplicative adaptation restores part of the dynamic range by attenuating the response to both the background and the flashes, an effect equivalent to shifting the dynamic range by placing neutral density filters or ‘dark glasses’ (MacLeod 1978) in front of the eyes. Subtractive processes further restore the dynamic range available for flashes by selectively attenuating the response to the background, an effect equivalent to high pass filtering. If the background is discounted completely, then the full response range is preserved for signalling the flashes, and the background serves only to set the gain of the system. This two-stage theory also provides a good account of how simple adapting backgrounds affect the perceived colour and brightness of lights (Whittle and Challands 1969; Jameson and Hurvich 1972; Wei and Shevell 1995). The processes of light adaptation occur primarily within the retina (Shapley and Enroth-Cugell 1984; Hood 1998). At high light levels one form of multiplicative adaptation is provided by photopigment bleaching, which scales photon capture in the receptors and is, in fact, the mechanism that protects the cone response from saturating at bright
light adaptation, contrast adaptation, and human colour vision
73
backgrounds. (The rod system saturates below intensities where bleaching becomes important.) However, sensitivity changes are pronounced at light levels too low to produce significant bleaching. The site of this neural gain control is uncertain. At least some gain changes are largely cone-specific (e.g. Chaparro et al. 1995; Chichilnisky and Wandell 1995; Webster and Mollon 1995) and the adaptation appears to pool signals over areas no larger than the diameter of individual cones (Cicerone et al. 1990; MacLeod et al. 1992). Both results point to an early locus of light adaptation that may be as early as the receptors, as suggested also by electroretinogram measurements (Seiple et al. 1992). However, adaptation within individual cones is not well established physiologically, and appears too weak to account for psychophysical changes in sensitivity (Schnapf et al. 1990). Moreover, there appears to be more than one site of multiplicative scaling. Some of the gain changes are extremely rapid while others take seconds or even minutes to asymptote (Hayhoe and Wenderoth 1991; Fairchild and Reniff 1995). These may have different time constants for luminance and chromatic stimuli, thus pointing to post-receptoral sites. There also appears to be more than one process underlying subtractive adaptation. Spatial subtraction is consistent with the centre-surround antagonism of retinal cells, which filters out the response to large uniform areas and thus emphasizes spatial transients (Hayhoe 1990). Visual acuity is better at brighter light levels, and this suggested that receptive field properties might adapt to different light levels (with surround strength increasing at higher intensities). However, these differences can also be explained by purely local light adaptation (Chen et al. 1987) or by detection within different spatially selective channels (Kortum and Geisler 1995). If spatial subtraction does depend on static receptive field profiles, then it is not a form of adaptation in the strict sense of this review (because it would not reflect a change in response properties). However, there is also a form of temporal subtraction that leads to a loss in response to the steady-state background over time. Unlike spatial subtraction, which is very rapid, this temporal subtraction requires many seconds to asymptote (Hayhoe et al. 1992).
Contrast adaptation As Fig. 2.1 illustrates, there is beyond light adaptation a second general form of perceptual adaptation that adjusts sensitivity to contrast. Individual examples may differ in their specific characteristics and thus involve different processes, but as a class, contrast adaptation effects share the property that they reflect adjustments to the structure or pattern in the stimulus. This structure may refer to the spatial or temporal properties of the stimulus (e.g. the orientation of a grating or the direction of movement) or, as we focus on here, to the colour properties of the stimulus. Blakemore and Sutton (1969) noted that exposure to the adapting pattern can cause four general classes of after-effects: (1) contrast thresholds for detecting similar patterns are elevated; (2) above threshold, the apparent contrast of similar patterns is reduced; (3) stimuli that differ from the adapting pattern are distorted in appearance away from the adapting stimulus; and (4) a ‘neutral’ test stimulus may appear to take on the complementary property of the adapting stimulus. Examples of the perceptual distortions induced by adaptation include the tilt after-effect (in which viewing an oblique line causes a vertical line to appear tilted in the opposite direction; Gibson and Radner
74
colour perception
1937), classical figural after-effects (e.g. changes in the perceived shape of simple patterns; Köhler and Wallach 1944), and shifts in perceived size or spatial frequency (Blakemore and Sutton 1969). An example of a complementary after-effect is the motion after-effect, in which adaptation to rightward motion causes a stationary object to appear to drift to the left (Wohlgemuth 1911). In most experiments on contrast adaptation, observers are first exposed to the adapting stimulus for a short period (usually from a few to several minutes). The stimulus is then turned off and a test stimulus presented briefly (usually for less than 1 second). The long exposure and then removal of the adapting stimulus prior to testing is what distinguishes adaptation from masking paradigms (Graham 1989). Masks alter sensitivity to test stimuli by changing the response level of the visual system, whereas the contrast adaptation stimulus is thought to instead change the ‘responsiveness’ of the system, without producing its own response during the test. To maintain a constant state of adaptation, it is common to interleave repeated tests with brief re-adaptation intervals (usually several seconds). Most adaptation effects build up rapidly with only a few minutes exposure, and may decline exponentially when the adapting stimulus is removed (e.g. Greenlee et al. 1991). However, some after-effects can show very long persistence if observers are not re-adapted to a competing stimulus (Stromeyer 1978). The after-effects of contrast adaptation reflect a selective loss in sensitivity to the adapting stimulus. Thus adaptation to a vertical, 2 c/deg luminance grating will reduce sensitivity to similar patterns but, in general, would have little influence on the thresholds for stimuli that differed by more than 45◦ from vertical (Gilinsky 1968), or whose spatial frequency was more than about 1 octave above or below 2 c/deg (Blakemore and Campbell 1969), or for stimuli that varied in colour rather than luminance (Bradley et al. 1988). Selective adaptation has been well established for each of the primary attributes of visual stimuli (i.e. form, motion, depth, and colour). The adaptation is also specific to retinal location, but not to the pattern of activity in individual cones. In fact, for spatial adapting patterns it is a frequent procedure to move the stimuli over the retina (or ask observers to move their eyes over the stimulus) in order to avoid afterimages owing to local light adaptation. The common interpretation of contrast adaptation is that it reflects sensitivity changes in channels tuned to specific features of the visual stimulus, and that the specificity of the adaptation gives a measure of the tuning function. Indeed, in the majority of studies involving contrast adaptation, the adaptation itself is of little direct interest and instead serves only as a tool for measuring the selectivities of visual channels. While much of the adjustment in light adaptation appears to occur in the retina, three sources of evidence point to a cortical locus for contrast adaptation. First, most contrast adaptation effects show significant interocular transfer (so that adapting with one eye affects sensitivity for targets viewed with the other eye) and signals from the two eyes first converge in the striate cortex (Blake et al. 1981). Secondly, the selectivity of the after-effects often appear to parallel the receptive field properties of cortical cells. For example, the striate cortex is the first level of the primate visual system to exhibit directional selectivity and strong orientation selectivity, implying that the cortex is the site of motion and tilt aftereffects. Thirdly, physiological studies have found strong response changes in cortical cells as a result of prior exposure to contrast, while contrast adaptation in retinal and geniculate
light adaptation, contrast adaptation, and human colour vision (a)
(b)
75
0.0
1.0
log change in contrast
–0.2 Matching contrast
0.3
0.1
post-adapt
pre-adapt
L–M 135 S 45
mul
–0.4
sub mul
–0.6
0.03
–0.8
0.01 0.01
–1.0
S 135 L–M 45
0.03
0.1 Test contrast
0.3
1.0
sub 3
6
12 Test contrast
24
48
Figure 2.5 (a) Apparent contrast of spatial gratings as a function of physical contrast, after adapting to a high-contrast grating (Blakemore et al. 1971). (b) Changes in perceived contrast of chromatic stimuli, after adapting to temporal modulations along different chromatic axes (Webster and Mollon 1994). Filled symbols show contrast losses along the adapting axis, open symbols along the orthogonal colour direction. For both the spatial and temporal contrast, the effects of adaptation on perceived contrast are proportionately larger for lower test contrasts.
cells appears to be much weaker (Maffei et al. 1973; Ohzawa et al. 1982; Albrecht et al. 1984; Saul and Cynader 1989; Sclar et al. 1989; but see Smirnakis et al. 1997). While collectively these results point to the striate cortex as the earliest principal locus for contrast adaptation, they do not preclude a role in adaptation of more central, extrastriate areas (Wenderoth et al. 1988; Paradiso et al. 1989). In general, higher-contrast adapting stimuli lead to larger sensitivity losses, and tend to have larger effects on lower-contrast test stimuli. Yet the specific form of the response change is not well defined, and may vary considerably with the specific task. One way to examine the form of the sensitivity change is to measure the changes in apparent contrast over a range of test contrasts. Figure 2.5a shows results from Blakemore et al. (1971). Adaptation to a high-contrast luminance grating lowered perceived contrast, but only for test gratings with physical contrasts lower than the adapting contrast. The sensitivity losses are intermediate to the contrast changes predicted by a subtractive or multiplicative change in apparent contrast, and instead closely follow a power function (i.e. on log–log axes the main effect of the adaptation is to change the slope of the line relating the match and test contrasts). Webster and Mollon (1994) obtained similar results following adaptation to chromatic contrast, presented as a temporal modulation in a uniform field (Fig. 2.5b). However, at lower adapting contrasts the changes in apparent contrast may more closely approximate a subtractive change (Georgeson 1985), and again this pattern appears to be similar for luminance and chromatic contrast (Webster et al. 1987). A second method for exploring the response changes is by examining how contrast adaptation alters the ability to discriminate changes in contrast. In analogy with the probeflash paradigm, this typically involves measuring (usually for spatial gratings) the threshold for detecting an increment in contrast (probe) as a function of pedestal contrast (flash)
Adapt
Re-adapt
Test Pedestal
Time
Test contrast threshold
4
(b)
mult
8 adapt
sub
2 no adapt
1 0.5
Chromatic contrast threshold
(a)
4
chrom adapt
2 lum adapt
1 no adapt
0.5
0.25
0.25 0
0.25
1
4
16
64
(c)
log change in luminance threshold
Pedestal contrast (× threshold)
0
0.25
1
4
16
64
Luminance pedestal contrast (× threshold)
1.0 0.8 0.6
lum adapt
0.4 0.2 chrom adapt
0.0 no adapt
–0.2 0 4 8 16 32 64 Chromatic pedestal contrast (× threshold)
Figure 2.6 Contrast discrimination before or after contrast adaptation, measured with spatial gratings (Webster et al. 1987). The three panels plot the contrast thresholds for a test grating superposed on a pedestal grating, before or after adaptation to a grating. Different panels plot the results when each component (adapt, pedestal, or test) was either a luminance grating or a colour grating. (a) Adapt, pedestal, and test gratings were all luminance (circles) or all chromatic (triangles). Before adaptation, thresholds follow a dipper function that is similar for luminance and colour (Switkes et al. 1988). Adaptation raises the threshold at low pedestal contrasts but not at high. Lines plot the changes predicted by a subtractive (dashed) or multiplicative (solid) sensitivity change in the contrast response function. (b) Thresholds for detecting a colour grating superposed on a luminance pedestal grating. Moderate-contrast luminance pedestals facilitate colour detection (unfilled circles) (Switkes et al. 1988). Adapting to a chromatic grating selectively affects chromatic sensitivity and raises thresholds on all pedestals by roughly a constant factor (unfilled squares and diamonds, for two
light adaptation, contrast adaptation, and human colour vision
77
after adapting to contrast (background). However, unlike studies of light adaptation, the adapting stimulus is typically turned off before each presentation of the test stimulus. Before adaptation, contrast discrimination in gratings is characterized by a ‘dipper’ function, with facilitation near threshold and decreasing sensitivity above threshold that follows a power function (with exponents between 0.5 and 1.0) (Legge and Foley 1980). Numerous studies have examined how this function is changed by adaptation, but results have varied widely (e.g. Barlow et al. 1976; Greenlee and Heitger 1988; Maattanen and Koenderink 1991; Ross et al. 1993; Wilson and Humanski 1993; for analogous measurements for temporal contrast, see also Shapiro and Zaidi 1992). Figure 2.6a shows results from Webster et al. (1987) for either luminance or chromatic gratings. The contrast discrimination functions are very similar for luminance and colour (Switkes et al. 1988), and prior adaptation affected both functions in similar ways—by elevating discrimination thresholds at low pedestal contrasts but not at moderate to high contrasts. Webster et al. (1987) also measured how contrast adaptation influenced contrast discrimination when the test, mask and adapting grating could each be defined by either luminance contrast or chromatic contrast. Thresholds for chromatic contrast are strongly affected by the presence of a luminance pedestal, and vice versa, yet there is little cross-adaptation between luminance and chromatic gratings (Bradley et al. 1988; Switkes et al. 1988). Thus stimuli could be chosen so that adaptation affected only the pedestal or only the test. (For example, for chromatic tests on luminance pedestals, luminance adaptation altered sensitivity to the pedestal contrast while chromatic adaptation altered sensitivity to the test.) The results, shown in Fig. 2.6b and c, suggested that adaptation had little effect on high (pedestal) contrasts (thus resembling a subtractive sensitivity change), but approached a multiplicative change at very low (test) contrasts, and this is qualitatively consistent with the effects of adaptation on perceived contrast. The results also suggest that contrast adaptation occurs at, or prior to, the site of the simultaneous interactions between luminance and colour, thus supporting a cortical locus for the simultaneous interactions. For example, if the facilitation of colour thresholds by luminance pedestals (Fig. 2.6b) occurred prior to the site of the adaptation, then reducing sensitivity to the luminance pedestal by luminance adaptation should not have altered the colour thresholds.
Contrast adaptation and colour vision Contrast adaptation was first used to explore the mechanisms of colour vision by Krauskopf and colleagues (Krauskopf et al. 1982, 1986a) and Guth and colleagues (Benzschawel and different contrasts of the adapting colour grating). Adapting to a luminance grating selectively affects luminance sensitivity and reduces the facilitation of colour tests by luminance pedestals, but only at low pedestal contrasts (filled triangles, for two different contrasts of the adapting luminance grating). (c) Thresholds for a luminance test superposed on a chromatic pedestal show masking (unfilled circles) (Switkes et al. 1988). Adaptation to a luminance grating raises thresholds for the luminance test (unfilled diamonds), while adaptation to a chromatic grating does not alter the luminance thresholds (filled triangles). The results for all conditions are consistent, with similar but separable adaptation effects for luminance and colour that have weak effects on high contrast (pedestal) targets, while strong effects that approach a multiplicative sensitivity change on very low contrast (test) targets.
78
colour perception
Guth 1982; Guth 1982; Guth and Moxley 1982). Krauskopf et al. (1982) measured the thresholds for detecting brief changes in the colour and luminance of a white field after adapting to temporal modulations of colour and luminance in the field. For example, during adaptation, observers viewed a field that might slowly flicker in luminance or colour (e.g. along a reddish–greenish axis). The flicker was then briefly extinguished and test pulses of luminance or colour were presented in the same field. The sequence of re-adaptation and test was continued while observers set thresholds for detecting the test pulses. The adapting modulations varied along different axes in colour–luminance space but all had the same average luminance and chromaticity, so that the state of light adaptation (to the time-averaged stimulus) remained constant. Krauskopf et al. (1982) found that the sensitivity losses following adaptation were primarily selective for three stimulus directions: an achromatic axis (that varies in luminance but not colour), and two chromatic axes that vary in opposing signals in the L and M cones (L vs M) or opposing signals in the S versus the L and M cones (S vs LM). For example, after adapting to modulations along the L vs M axis, thresholds were elevated for detecting colour changes along the L vs M axis, but not for changes along the S vs LM or achromatic axes. Such results bolstered earlier suggestions that these stimulus directions represent critical dimensions in colour coding (e.g. Le Grand 1949; MacLeod and Boynton 1979; Boynton and Kambe 1980). However, a second critical finding was that selectivity was not limited to these axes. Adaptation to intermediate axes tended to produce the largest threshold elevations along the adapting axis (Krauskopf et al. 1986a), and Guth (1982) showed that perceived hue was always biased away from the adapting axis. Both results were inconsistent with sensitivity changes in only three channels tuned to fixed directions in colour space. Webster and Mollon (1991, 1994) extended the adaptation paradigm of Krauskopf et al. to examine how contrast adaptation influences suprathreshold colour appearance. In their studies, observers adapted to modulations in a 2-deg test field placed to one side of a fixation point. Brief test stimuli were then interleaved with the adapting modulations, and were matched by adjusting the colour and luminance of a matching stimulus, presented simultaneously with the test but in a neutral matching field placed on the other side of fixation (Fig. 2.7). To illustrate how contrast adaptation can be used to explore the properties of colour channels, consider the changes in colour appearance predicted by a conventional model of colour coding. Figure 2.8a illustrates a standard two-stage model, in which the signals from the three classes of cone receptor are combined to form three post-receptoral channels that encode luminance or L vs M or S vs LM chromatic contrast. Figure 2.8b shows the volume of colour–luminance space in terms of the stimulus dimensions that isolate each channel (Derrington et al. 1984). Here we consider how this volume could be distorted by adaptation. Because the stimulus modulations (around a fixed mean colour) maintain the visual system in a constant state of light adaptation, the colour changes induced by the adaptation should reflect directly the sensitivity changes in the second-stage mechanisms. Figure 2.8c shows the changes in perceived colour that would be predicted by independent gain changes within the two chromatic channels tuned to the L vs M and S vs LM axes. Adaptation could reduce perceived contrast (‘saturation’) by reducing sensitivity in one or both of the channels. However, the contrast losses should always be greatest or least along the L vs M or S vs LM axes (depending on which channel is adapted more). Most
light adaptation, contrast adaptation, and human colour vision
79
Adapt and test field
+
Match field
Chromaticity
DC + AC adaptation test DC
1s match Illuminant C
Figure 2.7 Spatial and temporal arrangement of the asymmetric matching procedure used to examine contrast adaptation (AC) or the combined effects of light adaptation and contrast adaptation (DC + AC).
stimuli produce responses in both of the channels. Adaptation could change the perceived direction (‘hue’) of these stimuli whenever it reduced the response in one channel more than the other; perceived hue should always rotate away from the axis of the more strongly adapted channel and toward the axis of the less strongly adapted channel. Note that these hue changes are analogous to tilt after-effects, because they represent tilts in the perceived orientation of axes within colour space. However, stimuli that differ from the background only along the L vs M or S vs LM axes might appear desaturated but should not change in perceived direction, because these stimuli isolate a single channel, and thus adaptation could not change the relative responses to these stimuli across the two channels. Figure 2.9 shows the actual colour changes produced by adaptation to eight different directions within the equiluminant plane. Each panel plots the matches to the same set of test stimuli after adapting to modulations along two different axes, 90◦ apart. In each case the saturation losses are largest along the adapting axis, and weakest along an axis approximately 90◦ away. Thus adaptation produces changes in perceived contrast that apparently can be selective for any chromatic direction. Similar selective changes were found when the adapting and test stimuli varied in both colour and luminance, even when the chromatic variations were chosen to modulate signals only in the S cones. Thus the adaptation effects can show clear selectivity for how luminance contrast and S-cone contrast co-vary, even though S cones make little contribution to conventional measures of luminance sensitivity (Lennie et al. 1993). The sensitivity changes produced by contrast adaptation thus are clearly inconsistent with models of colour vision that assume only three discrete post-receptoral channels that adapt independently. Figure 2.10 illustrates how adaptation altered the perceived direction of the test stimuli. The left panel plots the angular difference between each test and its matching coordinates following adaptation to modulations along a single adapting axis (L vs M), while the right panel instead shows the effects of different adapting directions on a single test direction (the L vs M axis). In each case the appearance of test stimuli shifted away from the adapting
colour perception
80 (a)
+Luminance 90
(b) S
M
L
S
M
L
S
M
L +S 90
–L (+M) 180
+L (–M) 0
–S 270 S–(L + M)
(L – M)
‘blue–yellow’
‘red–green’
(L + M) ‘white–black’
–Luminance 270
(c) 18
test
12 6 0 match
–6 –12 –18 –18
–12
–6
0 L–M
6
12
18
Figure 2.8 (a) A standard two-stage model of human colour mechanisms. Signals from the three cone types are recombined to form one luminance and two chromatic channels. (b) All lights can be defined in terms of a colour space, the axes of which represent the stimulus variations that isolate each second-stage mechanism (after Derrington et al. 1984). (c) Plot of the equiluminant plane, showing the colour changes in a set of test stimuli (squares) predicted by independent adaptation in mechanisms sensitive to the L vs M and S vs LM axes (Webster and Mollon 1991). Adaptation may desaturate tests (so that they are matched by stimuli that plot closer to the origin), but the largest and smallest changes should occur along the cardinal axes. Adapting the LvsM axis more should cause the perceived direction of all stimuli to rotate away from the L vs M axis and toward the S vs LM axis. However, contrast adaptation to any axis should not alter the perceived direction of stimuli on either cardinal axis.
axis and toward a second axis approximately 90◦ away from the adapting axis, consistent with a selective sensitivity loss along the adapting axis. These hue shifts suggest that there is no test axis whose direction remains invariant across different adapting directions, and thus no axis that invariably isolates a single chromatic channel. Similar changes in perceived direction are again observed when adapting and test stimuli vary in both luminance and colour. For example, adaptation to a modulation between bright-red and dark-green causes a luminance increment to appear greenish and a luminance decrement to appear reddish, and causes a equiluminant red to appear darker while an equiluminant green appears brighter. Again, each change represents a rotation in perceived direction away from the adapting
light adaptation, contrast adaptation, and human colour vision (a) 18 12
(b) 18 45
135
12
6
6
s 0
s 0
–6
–6
–12 –18 –18
–12
–12
315
225 –6
0 L–M
6
12
18
(c) 18 12
–18 –18
45
135
12 6
s 0
s 0
–6
–6
–18 –18
135
45
225
315
–12
–6
0 L–M
6
12
18
(d) 18
6
–12
81
–12
–12
315
225 –6
0 L–M
6
12
18
–18 –18
135
45
225
315
–12
–6
0 L–M
6
12
18
Figure 2.9 Matches to test stimuli (filled squares) following adaptation to eight different directions within the equiluminant plane (Webster and Mollon 1994). Saturation losses are always largest along the adapting axis, and weakest roughly 90◦ away. (a) Adaptation to the L vs M (unfilled circles) or S vs LM (triangles) axes. (b) Adapting axes of 22.5−157.5◦ (unfilled circles) or 112.5–202.5◦ (triangles). (c) Adapting axes of 45–225◦ (unfilled circles) or 135–315◦ (triangles). (d) Adapting axes of 67.5–247.5◦ (unfilled circles) or 157.5–337.5◦ (triangles). Dashed lines and small circles plot the matches predicted by a model based on multiple colour channels (see text).
axis, and suggests that pure luminance and pure chromatic stimuli do not invariably isolate pure luminance- or pure chromatic-sensitive mechanisms. Thus the adaptation effects are again inconsistent with conventional models of colour vision based on only three independently adaptable channels, and instead suggest that the central representation of colour involves either multiple channels or channels that can alter their tuning functions through adaptation.
colour perception
82 (a)
30
(b)
24 +LUM 18 12 Match–test angle
Angle change (match–test)
20 10 0 –10
–L
+L
6 0 –6 –12
–20 –30 0 L–M
90 S
180 270 L–M S Test angle (°)
0 L–M
–18 –LUM –24 0
45
90
135
180
Adapting angle (°)
Figure 2.10 Changes in the perceived hue of test stimuli following contrast adaptation (Webster and Mollon 1994). (a) Adaptation to the L vs M axis causes all hues to rotate toward the S vs LM axis, but does not affect the perceived direction of stimuli on the L vs M or S vs LM axis. (b) However, any intermediate adapting direction rotates the perceived direction of L vs M tests away from the adapting axis, showing that the L vs M tests do not isolate a single mechanism.
Spatial factors in colour contrast adaptation In the preceding results, observers adapted to temporal modulations in uniform 2◦ fields, yet similar after-effects occur for spatially varying adapting and test patterns, such as gratings, and demonstrate that colour contrast adaptation is also spatially selective. For example, Webster and Mollon (1993) found that adaptation to gratings with correlated luminance and chromatic contrast (e.g. bright-red/dark-green) produced large brightness differences between the red and green components of chromatic gratings and large hue differences between the bright and dark bars of achromatic gratings. Yet these after-effects were abolished when the same luminance and chromatic adapting components were presented as temporal modulations in a uniform field, suggesting at least coarse size tuning. Flanagan et al. (1989) showed that the hue changes induced by adaptation to chromatic gratings are orientation selective, and conversely, tilt and spatial frequency after-effects are colour selective (Elsner 1978; Favreau and Cavanagh 1981; Flanagan et al. 1990). These suprathreshold after-effects parallel results showing that changes in threshold sensitivity following colour contrast adaptation are spatially selective (Stromeyer et al. 1980; Bradley et al. 1988). The classic demonstration of the spatial selectivity of colour after-effects is the McCollough effect (McCollough 1965). After adapting to a vertical grating of red and black stripes alternated with a horizontal grating of green and black stripes, an achromatic vertical grating appears greenish and an achromatic horizontal grating appears reddish. Thus the adaptation induces an orientation-selective colour after-effect in the test grating. It is unclear whether such after-effects represent a special form of adaptation that is distinct from contrast adaptation. For example, compared to conventional contrast adaptation
light adaptation, contrast adaptation, and human colour vision
83
effects, the McCollough effect is thought to have unusually long persistence and to exhibit less interocular transfer (Stromeyer 1978). Many theories have been proposed to account for the contingencies between form and colour in the McCollough effect (e.g. Stromeyer 1978; Dodwell and Humphrey 1990; Siegel and Allan 1992), yet the selectivity of colour contrast adaptation for any colour– luminance direction suggests an interpretation with a different emphasis (Webster and Malkoc 2000). Specifically, the colour changes in the after-effect may not depend on the pairing of orientation and colour per se, but rather on the pairing between orientation and colour–luminance direction. For example, adaptation to a bright-red grating may reduce selectively the sensitivity to bright red. An achromatic grating subsequently appears greenish because the achromatic axis has been ‘tilted’ away from the adapting axis (because the response distribution of colour–luminance mechanisms encoding whiteness is biased against bright-red and thus toward bright-green). The importance of the McCollough effect is in showing that the perceived colour changes are orientation-selective, consistent with the general spatial selectivity of contrast adaptation. Yet the response biases underlying the colour changes may reflect primarily how the visual system is organized to represent the volume of colour–luminance space, and how this volume is distorted by selective adaptation to different colour–luminance directions (Webster and Malkoc 2000).
Models of contrast adaptation We have seen that colour contrast adaptation effects cannot be accounted for by standard models of colour vision based on three independent post-receptoral channels. This section considers two alternative models of post-receptoral colour vision that can predict more closely the pattern of colour changes. These differ from the standard model either by assuming many more than three channels that adapt independently, or by assuming channels whose tuning functions can be altered by adaptation-dependent interactions. The two models are thus based on different assumptions about the nature of contrast adaptation. Webster and Mollon (1994) examined whether the contrast adaptation effects they observed within the equiluminant plane were consistent with a model based on multiple chromatic channels that adapt independently. Their model is similar to conventional multiple-channel models of spatial contrast adaptation effects that have been proposed to account for phenomena such as the tilt after-effect or shifts in apparent size (spatial frequency) (Braddick et al. 1978). The logic of such models is illustrated in Fig. 2.11. The stimulus dimension (e.g. orientation) is assumed to be encoded by a large number of channels with overlapping tuning functions. A stimulus is therefore represented by the distribution of activity across a subset of channels. Adaptation desensitizes the channels that respond to the adapting stimulus, with more adaptation in channels that respond most. When a test stimulus is now presented after adaptation, the sensitivity to stimuli similar to the adapting stimulus is reduced, and stimuli near the adapting value appear shifted in appearance away from the adapting value, because the distribution representing the test has been skewed by the adaptation.
84
colour perception
Relative activity Channel sensitivities
Test
Adapt
Retest
Orientation
Orientation
Perceived
Orientation
Perceived
Figure 2.11 Multiple channel models of contrast adaptation (Braddick et al. 1978). Adaptation to an oblique orientation reduces sensitivity to a subset of channels selective for similar orientations. This biases the distribution of responses to a vertical test line, causing it to appear tilted in the opposite direction.
Note that the predictions in Fig. 2.8c actually represent an extreme case of this model, in which there are only two linear chromatic channels. At the other extreme, the equiluminant plane might be spanned by an effectively uniform distribution of channels, each tuned to a different direction within the plane. Figure 2.12 suggests that the distribution that best fits the observed colour changes is somewhere in between—channels selective for intermediate directions are required to account for the selectivity of the adaptation for any chromatic direction, yet biases along the L vs M and S vs LM axes are evident. The dashed lines in Fig. 2.9 show that such multiple-channel models can closely approximate the observed colour changes. The channel distribution implied by this analysis is substantially broader than the spread of preferred directions estimated for cells in the parvocellular geniculate (Derrington et al. 1984), and thus could reflect the broader range of chromatic preferences found in striate cortex (Lennie et al. 1990). Several authors have suggested that adaptation does not reflect independent response changes within channels, but rather inhibitory interactions between channels (see Graham 1989). Barlow and Földiák (Barlow and Földiák 1989; Barlow 1990b) proposed a novel interpretation of contrast adaptation effects based on inhibition. They suggested that mutual inhibition builds up between two channels whenever their responses are correlated, and that this alters the tuning functions of the channels until their responses become statistically independent. For example, Fig. 2.13a illustrates a stimulus that produces co-varying
light adaptation, contrast adaptation, and human colour vision
85
1.00
Density
0.75 29
AS
28
MW
0.25 19
AE
0.50
18 0.00 –45
10
LGN 0 L–M
8 45
LGN 90 S
JM 135
Chromatic axis (°) Figure 2.12 Distribution of colour channels estimated from colour contrast adaptation effects for four observers (Webster and Mollon 1994). Dashed lines plot the distribution of chromatic directions for lateral geniculate nucleus (LGN) cells (Derrington et al. 1984). Note the figure does not reflect differences in absolute density along the L vs M and S vs LM axes.
responses within a pair of mechanisms. The outputs of the two mechanisms can be decorrelated by an oblique rotation of their response axes, equivalent to mutual inhibition that subtracts from each mechanism’s output some fraction of the response from the second mechanism (Fig. 2.13b). As a result, responses to the oblique adapting axis are selectively reduced, and all other stimuli are rotated away from the adapting axis toward the orthogonal axis. An important feature of this model is that sensitivity changes can show selectivity for stimuli to which none of the individual channels are tuned, and thus selective adaptation for a stimulus dimension does not demonstrate unequivocally the existence of a mechanism aligned to that dimension. Webster and Mollon (1991, 1994) suggested that adaptation-dependent interactions of the type proposed by Barlow provided an alternative basis for the selectivity of colour contrast adaptation for multiple colour–luminance directions. Formal models of decorrelation in colour channels were subsequently developed by Atick et al. (1993) and Zaidi and Shapiro (1993). For example, Atick et al. showed that the observed selectivity of adaptation for multiple directions within the colour–luminance plane can be generated by only two discrete mechanisms coupled by adaptable lateral feedback links that decorrelate and normalize their responses. In their algorithm, the effect of these interactions is equivalent to gain controlling the signals along the adapting axis and the orthogonal axes so that they have equal variance. This distorts the original circle of test stimuli into an ellipse whose
86
colour perception Q (a)
(b)
P B
Q B
A
P A
Figure 2.13 Decorrelation model of adaptation (Barlow 1990b). (a) A stimulus distribution that produces correlated responses in two mechanisms tuned to stimulus dimensions, A and B. Redundancies in the responses can be removed by changing the response axes of the mechanisms through mutual inhibition, so that the mechanisms instead encode the oblique dimensions P ( = A + kB) and Q ( = B + kA). (b) This results in an oblique rotation of the response axes. Stimuli that originally isolated one of the mechanisms (A or B) now have an inhibitory effect on the second mechanism, so that they appear rotated in the response space away from the adapting direction.
minor axis is aligned to the principal axis of the adapting distribution, and whose minor and major axis lengths (determined experimentally) are presumed to be inversely related to the variance in the adapting distribution along the principal and orthogonal axes. As Fig. 2.9 shows, such ellipses again closely characterize the perceived colour changes in the test stimuli. The observed results for colour contrast adaptation do not clearly discriminate between these alternative models. An important insight of the decorrelation model is that it provides a functional basis for contrast adaptation, which is lacking in multiple channel models (see below). However, as we review in the following section, results from other psychophysical paradigms suggest the presence of more than three channels even in a single state of adaptation, and again, multiple channels are consistent with the variability in cone inputs to individual cells observed physiologically. This inherent redundancy presents a challenge for models based on coding efficiency and decorrelation, for if the number of post-receptoral cone combinations is greater than the number of cone types, then there will always be some redundancy in the responses of post-receptoral channels. [However, colour coding is coupled to the spatial properties of receptive fields, and thus a complete account of decorrelation would also require consideration of the spatial redundancies between the channels (Atick 1992; Atick et al. 1992).] A second challenge arises from the response changes produced by adaptation, which we have seen are progressively weaker at higher contrasts. This suggests that the responses to the adapting stimulus itself are not completely decorrelated.
light adaptation, contrast adaptation, and human colour vision
87
Contrast adaptation and colour coding As the preceding section illustrates, the implications of the adaptation effects for colour coding depend on assumptions about the properties of the adaptation itself. Their interpretation must also be tempered by assumptions about the general nature of post-receptoral colour vision. The number and ways in which the cone signals are transformed within postreceptoral pathways remain poorly defined, and different paradigms often point to very different conclusions. These are often treated as contradictory evidence about a single, canonical organization (e.g. whether there are three channels or many), yet they might all be compatible if they reflect different aspects and levels of the system. In either case, it is important to ask how the adaptation effects might relate to other observations on colour coding. Here we consider this relationship in terms of several important aspects of post-receptoral colour vision. Spectral sensitivities of post-receptoral channels Measurements based on colour appearance and colour sensitivity make different predictions about how the cones are combined within post-receptoral channels. Both adaptation and threshold sensitivity typically reveal chromatic channels organized along the L vs M and S vs LM cardinal axes (Le Grand 1949; Boynton and Kambe 1980; Krauskopf et al. 1982, 1986a, b; Stromeyer and Lee 1988; Nagy and Sanchez 1990; Webster and Mollon 1991, 1994; Krauskopf and Gegenfurtner 1992; Cole et al. 1993; Sankeralli and Mullen 1996) (although some threshold measurements point to interactions between these axes; Nagy et al. 1987; Regan et al. 1994; Stromeyer et al. 1998). These dimensions also appear orthogonal within the mechanisms underlying chromatic induction (Shevell 1992) and motion perception (Krauskopf et al. 1996; Webster and Mollon 1997b), and characterize the average cone inputs to cells in the lateral geniculate (Derrington et al. 1984). Moreover, differences in the genes encoding the photopigments suggest that these dimensions may represent two subsystems of primate colour vision that evolved at very different times (Mollon 1989). However, these axes differ markedly from the principal axes suggested by subjective colour experience, a discrepancy that was evident in early measurements of colour coding in the geniculate (De Valois et al. 1966). Stimuli that isolate the L vs M axis vary from reddish to cyan, while the S vs LM axis varies from purple to yellow–green—colour variations that appear as inconspicuous mixtures of the red–green and blue–yellow dimensions that are central to models based on colour appearance (Abramov and Gordon 1994). Similar discrepancies are well known between luminance and perceived brightness. Luminance is the visual analogue of radiance, or the ‘visual effectiveness’ of light (Lennie et al. 1993). It can be measured by a variety of tasks based on nulling (e.g. of perceived flicker, motion, or border distinctness) or thresholds (e.g. acuity or sensitivity). Most of these tasks reveal an additive spectral luminosity function, Vλ , that depends on a weighted sum of L and M cones (with a weak input from S cones revealed under some conditions (Stockman et al. 1991; Stromeyer et al. 1991)). However, two stimuli with equal luminance may differ in perceived brightness, and brightness matches are not additive. Such observations have suggested that brightness reflects the outputs of both luminance and chromatic channels.
88
colour perception
Multiple post-receptoral stages If early post-receptoral mechanisms are organized in terms of the L vs M and S vs LM dimensions, this leaves the question of why the unique hues are such salient dimensions of our perceptual experience of colour. Two very different answers have been suggested. One is that the unique hues are not in fact tied to special states of neural activity (i.e. to the nulls of colour-opponent mechanisms) but are instead related to prominent properties of the visual environment, such as normalization to the average white (Pokorny and Smith 1977). An alternative suggestion is that the second-stage mechanisms are recombined to form ‘thirdstage’ mechanisms whose spectral sensitivities do agree with the unique-hue axes (Guth 1991; De Valois and De Valois 1993). However, beyond phenomenological observations, the evidence for this specific transformation remains lacking. A related problem concerns the separation of luminance and chromatic signals. Most geniculate cells respond to both luminance and chromatic contrast but with different spatial sensitivities (De Valois and De Valois 1975; Ingling and Martinez-Uriegas 1983), yet psychophysically, luminance and chromatic contrast and colour versus spatial sensitivity appear to be separable stimulus dimensions (Poirson and Wandell 1996). Schemes have been proposed for recombining the geniculate cells to yield pure luminance or pure chromatic mechanisms within the cortex (D’Zmura and Lennie 1986; Mullen and Kingdom 1991; De Valois and De Valois 1993). Colour coding in striate cortex appears clearly different from the organization found in the lateral geniculatge nuclear (LGN) (Lennie et al. 1990; Cottaris and De Valois 1998), yet the specific transformations implied by psychophysics have yet to be clearly confirmed in single-unit studies. Multiple colour–luminance mechanisms If trichromacy holds in the cones, then this necessarily limits colour vision to be three-dimensional, but does not limit the number of possible ways that post-receptoral mechanisms might recombine the cone signals. Many phenomena in colour vision, from colour naming to threshold contours for different colour–luminance directions, can be parsimoniously accounted for by assuming only three post-receptoral channels. Yet most such results could also be explained by assuming multiple colour–luminance mechanisms, each tuned to a different colour–luminance axis. The selectivity of the adaptation effects for multiple directions is consistent with results from a variety of different paradigms showing that sensitivity can be selective for more than three fixed directions within colour–luminance space (Guth 1982; Krauskopf et al. 1982, 1986a, b, 1996; Flanagan et al. 1990; D’Zmura 1991; Webster and Mollon 1991, 1993, 1994; Krauskopf and Gegenfurtner 1992; Zaidi and Halevy 1993; Zaidi and Shapiro 1993). (Results from masking studies have been mixed in this regard; Gegenfurtner and Kiper 1992; Li and Lennie 1997; Sankeralli and Mullen 1997; D’Zmura and Knoblauch 1998; Giulianini and Eskew 1998.) Physiologically, the true discreteness of the cone pigments stands in marked contrast to the variability in the spectral sensitivities of post-receptoral neurons (De Valois et al. 1966; Zrenner 1983; Derrington et al. 1984). The preferred directions of cells in the parvocellular LGN are strongly clustered along the L vs M and S vs LM axes, but there are nevertheless substantial differences in the cone inputs to individual cells, and wide variability in the elevation of preferred direction out of the equiluminant plane. Moreover, in the striate
light adaptation, contrast adaptation, and human colour vision
89
cortex any bimodality in the chromatic preferences appears much weaker (Lennie et al. 1990). This variability poses a problem for models that attempt to construct three discrete psychophysical channels out of mechanisms defined physiologically, and in fact offers an alternative to such models. For example, it may never be that pure luminance and chromatic mechanisms are explicitly built out of parvocellular LGN cells, for the perceptual phenomena that have suggested this transformation might be equally consistent with the pattern of responses across a population of varied cells. The evidence for multiple colour–luminance mechanisms calls into question a fundamental assumption of three-channel models of colour vision—that stimulus variations along some axes isolate the responses of only a single post-receptoral channel. Axes can be chosen so that they are visible only to a single class of cone, but if there are more postreceptoral colour channels than cone types, then there will likely always be more than one channel encoding the axis. This complicates the interpretation of studies that seek to examine the properties of ‘isolated’ channels, as in the many studies that have sought to isolate pure chromatic or luminance mechanisms (Livingstone and Hubel 1988) (although isolation might still be possible near threshold if only the most sensitive channel is responsive). As noted above, this redundancy also poses a challenge to models of colour vision based on coding efficiency. On–off pathways Adaptation to sinusoidal modulations induces symmetrical sensitivity changes along opposite poles of the adapting axis. However, with asymmetric waveforms it is possible to selectively adapt to the complementary directions of a single luminance or chromatic axis (De Valois 1977; Krauskopf et al. 1982; Krauskopf and Zaidi 1986), and colour after-effects like the McCollough effect appear to imply polarity-specific mechanisms (Stromeyer and Dawson 1978; Webster and Malkoc 1999). Contrast adaptation thus appears consistent with a wide range of evidence supporting the separate encoding of increments and decrements (Fiorentini et al. 1990). Multiple contrast mechanisms Just as there may be multiple mechanisms tuned to different axes within colour–luminance space, and to opposite poles of a single axis, there may be multiple mechanisms representing the signals along a single axis. Specifically, individual channels might respond to limited but overlapping contrast ranges, either because they have different contrast sensitivities and/or because their null points vary along the axis (see MacLeod and von der Twer, Chapter 5, this volume). Multiple mechanisms centred on different contrast levels have been proposed previously for the encoding of luminance (Albrecht and Hamilton 1982; Georgeson 1985). We will consider below evidence from adaptation that might point to a similar basis for contrast coding along chromatic axes. Multiple visual pathways Beginning with the cones, the different cell types involved at each stage of colour vision are designed to extract in parallel different information about the spectral qualities of light. These channels are, in turn, part of parallel subsystems that encode different properties of
90
colour perception
the visual stimulus, e.g. different spatiotemporal ranges of the stimuli, or different stimulus features, such as movement or colour (Zeki 1978; Livingstone and Hubel 1988; Schiller and Logothetis 1990; Shapley 1990; Merigan and Maunsell 1993). Within these different subsystems the signals from the cones may be combined very differently. Two principal subsystems identified in the primate retinocortical projection are the magnocellular and parvocellular pathways (M and P, named for the different layers of the geniculate through which they project). M and P cells differ on a number of dimensions, but in the present context the most important are that M cells exhibit only weak colour opponency and are the likely substrate of conventional measures of luminance sensitivity measured with rapid flicker or motion (Lee et al. 1988). In contrast, P cells are strongly colour-opponent, respond well to both luminance and colour, and are the likely pathway for colour and brightness appearance. The fact that different pathways may draw on the cone signals in different ways demands caution in comparing the colour organization implied by results from different experimental paradigms. For example, Webster and Mollon (1993) showed that different measures of the luminous efficiency of lights can be biased by adaptation in different ways, and this may arise in part because the alternative measures depend on different subsystems that have different sensitivities to luminance and chromatic contrast. Even a single response measure might reflect the influence of multiple pathways. For example, if luminance and chromatic thresholds are limited by separate subsystems, then the threshold contours for different directions within a luminance–chromatic plane might reveal more about the differences between the two subsystems than the organization of colour within either subsystem, and both might have little in common with the organization underlying the colour appearance of suprathreshold lights.
Combined effects of light adaptation and contrast adaptation In the preceding sections we considered the effects of light adaptation and contrast adaptation in isolation, by holding one of the adjustments constant. But how do the two forms of adaptation combine to influence colour sensitivity and appearance? We can assess this by generalizing the contrast adaptation experiment of Fig. 2.9 to examine what happens when the modulations in the adapting field are now centred around different points in colour space. Figure 2.14 shows results from an experiment in which observers adapted either to a static colour in a field, or to 1 Hz modulations around the mean colour along chromatic axes of 45–225◦ or 135–315◦ (Webster and Mollon 1995). They then matched the colour of a set of test stimuli bracketing the mean. The four panels plot the results for four different mean chromaticities, centred in different quadrants of the equiluminant plane. Light adaptation (to the mean alone) reduces the apparent saturation of the mean so that it appears nearly white, and produces corresponding mean shifts in the appearance of all of the test colours. Thus the set of test stimuli in the upper-right quadrant, which all appear reddish-purple in a neutral state of light adaptation, are shifted by the light adaptation so that they are centred around white, and take on the full gamut of hues. These mean shifts occur independently along the S vs LM and L vs M axes, and are very close to the adjustments predicted by
light adaptation, contrast adaptation, and human colour vision 75
75 135
50
45
135 50
DC
tests S
91
25
S
matches 45
25
DC
0
45 DC
tests
matches 45
135
0 DC
135
–25 –75
–50
–25 L–M
0
75
25
–25 –25
75
matches
0
25 L–M
50
75
matches DC
45
50
50
135
45
135 DC
S – 25
S – 25
DC
tests 45
135
45
–50
–75 –75
DC
tests
135
–50
–50
–25 L–M
0
25
–75 –25
0
25 L–M
50
75
Figure 2.14 Combined effects of light adaptation and contrast adaptation (Webster and Mollon 1995). Matches were made to test stimuli (filled circles) after adapting to a static adapting colour (DC) or to modulations around the DC along the 45–225◦ or 135–315◦ axes. Adaptation to the DC shifts the mean colour and all test stimuli toward white, consistent with predictions for von Kries scaling (small circles, dotted lines). Contrast adaptation produces, in addition, a selective loss in sensitivity to the 45◦ (filled triangles) or 135◦ (unfilled triangles) adapting axis. The four panels show similar results for four different DC colours.
von Kries scaling within each class of cone (dashed lines). Thus, for these conditions, von Kries adaptation provides a largely complete account of the colour changes produced by adaptation to the background colour. (Subtractive adjustments may also be present but are not evident because the von Kries scaling has already nearly equated the background signal for the test and match fields.) Adapting to modulations centred on the mean leads to the same mean colour shifts, but in addition induces a loss in sensitivity that is selective for
92
colour perception
each adapting axis. Since these axes co-vary the cone signals, the selectivity for them again implies adaptation within mechanisms that combine the cone signals. The results show that light adaptation and contrast adaptation produce independent and qualitatively different changes in colour appearance that can be readily dissociated. Light adaptation alters the cone signals available to later stages, but otherwise appears to have little influence on adaptation to contrast. Conversely, the fact that adding large modulations around the mean does not alter the state of light adaptation suggests that the visual response to contrast up to the sites of light adaptation is linear (or symmetric for opposite excursions from the mean). Thus, for such conditions, light adaptation and contrast adaptation can be treated as separable and successive influences on colour appearance.
Contrast adaptation with incomplete light adaptation To the extent that light adaptation and contrast adaptation do represent independent and successive adjustments by the visual system, identical contrast adaptation effects should occur for the modulations within each of the quadrants of Fig. 2.14, but only if the von Kries scaling adjusts completely to the different mean colours. In fact, for the conditions of these experiments this was usually not the case. After prolonged viewing, the static coloured fields appear very desaturated, but not completely white. What does incomplete light adaptation imply for the contrast adaptation effects? Figure 2.15 shows predictions based on standard assumptions. The first panel plots the sensitivity of a prototypical opponent mechanism. The zero-crossing corresponds to the ‘achromatic’ stimulus that elicits no response. Adapting to a DC colour (static adapting colour) alters the relative sensitivity of the cones, and this shifts the mechanism’s neutral point toward the adapting chromaticity. Any residual colour in the adapting field can be accounted for by assuming that the von Kries scaling is incomplete, so that the zero-crossing falls short of the adaptation point. Contrast adaptation should reduce the gain of the response, without shifting the mean. Thus the response to all stimuli except at the achromatic point should be reduced. The prediction is thus that any residual perceived colour that remains in the adapting field should be greatly reduced by the contrast adaptation, shifting the response to the DC colour closer to white. Surprisingly, this is not what happens. Contrast adaptation collapses perceived contrast relative to the mean chromaticity, and not relative to the stimulus that appears white (Webster and Wilson, 2000). In Fig. 2.14 the residual mean colour is so small that this effect is difficult to see. Figure 2.16 shows results for a second observer who exhibited weaker light adaptation effects. In this example, the subject adapted to a purple background that differed from the white reference only in S-cone excitation. The first panel shows the S-cone coordinates of matches to tests spanning the mean colour of the field, made after adapting to the static field or to modulations around the mean along the S axis. The second panel replots the matches as a function of test contrast. Light adaptation again shifts the perceived colour of all of the tests toward white, but in this case the adapting field retained a perceived colour that was several times threshold. Contrast adaptation strongly reduced the perceived contrasts along the S axis, as indicated by the difference in the slopes of the regression lines fitted to the two sets of matches. The critical point is where the two fitted lines cross, for
(a) Neutral adaptation
White
(b) Incomplete DC adaptation
DC
DC adapt DC + AC adapt
White Figure 2.15 Predicted colour changes for constant adaptation after incomplete light adaptation. (a) White (zero-contrast) corresponds to the null point for the opponent channel. (b) Adaptation to the DC biases the cone inputs, shifting the null towards the DC, but leaving a residual colour if the adaptation is incomplete. Contrasts adaptation reduces the gain of the channel and should collapse responses relative to the mean, reducing the apparent contrast of the DC.
(a)
(b)
75
DC
40
tests
DC 25
0
DC + AC
matches after DC adapt
matches after DC+AC adapt
white
Match contrast (× threshold)
S contrast (× threshold)
30 50
DC adapt m=0.75t – 25.2
20
DC + AC adapt m=0.36t – 9.57
10 0
–10 –25
–20 10
20
30 40 50 60 Test contrast (× threshold)
70
Figure 2.16 (a) Matches to test stimuli that lie along the S vs LM axis, after adapting to a mean colour change along the S axis (DC) or to modulations (AC) around the mean along the S axis (DC + AC). Light adaptation shifts the apparent colour of all tests towards white. Contrast adaptation in addition collapses perceived contrasts. (b) The DC and DC+AC matches are plotted as a function of test contrast. Contrast adaptation is indicated by the shallower slope for the DC + AC matches. The matches cross at a ‘zero-contrast’ equal to the DC chromaticity and not at the ‘zero-contrast’ predicted by the stimulus that appears achromatic.
94
colour perception
this gives an estimate of the one stimulus that was not affected by contrast adaptation. The intersection occurs very close to the mean chromaticity, and not near the stimulus that appeared achromatic. The adaptation effects thus reveal a dissociation between two alternative definitions of ‘zero chromatic contrast’, i.e. between the chromaticity that appears achromatic and the chromaticity that is the null point for contrast adaptation. Indeed, Fig. 2.16 suggests that there are cases when contrast adaptation might actually increase perceived saturation near the achromatic point by decreasing perceived contrast relative to the background chromaticity. It is hard to reconcile these dissociations with the notion that ‘white’ reflects an absence of activity in chromatic channels. Instead, it may be that white reflects the balance of activity across many channels, tuned not only to many different directions within colour space but also centred on different null points along any single axis.
The functions of adaptation We have characterized some of the more salient properties of light adaptation and contrast adaptation, but what are the benefits of these adjustments to colour vision, and to perception in general? For light adaptation there are clear answers. The range of light levels to which we are normally exposed varies dramatically (e.g. nearly 10 log units from starlight to bright daylight), yet the dynamic range of visual cells is at best 2 to 3 log units. If cells had to devote their operating range to encoding the full range of stimulus intensities, then they would have little sensitivity to small stimulus differences. This limitation is critical because the range of contrasts in any individual scene is typically small. Light adaptation thus serves to maintain high contrast sensitivity around the mean light level, both by shifting the steep contrast response functions of neurons so that they are positioned near the ambient background, and by discounting responses to the background so that the dynamic range is devoted to signalling differences around the mean (Walraven et al. 1990). A second function of light adaptation is to help maintain lightness constancy. Changing the overall light level in a scene changes the amount of light reflected from any surface, but does not alter the ratio of quanta reflected from different surfaces. Thus these ratios describe stable properties of the scene (the relative reflectances of surfaces) and is the property to which our lightness perception corresponds (e.g. in our perception of dark versus light objects). The multiplicative gain controls of von Kries adaptation capture this invariance by rescaling the cone signals so that across different light levels a constant difference in outputs corresponds to a constant ratio of inputs. Finally, as several chapters in this book illustrate, light adaptation may be a major factor contributing to colour constancy. Studies of colour constancy are concerned with how and to what extent the visual system can maintain a stable representation of the colour of surfaces under changes in illumination or viewing conditions. For natural illuminants and reflectance spectra—that vary only gradually across the visible spectrum (see Maloney, Chapter 9, this volume)—the main effect of an illumination change is to rescale the absorptions within each cone class for the set of objects in the scene (Dannemiller 1993; Foster and Nascimento 1994). For example, shifting from a yellow to blue illuminant will tend to increase the S cone response to each of the objects while decreasing the response in
light adaptation, contrast adaptation, and human colour vision
95
L cones to all objects. This is precisely the change that von Kries adaptation can compensate for—by rescaling sensitivity to maintain lightness constancy independently for each cone type (MacLeod 1985). However, for objects in the scene with different reflectance spectra, the relative lightnesses seen by each cone will also vary somewhat as the spectrum of the illuminant changes, so that rescaling alone cannot achieve perfect constancy. The influence of adaptation on colour constancy is illustrated in Fig. 2.17, which simulates the visual response to a set of natural surfaces viewed under two daylight illuminants (4800 K or 10 000 K) (Webster and Mollon 1995). The illuminant spectra were constructed out of the first three basis functions derived for daylight by Judd et al. (1964). The surfaces were each constructed from the first three basis functions derived by Cohen (1964) for the reflectance spectra of Munsell chips. The surfaces were chosen to form an equiluminant distribution centred on the chromaticity of the 4800 K illuminant, but with a bias along the S axis in the distribution’s range. When the same surfaces are viewed under the 10 000 K illuminant, the light reflected from all of the surfaces shifts toward blue. The middle panel shows the responses to the two distributions following complete von Kries adaptation. von Kries scaling adjusts for the mean of each distribution, so that the average colour under either illuminant appears white. This factors out most of the difference introduced by the illuminant change, but residual differences remain, because the 10 000 K illuminant has also induced a systematic tilt in the distribution. Thus von Kries scaling can compensate for much of the illuminant shift but cannot discount the illuminant completely. The residual errors following von Kries adaptation have suggested that the visual system might adjust to the illuminant through a second stage of post-receptoral adaptation (D’Zmura and Lennie 1986; Dannemiller 1989). These second-site adjustments could undo all of the illuminant change if they could estimate and correct for the weights on basis functions defining the two illuminants. Several methods have been proposed for judging the illuminant, yet none of these is likely to provide a basis for perfect colour constancy in human observers (Pokorny et al. 1991). On the other hand, there are actual ‘second-site’ adjustments like contrast adaptation, that could influence colour perception under the two illuminants. Before considering this additional stage of adaptation to an illuminant change, we first consider some of the functional implications of contrast adaptation. Compared to light adaptation, the functions of contrast adaptation have proven elusive, and clear improvements in performance following contrast adaptation have been difficult to demonstrate. In fact, a common assumption is that the sensitivity losses produced by contrast adaptation are without functional benefits, and represent only temporary ‘fatigue’ due to prolonged stimulus exposure. However, by this account it is unclear why retinal and geniculate cells (which should generally be more strongly stimulated than cortical cells) should exhibit weaker adaptation than cortical cells. Part of the problem in assessing the functional role of contrast adaptation may be the choice of an appropriate baseline for addressing this question. Adaptation effects are usually assessed relative to performance in a neutral, or zero-contrast adaptation state (or relative to an unknown state if the observer carries into the experiment the residual effects of past stimulus exposures). Neutral adaptation may be a rare anomaly in waking experience, and it may be more appropriate to look for the advantages of adaptation by comparing performance across different, natural adaptation states.
colour perception
96 (a)
60
(b)
100
4800 K 10 000 K
75
100° 90°
4800 K 10 000 K
40
50
20
25 0
S
S 0
–20 –25 –40
–50 –75 –100 –75
–50
–25
0
25
50
–60 –60
75
–40
–20
L–M (c)
0
20
40
60
L–M
30
100° 90°
4800 K 10 000 K
20 10 S
0 –10 –20 –30 –30
–20
–10
0
10
20
30
L–M
Figure 2.17 Light adaptation and contrast adaptation induced by an illuminant change (Webster and Mollon 1995). (a) Reflectances of simulated Munsell chips were chosen to form an equiluminant distribution of chromaticities (triangles) centred on the chromaticity of the 4800 K illumination but biased along the S vs LM axis. Circles show the distribution when the same surfaces are viewed under the 10 000 K illuminant. Without adaptation, the colours of the chips appear very different under the two illuminants. (b) von Kries adaptation adjusts responses in each cone so that the mean response under the two illuminants is the same. This compensates for most of the illuminant change, giving approximate colour constancy, but the 10 000 K distribution remains tilted 10◦ off the S axis. (c) The differences in the von Kries-scaled distributions lead to different states of contrast adaptation. Decorrelation results in sensitivity losses selective for either the 90◦ (S) axis (under the 4800 K illuminant) or the 100◦ axis (under the 10 000 K illuminant). As a result, colour signals that matched at the retinal level of light adaptation [e.g. the large circle of colours in (b)] no longer match at the cortical locus of contrast adaptation [as shown by the two different ellipses in (c)].
One proposed advantage of contrast adaptation is that it may adjust the visual system so that sensitivity is highest around the average contrast in a scene, just as light adaptation adjusts sensitivity to the prevailing mean light level. In fact, this effect can be readily demonstrated in cortical neurons (Ohzawa et al. 1982; Albrecht et al. 1984; Sclar et al.
light adaptation, contrast adaptation, and human colour vision
97
1989). Individual neurons typically respond only over a narrow range of contrasts. Adaptation tends to centre this range on the adapting contrast level. As a result, the neuron can signal differences between high contrast stimuli that before adaptation all produced equivalent, saturating responses in the cell. Psychophysical studies have, under some conditions, found that contrast discrimination may be improved by prior adaptation (Greenlee and Heitger 1988; Wilson and Humanski 1993), yet the majority of attempts to test this have instead found no effect or decreases in sensitivity (e.g. Barlow et al. 1976; Webster et al. 1987; Maattanen and Koenderink 1991; Ross et al. 1993). One reason for not expecting comparable sensitivity gains psychophysically is that the psychophysical contrast response function does not exhibit the degree of saturation observed in individual cells. A related possible function of contrast adaptation is to support ‘contrast constancy’, just as light adaptation supports lightness constancy. Scenes with higher or lower contrasts will induce stronger or weaker adaptation, respectively, so that the dynamic range might tend to be matched to the prevailing gamut of contrasts. Such adjustments might partially correct for changes in visibility owing, for example, to fog or rain (Brown and MacLeod 1997). Finally, we considered above the proposal that contrast adaptation reflects mutual inhibition between channels that serves to decorrelate their outputs. Combined with gain control this yields the most efficient representation of the visual stimulus because it allows each channel to devote its full dynamic range to carry independent information about the stimulus (Barlow 1990b; Atick et al. 1993). Such adjustments may be necessary because the statistical structure of images can vary markedly (see below). How is the goal of coding efficiency related to the goal in colour constancy of maintaining a stable representation of surface colour under different viewing conditions? We saw in Fig. 2.17b that von Kries adaptation corrects for most of the effects of an illuminant change, but leaves residual differences between the distributions. And the second-site adjustments that could, in theory, correct for these differences are not observed in empirical measurements of colour judgements under different illuminants (Brainard and Wandell 1992; Arend 1993). This suggests that differences remaining in the von Kries-scaled distributions survive to cortical sites and thus could drive the visual system to different states of contrast adaptation. The final panel in Fig. 2.17 shows the predicted colour matches under the two illuminants if contrast adaptation induces sensitivity losses that are selective for the principal axis of each distribution. The adaptation was modelled by decorrelating and normalizing the responses to the L vs M and S vs LM contrasts using the algorithm of Atick et al. (1993). The shift from the 4800 K to 10 000 K illuminant tilts the initial distribution off the S (90◦ ) axis to an axis of 100◦ . As a result, colours that matched under the two illuminants at the level of light adaptation in the visual system, no longer match at the level of contrast adaptation, because they are embedded in different colour distributions and thus are affected by contrast adaptation in different ways. This difference is illustrated by the ellipses within each distribution, which fall on a common circle of chromaticities after von Kries scaling but are distorted by contrast adaptation into ellipses with minor axes aligned along the different principal axes of the two distributions. Webster and Mollon (1995) showed that the colour changes produced by adaptation to random samples from the two distributions in Fig. 2.17a were consistent with von
98
colour perception
Kries scaling followed by selective sensitivity losses specific to each distribution. Contrast adaptation may therefore play a significant role in the visual adjustment to an illuminant change. However, while the observed selectivity of contrast adaptation effects may lead to a more efficient representation of colour, it does not lead to colour constancy, for it does not undo the tilts that different illuminants induce in the colour distributions. Thus far, the possible functions we have considered for contrast adaptation have focused on the role that adaptation may play in helping to encode the ambient stimulus. For example, we saw that decorrelation improves the representation of the adapting stimulus by removing the redundancies in responses to the stimulus. However, Barlow (1990a) suggested that such adjustments may also serve a further, complementary function—by discounting the prevailing stimulus contingencies, adaptation could aid the visual system in detecting novel associations, or ‘suspicious coincidences’, in the environment. Adaptation could thus provide an important mechanism for the perception and learning of causal structure, and may enhance the salience of novel structure by reducing sensitivity to the prevailing background. We were led by these ideas by to explore how adaptation influences performance in a naturalistic task like visual search, in which observers must try to locate a novel target rapidly against a background of distractors. To test this we developed a ‘foraging’ paradigm that simulates the problem of finding a fruit among foliage (Webster et al. 1998). The background is composed of a dense and random array of ellipses with colours drawn randomly from a specified colour distribution, and thus has the dense and variegated structure typical of many natural scenes. The target is a circle of variable colour placed at a random location on the background, and reaction times are measured for detecting the target location. Search times were measured for a wide range of test chromaticities on the achromatic background or on backgrounds that varied along the L vs M or S vs LM axis or two intermediate chromatic directions. To assess the effects of prior adaptation to the backgrounds, observers first adapted to a rapid and random succession of samples from a given background, and then searched for targets on the same background (e.g. L vs M) or the orthogonal background (e.g. S vs LM). The rapid sequence was chosen to simulate the pattern of stimulation that might arise from sampling the background with rapid and random eye movements. Figure 2.18 shows the search times on each background, by plotting reaction times as a function of the distance of the test contrast (e.g. along the S vs LM axis) from the background axis (e.g. L vs M). The search times vary from slow (for test colours within and thus camouflaged by the background distribution) to very rapid (for test colours that differ substantially from the background and thus appear to ‘pop out’). These variations are clearly selective for the background colour direction, confirming that visual search for colour depends on mechanisms that can be tuned to different chromatic directions (D’Zmura 1991). In each case the results also suggest that search is facilitated by prior adaptation to the background on which the search is performed (triangles), while hindered by prior adaptation to an inappropriate, orthogonal background (filled circles), and such effects are a plausible consequence of the selective changes in sensitivity that adaptation should induce for each background. These preliminary results thus lend support to the
light adaptation, contrast adaptation, and human colour vision 0° bkgd
Reaction time (s)
4.0
90° bkgd
4.0
2.0
2.0
1.0
1.0
99
0.5 0.5 –80 –60 –40 –20 0
20 40 60 80 45° bkgd
4.0 Reaction time (s)
–80 –60 –40 –20 0
135° bkgd
4.0
2.0
2.0
1.0
1.0
20 40 60 80
0.5 0.5 –80 –60 –40 –20 0
20 40 60 80
–80 –60 –40 –20 0
Target contrast
20 40 60 80
Target contrast
Figure 2.18 Adaptation effects on visual search. Reaction times for detecting a target are plotted as a function of the colour distance from the target to the background colour axis (e.g. target contrast equals S contrast for L vs M backgrounds, and vice versa). Overall search times are faster after adapting to the same background (triangles) and slower after adapting to an orthogonal background (filled circles) relative to neutral adaptation (to a uniform background; open circles). Lines plot search times on an achromatic background. The four panels show similar results for four different background colour axes.
proposal that adaptation increases the salience of novel stimuli by partially discounting the ambient background.
Adaptation and the statistics of natural images The preceding discussion suggests that the functions of adaptation might best be revealed by exploring natural visual tasks and how they are affected by natural variations in the state of adaptation. This raises the question of how the states of colour adaptation depend
colour perception
18 12
1.00
tests
MB MW
45
135
L–M Sensitivity (match / test)
100
matches 6 S
G
24 48 96
0 12
36 64
–6 –12 –18 –18
225
0.75
0.25
315 0.00
–12
S
0.50
–6
0 L–M
6
12
18
0
24
48
72
96
L–M contrast
Figure 2.19 Contrast adaptation to biased distributions. (a) Matches to tests were made after adapting to a fixed modulation (48 × threshold) along the S vs LM axis paired out of phase with modulations along the L vs M axis with varied contrast (indicated by numbers). (b) Sensitivity changes along the S vs LM and L vs M axes estimated by ellipses fitted to two observers’ results. Even weak biases in the relative contrasts along the two adapting axes cause changes in sensitivity that are (weakly) selective for the higher-contrast axis.
on the colour distributions that are characteristic of the natural visual environment. In anticipating these effects it should be emphasized that adaptation induces selective changes in colour appearance even when the adapting distribution has only a modest bias in its colour direction. For example, Fig. 2.19 shows the effects of adapting to a fixed-contrast modulation along the S vs LM axis, paired with varying levels of modulation along the L vs M axis. The two components were combined 90◦ out of phase in time, so that the resulting stimuli varied along ellipses within the equiluminant plane. Figure 2.19b shows estimates of perceived contrast along the two axes. The sensitivity losses along the fixed-contrast S vs LM axis remain remarkably constant, while increasing modulations along the L vs M axis lead to systematically larger sensitivity losses along this axis. Very similar results were obtained when the adapting components instead varied along intermediate colour directions. These results therefore suggest that even weak biases in the adapting distributions should lead to changes in colour appearance that are (at least weakly) selective for the adapting stimulus. Few measurements have been made of the colour statistics of natural images (Moorhead 1985; Burton and Moorhead 1987; Nagle and Osorio 1993; Webster and Mollon 1997a; Párraga et al. 1998; Ruderman et al. 1998; see also MacLeod and von der Twer, Chapter 5, this volume). We recently examined the colour distributions for a wide range of scenes, with the goal of characterizing the states of adaptation induced by natural colour distributions. The set of colours within a scene were recorded by sampling the colour signals over a grid of spatial locations with a spectroradiometer, or by reconstructing the colour signals from successive images of the scene captured with a digital camera through a set of
light adaptation, contrast adaptation, and human colour vision 150
50 0
50 0 –50
–50 –100
–50
0 L–M
50
100
Luminance contrast
100 50 0 –50
–100
–50
0 L–M
50
–50
0 L–M
50
100
50 0
100
–50
0 50 100 S – (L + M)
150
–50
0 50 100 S – (L + M)
150
150
100 50 0
100 50 0 –50
–50 –100
100
–50
150
(b) 150
S – (L + M)
100
Luminance contrast
S – (L + M)
100
150 Luminance contrast
Luminance contrast
(a) 150
101
–100
–50
0 L–M
50
100
Figure 2.20 The distribution of chromatic and luminance contrasts measured for two outdoor scenes (Webster and Mollon 1997a). Top panels show a distribution from a meadow backed by mountains and sky. Chromaticities are tightly clustered along an axis extending from blue sky to the yellow grasses of the meadow. Bottom panels show a distribution from within a forest. In this case chromaticities are instead moderately biased along the S vs LM axis.
interference filters spanning the visible spectrum. Figure 2.20 shows examples of the colour distributions for two individual scenes. The top panels plot the distribution obtained for a Sierra meadow backed by mountains and sky. Chromaticities are tightly clustered along a bluish–yellowish axis with a high correlation between the L vs M and S vs LM axes, and chromatic and luminance contrasts span a comparable range. The bottom panels plot a distribution measured in a forest in India. For this scene the colour gamut is more restricted and instead shows a moderate bias along the S vs LM axis, and larger variations in luminance than in colour (when axes are scaled to equate the adaptation effects for different colour–luminance directions). Thus individual scenes vary substantially in the range and biases of their colour distributions, so that no static set of post-receptoral mechanisms could represent different scenes efficiently. Contrast adaptation could therefore play an important role in adjusting the visual system to the properties of individual scenes. Figure 2.21 shows measurements of asymmetric colour matches following adaptation to the two colour distributions of Fig. 2.20. The adaptation was achieved by presenting, in a 2◦ field, a sequence of individual colours sampled at random from the distribution every 200 ms. Again this simulates for a single retinal locus the pattern of stimulation
colour perception
102 (a)
–5 –20 –35
Luminance contrast
Luminance contrast
S–(L + M)
40
40 10
20 0 –20
20 0 –20
–50 –40
–20
0 L–M
20
–40 –40
40
(a)
0 L–M
20
–40
40
–20 –35
Luminance contrast
Luminance contrast
–5
–50 –35 –20 –5 S–(L + M)
10
–50 –35 –20 –5 S–(L + M)
10
40
40 10
S–(L + M)
–20
20 0 –20
20 0 –20
–50 –40
–20
0 L–M
20
40
–40 –40
–20
0 L–M
20
40
–40
Figure 2.21 Colour changes produced by adaptation to the colour distributions of Fig. 2.20. Test stimuli (unfilled squares) were chosen to bracket the mean chromaticity of the distribution. Matches (filled circles) exhibit mean colour shifts induced by light adaptation and losses in contrast sensitivity that are selective for the principal axes of the individual distributions. Triangles show the matches predicted if the visual system exhibited only von Kries adaptation.
that should arise from rapid and random eye movements within the scenes. The resulting matches illustrate three adjustments that the visual system makes to the colour structure of the scenes. Light adaptation adjusts to the mean colour, while contrast adaptation adjusts both to the range of contrasts in the scene and selectively to the principal axes of each distribution. The changes in apparent contrast represent a partial tendency to maintain contrast constancy (Brown and MacLeod 1997), while the selective sensitivity loss to the background axis biases perceived colour toward the orthogonal chromatic axis, and thus toward the more novel colour directions within each image. Such results suggest that the characteristic biases in natural colour distributions may often induce large and selective biases in our colour perception. In turn, this suggests that contrast adaptation is not a maladaptive response to aberrant stimulation, but rather— like light adaptation—is a natural and intrinsic component in the visual coding of the natural environment. The substantial variability in the colour statistics of scenes points to the possible need for such adaptive adjustments, both across different scenes and within the same scene over time (e.g. because of changes in illumination, the weather or the seasons). Alternatively, the range of chromatic axes is limited—in most scenes colour varies principally along bluish to yellowish-green axes, with most distributions falling along axes that range from the S vs LM axis, which is more typical of scenes dominated by foliage
light adaptation, contrast adaptation, and human colour vision
103
(Webster and Mollon 1997a; Ruderman et al. 1998) to a perceptual blue–yellow axis, which is more typical of arid or panoramic scenes (Burton and Moorhead 1987; Webster and Mollon 1997a). (It is notable that the range thus appears roughly bounded by two chromatic axes that are central to models of colour vision.) This limited range suggests that the natural environment may maintain the visual system in only a limited range of adapted states. Characterizing these states is important because they determine the range of natural operating states of our colour vision. These results have implications beyond colour vision. For example, natural scenes have a characteristic spatial structure, with amplitude spectra that fall off with frequency roughly as 1/f (e.g. Burton and Moorhead 1987; Field 1987; Tolhurst et al. 1992; Ruderman 1994; Dong and Atick 1995). This structure has been found for a very diverse range of images and appears similar for luminance and chromatic contrast (Webster and Mollon 1997a; Párraga et al. 1998; Ruderman et al. 1998). We have examined how adaptation to this spatial structure alters sensitivity to spatial contrast (Webster and Miyahara 1997; Webster 1999). Figure 2.22 shows how the contrast sensitivity function (CSF) for luminance or chromatic (L vs M) contrast is affected by adaptation to images with 1/f spectra. The adaptation induces selective losses in sensitivity at lower spatial frequencies. For colour, these losses are large and tend to distort the normally lowpass CSF to become more nearly bandpass. (This selective effect may depend not only on the intrinsic bias in the image spectra, but also on interactions in the adaptation effects at different spatial scales; Webster 1999.) The characteristic spatial structure of natural images may therefore maintain the visual system in characteristic states of spatial contrast adaptation that can strongly influence the properties of our spatial vision. It is these adapted states that are most relevant for understanding how the visual system responds to natural patterns of stimulation.
(a) 800
KL
Luminance
(b) 200
L–M
KL
100 Contrast sensitivity
Contrast sensitivity
400 200 100 50 25
50 25 12 6
0.25 0.5
1
2
4
8
Spatial frequency (c/deg)
16
0.25 0.5
1
2
4
8
16
Spatial frequency (c/deg)
Figure 2.22 (a) Changes in threshold sensitivity for luminance spatial gratings following adaptation to luminance images with 1/f amplitude spectra (Webster and Miyahara 1997). Adaptation reduces sensitivity at low to medium spatial frequencies. (b) Comparable results for chromatic (L vs M) gratings and adapting images (Webster 1999).
104
colour perception
Acknowledgements This chapter is based on an earlier review by Webster (1996). Supported by EY10834.
References Abramov, I. and Gordon, J. (1994). Color appearance: on seeing red – or yellow, or green, or blue. Annual Review of Psychology 45, 451–485. Albrecht, D.G. and Hamilton, D.B. (1982). Striate cortex of monkey and cat: Contrast response function. Journal of Neurophysiology 48, 217–237. Albrecht, D.G., Farrar, S.B., and Hamilton, D.B. (1984). Spatial contrast adaptation characteristics of neurones recorded in the cat’s visual cortex. Journal of Physiology 347, 713–739. Arend, L.E. (1993). How much does illuminant color affect unattributed colors? Journal of the Optical Society of America A 10, 2134–2147. Atick, J.J. (1992). Could information theory provide an ecological theory of sensory processing? Network 3, 213–251. Atick, J.J., Li, Z., and Redlich, A.N. (1992). Understanding retinal color coding from first principles. Neural Computation 4, 559–572. Atick, J.J., Li, Z., and Redlich, A.N. (1993). What does post-adaptation color appearance reveal about cortical color representation? Vision Research 33, 123–129. Barlow, H.B. (1972). Dark and light adaptation: Psychophysics. In Handbook of sensory physiology, VII/4, (ed. D. Jameson and L.M. Hurvich), pp. 1–28. Springer-Verlag, New York. Barlow, H.B. (1990a). Conditions for versatile learning, Helmholtz’s unconscious inference, and the task of perception. Vision Research 30, 1561–1571. Barlow, H.B. (1990b). A theory about the functional role and synaptic mechanism of visual after-effects. In Vision: coding and efficiency, (ed. C. Blakemore), pp. 363–375. Cambridge University Press, Cambridge. Barlow, H.B. and Földiák, P. (1989). Adaptation and decorrelation in the cortex. In The computing neuron, (ed. R. Durbin, C. Miall, and G.J. Mitchison), pp. 54–72. Addison-Wesley, Wokingham. Barlow, H.B., MacLeod, D.I.A., and van Meeteren, A. (1976). Adaptation to gratings: no compensatory advantages found. Vision Research 16, 1043–1045. Benzschawel, T. and Guth, S.L. (1982). Post-receptor chromatic mechanisms revealed by flickering vs fused adaptation. Vision Research 22, 69–75. Blake, R., Overton, R., and Lema-Stern, S. (1981). Interocular transfer of visual aftereffects. Journal of Experimental Psychology: Human Perception and Performance 7, 367–381. Blakemore, C. and Campbell, F.W. (1969). On the existence of neurones in the human visual system selectively sensitive to the orientation and size of retinal images. Journal of Physiology 203, 237–260. Blakemore, C. and Sutton, P. (1969). Size adaptation: A new aftereffect. Science 166, 245–247. Blakemore, C., Muncey, J.P.J., and Ridley, R.M. (1971). Perceptual fading of a stabilized cortical image. Nature 233, 204–205. Boynton, R.M. and Kambe, N. (1980). Chromatic difference steps of moderate size measured along theoretically critical axes. Color Research and Application 5, 13–23. Braddick, O., Campbell, F.W., and Atkinson, J. (1978). Channels in vision: Basic aspects. In Handbook of sensory physiology VIII, (ed. R. Held, H. W. Leibowitz, and H. Teuber), pp. 3–38. Springer-Verlag, Berlin. Bradley, A., Switkes, E., and De Valois, K.K. (1988). Orientation and spatial frequency selectivity of adaptation to color and luminance gratings. Vision Research 28, 841–856. Brainard, D.H. and Wandell, B.A. (1992). Asymmetric color matching: How color appearance depends on the illuminant. Journal of the Optical Society of America A 9, 1433–1448.
light adaptation, contrast adaptation, and human colour vision
105
Brown, R.O. and MacLeod, D.I.A. (1997). Color appearance depends on the variance of surround colors. Current Biology 7, 844–849. Burton, G.J. and Moorhead, I.R. (1987). Color and spatial structure in natural scenes. Applied Optics 26, 157–170. Chaparro, A., Stromeyer, C.F., Chen, G., and Kronauer, R.E. (1995). Human cones appear to adapt at low light levels: Measurements on the red–green detection mechanism. Vision Research 35, 3103–3118. Chen, B., MacLeod, D.I.A., and Stockman, A. (1987). Improvement in human vision under bright light: Grain or gain? Journal of Physiology 394, 41–66. Chichilnisky, E.-J. and Wandell, B.A. (1995). Photoreceptor sensitivity changes explain color appearance shifts induced by large uniform backgrounds in dichoptic matching. Vision Research 35, 239–254. Cicerone, C.M., Hayhoe, M.M., and MacLeod, D.I.A. (1990). The spread of adaptation in human foveal and parafoveal cone vision. Vision Research 30, 1603–1615. Cohen, J. (1964). Dependency of the spectral reflectance curves of the Munsell color chips. Psychonomic Science 1, 369–370. Cole, G.R., Hine, T., and McIlhagga, W. (1993). Detection mechanisms in L-, M-, and S-cone contrast space. Journal of the Optical Society of America A 10, 38–51. Cottaris, N.P. and De Valois, R.L. (1998). Temporal dynamics of chromatic tuning in macaque primary visual cortex. Nature 395, 896–900. Dannemiller, J.L. (1989). Computational approaches to color constancy: Adaptive and ontogenetic considerations. Psychological Review 96, 255–266. Dannemiller, J.L. (1993). Rank orderings of photoreceptor photon catches from natural objects are nearly illuminant-invariant. Vision Research 33, 131–140. De Valois, K.K. (1977). Independence of black and white: Phase-specific adaptation. Vision Research 17, 209–215. De Valois, R.L. and De Valois, K.K. (1975). Neural coding of color. In Handbook of perception, (ed. E.C. Carterette and M.P. Friedman), Vol. 5, pp. 117–166. Academic Press, New York. De Valois, R.L. and De Valois, K.K. (1993). A multi-stage color model. Vision Research 33, 1053–1065. De Valois, R.L., Abramov, I., and Jacobs, G.H. (1966). Analysis of response patterns of LGN cells. Journal of the Optical Society of America 56, 966–977. Derrington, A.M., Krauskopf, J., and Lennie, P. (1984). Chromatic mechanisms in lateral geniculate nucleus of macaque. Journal of Physiology 357, 241–265. Dodwell, P.C. and Humphrey, G.K. (1990). A functional theory of the McCollough Effect. Psychological Review 97, 78–89. Dong, D.W. and Atick, J.J. (1995). Statistics of time-varying images. Network: Computation in Neural Systems 6, 345–358. D’Zmura, M. (1991). Color in visual search. Vision Research 31, 951–966. D’Zmura, M. and Knoblauch, K. (1998). Spectral bandwidths for the detection of color. Vision Research 38, 3117–3128. D’Zmura, M. and Lennie, P. (1986). Mechanisms of color constancy. Journal of the Optical Society of America A 3, 1662–1672. Elsner, A. (1978). Hue difference contours can be used in processing orientation information. Perception and Psychophysics 25, 451–456. Fairchild, M.D. and Reniff, L. (1995). Time course of chromatic adaptation for color-appearance judgments. Journal of the Optical Society of America A 12, 824–833. Favreau, O.E. and Cavanagh, P. (1981). Color and luminance: Independent frequency shifts. Science 212, 831–2.
106
colour perception
Field, D.J. (1987). Relations between the statistics of natural images and the response properties of cortical cells. Journal of the Optical Society of America A 4, 2379–2394. Fiorentini, A., Baumgartner, G., Magnussen, S., Schiller, P.H., and Thomas, J.P. (1990). The perception of brightness and darkness: Relations to neuronal receptive fields. In Visual perception The neurophysiological foundations, (ed. L. Spillmann and J.S. Werner), pp. 129–161. Academic, San Diego. Flanagan, P., Cavanagh, P., and Crassini, B. (1989). McCollough effects for equiluminant gratings. Investigative Ophthalmology and Visual Science (Suppl.) 30, 130. Flanagan, P., Cavanagh, P., and Favreau, O.E. (1990). Independent orientation-selective mechanisms for the cardinal directions of color space. Vision Research 30, 769–778. Foster, D.H. and Nascimento, M.C. (1994). Relational color constancy from invariant cone-excitation ratios. Proceedings of the Royal Society of London B 257, 115–121. Frisby, J.P. (1979). Seeing: Illusion, brain and mind. Oxford University Press, Oxford. Gegenfurtner, K. and Kiper, D.C. (1992). Contrast detection in luminance and chromatic noise. Journal of the Optical Society of America A 9, 1880–1888. Georgeson, M.A. (1985). The effect of spatial adaptation on perceived contrast. Spatial Vision 1, 103–112. Gibson, J.J. and Radner, M. (1937). Adaptation, after-effect and contrast in the perception of tilted lines. I. Quantitative studies. Journal of Experimental Psychology 20, 453–467. Gilinsky, A.S. (1968). Orientation-specific effects of patterns of adapting light on visual acuity. Journal of the Optical Society of America 58, 13–17. Giulianini, F. and Eskew, R.T.J. (1998). Chromatic masking in the (L/L, M/M) plane of cone-contrast space reveals only two detection mechanisms. Vision Research 38, 3913–3926. Graham, N. (1989). Visual pattern analyzers. Oxford University Press, Oxford. Greenlee, M.W. and Heitger, F. (1988). The functional role of contrast adaptation. Vision Research 28, 791–797. Greenlee, M.W., Georgeson, M.A., Magnussen, S., and Harris, J.P. (1991). The time course of adaptation to spatial contrast. Vision Research 31, 223–236. Guth, S.L. (1982). Hue shifts following flicker vs. fused adaptation reveal initial opponent mechanisms. Investigative Ophthalmology and Visual Science (Suppl.) 22, 78. Guth, S.L. (1991). Model for color vision and light adaptation. Journal of the Optical Society of America A 8, 976–993. Guth, S.L. and Moxley, J.P. (1982). Hue shifts following differential postreceptor achromatic adaptation. Journal of the Optical Society of America 72, 301–303. Hayhoe, M. and Wenderoth, P. (1991). Adaptation mechanisms in color and brightness. In From pigments to perception, (ed. A. Valberg and B.B. Lee), pp. 353–367. Plenum, New York. Hayhoe, M.M. (1990). Spatial interactions and models of adaptation. Vision Research 30, 957–965. Hayhoe, M.M., Levin, M.E., and Koshel, R.J. (1992). Subtractive processes in light adaptation. Vision Research 32, 323–333. Hood, D.C. (1998). Lower-level visual processing and models of light adaptation. Annual Review of Psychology 49, 503–535. Ingling, C.R. and Martinez-Uriegas, E. (1983). The relationship between spectral sensitivity and spatial sensitivity for the primate r-g X-channel. Vision Research 23, 1495–1500. Jameson, D. and Hurvich, L.M. (1972). Color adaptation: Sensitivity, contrast, after-images. In Handbook of sensory physiology, VII/4, (ed. D. Jameson and L.M. Hurvich). Springer, New York. Judd, D.B., MacAdam, D.L., and Wyszecki, G. (1964). Spectral distribution of typical daylight as a function of correlated color temperature. Journal of the Optical Society of America 54, 1031–1040. Köhler, W. and Wallach, H. (1944). Figural aftereffects: An investigation of visual processes. Proceedings of the American Philosophical Society 88, 269–357.
light adaptation, contrast adaptation, and human colour vision
107
Kortum, P.T. and Geisler, W.S. (1995). Adaptation mechanisms in spatial vision – II. Flash thresholds and background adaptation. Vision Research 35, 1595–1609. Krauskopf, J. and Gegenfurtner, K. (1992). Color discrimination and adaptation. Vision Research 32, 2165–75. Krauskopf, J. and Zaidi, Q. (1986). Induced desensitization. Vision Research 26, 759–762. Krauskopf, J., Williams, D.R., and Heeley, D.W. (1982). Cardinal directions of color space. Vision Research 22, 1123–1131. Krauskopf, J., Williams, D.R., Mandler, M.B., and Brown, A.M. (1986a). Higher order color mechanisms. Vision Research 26, 23–32. Krauskopf, J., Zaidi, Q., and Mandler, M.B. (1986b). Mechanisms of simultaneous color induction. Journal of the Optical Society of America A 3, 1752–1757. Krauskopf, J., Wu, H.-J., and Farell, B. (1996). Coherence, cardinal directions, and higher-order mechanisms. Vision Research 36, 1235–1245. Le Grand, Y. (1949). Les seuils différentials de couleurs dans la théories de Young. Revue d’optique théorique et instrumentale 28, 261–278. Lee, B.B., Martin, P.R., and Valberg, A. (1988). The physiological basis of heterochromatic flicker photometry demonstrated in ganglion cells of the macaque retina. Journal of Physiology 404, 323–347. Legge, G.E. and Foley, J.M. (1980). Contrast masking in human vision. Journal of the Optical Society of America 70, 1458–1469. Lennie, P., Krauskopf, J., and Sclar, G. (1990). Chromatic mechanisms in striate cortex of macaque. Journal of Neuroscience 10, 649–669. Lennie, P., Pokorny, J., and Smith, V.C. (1993). Luminance. Journal of the Optical Society of America A 10, 1283–1293. Li, A. and Lennie, P. (1997). Mechanisms underlying segmentation of colored textures. Vision Research 37, 83–97. Livingstone, M. and Hubel, D. (1988). Segregation of form, color, movement, and depth: Anatomy, physiology, and perception. Science 240, 740–749. Maattanen, L.M. and Koenderink, J.J. (1991). Contrast adaptation and contrast gain control. Experimental Brain Research 87, 205–212. MacLeod, D.I.A. (1978). Visual sensitivity. Annual Review of Psychology 29, 613–645. MacLeod, D.I.A. (1985). Receptoral constraints on colour appearance. In Central and peripheral mechanisms of colour vision, (ed. D. Ottoson and S. Zeki), p. 103. MacMillan, London. MacLeod, D.I.A. and Boynton, R.M. (1979). Chromaticity diagram showing cone excitation by stimuli of equal luminance. Journal of the Optical Society of America 69, 1183–1186. MacLeod, D.I.A., Williams, D.R., and Makous, W. (1992). A visual nonlinearity fed by single cones. Vision Research 32, 347–363. Maffei, L., Fiorentini, A., and Bisti, S. (1973). Neural correlate of perceptual adaptation to gratings. Science 182, 1036–1038. McCollough, C. (1965). Color adaptation of edge-detectors in the human visual system. Science 149, 1115–1116. Merigan, W.H. and Maunsell, J.H.R. (1993). How parallel are the primate visual pathways. Annual Review of Neuroscience 16, 369–402. Mollon, J.D. (1989). ‘Tho’ she kneel’d in that place where they grew…’. Journal of Experimental Biology 146, 21–38. Moorhead, I.R. (1985). Human color vision and natural images. Inst. Elec. Radio Eng. Pub. 61, 21. Mullen, K.T. and Kingdom, F.A.A. (1991). Colour contrast and form perception. In Vision and visual dysfunction 6: The perception of colour, (ed. P. Gouras), pp. 198–217. MacMillan, London.
108
colour perception
Nagle, M.G. and Osorio, D. (1993). The tuning of human photopigments may minimize red–green chromatic signals in natural conditions. Proceedings of the Royal Society London B 252, 209–213. Nagy, A.L. and Sanchez, R.R. (1990). Critical color differences determined with a visual search task. Journal of the Optical Society of America A 7, 1209–1217. Nagy, A.L., Eskew, R.T.J., and Boynton, R.M. (1987). Analysis of color-matching ellipses in cone-excitation space. Journal of the Optical Society of America A 4, 756–768. Ohzawa, I., Sclar, G., and Freeman, R.D. (1982). Contrast gain control in the cat visual cortex. Nature 298, 266–268. Paradiso, M.A., Shimojo, S., and Nakayama, K. (1989). Subjective contours, tilt aftereffects, and visual cortical organization. Vision Research 29, 1205–1213. Párraga, C.A., Brelstaff, G., Troscianko, T. and Moorehead, I.R. (1998). Color and luminance information in natural scenes. Journal of the Optical Society of America A 15, 563–569. Poirson, A.B. and Wandell, B.A. (1996). Pattern-color separable pathways predict sensitivity to simple colored patterns. Vision Research 36, 515–526. Pokorny, J. and Smith, V.C. (1977). Evaluation of single-pigment shift model of anomalous trichromacy. Journal of the Optical Society of America 67, 1196–1209. Pokorny, J., Shevell, S., and Smith, V.C. (1991). Colour appearance and color constancy. In Vision and visual dysfunction 6: The perception of color, (ed. P. Gouras). Macmillan, London. Pugh, E.N. and Mollon, J.D. (1979). A theory of the π1 and π3 color mechanisms of Stiles. Vision Research 19, 293–312. Regan, B.C., Reffin, J.P., and Mollon, J.D. (1994). Luminance noise and the rapid determination of discrimination ellipses in colour deficiency. Vision Research 34, 1279–1299. Ross, J., Speed, H.D., and Morgan, M.J. (1993). The effects of adaptation and masking on incremental thresholds for contrast. Vision Research 33, 2051–2056. Ruderman, D.L. (1994). The statistics of natural images. Network 5, 517–548. Ruderman, D.L., Cronin, T.W., and Chiao, C.-C. (1998). Statistics of cone responses to natural images: implications for visual coding. Journal of the Optical Society of America A 15, 2036–2045. Sankeralli, M.J. and Mullen, K.T. (1996). Estimation of the L-, M-, and S-cone weights of the postreceptoral detection mechanisms. Journal of the Optical Society of America A 13, 906–915. Sankeralli, M.J. and Mullen, K.T. (1997). Postreceptoral chromatic detection mechanisms revealed by noise masking in three-dimensional cone contrast space. Journal of the Optical Society of America A 14, 2633–2646. Saul, A.B. and Cynader, M.S. (1989). Adaptation in single units in visual cortex: The tuning of aftereffects in the spatial domain. Visual Neuroscience 2, 593–607. Schiller, P.H. and Logothetis, N.K. (1990). The color-opponent and broad-band channels of the primate visual system. Trends in Neuroscience 13, 392–398. Schnapf, J.L., Nunn, B.J., Meister, M., and Baylor, D.A. (1990). Visual transduction in the cones of the monkey Macaca fascicularis. Journal of Physiology 427, 681–713. Sclar, G., Lennie, P., and DePriest, D.D. (1989). Contrast adaptation in striate cortex of macaque. Vision Research 29, 747–755. Seiple, W., Holopigian, K., Greenstein, V., and Hood, D.C. (1992). Temporal frequency dependent adaptation at the level of the outer retina in humans. Vision Research 32, 2043–2048. Shapiro, A. and Zaidi, Q. (1992). The effects of prolonged temporal modulation on the differential response of color mechanisms. Vision Research 32, 2065–2075.
light adaptation, contrast adaptation, and human colour vision
109
Shapley, R. (1990). Visual sensitivity and parallel retinocortical channels. Annual Review of Psychology 41, 635–658. Shapley, R.M. and Enroth-Cugell, C. (1984). Visual adaptation and retinal gain controls. In Progress in retinal research, (ed. N.N. Osborne and G.J. Chader), Vol. 3, pp. 263–343. Pergamon, Oxford. Shevell, S.K. (1978). The dual role of chromatic backgrounds in color perception. Vision Research 18, 1649–1661. Shevell, S.K. (1992). Redness from short-wavelength-sensitive cones does not induce greenness. Vision Research 32, 1551–1556. Siegel, S. and Allan, L.G. (1992). Pairings in learning and perception: Pavlovian conditioning and contingent aftereffects. In The psychology of learning and motivation, (ed. D. Medin), Vol. 28, pp. 127–160. Academic Press, New York. Smirnakis, S.M., Berry, M.J., Warland, D.K., Bialek, W., and Meister, M. (1997). Adaptation of retinal processing to image contrast and spatial scale. Nature 386, 69–73. Stiles, W.S. (1959). Color vision: The approach through increment-threshold sensitivity. Proceedings of the National Academy of Sciences 45, 100–114. Stiles, W.S. (1961). Adaptation, chromatic adaptation, colour transformation. Anales Real Sociedad Española de Fisica y Quimica 57, 149–175. Stockman, A., MacLeod, D.I.A., and DePriest, D.D. (1991). The temporal properties of the human short-wave photoreceptors and their associated pathways. Vision Research 31, 189–208. Stockman, A., MacLeod, D.I.A., and Johnson, N.E. (1993). Spectral sensitivities of the human cones. Journal of the Optical Society of America A 10, 2491–2521. Stromeyer, C.F. (1978). Form-color aftereffects in human vision. In Handbook of sensory Physiology, VIII, (ed. R. Held, H.W. Leibowitz, and H.L. Teuber). Springer-Verlag, New York. Stromeyer, C.F., Chaparro, A., Rodriguez, C., Chen, D., Hu, E., and Kronauer, R.E. (1998). Short-wave cone signal in the red–green detection mechanism. Vision Research 38, 813–826. Stromeyer, C.F., Cole, G.R., and Kronauer, R.E. (1985). Second-site adaptation in the red–green chromatic pathways. Vision Research 25, 219–237. Stromeyer, C.F. and Dawson, B.M. (1978). Form-color aftereffects: selectivity to local luminance contrast. Perception 7, 407–415. Stromeyer, C.F., Eskew, R.T., Kronauer, R.E., and Spillmann, L. (1991). Temporal phase response of the short-wave cone signal for color and luminance. Vision Research 31, 787–303. Stromeyer, C.F., Kronauer, R.E., Madsen, J.C., and Cohen, M.A. (1980). Spatial adaptation of shortwavelength pathways in humans. Science 207, 555–557. Stromeyer, C.F. and Lee, J. (1988). Adaptational effects of short wave cone signals on red–green chromatic detection. Vision Research 28, 931–940. Switkes, E., Bradley, A., and De Valois, K.K. (1988). Contrast dependence and mechanisms of masking interactions among chromatic and luminance gratings. Journal of the Optical Society of America A 5, 1149–1162. Tolhurst, D.J., Tadmor, Y., and Chao, T. (1992). Amplitude spectra of natural images. Ophthalmic and Physiological Optics 12, 229–232. von Kries, J., (1970). Festschrift der Albrecht-Ludwigs-Universität (Fribourg, 1902). In Sources of color science, (ed. D.L. Mac Adam). MIT Press, Cambridge MA, USA. Walraven, J., Enroth-Cugell, C., Hood, D.C., MacLeod, D.I.A., and Schnapf, J.L. (1990). The control of visual sensitivity: Receptoral and postreceptoral processes. In Visual perception: The neurophysiological foundations, (ed. L. Spillmann and J.S. Werner), pp. 53–101. Academic Press, San Diego.
110
colour perception
Webster, M.A. (1996). Human color perception and its adaptation. Network: Computation in Neural Systems 7, 587–634. Webster, M.A. (1999). Contrast sensitivity under natural states of adaptation. In Human vision and electronic imaging, (ed. B. Rogowitz and T. Pappas). SPIE 3644, 58–70. Webster, M.A., De Valois, K.K., and Switkes, E. (1987). Effect of contrast adaptation on color and luminance interactions. Investigative Ophthalmology and Visual Science (Suppl.) 28, 214. Webster, M.A. and Malkoc, G. (2000). Color-luminance relationships and the McCollough Effect. Perception and Psychophysics, 62, 659–72. Webster, M.A. and Miyahara, E. (1997). Contrast adaptation and the spatial structure of natural images. Journal of the Optical Society of America A 14, 2355–2366. Webster, M.A. and Mollon, J.D. (1991). Changes in colour appearance following post-receptoral adaptation. Nature 349, 235–238. Webster, M.A. and Mollon, J.D. (1993). Contrast adaptation dissociates different measures of luminous efficiency. Journal of the Optical Society of America A 10, 1332–1340. Webster, M.A. and Mollon, J.D. (1994). The influence of contrast adaptation on color appearance. Vision Research 34, 1993–2020. Webster, M.A. and Mollon, J.D. (1995). Colour constancy influenced by contrast adaptation. Nature 373, 694–698. Webster, M.A. and Mollon, J.D. (1997a). Adaptation and the color statistics of natural images. Vision Research 37, 3283–3298. Webster, M.A. and Mollon, J.D. (1997b). Motion minima for different directions in color space. Vision Research 37, 1479–1498. Webster, M.A., Raker, V.E., and Malkoc, G. (1998). Visual search and natural color distributions. In Human vision and electronic imaging, (ed. B. Rogowitz and T. Pappas). SPIE 3299, pp. 264–273. Webster, M.A. and Wilson, J.A. (2000). Interactions between chromatic adaptation and contrast adaptation in colour appearance. Vision Research 40, 3801–3816. Wei, J. and Shevell, S.K. (1995). Colour appearance under chromatic adaptation varied along theoretically significant axes in color space. Journal of the Optical Society of America A 12, 36–46. Wenderoth, P., Bray, R., and Johnstone, S. (1988). Psychophysical evidence for an extrastriate contribution to a pattern-selective motion aftereffect. Perception 17, 81–91. Whittle, P. and Challands, P.D.C. (1969). The effect of background luminance on the brightness of flashes. Vision Research 9, 1095–1110. Wilson, H.R. and Humanski, R. (1993). Spatial frequency adaptation and contrast gain control. Vision Research 33, 1133–1149. Wohlgemuth, A. (1911). On the after-effect of seen movement. British Journal of Psychology Monograph Supplements 1, 1–117. Wyszecki, G. (1986). Color appearance. In Handbook of perception and human performance, Vol. 1: Sensory Processes and Perception, (ed. K. Boff, L. Kaufman, and J. Thomas), pp. 9–57. Wiley, New York. Zaidi, Q. and Halevy, D. (1993). Visual mechanisms that signal the direction of color changes. Vision Research 33, 1037–1051. Zaidi, Q. and Shapiro, A.G. (1993). Adaptive orthogonalization of opponent-color signals. Biological Cybernetics 69, 415–428. Zeki, S.M. (1978). Functional specialization in the visual cortex of the rhesus monkey. Nature 274, 423–428. Zrenner, E. (1983). Neurophysiological aspects of color vision in primates. Springer-Verlag, Berlin.
commentary: light adaptation, contrast adaptation
111
Commentary on Webster Adaptation and the ambiguity of response measures with respect to internal structure Franz Faul In his contribution to this book, Webster examines different adaptation processes in human colour vision. He covers a broad range of topics, from basic processes of light adaptation to functional aspects, and provides a lucid and integrated review of important results in the field. Given the great diversity of relevant findings, this is not an easy task; however, Webster’s integrative treatment offers a sense of orientation in a very complex field governed by methodological and theoretical eclecticism, which has its obvious advantages, but also poses extraordinary challenges to anyone who aims at a more coherent theoretical picture. Webster’s integrative account provides an excellent basis for discussing the more general theme of how adaptation experiments can be used in psychophysical research to reveal basic structural properties of the visual system. The question I would like to pursue in this commentary is: ‘To what extent can one draw sound conclusions regarding the structure of the visual system on the basis of adaptation effects observable in adequately designed experimental settings?’ I will focus on this methodological issue, whereby most of my remarks elaborate on critical points already mentioned by the author. The evaluation of the methodological status of adaptation experiments is complicated by the fact that without a precisely specified theoretical context ‘adaptation’ is merely a descriptive rather than an explanatory category. It seems therefore appropriate to first clarify the conceptual framework: in a general sense, adaptation means that a system changes its properties depending upon the prevailing spatiotemporal context. In a formal description of the system, these dynamic properties would normally be modelled by a set of parameters. Possible purposes of such dynamic adjustments are manifold. In sensory systems they may serve, for instance, to protect the system, to map a large input range to a smaller dynamic range of the sensors, to enhance relevant attributes of target objects, to allow a more efficient encoding of the signal, or to recalibrate the system when the structure of the input changes systematically over time. Each of these functional goals can be achieved by a great variety of potential mechanisms. Typical examples of mechanisms postulated in biological systems are sensitivity changes in receptors or the adjustment of weights in neuronal networks. If the internal structure of the system is known, then it is clear which properties of the system are dynamic and how the response characteristics of the system change with modifications of its parameters. However, the goal of vision science is to uncover the structural properties and mechanisms of a largely unknown system. In psychophysical investigations it is therefore necessary to rely on experimental observations to identify adaptive properties of the system. To this end the following criterion is usually employed: if the system’s response to a constant input is changed in a systematic and reproducible way when the spatiotemporal context is altered, then it may be concluded that the context has influenced an adaptive mechanism. Essentially, this means that adaptation is identified with a systematic context-dependent change in the system’s response to a constant stimulus. However, this approach has a serious flaw, since without thorough knowledge of the system’s design we lack a criterion by which to decide whether a certain stimulus condition represents a constant input for the subsystem we are interested in. It is therefore possible that contextdependent changes in the response of the system are actually due to the specific coding properties of the system. The subtractive component of light adaptation may serve as a typical example. It is
112
colour perception
generally assumed that a centre-surround antagonism of retinal cells is responsible for this subtractive component. According to this assumption, the change in the system’s response to a constant test field that can be observed when the brightness of the surround is varied is due to a change in the relative input of the interacting subsystems (retinal cells) and not to a dynamic change of their parameters. The ‘adaptation criterion’ does not apply in this example, because holding only the infield colour constant does not represent a constant input in the putative antagonistic mechanism. The above-mentioned problem is particularly grave if we consider information-processing systems such as the brain, since the responses of such systems do not depend solely on the input, but are also influenced by interpretations of the system, often referred to as top-down effects. For example, there is ample evidence that the visual system treats object and shadow borders differently (Gilchrist et al. 1983). Systematic changes in the response properties with varying context information may therefore also result, if the context triggers different interpretations and correspondingly different mechanisms of the system. This is a problematic point, since at present we have only limited knowledge about the interpretations made by the system and the cues that may trigger them. The research on light adaptation summarized by Webster in his first subsection demonstrates that it is possible to infer basic mechanisms of the visual system despite these difficulties. We may therefore use this research as an example to identify useful strategies to reduce the inherent ambiguity of response measures with respect to internal structure. Three characteristic features of this research seem to be especially important. A first point is that the relevant input into the system and its dependence upon properties of surfaces and illuminations can be clearly described. A second point is that most processes of light adaptation seem to be located in the retina, which is comparatively well understood. This knowledge of structural properties of the system can be used as a research guide, since it places specific constraints on possible mechanisms. Objections raised against Stiles’ proposal of four independent π-mechanisms, for example, were mainly based on the argument that it lacks plausibility with respect to known structural properties of the retina. A further, and very important, point is that research on light adaptation is led by clear hypotheses concerning the function of the mechanism. One searches explicitly for mechanisms that, on the one hand, map the large dynamic range of the input on the restricted range of neuronal responses and, on the other hand, isolate contrast, which contains information on the reflectance properties of surfaces that is invariant against changes in the overall illumination. It is instructive to compare the strategy used in research on light adaptation with the very different ‘general’ approach outlined in Webster’s section on contrast adaptation. The main reason for the evident methodological shift to a more general approach is, in my view, that the theoretical framework is considerably more vague in the latter case: (1) The structure of the input into the putative submechanisms is less clearly defined (‘structure or pattern in the stimulus’); (2) many structural properties of cortical areas, where the mechanisms of contrast adaptation are supposed to be located, are unknown; and (3) the functional role of contrast adaptation is also mostly unclear (but see Webster and Mollon 1995). Correspondingly, assumptions specific to the function or mechanism under scrutiny are (implicitly) replaced by more general assumptions about the structure of the visual system. Based on these assumptions, adaptation experiments are now interpreted as a general paradigm that follows a logic that is independent from the particularities of the subsystems under investigation and that can therefore be used as a general tool to probe structural properties of the visual system. The labelling of adaptation experiments as the ‘psychologist’s microelectrode’ nicely characterizes this picture. The basic assumption underlying this approach was aptly described by Barlow (1999): ‘Possibly the whole sensory cortex should be viewed as an immense bank of tuned filters, each collecting the information that enables it to detect with high sensitivity the occurrence of a patterned feature having
commentary: light adaptation, contrast adaptation
113
characteristics lying within a specific range.’ In addition, it is usually assumed that specific stimulus attributes, as for example colour or motion, are processed separately, in parallel, hierarchically organized pathways in which higher levels encode increasingly complex stimulus attributes. This general picture is supplemented with the assumption that the postulated feature detectors can fatigue independently of each other after prolonged selective stimulation, which results in specific aftereffects. Using this set of assumptions, the strategy is to deduce from specific after-effects observable after prolonged stimulation the existence of a detector for the attribute varied in the adaptation stimulus. Whether this reasoning is justified depends on the plausibility of the above-mentioned assumptions. However, a critical analysis reveals that serious objections may be raised against virtually all of them. A first point concerns the general picture underlying this approach: It is in no way a firmly established fact that the cortex can be described as an ‘immense bank of tuned filters’. Barlow and Földiák (1989, p. 54) characterize our present knowledge as follows: ‘Many readers will be familiar with the picture of neural activity in the primary visual cortex that has been revealed by the work of Hubel (1988), Wiesel and many others [. . .], but most will probably agree that nobody knows quite well what will happen next.’ A recent analysis of a large body of physiological results by Lennie (1998, 1999) queries the assumption of independent stimulus-specific pathways in particular. Contrary to this assumption, the evidence suggests that, beginning with V1, a close-coupled analysis of different stimulus attributes takes place in cortical areas. With respect to the task of higher cortical areas, he concludes that ‘one should view them as undertaking analyses of the image at successively more complex levels of aggregation (the units of analysis are more complex object elements) rather than along different stimulus dimensions’ (Lennie 1999, p. 245). This view is compatible with the more general theoretical picture outlined by Mausfeld (Chapter 13, this volume) that emphasizes the importance of object-related internal categories as induced by corresponding representational primitives. From this perspective one would expect that there is no simple relation between the response of cortical neurons and variations of elementary stimulus attributes in the image, but that their response depends in complex ways on interpretations of the image that are related to attributes of objects. The assumption of feature-detectors that fatigue independently may also be problematic. This is clearly recognized by Webster in his evaluation of the work of Atick et al. (1993). These authors show that the results of Webster and Mollen (1991), which were thought to support a model of multiple independent detectors, may be described equally well by a decorrelation model based on mutual interactions between detectors. This example demonstrates that it is, in general, not possible to infer from specific after-effects the existence of a detector for the attribute that is varied in the adapting stimulus. Although both types of model describe the data to a comparable degree of precision, there are two characteristics of the decorrelation approach that let it appear more favourable. First, it provides a functional basis for adaptation processes and, secondly, it avoids the theoretically unsatisfying concept of fatigue, which is in my view rightly criticized by Atick et al. (1993, p. 129): ‘We find the concept of fatigue uncompelling; it is hard to accept that a system such as the brain—which is known to exhibit all sorts of intricate adaptations—does so merely because of a breakdown of its neuronal response abilities and not in order to serve some function.’ Given the questionable empirical status of basic assumptions underlying the general adaptation paradigm, conclusions from experimental findings on structural properties of the visual system can only be drawn with great caution. As a possible way to alleviate this uncertainty Webster proposes to use consistency with other findings in the field as a criterion to decide between alternative interpretations. However, although consistency is surely something we should require in any sound theoretical approach, it is notoriously hard to evaluate on the basis of results from different paradigms. The main reason is that apparently contradictory findings may ‘be compatible if they reflect different aspects and levels of the system’, as Webster rightly states. Considering this circumstance, the consistency criterion
114
colour perception
may lead to a self-immunization of theories instead of helping to decide between different approaches: Faced with contradictory results, it is always tempting to refer to hitherto unknown aspects of the highly complex and barely understood visual system. Thus inconsistencies between results from different paradigms may be resolved by referring to higher processes, and deviations of experimental data from model predictions may be resolved by postulating additional mechanisms. Signs of such an inflation of explanatory entities are clearly evident in the long list of potential mechanisms considered by Webster: multiple post-receptoral stages, multiple colour–luminance mechanisms, on–off pathways, multiple contrast-mechanisms, and multiple visual pathways. Given this state of affairs, there is the obvious danger of ending in a ‘ptolemaic theory’ of the visual system that is descriptively satisfying but theoretically unfruitful. A more promising approach seems to be to embed investigations of adaptation processes in specific theoretical frameworks that allow computational analyses and considerations of functional aspects. Examples of such approaches are given in the last sections of Webster’s review and in several other chapters of this volume. From this perspective, the characterization of adaptation experiments as the ‘physiologist’s microelectrode’ has something to it, since they seem to share a problem with single-unit studies: ‘Single-unit recording on its own is a weak instrument for discovering what the cortex really does, but when harnessed to a theory it is immensely valuable’ (Lennie 1998, p. 924).
References Atick , J.J., Li, Z., and Redlich, A.N. (1993). What does post-adaptation color appearance reveal about cortical color representation? Vision Research 33, 123–129. Barlow, H.B. (1999). Feature detectors. In The MIT encyclopedia of the cognitive sciences, (ed. F.C. Keil), pp. 311–314. MIT Press, Cambridge, MA, USA. Barlow, H.B. and Földiák, P. (1989). Adaptation and decorrelation in the cortex. In The computing neuron, (ed. G. Mitchison), pp. 54–72. Addison-Wesley, Workingham, England. Gilchrist, A., Delman, S., and Jacobsen, A. (1983). The classification and integration of edges as critical to the perception of reflectance and illumination. Perception and Psychophysics 33, 425–436. Lennie, P. (1998). Single units and visual cortical organization. Perception 27, 889–935. Lennie, P. (1999). Color coding in the cortex. In Color vision: From genes to perception, (ed. L.T. Sharpe), pp. 235–247. Cambridge University Press, Cambridge, UK. Webster, M.A. and Mollon, J.D. (1991). Changes in colour appearance following post-receptoral adaptation. Nature 349, 235–238. Webster, M.A. and Mollon, J.D. (1995). Colour constancy influenced by contrast adaptation. Nature 373, 694–698.
chapter 3
CONTRAST COLOURS paul whittle So the judgements that we hold about the colours of objects seem not to depend uniquely on the absolute nature of the rays of light that paint the picture of objects on the retina; our judgements can be changed by the surroundings, and it is probable that we are influenced more by the ratio of some of the properties of the light rays than by the properties themselves, considered in an absolute manner. Monge (1789, cited by Mollon 1995) Hue is perhaps the direct perception of the direction of departures of the stimulus from some reference stimulus. Evans (1974, p. 234) The brain forms colors by comparing objects to their background and not by analyzing their local spectral reflectance. An object is bright or dark, and of a particular color, only in relationship to its background. Gouras and Zrenner (1981)
Preface I emphasize in this chapter both the long history and the paradoxical nature of the simplest description of chromatic adaptation, best known in the form of von Kries’s ‘law of coefficients’, and currently demonstrated by the explanatory value of ‘cone contrasts’. Over more than 100 years this has repeatedly been discovered, forgotten, re-discovered, proved, and disproved in different contexts. It is time that we put an end to this repetition compulsion and accepted its problematic nature—that it sometimes holds and sometimes does not—and faced the perhaps uncomfortable implications. It is probably a good approximate description of retinal function (‘contrast coding’) but perceived colour depends not only on that level but also on the parsing of the scene into surface colour, lighting, and spatial layout. These higher-level interactions can counteract the lower and, in most situations, all levels are in play. Returning to this chapter some time after I wrote it, I am less sanguine about ‘putting an end to the repetition’. Perhaps the best to be hoped for is that at least some people will realize the history and complexity of the topic before embarking on a new study, as so many have, in blithe confidence that all will be clarified by a few new measurements. In the intervening period I had occasion to read more of the writings of W. H. R. Rivers, one of the founders of experimental psychology and social anthropology in Great Britain, whose review I cite in this chapter. I came upon the following comment by one of his students, C. S. Myers (1923): ‘Rivers clearly showed that the effect of psychological factors is not to create but to mask the phenomenon of simultaneous contrast, which are really dependent on what he terms “the physiological reciprocity of adjoining retinal areas.” ’ Replacing the last phrase by ‘retinal contrast coding’, Rivers’s thesis anticipates my own by about a century. I take this as further confirmation of the repetitiveness of the field. P. Whittle
116
colour perception
Introduction Most of the science of early vision has made the ‘contrast turn’ in the past 30 years. That is, it has become generally accepted that relative stimulus magnitude, along whatever dimension, is more important than absolute, and that this is so because early stages of the visual pathway code information in terms of contrast.1 Accounts of colour appearance, as opposed to colour discrimination, have lagged somewhat behind, or followed only in a zigzag manner. Thus, on the one hand, I could cover pages with quotations like those above, which occur in the contexts of colour constancy, simultaneous contrast, adaptation, and retinal physiology. On the other hand, much colour science pays little attention to the surroundings of coloured objects. So much so that demonstrations of vivid contrast colours, such as coloured shadows or Land’s two-colour projections, often amaze colour scientists as much as anyone. Yet it is 200 years since Goethe, 150 since Chevreul, 100 since Hering, to name only three of the best-known figures who have written at length and eloquently on chromatic contrast effects. One would have thought that these observations could at last be assimilated and find their home in a visual science that has made the contrast turn. But would-be comprehensive treatments of colour appearance, such as Kaiser and Boynton’s text (1996) or Abramov and Gordon’s much cited review (1994), are still being published, which treat effects of contrast as minor side-effects. This is a curious state of affairs, but it is not without reason, for the idea that colour is always perceived relative to its background is contradicted by the everyday observation that if you move an object against a variegated background, it is often hard to see any changes in its colour at all. In the jargon of colour science, ‘simultaneous colour contrast’ between neighbouring objects is often a very small effect in ordinary vision, as has been observed at least since Helmholtz. I try in this chapter to understand this situation and to point to ways in which it might be resolved. I discuss conditions in which colour is a function of contrast, in the strong sense that patch and surround lights have equal and opposite influence, along all three dimensions of colour. If we represent colours in a three-dimensional space, the perceived colour depends only on the direction and length of the vector from background to object colour, and not on its absolute position. We can conveniently call such colours ‘contrast colours’, using a term also current since Helmholtz. I argue that they reflect a stage of early visual processing through which all colour information passes. But contrast colours are seen clearly only under certain conditions. This can be understood in terms of two kinds of colour constancy that the visual system achieves, one with respect to illumination and the other with respect to neighbouring object colours. Contrast coding facilitates the first but impedes the second. Since both occur, the brain must be able to evaluate the contrast code differently, depending on (or as part of) the parsing of a scene into surface colour, lighting, and spatial layout.
1 I refer to the ‘contrast turn’ by analogy with the ‘linguistic turn’ in philosophy. I use ‘contrast’ imprecisely to mean ‘relative stimulus magnitude’, particular expressions being discussed later on. Contrast is a physical quantity. I refer to psychological contrast effects by longer phrases such as ‘simultaneous contrast’.
contrast colours
117
Different stimulus conditions show up different aspects of these processes, some showing contrast dependence, and some independence.2 I find it helpful to think of contrast colours as analogous to stereo depth. Both reflect a relative or contrast code computed early in the visual pathway. It takes special situations to show them up in their purest form. In ordinary vision, contrast combines with other determinants of colour, just as binocular disparity combines with other depth cues. This analogy can be pursued in considerable detail [as Brookes and Stevens (1989) do for the brightness dimension]. I want to draw attention here to one particular aspect. The apparently simple question ‘How far away is x?’ can be answered in many ways. For example, by giving an absolute judgement in metres, by a relative judgement (‘about 5 metres behind that house’), or by reaching with the hand (‘this far’). It is the same with colour. We can always see at least the colours of both object and lighting. But although this has been pointed out over and over again, we continually fall back into the assumption that each region of the visual field has one and only one colour. To understand contrast colours, however, we have to remain aware of the ambiguity of colour. This is a gain, not a loss. It helps us to remain aware of the true complexity of experience. The discussion in this chapter is mostly concerned with uniform colours seen against larger uniform backgrounds. Both temporal and spatial interactions may be involved. I discuss their relative roles briefly later in the chapter.
Demonstrating contrast colours Contrast colours can now be made and studied with great freedom by using a computer monitor and appropriate software. Set up a large uniform grey field as a background, and vary the colour of a small central region. Suppose it can be controlled in a three-dimensional space with axes red–green, blue–yellow and black–white, with the background grey at the origin. If you vary the central patch along the axes, those are the colours you see. Now change the background to some other colour, and vary the small central region along directions parallel to the axes. The remarkable thing is that it doesn’t make much difference what background colour you start with, provided the eye has been allowed to adapt to it: you still see red–green, blue–yellow and black–white. The colours you see are contrast colours, dependent primarily on the relative rather than the absolute physical colours of patch and background. If you move along intermediate directions, the colour behaves as you would expect from vector addition. A direction between red and blue will look purple, one between green, yellow and black will look olive green, and so on. The relativity of the contrast colours can be enhanced by various tricks. A simple one is just to flash them, say 1 s on, 2 s off, with only the uniform background between flashes. Here is another more elaborate but very striking demonstration. 2 I put forward an earlier version of arguments in this chapter for the intensive dimensions of colour—brightness and lightness—in Whittle (1994a, b). Here I develop and modify the argument for colour as a whole, present it less cluttered with psychophysical detail, and review some new evidence in the colour domain.
118
colour perception
Arrange two rows of eight coloured patches as in Fig. 3.1a, with one row and its background seen by each eye and the two superimposed binocularly so that one row is seen above the other, apparently on the same background.3 Choose the colours of one row to form a circle in the red–orange region of a chromaticity diagram, and those of the other row a circle in the blue–violet region (as in Fig. 3.1b). Make both backgrounds a neutral colour mid-way between the two circles. You now have a row of reds and oranges above a row of blues and violets, against a neutral background. Within each row the colours are quite similar, and the two groups of colours completely disjunct. Now comes the crucial move: change each background colour to the chromaticity at the centre of its corresponding circle. The backgrounds can be brighter or darker than the patches, provided the luminance contrasts of both rows are the same. The transformation in the apparent colours of the patches, which are physically unchanged, is rapid and startling. Each row now contains a complete gamut of hues—red, purple, blue, green, yellow, and orange—and the two rows more or less match, each ‘red–orange’ with its corresponding ‘blue–violet’. All that is needed to produce the full gamut of hues is to choose background colours so that the vectors from background to patches point in all directions, in any chromaticity diagram. The matches, however, are more satisfactory in some colour spaces than others. The diagram of Fig. 3.1b, in which they are good, is a surface in the three-dimensional space whose axes are log L, log M and log S, where L, M and S are the quantum catches of the long- medium- and short-wave cones. In that space, within quite wide limits, the colour of a light is constant if the vector from background to patch is constant (Fig. 3.2).4 The components of the vector are [log L/Lb ], etc., where the subscript b denotes background. These are the ‘cone contrasts’ for the three types of cone. The vectors are the same in the two rows of colours, so the fact that the rows match shows that colours match if the three corresponding cone contrasts are equal (L/Lb = L ′ /Lb′ , etc.). I call this the ‘cone contrast rule’. It is yet another illustration of the Weberian principle that sensation depends on ratios of stimulus magnitudes, not on the absolute values. It is also supports von Kries’s law of coefficients (von Kries 1905), which asserts that any state of chromatic adaptation alters the sensitivities of the three types of cone simply by three multiplicative factors. In this case, the sensitivity coefficients are inversely proportional to the effects of the background on each cone type. There is nothing special about using red and blue patches. The circles of colours can be anywhere in the chromaticity diagram. But, as for all perceptual demonstrations, certain conditions have to be satisfied. In this case the most important is that the patches are spatially separate so that each is entirely surrounded by the background colour. The effect is enhanced by allowing each eye to adapt fully to the background, and presenting the colours in such a way that the background colours cannot be compared. The trick of binocular superimposition is just one way of ensuring this. Other tricks are presenting the backgrounds as a stabilized image (Larimer and Piantanida, 1988), blurring the edge
3
The binocular superimposition can be achieved by altering eye vergence or by using mirrors. This is approximately, but not exactly, true for a vector translated in the logarithmic MacLeod–Boynton diagram of Fig. 3.1b. 4
(a)
Right eye
Left eye
Binocular (b)
Log S/(L + M )
–1.0
B
–1.5
–2.0
G
–2.5
R –0.25
–0.20
–0.15
–0.10
Log L /(L + M) Figure 3.1 (a) A version of the haploscopic superimposed display (HSD) with eight patches in each eye. (b) Two sets of eight colours in a constant luminance diagram (the logarithmic version of the MacLeod–Boynton diagram). The clown’s hat shape is the gamut of colours available on the computer monitor.
120
colour perception
Patch 1
Log M
Patch 2 Bkgd 1 Log L – Log L b = Log L /L b Bkgd 2
Log L Figure 3.2 A rigid translation of a vector in the log L, log M plane preserves cone contrasts.
between them (Wuerger, 1996), or using only temporal contrast so that the backgrounds are not separately visible during the matching (Webster and Mollon, 1995). Since coloured shadows are the subset of contrast colours in which the light in the patch is obtained by blocking out some of the surround light, their hues can therefore also be predicted by the direction of the vector from background to patch in a chromaticity diagram. This was pointed out by Evans in his book The perception of color, published posthumously: As far as hue is concerned these effects [coloured shadows] can all be predicted by a single generalization as startling in its implications as in its predictions: if the chromaticities of the two sources are plotted on the CIE diagram and the line connecting them extended to the spectrum locus in both directions, the intercepts of this line indicate the wavelengths (or complementary wavelengths) that, seen as isolated colors, would be of roughly the same hues as the two patches, regardless of where the source points lie on the diagram. Since the mixture chromaticity of the two sources lies between them on this same line, this leads to a definition of the complementary with respect to this mixture point. (Evans 1974, p. 222)
Notice that Evans finds this ‘startling.’ His book is a striking example of the claims I made in my opening paragraphs. It is all about ‘related’ colours, that is, colours seen in an illuminated surround and strongly affected by it, or what I call contrast colours. Yet Evans had not made the contrast turn. The word ‘contrast’ is not indexed, and he always described his stimuli in terms of luminance and wavelength, never contrast. We can therefore say that in some situations any light can be made to appear any colour whatsoever by choosing a suitable background, provided that the background light is available within the gamut offered by the physical set-up and the limitations of the eye.5 5
A calculation set as an exercise for students in Le Grand (1948, p. 462, Ex. 23).
contrast colours
121
This striking characteristic of our colour vision must surely be of profound functional importance.
Experimental evidence for the cone contrast rule Chichilnisky and Wandell (1995) reported the results of hundreds of matches in a haploscopic set-up similar to that of Fig. 3.1a, although with only a single patch shown to each eye. This is a type of ‘asymmetric matching’ between lights seen with eyes or retinal regions in different states of adaptation. I will call displays such as that of Fig. 3.1a ‘haploscopic superimposed displays’, or ‘HSDs’. They were introduced by Hering (1890). Chichilnisky and Wandell’s data obeyed the cone contrast rule to a good approximation. They thus provided the most direct and fullest confirmation of the thesis that colour depended on the three cone contrasts rather than simply on the numbers of photons absorbed by each class of cone.6 Various precursors of this idea have been proposed over the past 150 years. These have varied in their choice of colour dimensions (CIE dimensions, cone channels, retinexes), in the contrast expressions (ratios, delta-contrasts, etc.), and in their context (adaptation, constancy, simultaneous contrast). An early one is Rollett (1867; colour systems, contrast colours). McDougall (1901) argued that contrast colours depend on interactions within the three (R,G, B) colour systems rather than on Hering’s opponent colour mechanisms. Von Kries’s (1905) statement of the coefficient rule is the best known older principle (cones, ratios, adaptation). Ives (1912) applied the von Kries rule to colour constancy. Spencer (1943) proposed a revision of colorimetry to make it more compatible with colour appearances, because ‘the trichromatic system ignores adaptation’. She expressed colours as tensors in CIE XYZ space, which were equivalent to vectors whose components were the three deltacontrasts X /Xb , etc., and showed that her scheme fitted the data of Helson (1938). This work was described in Le Grand’s (1948) influential text. Both Spencer and Le Grand saw it as subsuming all of adaptation, simultaneous contrast and constancy. Alpern (1964) suggested that simultaneous colour contrast could be explained by interactions within each -mechanism. The ‘retinex’ theory of Land and McCann (1971) explained colour constancy in terms of the computation of image contrasts in each cone mechanism. Walraven (1976, 1981) showed that contrast colours could be described by delta-contrasts computed within each -mechanism. Mausfeld and Niederée (1993) proposed a general scheme for colour coding in terms of the three cone delta-contrasts. Webster and Mollon (1995) showed von Kries transformations of colours in chromatic adaptation. It is striking how neglected most of the early papers are. Rollett (1867) was known by McDougall (1901), who had independently reached the same conclusions, and reviewed by Rivers (1900) and Tschermak (1903), but then forgotten, at any rate in the English language 6 It is of interest in this connection that Evans (1974, p. 122) also wrote: ‘What’s needed are “inter-eye comparison methods” but these have been considered respectable only in quite recent years. I am sure that a systematic study of perceived hue by this technique would unravel most of the mysteries of color perception.’ Asymmetric matching between the eyes exploits the fact that the states of adaptation of the eyes are almost completely independent. Matching in the HSD is particularly easy and reliable, for reasons discussed later.
122
colour perception
literature.7 Ives’s (1912) paper was neglected (see Brill 1995). Spencer’s (1943) work was described by Le Grand, but I have come across no other citation. Apparently these were ideas whose time had not yet come. I discuss why in a later section. This brief history shows the same basic idea being put forward repeatedly. This is the Weberian relativity of colour perception shown by constancy and contrast colours. These are robust phenomena, and can be described in various ways. The description must contain ratios between light and background, but, given that, the precise choices of expressions and variables are, for many purposes, unimportant. But we can also ask, as many have, what is the most accurate mathematical description, and where exactly in the visual pathway is the relativity imposed? With regard to the mathematical description, an important distinction is between ratios and contrast expressions that also involve differences. This concerns what is meant by ‘contrast’. Weber contrast and Michelson contrast are the most common expressions for it, not the simple ratio L/Lb . Weber contrast = (L − Lb )/Lb , usually written L/Lb . When the backgrounds are weak a ‘dark light’ constant must be included: L/(Lb + L0 ). Michelson contrast = |L|/(L + Lb ). When people talk of contrast coding in the early stages of the visual pathway, they usually mean that the firing rate of sensory neurons is a function of, perhaps proportional to, Weber contrast or Michelson contrast. Such a code combines differencing (calculating L − Lb ) with normalization (attenuating by some function of the absolute stimulus level). The two components probably serve different functions. Normalization copes with the wide range of illuminations under which we can see, and contributes to constancy with respect to illumination changes, both in intensity and colour. Differencing seems to be associated with enhanced discrimination about an adaptation level.8 What is the evidence for differencing? The most direct is that many responses fall to zero when the difference is zero. The object disappears or the firing rate of a neuron is at resting level. Other bits of evidence come from the analysis of psychophysical data. Walraven (1976, 1981) provided one. His subjects saw a test patch made of superimposed red and green lights, on various coloured backgrounds. When they set the ratio of red to green to produce ‘unique’ yellow, they kept the ratio of 4 to 5 (effectively M and L cones) Weber contrasts constant. Here the differences were important; simple ratios L/Lb and M /Mb didn’t do the trick. Another line of evidence comes from studying the enhanced discriminability (or rapid change of appearance) around the background colour, described by von Bezold (1874) and called ‘the crispening effect’ by Takasaki (1966, 1967). For luminance, the form of the data implies that differences are computed (Semmelroth 1970; Whittle 1992, 1994a). The story is not yet fully worked out for colour, but the fact that there is also a chromatic crispening effect (Takasaki 1967; Ovenston and Whittle 1996), suggests a comparable importance for chromatic differences. The implications of matching experiments obeying the cone contrast rule depend on the values of the dark light constants L0 , M0 , S0 . If these are zero, matches 7
Tschermak’s long review, with 180 references, is a valuable and little-known resource. It is tempting to associate them with the subtractive and multiplicative components of adaptation that have been demonstrated by many workers (see Walraven et al. 1990), but I doubt whether this subtractive component does actually play the required role (Whittle 1994a, p. 104). 8
contrast colours
123
do not distinguish equating ratios from equating Weber (or Michelson) contrasts, because those expressions are equal if the ratios are equal and vice versa. Chichilnisky and Wandell (1995) fit their data with L0 etc. > 0. I am not clear whether the constants were sufficiently large to provide strong evidence for differencing, although they argue on other grounds (p. 251) that differential stimuli were being equated. Walraven’s result had a remarkable implication, which I discuss briefly, both because it is taken up later and because it gave rise to a controversy which seemed to impugn the validity of the cone contrast story. His data implied that the visual system was comparing only the cone increments, L and M . But these were superimposed on bright red or green backgrounds, which of course added physically to them and could be clearly seen. Nevertheless, neither the physical addition nor the subjective appearance seemed to influence subjects’ settings. It was as though the background was subtracted out by the visual system and influenced the colour of superimposed stimuli only by setting the adaptational state of the retina. Walraven called this ‘discounting the background’, echoing the older description of colour constancy as ‘discounting the illuminant’. This finding did not go uncontested. Shevell (1978), using the same technique, found that there was also an additive effect of the background hue, particularly when the patch was near threshold. But the additive effect seems, rather surprisingly, to depend on the subjective appearance of the background, not on its physical addition to the test patch. This is implied by an ingenious experiment by Nerger et al. (1993). They put an annulus of a different colour around the background and stabilized its inner edge on the retina. When this stabilized edge faded, as such edges always do, the inner background disappeared and its colour was filled in by the colour of the annulus. With an ordinary long-wavelength red background, they found Shevell’s additive effect. But when this background was made to appear yellow by filling-in, the effect disappeared. Therefore, it depended on the appearance of the background, not its physical composition, which was unchanged by stabilization. The measurements of Chichilnisky and Wandell (1995) strengthened the evidence for the cone contrast rule in two ways. First, they were able to explore more of the colour domain than Walraven, because they were not restricted to ‘unique’ hues. Secondly, the HSD produced particularly good cone contrast matching. If Nerger and co-workers are right, one reason was that it prevented subjects seeing the separate monocular backgrounds, thus removing Shevell’s additive effect.9 Although equating cone contrasts provides a good first-order description of contrast colours, it is not the whole story, even in situations like the HSD that favour it. Whittle and Arend (1991) adjusted a patch in a grey surround in the HSD to match ‘homochromatic’ standards. These were patches that differed from the background only in luminance, although both were intensely coloured, being produced by single monitor phosphors— red, green, or blue. When there is only a luminance difference, the cone contrasts are all equal to the luminance contrast, so a matching patch should be set with the same characteristics, differing from its grey background only in luminance. This was true for weak decrements, but increments were set to a desaturated version of the coloured background, 9 This has been confirmed in my lab in an undergraduate research project by Nick Blaker, who directly compared matching in the HSD with setting equilibrium hues.
124
colour perception
and higher contrast decrements to a colour approximately complementary. This resembles the Helson–Judd effect for judgements of colours under chromatic illumination (Helson 1938; Judd 1940). Chichilnisky and Wandell (1996) found evidence for the same effect in experiments in which subjects set a patch to achromatic in various coloured backgrounds. They also found evidence for adaptation in opponent colour mechanisms, the stage beyond ‘receptor adaptation’, which was their preferred description for what I call cone contrast matching. Finally, Chichilnisky and Wandell (1997) admitted that they also found small differences between increments and decrements in the experiments of their 1995 paper. All these results imply deviations from exact cone contrast matching. There is another type of experiment where patches were set to neutral in an illuminated surround, the results of which cannot be explained in terms of keeping the ratios of cone contrasts constant. These are the experiments on the black threshold by Werner and colleagues (see, for example, Shinomori et al. 1997). They found the luminance at which lights of different wavelength in a constant surround became black, and also the luminance of surround lights that induced black in a constant patch. The spectral sensitivities were quite different, contradicting the symmetry implied by a cone contrast scheme. The sensitivity in the patch showed the multiple peaks characteristic of opponent colour mechanisms, whereas that in the surround followed the single-peaked luminosity function.
Cone contrasts and opponent colours The notion of colour opponency was partly motivated by contrast colours, and indeed it could have been designed expressly for them. They have an ‘opponent’ structure which is more general, and in one way more clearly defined, than the opponency of ‘unrelated’ colours in a dark surround. If an unrelated colour is varied along a line through neutral in a chromaticity diagram, the series will contain just two hues, such as green and magenta, divided by the neutral point into two saturation series: saturated magenta to no hue, and no hue to saturated green.10 The magenta and green are alternatives, ‘opponent’ colours. They can never be seen together as components of a mixture, as red and yellow can be seen in orange. The neutral point of the magenta–green series can be judged reliably, but it takes care. If the line is tilted out of the chromaticity (constant luminance) plane, a black–white component is added. The colours might now run from bright magenta to dark green. If the line is vertical—black–white—the division point becomes arbitrary. Contrast colours generalize this structure in that the background provides a balance point that can be any colour, not just neutral. If the eye is adapted to it, then any line through any background colour contains just two categorically different colours, such as bright magenta and dark green, dividing at the background. The balance point is easy to judge and never arbitrary. It is where patch and background are equal: zero contrast, no object. The background colour may be visible, but that belongs to a different object, not in the same continuum. 10 I ignore the curvature of constant-hue lines and, in the case of contrast colours, Shevell’s additive effect, which can give low-contrast stimuli a tinge of the background colour.
contrast colours
125
It would therefore be somewhat paradoxical if contrast colours were to be explained in terms of cone contrasts rather than the opponent colour mechanisms that phenomenology and physiology imply. But what is the relationship between these two frameworks? The question is sometimes mentioned in the literature (e.g. Wyszecki and Stiles 1967, p. 556; Mollon 1987, p. 35), but is remarkably little discussed considering the long history of the use of both frameworks. One of the interests of contrast colours is that they open it up. Two different types of colour space have been mentioned in passing. One in which the axes represent some quantity—quantum catch, contrast, or whatever—corresponding to each cone type, and one in which two axes represent opponent chromatic dimensions and the third axis the light–dark dimension. A popular version of the latter type is the ‘DKL’ space proposed by Derrington et al. (1984), who found it a convenient framework within which to represent the responses in different classes of retinal ganglion cell. The x and y axes are the MacLeod–Boynton axes L/(L + M ) and S/(L + M ), and the z axis is luminance (L + M ). What is the relationship between these two types of colour space, and which best represents contrast colours? We are, of course, interested in functional localization as well as representation. Just as we can say that metameric matching is determined by the photopigments, we would like to know what level of the visual pathway determines contrast colours. Since contrast colours match, to a first approximation, if their three cone contrasts are all equal—the cone contrast rule, one might expect a cone contrast space, with axes L/Lb , etc., to be a good representation for them. But the DKL space also has advantages. Both represent colours as vectors from a background, but the DKL space represents colour appearance more intuitively. It segregates brightness, corresponding to the vertical luminance axis, from chromaticity, represented in horizontal planes. Within any horizontal plane, polar coordinates r, θ map on to saturation and hue. The planes and axes of cone contrast space, on the other hand, do not map conveniently on to familiar dimensions of colour. Brightness corresponds not to an axis, but to the diagonal through the points (1,1,1) and (−1, −1, −1). None of the axes maps simply on to hue. For example, although varying S-cone contrast varies hue along a violet–chartreuse dimension, excellent violet or chartreuse can be produced with zero S-cone contrast.11 This follows immediately from the fact that the violet–chartreuse axis in the MacLeod–Boynton chromaticity diagram is the ratio S/(L + M ), so you can manipulate this dimension by changing (L + M ) with S constant just as well as by changing S. Similarly, cherry is produced by M-cone decrements just as well as by L-cone increments, and vice versa for teal. The relative stimulation of cone types, which is what is computed by opponent colour mechanisms, maps more simply on to hue than the stimulation of separate cone types. This is well known, but it is necessary to remind ourselves of it in the present context. Furthermore, there is empirical evidence that opponent colour mechanisms are involved in generating contrast colours. I mentioned Chichilnisky and Wandell (1966) above. Poirson and Wandell (1993) matched a uniform region to the bars of square wave gratings of 11 I refer to the colours of the MacLeod–Boynton axes as violet versus chartreuse and cherry versus teal (Abramov and Gordon 1994) to emphasize the fact that the relation of these ‘geniculate’ axes to the ‘basic colours’—red, green, blue, and yellow—is a major unsolved problem.
126
colour perception
various frequencies. The latter were contrast colours in my sense because their hue would be determined as much by the spectral composition of neighbouring bars as by their own. They showed that opponent colour spectral sensitivities could be derived from their data on the simple assumption of pattern-colour separability—that the relative chromatic sensitivities are independent of spatial frequency and vice versa. It is not difficult to find a variant of the DKL space that can accommodate both cone contrast and opponent colour findings.12 One possibility is to take the axes as x = CL − CM , y = CS −(CL +CM )/2, z = (CL +CM )/2, where CL , CM and CS are the Weber or Michelson cone contrasts for the L, M, and S cones. As in the DKL space, the x axis depends on the relative L and M cone signals, although now they are contrast signals, and is independent of S. The z axis is a bright–dark dimension: the average of the L and M cone contrasts. The y axis still represents the S cone signal (now contrast) relative to the bright–dark dimension. These coordinates have several advantages for representing contrast colours. The background colour is at the origin (0,0,0) because all cone contrasts are zero there. Two points coincide if their cone contrasts are equal: the cone contrast rule. Therefore graphs plotted on these axes immediately show the matches to be predicted from that rule, and the deviations from it. Finally, such coordinates express the increasingly accepted idea that most adaptation—the computation of contrast—occurs prior to the opponent stage. This is the ‘receptor adaptation’ of Chichilnisky and Wandell (1995). For recent evidence, see He and MacLeod (1998). Differences of cone contrasts have been used extensively in the literature on chromatic discrimination (e.g. Friele 1961; see Wyszecki and Stiles 1967, p. 558; Vingrys and Mahon 1998).13 In that literature, the Cs are usually Weber contrasts. For describing colour appearance, however, signed Michelson contrast (L − Lb )/(L + Lb ) has two major advantages. First, it provides a compressive transform of the S-cone axis that yields approximately equal subjective intervals,14 whereas Weber contrast preserves the linear S scale, which is a poor subjective scale because it compresses the chartreuse end relative to the blue–violet end (Le Grand 1949). We see this in the MacLeod–Boynton diagram, for example, which puts neutral near the bottom, red–green, edge. Secondly, Fig. 3.3a shows that the difference of Michelson contrasts behaves well as luminance contrast is varied with chromaticity constant. They have maximum magnitude near isoluminance, and drop to zero at high positive or negative contrasts. This accords with experience. Differences of Weber contrasts, on the other hand, increase monotonically from black, tending to infinity at high positive contrasts (Fig. 3.3b), which is not at all how subjective hue and saturation behave. However, Michelson contrasts also have a major disadvantage. Consider a coloured light in a dark surround. All three Michelson cone contrasts are 1.0, so the chromatic x and y coordinates defined above are both zero. Yet such a light can look intensely saturated.15 How can this be? 12
Simply making the axes logarithmic works quite well, although it does not give a convenient origin. ‘For modest signals under a constant adaptation state, single-cell responses and psychophysical sensitivity are consistent with mechanisms that respond to simple sums or differences of the cone contrasts’ (see Chapter 2 this volume). 14 (L − L )/(L + L ) is approximately equal to log (L/L ), for contrasts between about ± 80%. b b b 15 I am indebted to Donald MacLeod for this point. Note that the inverse stimulus, a dark patch in a saturated surround, has all three contrasts equal to –1.0, and looks black, as that would imply. 13
contrast colours
127
(a) Differences of Michelson contrasts
CL – CM, or CS – (CL + CM )/2
0.6 0.4
0.2 0.0 –1.0
–0.5
0.0
0.5
1.0
–0.2
–0.4
(b)
–0.6
Lum contrast, My
40
Differences of weber contrasts
CL – CM, or CS – (CL + CM )/2
30 20 10
–1.0
–0.5
0 0.0
0.5
1.0
–10 –20
Lum contrast, My
Figure 3.3 Showing how the differences of contrasts, as opponent colour expressions, vary with luminance contrast (between ±1 Michelson contrast), for saturated purple (squares) and green (circles) of constant chromaticity. (a) For signed Michelson contrasts; (b) for Weber contrasts. Filled symbols show differences between S and the average of L and M cone contrasts, open ones between L and M cone contrasts.
Using modified Weber contrasts incorporating ‘dark light’ constants, L/(Lb + L0 ), etc., could avoid the paradox. If L0 = M0 = S0 , for example, with a dark surround, for which Lb = Mb = Sb = 0, the three cone contrasts would be proportional to L, M , and S, and strong hues would be expected. These questions obviously remain very open. I offer these
128
colour perception
remarks mainly as an example of a level of discussion that I feel is needed, intermediate between analysing particular data sets and would-be comprehensive models of colour vision.16 If the matching rule in this space were just that x, y, and z were each equated, this would be empirically indistinguishable from equating cone contrasts, because each would imply the other. But, in fact, results like the Helson–Judd effect, found by Whittle and Arend (1991), and the other deviations from cone contrast matching discussed above, require postulating a linear mapping between the coordinates of the stimuli being matched, where the mapping coefficients may vary between octants (increment versus decrement, cherry versus teal, violet versus chartreuse). The focus of my current research is on determining these mappings. Note that the diagonal terms (constants a in x = ax ′ , etc.) can be called ‘contrast gain’ values because the coordinates are contrast expressions. So this may allow a rapprochement between ‘second site adaptation’ and the ‘contrast gain’ of, for instance, Chubb et al. (1989), although the present ‘contrast gains’ are set by uniform adapting fields, not by contrast stimuli (see Webster 1996, and Chapter 2 this volume). One further point about the suggested space should be mentioned briefly because it raises the vexed question of the dimensionality of contrast colours (Evans 1974; Mausfeld et al. 1993). Since three variables are required to express the visual effect of the surround, and another three that of the focal patch, this dimensionality could be as much as six. I proposed (CL +CM )/2 just now as the expression for the intensity dimension, because of the algebraic convenience of treating contrast colour as a function of only the three variables CL , CM , and CS . But if, for example, luminance contrast CL+M , were to describe matches better, as is sometimes true for my data, then since this is not predictable from the three cone contrasts alone, no three-dimensional space would give a complete representation, although it might provide a useful approximate one.
Contrast colours form a natural space for colour appearance This section is trying to articulate a hunch, a strong feeling that I have when working with contrast colours. The main component is that spaces like the DKL space, or the variant of it just discussed, are the right representation for this work in so many ways (notwithstanding the problem of dimensionality just raised). One feels at home in them. Sometimes I feel that they know more than I do: they seem to lead me into good experiments or fruitful analyses of the data. This is not a feeling that one often has in experimental psychology, so I think it is worth trying to articulate it, to unpack it a little. Furthermore, these spaces are also the right representation for the behaviour of primate retinal ganglion cells, and are closely related to the colour solids that express the structure of surface colours. This threefold coincidence is surely of significance.17 I want to say ‘surely such a space is 16 General discussions like this of the merits of different opponent colour expressions seem remarkably rare, given the importance of the topic, and the fairly obvious points that can be made. For example, one still finds the simple difference L − M being proposed. But this would vary with luminance from zero to infinity, which would be absurd for an index of chromaticity. 17 One could speculate on evolutionary links between the items.
contrast colours
129
the natural space for representing colour appearance’. This is in contrast, particularly, to spectrum-based representations or the CIE xy diagram, which are right for metamerism or other phenomena that depend on the visual pigments, but are clumsy for representing colour appearance. None of this is new, but we seem to have difficulty grasping the whole pattern. This class of spaces is characterized by a vertical (z) axis representing intensity (luminance, luminance contrast, etc.), and horizontal (xy) planes representing chromatic colour (hue, saturation, purity, etc.) at constant intensity. The x and y axes are opponent-colour axes, loosely speaking. Any horizontal plane has colour circles containing all the hues. Colours can be represented as vectors from the reference colour at the origin, which is neutral physically or psychologically or both. Such a space accords with the phenomenology of colour, as Hering pointed out. Colours are all represented. There are no arbitrary absences like brown or the extra-spectral purples. Further, and most importantly, we can navigate easily in such a space. Subjects who are given a joystick connected so that its lateral movements take them round the colour circle, its front–back movements vary the radius of the circle (chromatic contrast), and which has two buttons for varying intensity, learn to make accurate three-dimensional colour matches within a few minutes. This is the most concrete way in which colours form a space: that we can find spatial representations which translate bodily movements into operations on colours. The origin can be treated just as a computational reference point. But in work with contrast colours, it can represent the background colour, which is physically present. The vector from origin to colour is then physically realized as the contrast at the edge of the patch and is seen as a contrast colour. The demonstrations I described, and measurements such as Chichilnisky and Wandell’s, show that the background can be any colour. The eye adapts and makes it the current neutral. So these spaces accord particularly well with the phenomenology of contrast colour. The length of the vector represents a generalized contrast including both colour and luminance. It can be resolved into chromatic (saturation or colourfulness) and light–dark components. The azimuth represents hue, and the altitude luminance contrast. In this representation we can have negative light (decrements), which generalizes some colour algebra. The bipolar z axis dividing at the origin, where the object is absent (zero contrast), represents light–dark, or ‘contrast brightness’ (comprising lightness of surfaces and the brightness of light sources; Whittle 1994a). In fact all straight lines through the origin represent just two (complementary or opponent) hues, with their category boundary at the origin. Thus most qualities find a convenient representation in this space.18 Although the space accords well with colour appearance, the axes, of course, represent physically defined quantities, albeit ‘anthropocentric’ ones (Hilbert 1987). One important move has been to make them reflect the biology of the visual system. That is, the main difference between the DKL space and variants whose axes are also functions of the cone signals, and the three-dimensional spaces—colour solids—used in colour order systems (reviewed by Derefeldt 1991). Spaces whose axes are physiologically meaningful 18 An exception is the ‘brilliance’ dimension of Evans (1974), the bipolar dimension comprising greyness and fluorence.
130
colour perception
are therefore powerful research tools in studying the relationships between physical, physiological, and psychological variables. I have tried to unpack the intuition of ‘right representation’. It has at least the following components: • Physical, physiological, and subjective quantities are all represented.
• The vertical axis and the horizontal chromaticity planes are easy to interpret (compare,
say, the x axis of the CIE xy diagram). • All colours are represented. • Easy to navigate.
• Allows flexibility, crucial for a research tool, in the choice of the precise quantities to
plot on the axes. See previous section.
• In my experience, although I cannot document it here, it is also serendipitous in
suggesting new connections, and cleanly distinguishing different influences on data.
It will have occurred to many readers that there is a dialectical relationship between much of what I am saying and the ubiquitous—in vision labs and out—current technology of colour cathode ray tube (CRT) devices, such as TVs and computer monitors. The spaces I have been discussing are particularly appropriate to them. (Although not only to them: Hering’s opponent colour space antedates them by decades and colour solids antedate them by centuries. Derefeldt lists a version by Forsius from as early as 1611.) It is no coincidence that the DKL space and the use of a CRT to present stimuli for colour research developed together. The CRT makes it particularly easy to generate and study contrast colours, to modulate colours about a mean, and so on. For instance, increments and decrements are equally easy to produce on a CRT, which is not at all the case using projectors or optical systems of lenses and mirrors. This ease is reflected in the bipolar intensity axis of the DKL space. Further, monitor colours are often textureless and always co-planar, which are features that enhance simultaneous contrast (e.g. Woodworth 1938), that is, seeing contrast colours. Much of this chapter could be framed as a contribution to an emerging technology of colour representation suited to the CRT.19 This doesn’t mean that the science is just technology in disguise. New technologies are successful only if they fit (and enlarge) our capacities, and they, in turn, allow us to better understand those capacities. With regard to the importance of contrast colours, the argument of this section is as follows. First, these spaces reflect the structure of colour appearance and of a stage of early visual processing. Secondly, contrast colours exemplify the structure particularly clearly. They enable you to ‘see’ the vectors which, for isolated surface colours, are merely an ordering device. Therefore contrast colours are likely to be revealing something important. 19 Indeed, I fear that in colour vision labs which habitually use CRTs and the DKL space, much of what I am saying will seem to be just spelling out the obvious. My defences are that not everyone has such experience, and that the obvious usually merits more reflection.
contrast colours
131
Why contrast colours are not accepted as fundamental; two types of colour constancy If one was acquainted only with the striking phenomena of contrast colours, such as the demonstrations described above, and their simple mathematical structure, the cone contrast rule, a form of the well-known coefficient rule of von Kries, it would seem surprising that they have not been generally accepted as a basic fact of colour vision, on a par with metamerism. But, in fact, in spite of strong statements like the epigraphs to this chapter, their significance has remained controversial and often unappreciated. This was illustrated vividly by the stir created by Edwin Land’s two-colour projection demonstrations, which it was quickly shown could be predicted from known laws of chromatic contrast and adaptation (Judd 1960; Walls 1960; Wilson and Brocklebank 1960). Only in a climate of ignorance of the behaviour of contrast colours could Land have made such a stir, and have believed that he was revolutionizing colour science. One could argue that this lack of acceptance is because the right experiments had not been done until recently, and that they were dependent on accurate knowledge of the cone spectral sensitivities, but I do not think those are the main reasons. The HSD and other ways of generating strong contrast colours were available in the nineteenth century, and König’s or Fick’s fundamentals were sufficiently accurate to allow Le Grand in 1949, to take just one example, to perform analyses of psychophysical data of lasting validity. Le Grand had also, in describing Spencer’s (1943) work, stated clearly the dependence of contrast colours on the cone Weber contrasts (Le Grand 1948, p. 222). I think a much more important reason is that it has been, and still is, quite reasonably assumed that if there are strong effects of contrast on colour then they should be clearly visible whenever two colours are juxtaposed. However, that is simply not so. If you place saturated red and green papers side by side, and place a grey square on each, the squares do not generally acquire clear contrast colours. Similar illustrations of ‘simultaneous colour contrast’ in textbooks are often rather unconvincing. Furthermore, if you move a coloured object above a variegated background, you will usually see little change in its colour as it moves over different colours of floor or furniture. This constancy of colour in the presence of different surrounding surface colours or different backgrounds is a second type of colour constancy, in addition to the classical type with respect to changes in illumination. The most convincing evidence that there are two separate types of constancy, as opposed to a single complex algorithm that takes all the visual field into account, as in retinex theory, is provided by situations where the apparent colours change markedly as a result of changing only the perceptual interpretation, not the stimuli on the retina. An informal demonstration was described by Evans (1948, p. 166). Gilchrist et al. (1983) achieved a good approximation to the same situation. These were both in the lightness domain (see Whittle 1994b, for a fuller account). Both kinds of constancy have recently been demonstrated for colour in a series of experiments by Brainard and colleagues (Brainard 1998; Brainard et al. 1998, Chapter 10 this volume). Subjects saw largish (4◦ × 6◦ ) coloured rectangles mounted on a wall, the illumination of which could be changed. They were not CRT simulations as in so many current studies. In one experiment, subjects set a rectangle to look achromatic
132
colour perception
under different colours of ambient illumination.20 Their settings approximately tracked the illuminant, so that if it was red light, for example, the patch was set so that it sent red light to the eye. This is classical colour constancy: the patch looked grey when the eye received almost the same light that a grey surface would reflect under each ambient illuminant (80% constancy according to their index). But when the surroundings were varied in colour not by ambient illumination but by placing around the patch either a large coloured board (16◦ × 20◦ ) or a sheet that covered the whole wall, there was very little effect on the achromatic settings. The patch was kept close to neutral reflectance. This is constancy with respect to neighbouring objects. Clearly, if the arguments of this chapter are correct, that the dependence of colour on contrast is mediated by low-level, perhaps retinal, mechanisms, this second kind of constancy is very puzzling. The classical type is one context in which the role of contrast in determining perceived colour has been most strongly asserted. The argument has been that since contrast is approximately independent of overall illumination, the constancy of perceived colour with respect to illumination, also approximate, would be explained if perceived colour depended on contrast. This argument is less straightforward for chromatic than for intensity changes, because the cone contrasts are not, in fact, at all independent of the spectral composition of illumination for arbitrary surfaces and illuminants—consider the extreme case of narrowband reflectors and illuminants. However, they are approximately so for most actual surfaces and natural illuminants (Foster and Nascimento 1994). But while a contrast code is an ideal mechanism for maintaining constancy with respect to illumination, it ought to produce strong ‘simultaneous contrast’ effects (i.e. contrast colours) when the colours of surrounding objects are changed. But instead, we find the second type of constancy: an object is little affected by the colours of surrounding objects. There are two major puzzles here: first, how do we distinguish illumination changes from surrounding-object changes? Secondly, in the latter case, how is constancy maintained if a contrast code is all the brain receives? We don’t know the answer to either question, and this is not the place to speculate at length. The distinction between the two situations might require a global parsing into objects, lighting, and spatial layout (many phenomena show linkages between these three). But it is also possible that it could be done on the basis of relatively low-level cues, such as the structure of intersections, which can generate different percepts of transparency or opacity (Metelli 1974; Gerbino 1994). The colours of objects under a changing illumination can be contrast colours, derived directly from a contrast code. But when it is the surrounding surfaces that are changing around an object, if the object’s colour is coded in terms of contrast with respect to those surfaces, then to recover the constant colour of the object, these changes in contrast must somehow be compensated for. This suggests a process of integration, and various ideas of ‘edge integration’ have been put forward, sometimes after sorting the edges into reflectance and illumination edges. See Gilchrist (1994a) for some discussion of these ideas. 20 The adjustment was by concealed projectors illuminating just the test rectangle, and set up so that the rectangle always looked like a pigmented surface, except for its strange variability.
contrast colours
133
The constant of integration (the ‘anchoring problem’) may be supplied by an average of the colours in the scene: the ‘grey world’ assumption. Or perhaps the brain does not receive only contrast signals: there might be absolute information such as controls pupil size, though I do not know of any evidence for that in the colour domain. Another possibility is that the brain receives only contrast signals, but that there are many types of them, transmitted by different retinal ganglion cell types (which is likely), and that these could somehow be solved to extract absolute information.21 At present, this is a major area of ignorance in our understanding of colour vision. It may be that only when these problems are solved will contrast colours really find a secure, and probably fundamental, place in colour science. The important point in the present context, which it seems to me cannot be overstated, is that since we know from both everyday experience and experimental evidence that both these constancies exist, we cannot expect strong contrast colours (or von Kries adaptation) to show up every time we juxtapose different coloured lights, or present lights in different states of adaptation. The outcome will depend on how it is done, and in a rather subtle way. Specifically, on whether the variations in surround colours are seen as changes in illumination or changes in neighbouring surface colours. When the surround colours cannot be seen, or at any rate cannot be compared, as in the HSD, this choice is unavailable, and subjects are constrained to respond only on the basis of physical contrast. But the stimulus displays commonly used are, in effect, abstract pictures, which leave the perceptual interpretation indeterminate. The equivocal results of experiments on simultaneous contrast or on von Kries adaptation should be expected. The fact is that asymmetric matching experiments in which the background colours can be seen and compared, as they cannot be in the HSD, often do not produce strong contrast colours. Varying the background colour generally has a much weaker effect than varying patch colour, so measurements do not follow the cone contrast rule. But this is not an invariant result. The only accurate generalization about these experiments is that the results are variable. Sometimes they give results following the cone contrast rule [for example, some results of Lucassen (1993)], and sometimes not. There have been an immense number of such experiments, dating back at least to Kirschmann (1890). Wyszecki and Stiles (1982) and Wandell (1995) review some of them. The indeterminacy of their results is presumably one reason why each new arrival in the field feels they have to make their own measurements. But it is not enough to control only the spectral composition of patches and surrounds. New measurements will serve no purpose unless the perceptual interpretation of the surround colour is controlled.
Spatial versus temporal interactions The concept of ‘contrast colours’ is shorthand for much of what is discussed under the headings of simultaneous and successive colour contrast, chromatic adaptation, and chromatic induction. 21 For example, if L/L and L/(L + L ) are signalled, and the constant L is known, then L and L can be 0 0 b b b derived.
134
colour perception
It is often asked whether contrast colours are the product of spatial or temporal interactions or both. We need here to distinguish interactions between neighbouring or successive parts of the stimulus, from the neural mechanisms mediating them. For example, to produce a dark colour as a steady object rather than a transient appearance requires a bright surround. In that sense spatial interactions are required. But the effect of the surround could be mediated entirely by temporal neural interactions, produced by eye movements jiggling the image of the edge back and forth across the photoreceptors, rather than by the spatially opponent colour receptive fields which we know exist. Good contrast colours can certainly be produced by temporal stimulus and neural interactions alone, as is shown, for example, by Webster and Mollon (1995). The instantaneous appearance of coloured shadows and other contrast effects is often cited as evidence that contrast colours can be produced by spatial neural interactions alone. However, I do not know of really convincing experimental evidence, for example from stabilized image experiments. The results of most studies could be due eye movements plus temporal neural interactions. One piece of evidence comes from the crispening effect (see above). For contrast brightnesses this effect is markedly dependent on the exact edge profile, suggesting that spatial interactions across the edge are important. The evidence is that the effect is abolished by a thin black ring round the edge of the patch (Whittle 1992). However, a black ring makes no difference at all to the chromatic version of the effect (Ovenston 1998). This suggests, although not conclusively, that spatial neural interactions may be more important for brightness than for colour. It should be noted that even if contrast colours are produced predominantly by temporal interactions rather than spatial, this cannot account for the weakness of simultaneous colour contrast with respect to neighbouring objects (i.e. to the second type of colour constancy). Retinal generation of chromatic contrast signals, whether by spatially opponent colour receptive fields or receptor adaptation, or whatever, encodes the proximal stimulus, not the distal, and will therefore be indifferent to whether the neighbouring colour is a surface or an illumination colour.
Further complications: colour is not single-valued The phenomenon of contrast colours has surprising implications. It is easy to say ‘situations under which colour is determined by contrast’, but if such situations are common in everyday life, they radically upset common beliefs about colour. Take a large bright red field. Make a small region of it slightly darker, keeping its physical colour (the spectral distribution of energy) the same, so that the relative stimulation of the three cone classes is unchanged. Therefore the three cone contrasts are all the same, and equal to the luminance contrast. So according to the cone contrast rule it should match a patch of the same luminance contrast in a background of any other colour, including grey. Another way to put it, is to say that since it has no chromatic contrast with respect to its surround, if colour is determined by chromatic contrast, it should look colourless, grey. But it was bright red a moment ago. Can the slight darkening or lightening have so dramatically changed its colour? We know this sort of discontinuity in colour appearance doesn’t happen. What we will have is the bright red
contrast colours
135
field with a small region a bit darker or lighter, but still red. Yet in the HSD it does happen, in the sense that the small region turns out to match a grey patch in a grey surround (Whittle et al. 1991). The matches seem to imply a discontinuity that you do not see in the colour as normally viewed. But in fact there is a corresponding discontinuity in normal vision. The slight darkening gives birth to totally new ways to see the small region: for example as a grey shadow on the red background, or as a grey object in a lighter grey surround both illuminated by intense red light. Contrast creates boundaries, and it is boundaries that organize the field and allow different ways of seeing: create ‘the structure “lighting–object lighted” ’ (Merleau-Ponty 1945, p. 307 of English translation). In this sense it can suddenly ‘look grey’ in ordinary vision.22 The corresponding mode of seeing—relative colour—is the only one allowed by the HSD, in which the background is masked.23 So the phenomena of contrast colours lead us to the ambiguity of colour: that regions of the visual field do not just ‘have a colour’, but can be seen in different ways. Colours can vary dramatically, depending on how one parses the scene into objects, lighting, and transparent media. Jakobsson et al. (1998) give a striking example of a pattern of saturated colours that looks achromatic when the saturated colour is parsed (and therefore discounted) as the illuminant. And it is quite normal to see two colours in the same place. Here is Evans again: Perhaps the greatest contribution Katz made in his book (Katz, 1911/1935) was his insistence that the light that is seen to be illuminating objects is perceived as separate from the objects. This is a concept that is obvious to the naive and obscure to scientists; it is a basic tenet of the present book that that this is so fundamentally true that no perception of a complex scene can be analyzed without taking it into account. (Evans 1974, p. 91)
Both these kinds of ambiguity contradict the notion that perceived colour is just a transduction or transformation of the physical colour. This notion, which Gilchrist (1994b) has called the ‘photometer metaphor’, and Mausfeld (1998, Chapter 13 in this volume) the ‘measurement device conception of perception’, has dominated psychophysics at least since Fechner. Although many people have argued persuasively against it, it is so temptingly simple, allows just enough space for some neural complexities (like lateral inhibition and receptive fields), and meshes with so many of our habitual Cartesian assumptions, that it has great staying power. Its dominance is perhaps the deepest reason why contrast colours remain in their uneasy position in colour science: invoked when convenient, but much of the time ignored. 22 Dejan Todorovich notes: ‘The predicted grayness of its appearance is in accord with our phenomenal impression, noted by Kardos (1934), that, if we ascribe a color to the shadow at all, it is usually gray and not chromatic (the case of genuine “colored shadows” excluded, but it is notable that for the layman that is a counterintuitive phenomenon). Kardos notes that is very hard for us to see a shaded portion of a chromatic surface as a darker shade of chromatic color.’ (Todorovich, personal communication). 23 The psychophysics of contrast colours has a dual relation to normal vision. On the one hand, it describes some mechanisms of early vision. On the other hand, it measures, fixes, expresses, particular ways of seeing coloured patches: primarily as object colours, but also as shadows. To be good psychophysics, it has to remove the ambiguity of colour.
136
colour perception
References Abramov, I. and Gordon, J. (1994). Color appearance: on seeing red—or yellow, or green, or blue. Annual Review of Psychology 45, 451–485. Alpern, M. (1964). Relation between brightness and color contrast. Journal of the Optical Society of America 54, 1491–1492. Bezold, W. von (1874). Die Farbenlehre im Hinblick auf Kunst und Kunstgewerbe. Westermann, Brunswick. [English translation by S. R. Koehler (1876). The theory of color and its relation to art and art-industry. Prang, Boston.] Brainard, D. H. (1998). Color constancy in the nearly natural image. 2. Achromatic loci. Journal of the Optical Society of America A 15, 307–325. Brainard, D. H., Brunt, W. A., and Speigle, J. M. (1998). Color constancy in the nearly natural image. 1. Asymmetric matches. Journal of the Optical Society of America A 14, 2091–2110. Brill, M. H. (1995). Commentary on Ives, 1912. Color Research Applications 20, 70–71. Brookes, A. and Stevens, K. A. (1989). The analogy between stereo depth and brightness. Perception 18, 601–614. Chichilnisky, E. J. and Wandell, B. A. (1995). Photoreceptor sensitivity changes explain color appearance shifts induced by large uniform backgrounds in dichoptic matching. Vision Research 35, 239–254. Chichilnisky, E. J. and Wandell, B. A. (1996). Seeing gray through the ON- and OFF-pathways. Visual Neuroscience 13, 591–596. Chichilnisky, E. J. and Wandell, B. A. (1997). Increment-decrement asymmetry in adaptation. Vision Research 37, 616. Chubb, C., Sperling, G., and Solomon, J. A. (1989). Texture interactions determine perceived contrast. Proceedings of the National Academy of Science USA 86, 9631–9635. Derefeldt, G. (1991). Colour appearance systems. In The perception of colour (ed. P. Gouras), pp. 218–261 Vol. 6 of Vision and visual dysfunction, Macmillan, London. Derrington, A. M., Krauskopf, J., and Lennie, P. (1984). Chromatic mechanisms in lateral geniculate nucleus of macaque. Journal of Physiology 357, 241–265. Evans, R. M. (1948). An introduction to color. Wiley, New York. Evans, R. M. (1974). The perception of color. Wiley, New York. Foster, D. H. and Nascimento, S. M. C. (1994). Relational colour constancy from invariant cone-excitation ratios. Proceedings of the Royal Society B 257, 115–121. Friele, L. F. C. (1961). Analysis of the Brown and Brown–MacAdam colour discrimination data. Die Farbe 10, 193. Gerbino, W. (1994). Achromatic transparency. In Lightness, brightness and transparency, (ed. A. L. Gilchrist). Erlbaum, Hillsdale, NJ. Gilchrist, A. L. (ed.) (1994a). Lightness, brightness and transparency. Erlbaum, Hillsdale, NJ. Gilchrist, A. L. (1994b). Introduction: Absolute versus relative theories of lightness perception. In Lightness, brightness and transparency, (ed. A. L. Gilchrist). Erlbaum, Hillsdale, NJ. Gilchrist, A. L., Delman, S., and Jacobsen, A. (1983). The classification and integration of edges as critical to the perception of reflectances and illumination. Perception and Psychophysics 33, 425–436. Gouras, P. and Zrenner, E. (1981). Color vision: a review from a neurophysiological perspective. Progress in Sensory Physiology 1, 139–179. He, S. and MacLeod, D. I. A. (1998a). Contrast-modulation flicker: dynamics and spatial resolution of the light adaptation process. Vision Research 38, 985–1000. Helson, H. (1938). Fundamental Problems in color vision I. The principle governing changes in hue, saturation and lightness of non-selective samples in chromatic illumination. Journal of Experimental Psychology 23, 439–476.
contrast colours
137
Hering, E. (1890). Beitrag zur Lehre vom Simultankontrast. Zeitschrift für Psychology 1, 18–28. Hilbert, D. R. (1987). Color and color perception: A study in anthropocentric realism. CSLI Lecture Notes, Stanford CA. Ives, H. E. (1912). The relation between the color of the illuminant and the color of the illuminated object. Transactions of the Illuminating Engineering Society 7, 62–72. Jakobsson, T., Bergström, S. S., Gustafsson, K.-A., and Fedorovskaya, E. (1998). Ambiguities in colour constancy and shape from shading. Perception 26, 531–541. Judd, D. B. (1940). Hue, saturation and the lightness of surface colors with chromatic illumination. Journal of the Optical Society of America 30, 2–32. Judd, D. B. (1960). Appraisal of Land’s work on two-primary color projection. Journal of the Optical Society of America 50, 254–268. Kaiser, P. K. and Boynton, R. M. (1996). Human color vision, (2nd edn). Optical Society of America, Washington, DC. Kardos, L. (1934). Ding und Schatten. Zeitschrift für Psychologie, Ergänzungsband 23, 1–184. Katz, D. (1935). The world of colour. Kegan Paul, Trench, Trubner, London. Kirschmann, A. (1890). Über die quantitativen Verhältnisse der simultanen Helligkeits- und Farbenkontrastes. Philosophisches Studien 6, 417–491. Kries, J. von (1905). Einfluß der Stimmung des Sehorgans auf die durch Lichtreize hervorgerufenen Erfolge. In Handbuch der Physiologie des Menschen, Vol. 3, (ed. W. Nagel), pp. 208–214. [English translation in D. L. MacAdam (ed.) (1970). Sources of color science. MIT Press, Cambridge, MA.] Land, E. H. and McCann, J. J. (1971). Lightness and retinex theory. Journal of the Optical Society of America 61, 1–11. Larimer, J. and Piantanida, T. P. (1988). The impact of boundaries on colour: stabilized image studies. Society of Photo-optical Instrument Engineers, Vol. 901, Image processing, Analysis, Measurement and Quality, 241–247. Le Grand, Y. (1948). Light, colour and vision. Chapman & Hall, London. Le Grand, Y. (1949). Les seuils différentiels de couleurs dans la théorie de Young. Revue d’Optique 28, 261–278 [English translation: Color research and applications (1994) 19, 296–309]. Lucassen, M. P. (1993). Quantitative studies of color constancy. Dissertation, TNO. Mausfeld, R. (1998). Colour perception: From Grassmann codes to a dual code for object and illumination colours. In Color vision: A perspective from different disciplines, (ed. W. Backhaus, R. Kliegl, and J. S. Werner), pp. 219–250. de Gruyter, Berlin. Mausfeld, R. and Niederée, R. (1993). An inquiry into relational concepts of colour based on incremental principles of colour coding for minimal relational stimuli. Perception 22, 427–462. McDougall, W. (1901). Some new observations in support of Thomas Young’s theory of light and colourvision, III. Mind 10, 347–382. Merleau-Ponty, M. (1945). Phenomenology of perception. Tel, Paris. [English translation (1962). Routledge & Kegan Paul, London]. Metelli, F. (1974). The perception of transparency. Scientific American 230, 90–98. Mollon, J. D. (1987). On the nature of models of colour vision. Die Farbe 34, 29–46. Mollon, J. D. (1995). Seeing colour. In Colour: art and science, (ed. T. Lamb and J. Bourriau), pp. 127–150. Cambridge University Press, Cambridge. Monge, G. (1789). Mémoire sur quelques phénomènes de la vision. Annales de Chimie 3, 131–147. Myers C. S. (1923) The influence of the late W. H. R Rivers. In Pychology and politics and other essays, (ed. W. H. R. Rivers). Kegan Paul, Trench, Trubner, London. Nerger, J. L., Piantanida, T. P., and Larimer, J. (1993). Color appearance of filled-in backgrounds affects hue cancellation, but not detection thresholds. Vision Research 33, 165–172. Ovenston, C. A. (1998). The scaling and discrimination of contrast colours. PhD dissertation, Cambridge University.
138
colour perception
Ovenston, C. A. and Whittle, P. (1996). Improved discrimination near the background field: transfer of the ‘crispening effect’ to colour. Perception (Suppl.) 25, 16. Poirson, A. B. and Wandell, B. A. (1993). The appearance of colored patterns: pattern–color separability. Journal of the Optical Society of America 12, 2458–2471. Rivers, W. H. R. (1900). Vision. In Textbook of physiology, Vol. 2, (ed. E. A. Schäfer), pp. 1026–1148. Young J. Pentland, Edinburgh. Rollett, A. (1867). Zur Physiologie der Kontrastfarben. Sitzungsberichte Akademie der Wissenschaften, Mathematisch-naturwissenschaftliche Klasse 55, 741–766. Semmelroth, C. C. (1970). Prediction of lightness and brightness on different backgrounds. Journal of the Optical Society of America 60, 1685–1689. Shevell, S. K. (1978). The dual role of chromatic backgrounds in color perception. Vision Research 18, 1649–61. Shinomori, K., Schefrin, B. E. and Werner, J. S. (1997). Spectral mechanisms of spatially induced blackness: data and quantitative model. Journal of the Optical Society of America A 14, 372–387. Spencer, D. E. (1943). Adaptation in color space. Journal of the Optical Society of America 33, 10–17. Takasaki, H. (1966). Lightness change of grays induced by change in reflectance of gray background. Journal of the Optical Society of America 56, 504–509. Takasaki, H. (1967). Chromatic changes induced by changes in chromaticity of background of constant lightness. Journal of the Optical Society of America 57, 93–96. Tschermak, A. (1903). Über Kontrast und Irradiation. Ergebnisse der Physiologie 2, 726–798. Vingrys, A. J. and Mahon, L. E. (1998). Color and luminance detection and discrimination asymmetries and interactions. Vision Research 38, 1085–1096. Walls, G. L. (1960). Land! Land! Psychological Bulletin 57, 29–48. Walraven, J. (1976). Discounting the background—the missing link in the explanation of chromatic induction. Vision Research 16, 289–295. Walraven, J. (1981). Perceived colour under conditions of chromatic adaptation: evidence for gain control by mechanisms. Vision Research 21, 611–620. Walraven, J., Enroth-Cugell, C., Hood, D. C., MacLeod, D. I. A., and Schnapf, J. (1990). The control of visual sensitivity: receptoral and post-receptoral processes. In Visual perception: Neurophysiological foundations, (ed. L. Spillmann and J. S. Werner). Academic Press, San Diego, CA. Wandell, B. A. (1995). Foundations of vision. Sinauer, Sunderland MA. Webster, M. A. (1996). Human colour perception and its adaptation. Network: Computation in Neural Systems 7, 587–634. Webster, M. A. and Mollon, J. D. (1995). Colour constancy influenced by contrast adaptation. Nature 373, 694–8. Whittle, P. (1992). Brightness, discriminability and the ‘Crispening Effect’. Vision Research 32, 1493–1507. Whittle, P. (1994a). The psychophysics of contrast-brightness. In Lightness, brightness and transparency, (ed. A. L. Gilchrist). Lawrence Erlbaum Associates, Mahwah, NJ. Whittle, P. (1994b). Contrast-brightness and ordinary seeing. In Lightness, brightness and transparency, (ed. A. L. Gilchrist). Lawrence Erlbaum Associates, Mahwah, NJ. Whittle, P. and Arend, L. E. (1991). Homochromatic colour induction. Perception (Suppl.) 20, 99. Wilson, M. H. and Brocklebank, R. W. (1960). Two-colour projection phenomena. Journal of Photographic Science 8, 142–150. Woodworth, R. S. (1938). Experimental psychology. Henry Holt, New York. Wuerger, S. (1996). Color appearance changes resulting from iso-luminant chromatic adaptation. Vision Research 36, 3107–3118. Wyszecki, G. and Stiles, W. S. (1967). Color science. Wiley, New York.
commentary: contrast colours
139
Commentaries on Whittle A background to colour vision Michael A. Webster Paul Whittle once whisked me into his lab to try out an experiment. An array of discs was arranged in a spiral on a monitor. The two extremes were black and white, and I was to set the intermediate ones so that they formed equal steps in between. Somehow I began by accidentally clicking a button that changed the uniform background, so that it too varied from light to dark across the screen. While he waited in anticipation next door, I also spent much longer than I was supposed to meticulously readjusting each circle. In the end I was satisfied with my results, and even a little pleased to think I was clever enough to figure out exactly what he was up to. But the moment he returned my theories were dashed (and the settings were trashed), for it was not the dots but the background that he was interested in. In this chapter Paul provides a forceful argument for considering the background context in colour perception. Backgrounds define the baseline against which stimuli are judged, and it is the deviations from this baseline—the contrasts—that form the critical signals in colour vision. There are two central messages in the chapter. The first is that contrast is fundamental to colour perception, as it likely is to all of perception. As the many examples he discusses illustrate, the hue or lightness of a stimulus can be varied at will simply by changing the context in which it is embedded. That the experience of colour can be so strongly decoupled from the physical spectrum is a well known, yet somehow discomforting, thought. However, it becomes more obvious and reassuring when it is remembered that vision is not about experiencing light but rather the things that light reveals. Contrast is much more closely tied to the characteristics of surfaces than the illuminant, and the way it is emphasized throughout visual coding provides one of the most compelling examples that perceptual processes are designed to extract ecologically relevant features of the environment. The second important message is that—as important as contrast is—we are a long way from understanding what it is. There are several ways in which the concept of contrast remains poorly defined. One, which is considered at length in the chapter, concerns the problem of defining an appropriate metric. As Paul notes, choosing the right measure is problematic even for the simplest case of a uniform spot and background. These problems become much worse as the stimuli become more complicated. For example, Michelson contrast, which is widely used for simple spatial patterns, depends only on the maximum and minimum values in the stimulus, and may be useless for describing many of the perceptually relevant characteristics of contrast in natural scenes. A second problem is trying to understand what parts of the stimulus control contrast. Contrast is relative to the background, but what is the background? For example, the colour of a stimulus depends in large part on how we are light adapted at the moment it appears. But over what window of time are signals integrated to determine the adapting background? Similarly, spatial contrast depends on the stimulus and its surround, but which parts of the surrounding field are important is often hard to ascertain and may vary dramatically depending on the nature of the stimulus. These are difficult questions because there are probably many answers. The visual system must solve many problems with competing demands. The meaning of contrast may be very different within these. The temporal contrast of adaptation may have a very different basis and function than the spatial contrast of induction. Moreover, there are many striking examples of how the perception of lightness and colour can be drastically altered by subtle stimulus changes that nevertheless suggest completely different interpretations about the geometry of scenes. Contrast may therefore be a collection of many principles, yet they share a common principle that perception depends on comparisons (a phrase Semir Zeki described as the best single ‘sound byte’ to describe perception). Paul Whittle’s chapter leaves little doubt that these comparisons are at the centre of colour vision. I returned many times to his lab to marvel at his remarkable demonstrations and experiments on contrast effects, but I don’t think he ever asked me again to be a subject.
140
colour perception
Commentaries on Whittle Contrast coding and what else? Hans Irtel The visual system achieves two kinds of colour constancy: invariance against (1) illumination changes and (2) changes of surrounding object colours. Empirical data (Foster and Nascimento 1994) show that spatial contrast coding of receptor signals might be a mechanism for obtaining invariance against illumination changes in natural light environments, where illuminant and surface remittance spectra are smooth and of low dimension. However, people may hesitate to accept the idea that the visual system has to reconstruct the visual world from contrast signals, since logic tells us that contrast coding also implies a loss of information about illumination and physical surface properties. Not only do we rarely observe changes of object colours when moving these across differently coloured background surfaces, we are also quite able to identify illumination properties. It is no problem to differentiate warm morning light from cool high-noon lighting, and tungsten light from fluorescent light sources, even if the sources themselves are invisible. We ran an experiment to investigate how the size of the visual angle of a background stimulus affects adaptation. We compared the 14◦ viewing angle achievable by using a CRT monitor to full visual field adaptation created by controlling the coloured fluorescent tubes of a room illumination system. The subject’s task was to select an achromatic looking patch from a sample of Munsell chips simulated on a CRT screen. The data show almost 100% adaptation to the background colour under full view adaptation. The patch selected as achromatic has the same chromaticity coordinates as the background light. However, for every subject it is completely clear that the room illuminant is not achromatic but strongly chromatic. Thus subjects adapt but still are able to identify the illuminant colour. This observation does not strictly rule out contrast coding but makes clear that there may be intensitydependent non-linearities involved which effectively retain information about absolute signals. Accepting contrast coding as one of the early stages of visual processing thus creates new problems which seem to be very hard to solve. One of them, not mentioned by Paul Whittle, is how the visual system can create homogeneous areas from contrast signals at edges only. Actually the text is rather vague about whether the signals are of spatial or temporal origin and whether all colour signals are contrast signals only. For now, let’s assume we only have spatial contrast signals available. It is well known that there is a fundamental difference between the achromatic and the chromatic system in the spatial domain: the chromatic system behaves like a spatial low-pass filter while the achromatic system behaves like a spatial band pass. Thus we can easily detect smooth chromatic changes where smooth brightness changes go unnoticed. Furthermore, large homogeneously coloured surfaces exist and are perceived as such. It seems to be completely unclear which mechanisms of the visual system are capable of such a long-range interaction within the visual field, which would be necessary to recreate homogeneous areas from contours only. Let’s drive complexity even one step further: there might be a simple solution to the case where the colour of a homogeneous patch within a homogeneous surround has to be recreated from the contrast at the contour. However, in most cases there will be more than a single contrast value available, since most coloured areas have contours with different background surfaces and still appear homogeneously coloured. And the question is how multiple background elements and their contrast values are processed to create the apparent colour of a target area. A first step is to ask whether for any complex context pattern there exists a homogeneous context pattern which has the same context effect on any arbitrary target stimulus. Such an experiment was run by Bruno et al. (1997) and Irtel (1998). The results show a strong target dependence of the equivalent homogeneous context and point to a two-process model of lateral interaction. The first process seems to be strongly related to illumination invariance, as captured by contrast coding: the intensity of every patch in the visual field is normalized by a reference, which may be built from a low-pass filtered channel of the visual image.
commentary: contrast colours
141
The data of Irtel (1998) favour an averaging process against anchoring at the maximum as suggested by Gilchrist et al. (1999) and Gilchrist, Chapter 14, this volume. This process is responsible for most of the effects that are captured by von Kries type of normalization models. There is, however, a second process, which serves contrast enhancement in those cases where target patches have homogeneous borders with their surrounding fields. This seems to be a rather local contrast enhancement which also is responsible for the ‘crispening effect’ mentioned by Paul Whittle. This local contrast enhancement is strongly reduced, or even disappears, when the target-surrounding areas are not homogeneous but vary in intensity. The consequence is that no simple model, based on contrast signals only, is able to predict the parameters of a homogeneous surround field which has context effects equivalent to some complex surround. Spatial arrangement has to be incorporated into the model.
References Bruno, N., Bernardis, P., and Schirillo, J. (1997). Lightness, equivalent backgrounds, and anchoring. Perception and Psychophysics 59, 643–654. Foster, D. H. and Nascimento, S. M. C. (1994). Relational colour constancy from invariant coneexcitation codes. Proceedings of the Royal Society of London, B 257, 115–121. Gilchrist, A., Kossyfidis, Ch., Bonato, F., Agostini, T., Cataliotti, J., Li, X., Spehar, B., Annan, V., and Economou, E. (1999). An anchoring theory of lightness perception. Psychological Review 106, 795–834. Irtel, H. (1998). Equivalent homogenous context intensity depends on average intensity, contrast range, and relative target intensity of complex context patterns. Perception 27 (suppl.), 88–89.
This page intentionally left blank
chapter 4
COLOUR AND THE PROCESSING OF CHROMATIC INFORMATION michael d’zmura Preface My interest in colour vision was formed during my graduate work with Peter Lennie at the University of Rochester, New York, during the mid-1980s. Peter Lennie and John Krauskopf were studying chromatic processing in macaque lateral geniculate nucleus (LGN) using electrophysiological techniques, and I fell naturally into related research using psychophysical techniques. Krauskopf ’s psychophysical experiments with chromatic habituation in the early 1980s revealed cardinal axes for colour vision which matched peak chromatic sensitivities in the LGN. The cardinal axes do not match the colour-opponent mechanism sensitivities found with judgements of colour appearance, like those of Hurvich and Jameson (1955), and the question arose of the relationship between detection-based cardinal axis sensitivities and appearance-based colour-opponent mechanism sensitivities. The thinking is that experiments involving detection are more objective than those involving judgements of appearance, and the underlying question was thus whether one could study the mechanisms involved in the conscious representation of colour using objective techniques. Work at that time with noise-masking techniques to study achromatic vision (particularly that of Denis Pelli) prompted me to study this question using noise masking. I thought that perhaps the standard red–green and yellow–blue mechanism sensitivities could be measured in a detection experiment using a noise-masking variant of Stiles’ techniques. For instance, a red signal would be visible to only the red–green mechanism, and threshold elevation by noise along various axes in colour space could be used to measure its sensitivity. Experimental results seemed to confirm the hypothesis that, in detection tasks, signals of unique hue could be used to reveal the sensitivities of appearance-based mechanisms. Yet the results found with signals of intermediate hue, such as orange, suggested that observers detect such a signal by using a mixture of mechanisms that depends on the chromatic properties of the noise mask. For instance, one can detect an orange signal presented in yellow–blue noise using a ‘red’-sensitive mechanism, but one detects the same signal using a ‘yellow’-sensitive mechanism when it is presented in red–green noise. This ‘off-axis looking’ led me to formulate a model in which noise is represented by a covariance matrix, the inversion of which reveals the chromatic sensitivity of the mechanism with the greatest signal-to-noise ratio. Such a model, I felt, would quantify observers’ tendency to detect a signal of intermediate hue using a mechanism sensitive to the signal yet minimally sensitive to a particular noise. Donald MacLeod pointed out to me that such a model is isotropic, and cannot distinguish between unique-hue stimuli and those of intermediate hue. The model suggests that observers should use varying detection mechanisms to detect not only an orange signal but also unique red, unique green, unique yellow, and unique blue signals. Yet the results of my experiments did not reveal off-axis looking for unique-hue signals. I went back to the drawing board. I suspected that observers may have been able to improve their sensitivity to unique-hue signals using off-axis looking, but was not willing to redo the noise-masking experiments, as they are quite tedious and involve looking for things which typically cannot be seen. I turned to the visual search paradigm used by Anne Treisman, which was in vogue among vision researchers at the time. The results of these experiments were consistent with the isotropic model; they did not reveal the standard colour-opponent mechanism sensitivities. These and further experiments are discussed in this chapter. M. D’Zmura
144
colour perception
Introduction The study of colour vision has focused on the relation between colour appearance and the physical stimulation of the eye by light. Newton (1704/1952) spread sunlight into a visible spectrum using a prism and so linked light wavelength and perceived hue. Young (1802) and Helmholtz (1909/1962) accounted for colour-matching data by developing the trichromatic theory of colour vision, which held that perceived colour is determined by the extent to which a light stimulates three classes of retinal photoreceptor. The work of Hering (1878/1964) and, later, Hurvich and Jameson (1957), completed the present, standard account of colour vision by adding a second, colour-opponent, stage of chromatic processing that transforms signals from the initial photoreceptoral stage. In this view, the appearance of a light is determined by the extent to which it stimulates three classes of colour-opponent neurons, rather than the extent to which it stimulates three classes of photoreceptors. Trichromatic and colour-opponent theories have been used to account for how our visual system uses spectral information. One such use is to discriminate between two lights on the basis of a difference in their spectral properties. Helmholtz (1891, 1892) used a line-element model which measures the difference in stimulation of the three classes of photoreceptor to fit wavelength discrimination data. The activities of the photoreceptors are then scaled to quantify the two lights’ discriminability (von Kries 1905). Other trichromatic line-element models scale and combine photoreceptoral information in different ways (e.g. Stiles 1946). Colour-opponent theory has also resulted in its share of line-element and related models of colour discrimination (e.g. Hurvich and Jameson 1955; Vos and Walraven 1972a, b). The aim, again, has been to fit colour discrimination data by supposing that activity determining colour appearance also determines discriminability. This work shows that outputs from colour-opponent units must be scaled to take into account adaptation, viz. the regulation of visual sensitivity. Indeed, there was one point in the history of colour science when there was strong evidence for unitary mechanisms involved in colour discrimination, Stiles’ mechanisms, that corresponded neither to the three classes of photoreceptor nor to the black–white, red–green, and yellow–blue colour-opponent processes (Stiles 1978). Yet these mechanisms were soon understood in terms of the adaptation of the standard colouropponent mechanisms (Pugh 1976). In summary, accounts of colour discrimination and other tasks that involve the visual processing of spectral information have been derived historically from accounts of colour appearance. Indeed, all is well in the world of colour research when performance in chromatic processing tasks can be understood as a by-product of colour appearance. The problem is that psychophysical and electrophysiological research in colour vision no longer supports this simple view. This chapter reviews briefly work showing that we use mechanisms, in colour detection tasks, which have sensitivities that lie between those of the standard black–white, red–green, and yellow–blue mechanisms. Discovered by Krauskopf and his colleagues, working with chromatic habituation (Krauskopf et al. 1986), these mechanisms with intermediate sensitivities have a clear anatomical correlate in the macaque visual cortex and have been shown to determine behaviour in practical visual search and colour detection tasks.
colour and the processing of chromatic information
145
Chromatic habituation Krauskopf and colleagues used a habituation technique to isolate colour detection mechanisms. Observers in their experiments exposed themselves to high-contrast modulations of chromaticity along various directions in colour space. Prolonged exposure to the highcontrast modulations desensitizes observers in a colour-selective way. For instance, if observers stare at a pulsing red–green stimulus for 1 min, their ability to detect red or green stimuli in the immediately following period is poorer, but their ability to detect yellow or blue stimuli is relatively unimpaired. Likewise, staring at a yellow–blue pulsation impairs the detection of yellow or blue signals but has little effect on the detection of red or green signals. The interpretation of these results is that observers possess both red–green and yellow–blue opponent mechanisms, and that these can be desensitized or habituated independently of one another (Krauskopf et al. 1982). The interesting part arose when Krauskopf and colleagues asked their observers to view pulsing stimuli with intermediate colours, such as orange. If observers have only the red– green and yellow–blue colour-opponent mechanisms, then one would expect that a pulsing orange–blue-green stimulus will desensitize both the red–green and the yellow–blue mechanisms: detection of all colours will be equally impaired. In fact, the ability of observers to detect chromatic signals was impaired most along the orange–blue-green direction in colour space, and least along a perpendicular, yellow-green–violet direction. This result suggests that observers possess colour-opponent mechanisms with hue sensitivities that lie between those of the standard mechanisms (Krauskopf et al. 1986). These observations have since been extended in a variety of work with the chromatic habituation paradigm (see Webster Chapter 2, this volume).
Monkey visual cortex Studies of colour-sensitive neurons in the macaque visual cortex paint a picture that is consistent with the activity of mechanisms with intermediate hue sensitivities. At the time when Krauskopf and colleagues performed their studies, little quantitative work had been done on chromatic sensitivity in the macaque cortex. More work had been done with neurons in the lower levels of the monkey visual system, including retina and lateral geniculate nucleus (LGN). Studies of the LGN indicate an organization that is partially consistent with, but not identical to, the standard colour-opponent scheme (DeValois et al. 1966; Derrington et al. 1984). The chromatic sensitivities of parvocellular colour-sensitive neurons in the LGN cluster around two axes in colour space that correspond only roughly to what one would expect of a red–green and a yellow–blue mechanism. One cannot identify these neurons with the standard colour-opponent channels for two reasons. First, the spectral sensitivity of the red–green mechanism suggested by studies of the LGN does not match the sensitivity deduced psychophysically. The putative red–green neurons tend to have no input from short-wavelength-sensitive cones, yet standard accounts of colour-opponency invoke a short-wavelength-sensitive cone contribution to redness, in order to help account for the redness seen within violet lights of short wavelength (Hassenstein 1968; Jameson and Hurvich 1968; D’Zmura 1990). Secondly, many colour-sensitive neurons in the LGN
colour perception
146
are also sensitive to black–white modulation; the red–green and yellow–blue mechanisms of standard colour-opponent theory are insensitive to black–white modulation (Ingling and Martinez-Uriegas 1983). One may combine signals from neurons in the LGN to create colour sensitivities that are like those required by the standard theory (D’Zmura and Lennie 1986). As shown in Fig. 4.1, the addition of signals from colour-sensitive neurons in the LGN can provide neurons insensitive to achromatic signals but with chromatic sensitivities that match red– green and yellow–blue. Figure 4.1 also shows that adding such signals can give rise to neurons with intermediate spectral sensitivities—neurons that are tuned to orange, yellow-green, blue-green, and violet directions in colour space. Work over the past decade has shown conclusively that the visual cortex of macaque monkeys is replete with colour-sensitive neurons with hue sensitivities that lie intermediate ‘BlueGreen’
‘YellowGreen’
‘Orange’
‘Vlolet’
ADDITIVE COMBINATION
L–M
M–L
L+M–S
S-L–M
L–M
M–L
L+M–S
S-L-M
‘Red’
‘Green’
+
‘Yellow’
+
‘Blue’
+
+
–M
+L
–L
+M
–S
L+M
+L
-M
+M
–L
L+M
–S
S
–L–M
–L–M
S
Figure 4.1 Formation of colour-sensitive neurons with intermediate spectral sensitivities. The bottom row shows schematically centre-surround configurations of neurons in the parvocellular layers of the lateral geniculate nucleus (LGN). They are labelled according to their long- (L), medium- (M) and short-wavelength (S) sensitive photoreceptoral inputs. Signals from LGN units are combined linearly to provide units in the cortex with spatially uniform colour sensitivities. The middle row shows these units and their preferred hue directions in the colour plane. Their signals may be added to create units with arbitrary, intermediate hue sensitivities. (After D’Zmura and Lennie 1986, figure 8.)
colour and the processing of chromatic information (a)
Y
Parallel
Y
(b)
Parallel
(c)
R
G
Y
147
Parallel
T G
R
G
R T
T B
B
B (d)
Y
G
Not parallel R
B
Y
G
R
B
Figure 4.2 Interpretation of the results from a search for an orange disc (open circles) among (A) red and green distractors (filled circles), (B) yellow and blue distractors, (C) yellow-green and violet distractors, and (D) red and yellow distractors. In cases of a parallel search (A, B, C), the lines marked ‘T’ represent thresholds on the responses of linear chromatic mechanisms tuned to ‘yellow’, ‘red’, and ‘orange’, respectively. In (D) are shown two possible non-linear mechanisms for detecting the orange target disc among red and yellow distractors. In this case the search cannot be conducted spatially in parallel. (After D’Zmura 1991, Figure 3.)
to red–green and yellow–blue. Lennie et al. (1990) showed that the spectral sensitivities of colour-sensitive neurons in cortical area V1 are scattered uniformly within the colour plane. This result was replicated by DeValois et al. (1997). Kiper et al. (1997) found a similar result in cortical area V2, while Gegenfurtner et al. (1997) found it in cortical area V3. The visual cortex has many colour-sensitive neurons which are most sensitive to hues that lie between those that correspond to the standard colour-opponent mechanisms.
Colour in visual search Visual search experiments provided the first evidence that the mechanisms tuned to intermediate hues are active in everyday colour detection tasks (D’Zmura 1991). Observers sought an orange disc placed among distractor discs with different colours. The Treisman visual search paradigm (Treisman and Gelade 1980) was used to determine how effectively the distractors hindered the search for the orange target disc. Four experimental conditions that correspond to different colours for the distractor discs are shown diagramatically in Fig. 4.2. The colour of the orange disc is shown in the colour plane by the unfilled circle in each of the panels. Distractor colours are shown by the filled circles. Figure 4.2a refers to an experimental condition in which the orange disc is presented among red and green distractors. Results showed that the search for an orange target among red and green distractors occurs spatially in parallel across the central visual field. The orange target ‘pops out’ of the field of red and green distractors. The yellow–blue
148
colour perception
opponent mechanism is a plausible candidate for detection under these circumstances. The yellow–blue mechanism can pick up the yellow within the orange target disc, but is completely insensitive to the red and green distractor discs. The experimental condition in which the orange disc is presented among yellow and blue distractors is shown diagrammatically in Fig. 4.2b. Again, the orange target pops out. The red–green opponent mechanism is a plausible candidate for detection in this condition: it will pick up the red within the orange target disc, but will be completely insensitive to the yellow and blue distractor discs. What happens if the orange target is displayed among yellow-green and violet distractors, as shown in Fig. 4.2c? The yellow–blue mechanism cannot detect the orange target reliably, because the mechanism is also sensitive to the yellow-green distractors. Likewise, the red– green mechanism cannot detect the orange target reliably, because it is sensitive to the redness within the violet distractors. Nevertheless, the orange target pops out. The simplest explanation is that we possess a detection mechanism tuned to orange that is simultaneously insensitive to yellow-green and violet. The spectral width of the orange-sensitive mechanism is tested in the condition shown Fig. 4.2d. If the distracting colours are brought too close to the colour of the orange target, then what was formerly an easy, parallel search becomes a difficult search, most likely conducted in a serial fashion. Evidently, the orange mechanism used under the conditions of Fig. 4.2c does not have a narrow sensitivity, because crowding the orange target with red and yellow distractors makes the search difficult. The pattern of results suggests that the colour mechanisms used in parallel search have broad, linear spectral sensitivities. In the parallel search conditions, observers can set a threshold on a linear mechanism that distinguishes the target from the distractors. These thresholds are indicated in Fig. 4.2A–C by the lines marked ‘T’ that separate the orange target from the various sets of distractors. This pattern of findings was shown to hold true for red, yellow, orange, and violet targets. It provides conclusive evidence for the activity of mechanisms with intermediate hue sensitivities in everyday detection tasks (D’Zmura 1991). Further work on colour in visual search by Bauer et al. (1996, 1998) and by Olds et al. (1999) has replicated and extended these results in several interesting ways. The work by Olds and colleagues has shown, in particular, that a parallel search for a target of intermediate hue does not depend on an observer’s prior knowledge of target and distractor colours. The mechanisms sensitive to intermediate hues are not created through top-down awareness. They are simply there, ready to go. D’Zmura et al. (1997) showed that these mechanisms can operate at a high level of visual processing, after figure/ground segmentation has taken place. Shaded, coloured discs were used in visual search experiments. The discs were shaded in such a way that some of them, positioned randomly, appeared within the perceived figure, whereas others appeared in the ground. An orange target disc was placed among yellow-green and violet distractors in the figure. Orange distractor discs were placed in the ground. Results showed that the search for the orange target among the orange, yellow-green and violet distractors was conducted spatially in parallel. The orange distractors in the ground provided no problem whatsoever, presumably because the colour-based search using the ‘orange’ mechanism occurred within a segmented representation of the display.
colour and the processing of chromatic information
149
Colour detection The mechanisms sensitive to intermediate hues function not only at the suprathreshold stimulus levels used in the visual search experiments, but also at parathreshold levels common in standard detection experiments. This was shown in a recent noise-masking study that measured the spectral properties of chromatic detection mechanisms (D’Zmura and Knoblauch 1998). Earlier noise-masking studies used a simple kind of noise called ‘axial’ noise, which is a random modulation of colour along a particular axis in the colour plane. For instance, a light that flickers randomly among various lights along the red–green axis is red–green axial noise. One expects, correctly, that detecting a faint red signal pulse will be made more difficult if one adds a sufficient amount of red–green axial noise to the signal (D’Zmura 1990). Unfortunately, one cannot use axial noise to measure the spectral properties of detection mechanisms. Consider an experiment, similar to that shown in Fig. 4.2, in which a faint orange signal is masked by axial noise. One might think that one can measure the spectral sensitivity of the orange detection mechanism by measuring how well noise, along systematically varied axes, masks the orange signal. This strategy does not work. For example, by masking the orange signal with red–green noise, one forces the observer to use the ‘yellow’ mechanism to detect the orange stimulus. By masking the orange signal with yellow–blue noise, one forces the observer to use the ‘red’ mechanism. Clearly, the mechanism that one uses to detect a single chromatic signal depends on the choice of noise chromatic properties, and measuring the spectral sensitivity of a single detection mechanism in this manner is impossible. D’Zmura and Knoblauch (1998) used a new kind of noise, ‘sectored’ noise, to overcome this weakness of earlier studies with axial noise. As shown in Fig. 4.3, noise samples are drawn from a sector in colour space that is centred on an axis a of the colour signal to be detected. One varies the sector width and measures how well the noise masks the signal. The technique borrows from earlier work in studies of ‘critical bands’ in audition (Fletcher 1940). By measuring the potency of noise masking as a function of sector width, one can distinguish three possibilities: (1) that the signal is detected by a linear detection mechanism that matches the axis of the signal; (2) that the signal is detected by a mechanism with a narrow spectral sensitivity that matches the axis of the signal; and (3) that the signal is detected by two standard colour-opponent mechanisms with sensitivities that do not depend on the choice of signal. In the first case, varying sector width has no effect on signal detection. In the second case, increasing sector width causes noise samples to become increasingly less effective in stimulating the detection mechanism, because the noise samples fall outside the narrow region of sensitivity. In the third case, increasing sector width can cause the noise to grow more effective, because noise samples fall increasingly in areas to which the standard mechanisms are most sensitive.
150
colour perception S
S
a⊥
a⊥
a θ1
a
n
n θ2
LM
LM
S a⊥
a n θ3
LM
Figure 4.3 Sectored noise used to characterize properties of detection mechanisms. To axial noise of amplitude n along axis a is added noise along the perpendicular axis a⊥ , modulated by the original noise in such a way as to fill a sector in the DKL colour plane (Derrington et al. 1984). Shown in the three panels are noises of narrow (top), moderate (middle) and wide (bottom) sector half-widths θ1 , θ2 , and θ3 . Noises that vary in sector width alone, as pictured below, have identical effects on the detection of a signal along axis a, if the detection mechanism is a broadband, linear one with a peak spectral sensitivity that matches axis a. (After D’Zmura and Knoblauch 1998, figure 4).
Experimental results show that noise masking is independent of sector width, for signals that appear yellow, orange, red, and violet. One infers that observers use linear detection mechanisms tuned to these hues. The ready generalization is that colour detection is served by mechanisms tuned to a variety of directions in colour space, and that these mechanisms have broad, linear spectral sensitivities (D’Zmura and Knoblauch 1998).
Discussion Colour appearance is available only through conscious representation. Yet most would agree that consciousness does not provide a complete picture of all neural processing. For instance, consider a machine vision system that uses chromatic information to perform a task such as colour quality control on a factory assembly line. One would be hard pressed to claim that such a system performs its task using colour representations in consciousness. There is no reason to assume that colour appearance, available only through consciousness, can be used to understand performance in chromatic processing tasks.
colour and the processing of chromatic information
151
This important point has been made repeatedly by Hurvich, who has reminded many speakers in public meetings that chromatic processing must be distinguished from colour appearance. Yet Hurvich and Jameson have had difficulty in articulating this position, because the mechanisms that serve chromatic processing and those that serve colour appearance are isomorphic in their colour-opponent theory (Hurvich and Jameson 1957). The experimental work that has been reviewed in this chapter makes clear that detection mechanisms are not structured according to standard colour-opponent theory. Rather, the peak spectral sensitivities of detection mechanisms in habituation, visual search, and noise-masking tasks are scattered uniformly in the colour plane. We can ask, conversely: do the mechanisms that are directly responsible for colour appearance have this multiple-mechanism sort of organization rather than the standard red–green and yellow– blue organization? There is good reason to think not. The unique hues provide robust psychological evidence for the standard colour-opponent organization. Unique red, yellow, green, and blue have a unitary, fundamental quality, while orange, yellow-green, blue-green, and violet are composite in nature. Although the multiple-mechanism organization can be forced to provide unique hues, this can only be accomplished in a post hoc fashion. The evidence suggests, then, that we must distinguish carefully between colour and the processing of chromatic information. Different organizations of colour sensitive mechanisms appear to underlie behaviour concerning colour appearance and behaviour in detection tasks.
Acknowledgements This work was supported by National Eye Institute grant EY10014.
References Bauer, B., Jolicoeur, P., and Cowan, W. B. (1996). Visual search for colour targets that are or are not linearly separable from distractors. Vision Research 36, 1439–1465. Bauer, B., Jolicoeur, P., and Cowan, W. B. (1998). The linear separability effect in colour visual search: Ruling out the additive-colour hypothesis. Perception and Psychophysics 60, 1083–1093. Derrington, A. M., Krauskopf, J., and Lennie, P. (1984). Chromatic mechanisms in lateral geniculate nucleus of macaque. Journal of Physiology (London) 357, 241–265. DeValois, R. L., Abramov, I., and Jacobs, G. H. (1966). Analysis of response patterns of LGN cells. Journal of the Optical Society of America 56, 966–977. DeValois, R. L., Cottaris, N., and Elfar, S. (1997). S-cone inputs to striate cortex cells. Investigative Ophthalmology and Visual Science 38, 15. D’Zmura, M. (1990). Surface color psychophysics. PhD Dissertation, University of Rochester, New York. D’Zmura, M. (1991). Color in visual search. Vision Research 31, 951–966. D’Zmura, M. and Knoblauch, K. (1998). Spectral bandwidths for the detection of color. Vision Research 38, 3117–3128. D’Zmura, M. and Lennie, P. (1986). Mechanisms of color constancy. Journal of the Optical Society of America A 3, 1662–1672. D’Zmura, M., Lennie, P., and Tiana, C. (1997). Color search and visual field segregation. Perception and Psychophysics 59, 381–388.
152
colour perception
Fletcher, H. (1940). Auditory patterns. Reviews of Modern Physics 12, 47–65. Gegenfurtner, K. R., Kiper, D. C., and Levitt, J. B. (1997). Functional properties of neurons in macaque area V3. Journal of Neurophysiology 77, 1906–1923. Hassenstein, B. (1968). Modellrechnung zur Datenverarbeitung beim Farbensehen des Menschen. Kybernetik 4, 209–223. Helmholtz, H. von (1909/1962). Treatise on physiological optics, (3rd edn), (ed. J. P. C. Southall). Dover, New York. Helmholtz, H. von (1891). Versuch einer erweiterten Anwendung des Fechnerschen Gesetzes im Farbensystem. Zeitschrift für Psychologie und Physiologie der Sinnesorgane 2, 1–30. Helmholtz, H. von (1892). Versuch, das psychophysische Gesetz auf die Farbenunterschiede trichromatischer Augen anzuwenden. Zeitschrift für Psychologie und Physiologie der Sinnesorgane 3, 1–20. Hering, E. (1878/1964). Outlines of a theory of the light sense (translated by L. M. Hurvich and D. Jameson). Harvard University Press, Cambridge, MA. Hurvich, L. M. and Jameson, D. (1955). Some quantitative aspects of an opponent-colors theory. II. Brightness, saturation, and hue in normal and dichromatic vision. Journal of the Optical Society of America 45, 602–616. Hurvich, L. M. and Jameson, D. (1957). An opponent-process theory of color vision. Psychological Review 64, 384–404. Ingling, C. R., Jr, and Martinez-Uriegas, E. (1983). The relationship between spectral sensitivity and spatial sensitivity for the primary, r,g x-channel. Vision Research 23, 1495–1500. Jameson, D. and Hurvich, L. M. (1968). Opponent response functions related to measured cone photopigments. Journal of the Optical Society of America 58, 429–430. Kiper, D. C., Fenstemaker, S. B., and Gegenfurtner, K. R. (1997). Chromatic properties of neurons in macaque area V2. Visual Neuroscience 14, 1061–1072. Krauskopf, J., Williams, D. R., and Heeley, D. W. (1982). Cardinal directions of color space. Vision Research 22, 1123–1131. Krauskopf, J., Williams, D. R., Mandler, M., and Brown, A. (1986). Higher-order color mechanisms. Vision Research 26, 23–32. Kries, J. von (1904). Die Gesichtsempfindungen. In Handbuch der Physiologie des Menschen, Vol. 3, (ed. W. Nagel), pp. 109–279. Vieweg, Braunschweig. Lennie, P., Krauskopf, J., and Sclar, G. (1990). Chromatic mechanisms in striate cortex of macaque. Journal of Neuroscience 10, 649–669. Newton, I. (1704/1952). Optics. In Great books of the Western world, Vol. 34 (ed. R. M. Hutchins), pp. 377–544. William Benton, Chicago. Olds, E. S., Cowan, W. B., and Jolicoeur, P. (1999). Stimulus-determined mechanisms for color search. Perception and Psychophysics 61, 1038–1045. Pugh, E. N., Jr (1976). The nature of the 1 colour mechanism of W. S. Stiles. Journal of Physiology (London), 257, 713–747. Stiles, W. S. (1946). A modified Helmholtz line-element in brightness-colour space. Proceedings of the Physical Society 58, 113–137. Stiles, W. S. (1978). Mechanisms of colour vision. Academic Press, New York. Treisman, A. M. and Gelade, G. (1980). A feature-integration theory of attention. Cognitive Psychology 12, 97–136. Vos, J. J. and Walraven, P. L. (1972a). An analytical description of the line element in the zone-fluctuation model of colour vision—I. Basic concepts. Vision Research 12, 1327–1344. Vos, J. J. and Walraven, P. L. (1972b). An analytical description of the line element in the zone-fluctuation model of colour vision—II. The derivation of the line element. Vision Research 12, 1345–1365. Young, T. (1802). On the theory of light and colours. Philosophical Transactions of the Royal Society London, 92, 12–48.
commentary: colour and processing of chromatic information
153
Commentary on D’Zmura The processing of chromatic information Laurence T. Maloney In his chapter, D’Zmura presents a lucid and intriguing review of the connections between theories of chromatic discrimination and theories of colour appearance. The best preparation before reading it would be to recall that psychophysics originated in Gustav Fechner’s attempt to explain the apparent intensity of sensory variables, such as lightness, in terms of the observer’s ability to discriminate intensities. This derivation of Fechner’s law from Weber’s has been challenged repeatedly, in particular by S. S. Stevens (Stevens 1957; Roberts 1979, pp. 149–184) and this controversy alone should alert us to the possibility that there is no simple relation between human chromatic discrimination performance and human judgements of colour appearance. D’Zmura’s point is important. To concentrate on colour appearance alone, as so many of the other chapters in this volume do, is analogous to treating statistics as a collection of methods for estimation, ignoring the statistical machinery that is appropriate and relevant for hypothesis testing. One can push this analogy a bit further. In the context of colour vision, ‘estimation’ amounts to assigning colours to the pieces that make up the world. It is evidently difficult to estimate colour so that assigned colours correspond to surface properties, the problem of colour constancy. But the goal of the estimation problem is clear. Hypothesis testing is manifold: it can be used to decide whether a specific piece of the world has changed recently, whether one piece is different from another, whether something seen a while ago has returned, or whether one piece really doesn’t fit in very well with the rest. There are psychophysical procedures used in colour vision corresponding to each sort of hypothesis test just outlined. The tests involve comparison across space and time, detection of visual transients, and detection of anomalies in complex arrays of stimulus elements. It is evidently a challenge to develop a computational model of colour processing that combines estimation and all the different kinds of neural ‘machinery’ needed for testing the kinds of hypotheses just outlined. D’Zmura emphasizes that there is psychophysical and neurophysiological evidence implying that the representations underlying colour appearance and chromatic discrimination are remarkably different. He describes his work in collaboration with Knoblauch, involving an elegant noise-masking method (analogous to that used in critical-band experiments in audition), as well as his work in the visual search for coloured targets. It is impressive that the conclusions drawn from two psychophysical tasks that are, in appearance, unrelated are so similar, and that this common conclusion is in agreement with the neurophysiological evidence: the distribution of mechanisms encoding colour information is roughly uniform in colour space. There are no distinguished directions in colour space corresponding to those isolated by experiments concerning colour appearance. D’Zmura makes his point very well. It’s not appropriate, though, to talk about a representation without also specifying the neural machinery that links it to psychophysical judgements of various kinds. As D’Zmura demonstrates, simple weighted linear sums can be used to create virtual mechanisms that can have any direction in colour space. What sort of role do spontaneously created virtual mechanisms of this sort have in carrying out the psychophysical tasks he describes, that correspond to different sorts of hypothesis tests? Indeed, what exactly happens when we compare two locations separated in space or in time, or try to detect the ‘odd man out’ in a visual search experiment? It is possible that estimation and hypothesis testing tasks can share a common representation and yet differ in the neural operations that each task employs. For example, the visual system may represent colour information interchangeably in many directions in colour space, but a critical piece of neural machinery required to carry out hue cancellation may not operate equally well along arbitrary directions in colour space.
154
colour perception
D’Zmura’s chapter is a succinct and well-written summary of his own work and the work of others concerning the neural representation of colour and what we can earn about it by psychophysical means.
References Roberts, F. S. (1979). Measurement theory with applications to decision making, utility and the social sciences. Addison-Wesley, Reading, MA. Stevens, S. S. (1957). On the psychophysical law. Psychological Review 64, 153–181.
chapter 5
THE PLEISTOCHROME: OPTIMAL OPPONENT CODES FOR NATURAL COLOURS donald i. a. macleod and tassilo von der twer Preface Colour opponency is one of the most obvious features of the visual system’s functional organization; here we consider the benefits that accrue from colour opponency, indeed from opponency in general. We claim that the advantages of opponency are traceable to the statistics of colour in natural scenes. The authors, a vision researcher and a mathematician, were brought together by the ZiF project in an atmosphere highly conducive to theoretical rumination, and their collaboration on these and related issues has been maintained through subsequent visits by von der Twer to San Diego. The problem to which, we suggest, colour opponency provides the answer is: what non-linear input– output function for a graded neural signal allows that signal to represent environmental stimuli best, under the constraint of a definitely restricted neural response range? Although we consider this problem in the particular context of colour vision for the sake of concreteness, the arguments are perfectly general. In colour, as in sensory processing in general, a quantitative input has to be represented quantitatively, and different system designs may differ in the precision with which they represent the input. The essence of the problem can be quickly appreciated with the help of a crude analogy (which ultimately proves to be too crude): a slicing of the input space. In the space of cone excitations that defines the initial colour stimulus, natural colours are clustered around white, so discrimination is ipso facto more important around white than in the outlying regions. The bins, or slices, into which the system can reliably divide its inputs correspond to successive levels of each output signal. The total number of these slices is fixed for a given output signal by the available range of output signal values. However, by suitable choice of the mapping from input to output, these slices can be arranged to span part or all of the input range consecutively in any desired manner, with any desired variation in slice thickness across the input range. Laughlin (1981) investigated the problem of the best slicing pattern. His answer is that the thickness of the slices should be chosen by the infomax criterion, maximizing the mutual information between input and output. The infomax criterion leads to the arrangement of slice thickness known as histogram equalization, in which slice thickness (or histogram bin width) varies in inverse proportion to the input probability density (so that all bins are equally populated). This arrangement minimizes the probability that pairs of randomly chosen inputs will occupy a common slice and be confused. Although histogram equalization minimizes confusion rates, rather different considerations arise when the output signal is contaminated by random variation (noise). In the presence of noise, any stimulus is confused with a slightly different one each time it is presented, and what is important is to minimize the magnitude of these errors. Here the slicing metaphor becomes inexact, the infomax criterion becomes too simple, and histogram equalization is no longer optimal for rational cost functions that associate increasing costs with increasing errors. The subject of this chapter is the optimization of non-linearity for precise representation of inputs in the presence of noise. We assume initially that the noise originates at
156
colour perception
the output, after the non-linear encoding that is to be optimized, but we discuss how this assumption can be relaxed. The derived optimum non-linear response function is not very different from Laughlin’s, but provides much better discrimination in the margins of colour space. An extension of the analysis provides a justification for colour-opponent coding, as well as for compressive non-linearity in the contrast–response functions of the opponent signals. We make comparisons with psychophysical and physiological evidence, and these suggest only a very approximate correspondence between theoretical expectation and observation as to the extent of the nonlinearity. The discrepancies bear on one of the most basic questions about colour vision: What is it for? Do brightness and colour provide a specification of viewed objects suitable for identifying and characterizing them, or is their main value to permit objects to be distinguished from their backgrounds? We suggest that the visual system’s M cells, which lack chromatic opponency, are designed to serve the latter role; this is reflected in their almost all-or-none response, a non-linearity more extreme than is required for the quantitative representation of natural inputs. The chromatic-opponent P cells, on the other hand, are linear enough to provide such a quantitative representation. D.I.A. MacLeod, T. von der Twer
Introduction Rich though it is, the perceptual world of conscious experience is far more impoverished, in terms of sheer information content, than either the external reality of which it is a representation, or the proximal stimulus from which it is constructed. Many distinctions that are present in the stimulus fail to register in perception. The domain of colour vision provides two clear examples of this. First, the initial encoding of colour by the human retina is only three-dimensional, with the result that very different spectral energy distributions may be absolutely equivalent visually. Secondly, even stimulus differences of a kind that do affect the visual system must escape notice when sufficiently small in magnitude. In this chapter we address this second limitation on colour vision, by analysing the discrimination of colour and brightness within a framework that is both mechanistic and ecological. Discrimination is a primitive perceptual accomplishment that lends itself to a mechanistic analysis informed by neurophysiology. Two stimuli that are physically different will be perceptually discriminable only if their neural representations are reliably different at some level. The earliest such representation in vision is the set of excitations of the photoreceptor cells in the retina. But a mechanistic analysis cannot be based on that alone: as we will see, post-receptoral re-coding alters the neural representation of colour radically, even within the retina, and information may be lost in this re-coding. No matter which brain loci form the immediate substrates of visual experience, any distinctions that are lost in the retinal output can hardly be restored in the brain or in conscious experience. Which stimulus distinctions are lost in this way, and which are retained, will depend on the nature of the neural code at the retinal output, as we will see below. This is a purely mechanistic, or neurophysiological issue. But the visual system’s neural code for colour can be regarded as a choice made either during evolution, or during individual development under the guidance of genetically allowed plasticity. In either case, its selection poses a problem of design. Here we encounter the
optimal opponent codes for natural colours
157
ecological aspect of the problem. The code should be one that reliably represents important stimulus differences, obliterating only the less important ones. Thus there is no need for precision in the neural representation of stimuli that never occur in the natural environment. Our argument in this chapter is that the post-receptoral neural code for colour is nicely adapted to the representation of naturally common stimuli. One of the simplest statistical characterizations of a sensory environment is the probability distribution of input values. In this chapter, we consider the implications of that distribution for perceptual coding. The distribution is commonly a peaked one: in the case of colour, for example, natural colours are usually nearly neutral (delivering roughly comparable intensities of stimulation to each of the three types of cone photoreceptor) rather than vividly saturated (delivering very different intensities of stimulation). In the case of visual motion, the angular velocities of viewed objects relative to the observer are typically small or zero, and for any given direction of motion, the larger velocities, positive or negative, are progressively less frequent. Turning from the stimulus to its neural representation, we commonly find that the code adopted is an opponent one. The clearest instance of this is in the representation of colour, where individual neurons at stages following the photoreceptor stage are excited by certain parts of the spectrum and inhibited by others (DeValois and DeValois 1975; Derrington et al. 1984). This happens because these neurons are excited by one or two types of spectrally selective photoreceptor and inhibited by others. A somewhat paradoxical consequence of this encoding scheme is that the post-receptoral neurons are poorly responsive to the physiologically and phenomenally neutral stimuli that are most abundant in the environment. Likewise, in the case of motion detection, directionally selective neurons respond poorly to static or nearly static stimuli, with inhibitory or zero response for motion in directions opposed to the preferred direction. Thus the cases of motion and colour both exemplify what we will call a ‘split-range’ code, where an input continuum such as colour, or (signed) input velocity, is divided at a physiological null point (near-white, or zero velocity), and where separate neurons respond to inputs on opposite sides of that null point. In this chapter we ask: Why is this non-linear encoding scheme a good one? We first demonstrate theoretically a quantitative connection between the statistics of environmental inputs and the split-range code evolved by the visual system. Namely, by adopting a split-range code, and representing opposite segments of the input range by neurons that each have rectifying and compressive response non-linearity, the visual system maximizes the average precision in its representation of natural inputs in the presence of neural noise introduced at the output. We then compare the optimal form of the split-range non-linearity with experimental estimates from psychophysics and from neurophysiology, finding very rough agreement. Our discussion is focused on colour vision, although the theoretical arguments are quite general. We therefore begin with a rough characterization of the distribution of natural colours.
The distribution of surface colours Because the cone spectral sensitivities are broad, with substantial overlap, the ratio of the sensitivity of the long-wavelength sensitive cone photoreceptors (L cones) to midspectral (M) cone sensitivity varies only by 20:1 across the spectrum (Stockman et al. 1993), and
158
colour perception
L and M cone excitations are strongly correlated. For equal-energy spectral lights, L and M cones alike are most strongly excited in the yellow-green part of the spectrum; the L and M cone excitations vary with a correlation of 0.84 in the range from 400 to 700# nm. As Fukurotani (1982) and Buchsbaum and Gottschalk (1983) have noted, this rather high correlation for spectral colours means that the L and M cones measure almost the same thing. The difference between their excitations, on which perception of colour depends, is very small. Moreover, this problem is enormously exacerbated by the characteristically broad spectral energy distributions of natural stimuli. For natural stimuli, the ratio of L to M cone sensitivity seldom approaches the limiting values that can be attained by spectral lights, but both L and M cone excitations vary together with varying surface lightness. The correlation between the cone excitations is correspondingly higher for natural colours than for spectral lights. This is clearly apparent in each of two sets of measurements on natural colours (Brown 1994; Ruderman et al. 1998), on which we have mainly relied in our analysis. Figure 5.1 shows results for a set of 574 natural spectral reflectance functions measured in San Diego by Brown (1994). The sample included flowers, fruits, leaves, barks, and soils, with a few samples of water and sky. For these 574 samples, the correlation between L and M cone excitation is 0.985 (Fig. 5.1). Since it seems impossible to define a representative sample, Brown made no attempt to select in any systematic way but measured various things that happened to catch his eye; vivid colours are accordingly over-represented. Ruderman et al. (1998) obtained spectral reflectance estimates, pixel by 3′ -arc pixel, for 12 entire views of natural environments, thereby avoiding arbitrary selection within each scene. For this data set comprising nearly 200 000 pixels, the correlation between L and M cone excitation is even higher than in Fig. 5.1 (0.9983). The dispersion of the distribution is important for our analysis. As Fig. 5.1 implies, the variation is far less in the red–green direction than in the luminance direction, even when the effects of variations in the daylight illuminants [which Brown (1994) has noted
(b) 20 M cone excitation
M cone excitation
(a) 20 15 10 5 0 0
5 10 15 L cone excitation
20
15 10 5 0 0
5 10 15 L cone excitation
20
Figure 5.1 A scatter plot for natural surface colour stimuli measured by R. O. Brown in the (L,M) plane, divided by a grid into (a) 10 distinguishable levels of L and of M cone excitation; (b) 10 levels of L + M and 10 levels of L − M. (In this plot the values of M have been scaled up by 2.5 relative to a luminance basis, hence true constant luminance contours deviate from the negative diagonal shown.)
optimal opponent codes for natural colours
159
are particularly large in the luminance direction] are normalized out. To normalize for illumination variation, each surface was characterized by its spectral reflectance relative to a full reflectance white standard measured under the same illumination (Brown) or in the same scene (Ruderman et al.). We derived cone excitations from these reflectances by integrating their cross-products with the cone sensitivities of Stockman et al. (1993), either without or with a weighting factor for the spectral energy distribution of typical (D65) daylight. The resulting numbers are the cone excitations produced by each surface viewed under, respectively, a standard equal energy white illuminant or a D65 illuminant. Surface luminance is given by the summed excitations of L and M cones, which we denote here simply by L and M (Eisner and MacLeod 1980; Lennie et al. 1993). The standard deviation of log 10(L + M ) in Brown’s data set is 0.46, which corresponds to a factor of three in luminance; for Ruderman et al.’s data the value is 0.24, a little less than a factor of two. The purely chromatic variations are conveniently indexed by two axes: r = L/(L + M ) forms the (roughly speaking) ‘red–green’ axis of a photoreceptor-based chromaticity diagram (Luther 1927; MacLeod and Boynton 1979) and is proportional to L cone excitation per unit luminance. In Fig. 5.1 it is nearly proportional to the angle, from vertical, of the line connecting a colour point to the origin. As the high correlation between L and M implies, the standard deviation of r is far smaller than that for luminance (only 7.5% , or 0.03 in the decimal logarithm in Fig. 5.1, and only 1% for the entire-scene data of Ruderman et al.). For the remaining chromatic axis, we adopt the luminance-normalized S cone excitation, b = S/(L + M ), which is low for yellows and very high for violets. The standard deviation of b is more than 10 times that for r, and about as high as the one for luminance: 0.39 in log10(b) or a factor of about 2.5 for Brown’s data, or 0.11 in log 10(b) for the data of Ruderman et al. This greater variability for b than for r arises partly because natural surface reflectances have more variation at short wavelengths than at long, but partly it arises because the S cone spectral sensitivity curve is spectrally remote: it is displaced some six times further from the L and M sensitivity curves than those are from each other, with the result that the ratio of S sensitivity to L or M sensitivity varies more than a million-fold across the spectrum (Boynton 1980; Stockman et al. 1993; McMahon and MacLeod 1998).
Colour discrimination as a slicing of colour space We assume initially that the goal of the encoding of colour and lightness is to characterize or specify surfaces in terms of colour and lightness. While this may appear tautologous, there are other possibilities (Boynton 1980; Morgan et al. 1992); for instance, the goal of detecting all object boundaries has rather different requirements, which we consider briefly below (p. 178). Roughly speaking, characterization of any given surface is made most precise by making the number of distinguishable colour–lightness combinations as large as possible. The number of distinctions made by a single neuron is limited by restrictions on its firing rate (from zero to some maximum), and by the random fluctuations in the firing rate. As an aid to intuition at the outset, the following crude idealization may be useful. Imagine a discrete set of N progressively increasing firing rates, that span the range from zero to the maximum firing rate, with each one just reliably different from the last, as defining the number N of distinct signals that a neuron can generate. Somewhat more precisely, the
160
colour perception
neuron might be regarded as classifying any stimulus into one of N classes, on the basis of the firing rate that the stimulus evokes. A neuron fed by L cones alone will distinguish colour stimuli on the basis of L cone excitation alone; in colour space, planes of constant L cone excitation will bound the N colour classes that it can distinguish. On this view, the neuron performs a slicing of colour space, separating it into N distinguishable colours; in Fig. 5.1 an L cone-driven neuron would make vertical slices, while a purely M conedriven neuron would make horizontal ones. The combination of a purely L-driven neuron with a purely M-driven one will slice both horizontally and vertically, creating a grid of N 2 distinguishable colours that appear as squares in the (L,M ) plane of Fig. 5.1 [and also in the three-dimensional (L,M ,S) colour space, the S coordinate being irrelevant]. Yet as Fig. 5.1 shows, only a small number of these potentially distinguishable squares are actually occupied by natural colours. In the case illustrated, N = 10 and only 23 of a possible 100 squares are occupied. Although the retinal ganglion cells of the optic nerve, as well as visual neurons in the brain, are fed by multiple cone types, the planar slicing analogy is straightforwardly applicable to them also, since the signals of these neurons depend (albeit through a non-linear response function) on a weighted but (approximately, and with the proviso that the state of adaptation does not vary) linear combination of cone excitations (Derrington et al. 1984). The weighted linear combination of cone inputs has some parallels in psychophysically investigated opponent codes (Larimer et al. 1974). The slices associated with any such neuron remain planar and parallel, but are made at some characteristic angle to the axes of cone excitation space. Importantly, the planar slicings also have, in general, an unequal spacing, expressing a combined influence of response non-linearity and noise factors (see p. 161).
Best directions for slicing colour space By choosing the weights of its cone inputs appropriately, a neuron can be designed to slice colour space in any desired direction. In Fig. 5.1, an attractive choice would be to confer on one type of neuron a luminance (=L + M ) sensitivity, allowing such a neuron to make, say, 10 slices spanning the diagonal that forms the major axis of the distribution of natural colours. Then if a second neuron generates a signal that depends on an opponent combination of L and M excitations, its 10 slices could be spaced much more finely to span the more limited range of natural colours in the red–green direction, as shown in Fig. 5.1b; many more of the boxes representing potentially distinguishable colours would then be occupied. The domain of natural colours can, in this way, be divided into a much greater number of distinguishable colours than was possible using purely L-or M-cone driven neurons. Note, however, that this benefit of slicing along diagonals occurs only if the number of slices made by each neuron is fixed, in keeping with the assumed restriction on response range. If there were no such restriction, the grid of Fig. 5.1b could be rotated arbitrarily without reducing appreciably the number of distinct colours. For fine slicings, that number is simply equal to the volume of the colour ‘sausage’ being sliced, divided by volume of the pieces (distinguishable colour domains) into which it is sliced. It is therefore invariant with
optimal opponent codes for natural colours
161
the orientation of the slicing grid. The essential role of neural non-linearities, such as range restriction, in determining the optimal slicing pattern is not recognized or allowed for in treatments of colour coding that propose that the optimal directions for slicing are those of the principal components of the distribution of photoreceptor excitations (Fukurotani 1982; Buchsbaum and Gottschalk 1983; Atick et al. 1992). Principal components analysis (PCA), appropriate as it is for a linear system, provides no mathematical or biological rationale for preferring one slicing grid orientation over another (even for a non-linear system, whose non-linearity is not allowed for in the PCA analysis).1 The benefits of encoding the diagonal variables in Fig. 5.1, or uncorrelated variables in general, are contingent on the limited range of neural response, a non-linearity which limits the number (and thickness) of the slices that are created by each type of neuron. Moreover, while principal components analysis can specify a set of directions for slicing the input space, it cannot indicate a preferred origin of the coordinate system, and hence can give no rationale for opponent codes. We will present a rationale for opponent neural codes—and for splitrange codes in particular—which takes as its starting point the idea that the thickness, and not the direction, of the slices should satisfy a principled requirement: specifically, the arrangement of slice thicknesses should provide the most precise representation of natural colours. This criterion allows determination of an optimum non-linear code, subject to the constraint of a limited output range. The optimal code is a split-range one, with rectifying opponent cells. Although the rationale for their derivation is novel, it turns out (p. 168) that the axis directions for the optimum non-linear code are simply the principal component directions, provided that the distribution of colours satisfies an independence condition.
Best choice of slice thickness: the pleistochrome We have assumed that the relevant neural signals depend upon a weighted combination of cone excitations, a simplifying assumption supported by experiment (Larimer et al. 1974; Derrington et al. 1984). In terms of our crude slicing analogy, each neuron slices cone excitation space into slabs bounded by parallel planes. If the neural signal could be linear in its dependence on its net input, the thickness of these slices might be uniform. But owing to limitations on firing rate at both ends of the scale, the dependence of output on input is necessarily quite non-linear, and the slice thickness will be correspondingly non-uniform. The pattern of slice thickness in the input space will depend not only on the form of the non-linear neural input–output function, but also on the degree of random variability in the output, in a manner that we will consider shortly. A second assumption adopted for simplicity at the outset—we dispense with it in the following section—is that variability originating at the retinal output predominates over sources of error at earlier stages. The rationale for this is that the optic nerve constitutes an informational bottleneck for vision, where the number of nerve fibres and of nerve 1 The principal component axes are optimal when there is a requirement for dimensionality reduction in a linear system, but no functionally important dimensionality reduction is typically involved in the post-receptoral coding of colour.
162
colour perception 1
Neural response
0.8
0.6
0.4
0.2
0 0
2
4 Stimulus value
6
8
Figure 5.2 Compressively non-linear function for neural firing rate versus input stimulus value. Random fluctuations in firing rate occur with constant standard deviation, σ , spanning bands of equal height. The associated root-mean-square (RMS) errors in the estimation of the input value are shown by the widths of the horizontal bands, and vary in inverse proportion to the derivative of the compressive response function.
impulses is relatively limited: much less, for instance, than the number of absorbed photons at daylight light levels (Barlow 1965). Hence relatively large errors are introduced by random fluctuations in the optic nerve fibre impulse counts (Bialek and Rieke 1992; Lee et al. 1993). Any random error at the retinal output implies a corresponding error in the perceptual estimate of the input value. This error in the estimated input depends both on the output error and on the gradient of the input–output response function, as illustrated in Fig. 5.2. There the output noise, with root-mean-square (RMS) value σ , may be regarded as defining the (vertical) slice thickness at the output. In Fig. 5.2 the noise, and the vertical thickness of the illustrated horizontal slices, is constant for variation in the mean input or output. The associated variation in the represented input value (RMS equivalent input noise) is proportional directly to σ , and inversely to f′ (x), the derivative of the function relating input to output. This defines reliably distinguishable slices at the input; in Fig. 5.2 these are non-uniform in width owing to the compressively non-linear2 input–output function. Note that because of the reciprocal relation between the response function gradient and the associated error in the estimated stimulus value, discrimination around any input value can be made as good as desired simply by making the response function gradient at the relevant point as steep as necessary. But the constraint imposed by the limited total available range in firing rate means that (for monotonic response functions) an increase in 2 For now, we make the traditional simplifying assumption that the relevant visual non-linearity can be treated as a static one. In reality, the non-linearity of post-receptoral visual neurons is preceded by a sensitivity-regulating mechanism. How recognition of this affects the analysis is considered on p. 172.
optimal opponent codes for natural colours
163
Stimulus frequency (x – x – x ) or response (arbitrary units)
gradient at one point has to be accompanied by a decrease at other points within the input range, and hence by reduced discrimination at those points. Thus a problem confronts the visual system. By suitable choice of a non-linear response function, relative discriminative precision can be distributed in any desired way over the range of input values. So which choice is best? With a mathematical definition of ‘best’, this problem has, as we will show, a definite solution. We may ask, for example, what input–output function is best in the sense of giving the smallest error (in for instance, the least squares sense) in the estimated input, averaged over all naturally occurring cases. The optimal code in this sense must depend on the distribution of environmental inputs. Clearly, it would be inefficient to make the code linear (with constant gradient and constant discrimination) over an input range greater than what is encountered naturally (dotted curve, Fig. 5.3). By doing this the system would sacrifice useful discrimination among naturally occurring stimuli in order to preserve discrimination in ranges where it is never needed. Other things being equal, it is advantageous to allocate discriminative power preferentially to the part of the input range where discrimination is most often needed—that is, the part where natural colours most frequently occur—by giving the response function a steep gradient at that point in the stimulus range. However, the opposite extreme from the linear code is also inefficient: a response function that steps abruptly from minimum to maximum firing rate at the peak of the distribution of natural colours (dashed curve, Fig. 5.3). This choice would lead to categorical perception, in which blues and yellows, for instance, might be unmistakably distinguished at a precisely defined bounding chromaticity, at the price of providing no
1 0.8 0.6 0.4 0.2 0 –0.8
–0.6
–0.4 –0.2 0 0.2 log10(b); b = L/(L + M)
0.4
Figure 5.3 Crosses: frequency distribution of log 10(b) for Ruderman et al.’s set of natural colours; b specifies S cone excitation per unit luminance, i.e. b = S/(L + M). Whites and greens are near the middle of the distribution, with equal-energy white at log 10(b) = 0. To the right lie bluish colours; to the left, generally yellowish or reddish ones. Candidate input–output functions: pleistochrome (circles), compared with linear (dotted) and stepwise (dashed) alternatives.
164
colour perception
discrimination within each class. The best choice will be an intermediate one: a gently curving sigmoid with steepest gradient at the peak of the distribution, but with a non-zero gradient also in the tails of the distribution. This retains some discrimination in the tails while slicing colour space most finely at the peak. The optimal response function can be found once the cost of a perceptual error has been defined. The rationale for finding it is as follows. The benefit of increasing the gradient very slightly from its optimum value at any chosen point must be exactly cancelled by the cost of the necessary equal reduction in the gradient at any other point. That is, the derivative of cost with respect to gradient must be the same at all points in the optimal condition. One definition of cost is implicit in the adoption of a mean squared error criterion: to minimize the sum of the squares of all errors made in the perception of the stimulus set is to minimize a cost proportional to the square of each error. We begin by considering the simplest case—where the cost of a given error is the same at all points in the stimulus range, and proportional to the squared error—and show that in that case, the optimal condition is achieved by a response function with a gradient matched to the cube root of the probability density function of the input distribution. Let the output signal of interest (one element of the vector making up the post-receptoral colour code) be y ∗ , with a mean y = g (x) and an associated random variation (standard deviation) σy ∗ (x), for a net input (a weighted sum of cone excitations), x. Denote the environmental probability distribution of x (for all stimuli encountered, or of interest) by p(x). We want to minimize the mean squared random error (MSE) in x ∗ , the perceptual estimate of the input value, x, based on the output, that originates from random variation in y ∗ . For given input x, this mean squared error in x ∗ is σy ∗ (x) 2 2 (5.1) σx ∗ (x) = g′ (x) where g ′ (x) is the response gradient, or the derivative of the response function y = g (x) at x; here we make the simplifying assumption that g ′ (x) can be treated as linear within the range of the random variation. The average of σx2∗ (x ∗ ) for all inputs is its probabilityweighted integral over x, which converges if p(x) decreases more rapidly than x 3 for large x: MSE = p(x)σx2∗ (x)dx
Consider the effect of small variations in the response gradient or incremental gain g ′ (x) around its optimal value. As explained above in the optimal condition, p(x)d(σx2∗ (x))/dg ′ (x) must be independent of x. Thus d[σx2∗ (x)]/d[g ′ (x)] = k/p(x) where k is a constant of proportionality. Since from Equation 5.1 d[σx2∗ (x)]/d[g ′ (x)] = −2[σy ∗ (x)]2 /[g ′ (x)]3
(5.2)
optimal opponent codes for natural colours
165
the optimal condition occurs when g ′ (x) = (1/k)[σy ∗ (x)]2 p(x)1/3
(5.3)
The scaling factor k serves only to define units of measurement for g (x) and can be set to 1 if those units are not defined independently. If σy ∗ (x) is independent of x, and more generally as an approximation if the factor p(x) predominates, the above reduces to g ′ (x) = p(x)1/3 or g (x) =
x −∞
p(u)1/3 du + c
(5.4)
We refer to this optimal response function as the pleistochrome, from the Greek pleistos meaning ‘most’, since it may be described roughly as the function that makes available the greatest number of distinguishable colours. More exactly, it is the function that maximizes the average precision with which input colours are represented. For single-peaked stimulus distributions, the pleistochrome is a sigmoidal curve centred near the peak of the distribution of x (Fig. 5.4). It is roughly similar to the cumulative distribution of x, but wider than that function by a factor of about the square root of three. A similarly motivated proposal of Laughlin (1981, 1983) aims to maximize the information about the input, given the output, through histogram equalization. The infomax criterion does not derive from a noise-based theoretical framework, however, and lacks a rationale in terms of minimization of random error or the associated cost. It leads to a steeper non-linearity (by about a factor of the square root of three) than the least-error one embodied in the pleistochrome (Fig. 5.4).
The pleistochrome under less restrictive assumptions: non-uniform noise and other complications More general cases than the simple one that led to Equation 5.4 turn out to be mathematically tractable. Here we list some ways to elaborate or extend the initial scenario. More formal and rigorous treatments of these cases, and of the pleistochrome in general, can be found in von der Twer and Macleod (2001). Readers with limited enthusiasm for quantitative theory may prefer to skip the next section or two. Output noise can be non-uniform Equation 5.3 determines the pleistochrome in the general case where the random variation introduced into the output has a standard deviation that varies with the mean output. Equation 5.4, on the other hand, treats the output noise as independent of the mean output. But a very simple intuitive connection exists between that case and the general one. When output noise is output-dependent (but monotonic with mean output) there exists some non-linear monotonic transform of the output that has a standard deviation independent of its mean: the transforming function need only have a derivative inversely
166
colour perception 1
Response
0.8
0.6
0.4
0.2
0 –1
–0.5 0 Stimulus value (log10(b))
0.5
Figure 5.4 Pleistochromes based on minimization of mean squared error (circles), or of mean absolute error (crosses), compared with the function that maximizes mutual information through histogram equalization (plain curve).
proportional to the output noise standard deviation at every point. The optimal condition occurs when this function of the output satisfies Equation 5.4.3 For instance, in the case of the Poisson process that provides the simplest idealization of a spiking neuron, and is often approximately descriptive of real ones (Tolhurst et al. 1983; Levine et al. 1988, 1992; but see also Croner et al. 1993), the standard deviation of the firing rate goes up as the square root of the mean firing rate, and so the square root of the firing rate has a constant standard deviation. For a neuron with a mean output firing rate n(x) for input x, a lowest firing rate nmin at x = −∞, a maximum rate nmax and a standard deviation an 1/2 , the standard deviation of n(x)1/2 will be a/2 for all x, and the optimal dependence of n(x) on x is such that n 1/2 satisfies Equation 5.4, i.e. n(x)1/2 = nmin (x))1/2 + ((nmax (x))1/2 − (nmin (x))1/2
x
p(u)1/3 du
(5.5)
−∞
so that n(x) is given by the square of this expression. A second example: with noise fluctuations whose range spans a constant fraction of the mean rate, the log of the rate has a constant standard deviation. Then it is the log of the output that should satisfy Equation 5.4 in order to minimize the average error in the estimate of the input, so the optimal non-linear response function is an exponential function of g (x) in Equation 5.4. 3 As this may suggest, optimizing the non-linearity makes the variation of equivalent input noise with input x completely independent of σy ∗ (x): the effects of the latter are cancelled (except for an overall sensitivity factor) if the non-linearity y = g (x) is in each case the appropriate one.
optimal opponent codes for natural colours
167
Multiple non-linearities are treatable stage by stage In a system where there are multiple stages that impose non-linear transformations on their inputs, Equation 5.3 provides a prescription for the optimization of each such transform separately, proceeding downstream from the initial stimulus.4 The input can itself be contaminated by error Sources of error may exist prior to the non-linear stage of interest, and this input noise may be stimulus-dependent. Conveniently enough, this doesn’t affect the pleistochrome at all: Equation 5.3 still guarantees least mean squared error. This is because the input noise (if uncorrelated with the output noise) simply adds an optimization-irrelevant constant to the mean squared error that must be minimized. When input noise is stimulus-dependent, however, it may be appropriate to weight the errors of estimation for different stimuli differently, as discussed next. Input variability can be non-uniform (stimulus-dependent) To the extent that Weber’s law applies, the cost of an error in estimating the stimulus value may be related less to its absolute magnitude than to its magnitude as a proportion of the value being estimated. In just the same way as was described for non-uniform output noise, one may then consider a transformation of the input x for which estimation errors of equal absolute magnitude are equally undesirable (e.g. a logarithmic one, if Weber’s law describes the error cost), and then applying Equation 5.3 to derive the optimum dependence of the output firing rate on this function f (x), and thence on x. Figure 5.3 illustrates this. The probability distribution given there is for the logarithm of the b chromaticity coordinate. As a result, the pleistochrome shown by the continuous curve in Fig. 5.3 is the response function that minimizes error in the estimate of log(b) rather than in the absolute value of b. The rationale for preferring this to the response function that minimizes error in the recovery of linear b values is that equal differences in log(b) are about equally noticeable (Le Grand 1949; Boynton and Kambe 1980). The cost of a given error of estimation may vary depending on the estimated stimulus value Some discriminations are biologically important, others less so. If important discriminations tend to be concentrated at certain points in the input domain, some allowance should be made for this, and the establishment of a mathematical criterion for optimization might then seem hopeless. Formally, however, non-uniformity in the cost of error over the stimulus domain can in fact be handled formally in exactly the same way as non-uniform noise contamination of the input. When the cost of errors of estimation is different for different stimulus values, this can be dealt with by applying Equation 5.4 not to the initial input value x but to a transformation of it, f (x), chosen such that estimation errors of equal absolute magnitude in f (x) are equally undesirable. 4 Optimizing response non-linearity for each individual non-linear stage by this procedure is not, however, equivalent to optimization of the whole system. For that, iteration would be necessary.
168
colour perception
The relative cost of small and large errors can be chosen freely As noted, the use of least-squared error in the estimated input as a criterion for optimization is equivalent to assuming that errors entail a cost simply proportional to their squares. If we choose to minimize the absolute error rather than its square, small errors are not as well tolerated, and the incentive to make very thin slices in high-density regions is increased. By developments similar to those that led to Equation 5.3, this leads to an optimal incremental gain g ′ (x) = [σy ∗ (x)p(x)]1/2 , and thus to an input–output function spanning an input range a little less wide than in the least square case (Fig. 5.4).5
Multidimensional stimulus domains: slicing colour space In extending these ideas to a multidimensional stimulus domain such as colour, we encounter interesting problems that we discuss more fully elsewhere (von der Twer and MacLeod, in preparation). A particularly simple extension is possible if the stimulus probability density function satisfies independence for some input quantities u and v (which need not be linearly related to the initial stimulus coordinates, for instance the cone excitations), that is if p(u, v) = p1 (u)p2 (v). Then the pleistochromes for the marginal distributions p1 (u) and p2 (v) specify the optimal spacing of linear cuts parallel to the u and v axes— constant-response contours of signals encoding u and v—for minimizing perceptual error in (u, v). Moreover these linear cuts (or for higher dimensionality, planar cuts) in (u, v) are more efficient in that sense than any curved cuts, since the u-pleistochromes are the same for all v and vice versa, and since, as we show elsewhere, the choice of other axes than those satisfying independence introduces added reconstruction error. The optimal neural responses fu(u) and fv(v) are obtainable in terms of the original stimulus coordinates x and y by inverting the transformation that generated u and v from those stimulus values.6 5 An interesting situation arises if noise sources inherent in the input predominate over errors originating at the output. In the visual system, this situation arises at low light levels (Barlow et al. 1971; Baylor 1987; Donner 1992). In this case, a relatively small output-derived MSE is added to the relatively large MSE inherent in the input itself to give the total MSE in the perceptually estimated input. Provided that the output-derived increments in total MSE are sufficiently small, any measure of average error or of the associated cost—mean square, mean absolute error, or anything else—will increase approximately linearly both with the total MSE and with the mean square error contributed from sources of variation at the output. Thus, in this case, the cube root construction remains optimal even if the cost function for errors is not quadratic. 6 Two caveats apply. First, the error is minimized in the coordinate system (u, v) where independence holds, not in the original stimulus coordinate frame. The optimization is valid, therefore, only where independence holds in some coordinate frame in which the cost associated with errors is uniform. This happy situation may exist for colour space, with the adoption of appropriate simple linear combinations of the logs of the cone excitations— quantities rather close to log(r), log(b) and log(luminance)—as coordinates (Ruderman et al. 1998). Secondly, there is some freedom with respect to the orientation of the coordinate frame in (u, v). A suitable choice of u and v will make the marginal distributions Gaussian and of equal variance. If independence then holds in (u, v), the distribution p(u, v) will be bivariate Gaussian and be radially symmetric. Then, if equal costs are associated with equal errors in different directions in (u, v), all orthogonal coordinate frames in (u, v) are equally efficient. In Fig. 5.1, for example, the strategy of encoding the principal axes of the distribution—roughly, luminance and colour as in Fig. 5.1(b)—is only one possible choice; the other choices slice the plane of Fig. 5.1 into narrow diamonds rather than narrow rectangles.
optimal opponent codes for natural colours
169
More complicated (but still continuous) distributions, that violate independence overall, may still approximate it locally over sufficiently small neighbourhoods, and this will call for local variations in the orientation of the slicing grid.7 We have found that good contours for slicing a complicated distribution can be determined without iteration in the following way. We fix all but one of the stimulus variables, and determine the pleistochrome for variation in the remaining variable alone. When this is done in turn for a range of values of the other variables, we obtain constant response surfaces for the neural signal of interest. If independence is not satisfied, these are curved, and crowd together in those regions where stimuli occur more frequently than expected from independence. We have found this approach quite effective in reducing the reconstruction error for challenging stimulus distributions, including the real colour distribution illustrated in Fig. 5.7. The technique shares with independent components analysis (ICA) (Bell and Sejnowski 1997), the merit that non-orthogonal rotations of the initial axes are permitted, but is more versatile than ICA in allowing non-planar cuts. The spacing rule of Equation 5.3 also differs from the one customarily adopted in ICA, which, like the proposal of Laughlin (1983) aims to maximize the information about the input, given the output, through histogram equalization. As Fig. 5.7 shows, the spacing of the constant-response contours generated by this errorminimization procedure is only slightly non-uniform within the (very limited) chromaticity range of natural colours. The implied relatively gentle non-linearity allows much better discrimination among colours near the margins of the distribution than would result from histogram equalization. Optimal treatment of complex distributions would require an iterative procedure. Kohonen’s procedure (1989) for creating self-organizing neural maps might be appropriate; intriguingly, this procedure has been shown to generate a set of spacings consistent with the cube-root pleistochrome (eqn 5.3) in the one-dimensional case (Ritter et al. 1989), even though it makes no attempt to evaluate or to minimize error in the representation of the input.
Benefits of split-range coding: something for nothing Our discussion of optimum response non-linearity has not yet yielded the promised rationale for opponent codes. On the contrary, the sigmoidal nature of the pleistochrome, with its steepest gradient at or near white, is quite incompatible with a null response to white in a system restricted to non-negative signals. But when more than one neuron is available 7 Even when independence does not hold, any single-peaked input probability density distribution can be mapped into a radially symmetrical one (though not, in this case a Gaussian one) by some continuous one-toone deformation of the input space, and here, just as in the case of independence, free choice of orientation of the coordinate system becomes possible. Failure of independence is, nevertheless, helpful for defining the best direction of the coordinate system for encoding the stimulus set, because the mapping of a distribution that violates independence into a radially symmetric one is more constrained than the mapping to (u, v) in the case of an independent distribution: independence is preserved under separate arbitrary transformations of u and v. Thus it is easier to find a coordinate frame satisfying independence, than it is to find one yielding radial symmetry in a case where independence is not possible; and, by the same token, a successful choice is then less likely to be unique.
170
colour perception
to represent a single stimulus dimension, new opportunities for coding are introduced. The non-opponent sigmoidal pleistochrome that optimizes encoding by a single neuron has an opponent counterpart when encoding is done by a pair of neurons. Two rectifying neurons, one red-excitatory and the other green-excitatory, and each with a purely compressive non-linearity, can represent opposite halves of the red–green stimulus continuum with increases in firing rate, and with null responses to greenish or to reddish stimuli, respectively. As Fig. 5.5 shows, such a representation can be almost equivalent to that produced by a single neuron with sigmoidal non-linearity (Marr 1974). The responses of the two neurons (upper open circles and squares, Fig. 5.5) correspond to the two halves of the single-neuron pleistochrome sigmoid (plain curve, Fig. 5.5), but with the left half flipped up so that the response gradients for the neuron responding in the left half of the stimulus range are simply reversed. This, however, uses only the upper half the output range of each neuron. If the full output range is used, the gradients of the response functions are doubled everywhere (filled circles and squares, Fig. 5.5). By using two neurons in this way, the visual system can double the precision in its representation of the input in the presence of output noise, since the gradients of the individual response functions have been doubled and the equivalent input noise thereby halved. If, alternatively, the two neurons had each been endowed with the same sigmoidal non-linearity that is optimal for single neurons, then averaging of their signals (on the generous assumption of independent noise) would have led to an improvement of only a
Response
1.0
0.5
0.0
–0.6
–0.4 –0.2 0.0 0.2 Stimulus value (Log10(b))
0.4
Figure 5.5 From single-neuron pleistochrome (continuous sigmoidal curve, from Fig. 5.3) to two-neuron split-range opponent code. Circles and squares show, for two rectifying neurons, how using the full output range for each neuron to cover only half the full input range (‘split-range’ code) allows doubled response function gradients. Upper open circles and squares use only the upper half of the output range, with no gain in efficiency over the original sigmoidal neural response function. But using the full response range (filled circles and squares) allows a two-fold vertical expansion, hence doubled differential sensitivity. Dashed curves (with open circles and squares) show how the full-range response functions indicated by the filled symbols are modified when Poisson noise and spontaneous activity are assumed.
optimal opponent codes for natural colours
171
factor of the square root of two in average error. Thus the net benefit of adopting the ‘splitrange’ code (as opposed to the alternative of similar neurons operating in parallel with optimal non-linearity) is a square root of two reduction of average error. The spontaneous activity shown by real neurons without stimulation, or on presentation of the null stimulus, entails some reduction in efficiency within the present framework, but if the spontaneous activity is a small fraction of the maximum firing rate, most of the advantage of splitrange coding is preserved. Parenthetically, we note that reserving high firing rates for unusual stimuli may have other important advantages as well: it usefully facilitates selective response to unusual inputs (Barlow 1972; Field 1987), and by reducing average firing rate and neurotransmitter release it lowers the metabolic cost of perception. With Poisson noise, low spontaneous activity nmin ≪ n(x), and a smoothly peaked g ′ (x), the split-range code with Equation 5.5 leads to a threshold-like, approximately quadratic increase in firing rate with stimulus value (here |x −x0 |) as x moves away from the null stimulus value x0 (dashed curves, Fig. 5.5). In conjunction with the output-dependent noise, the effect of this is to make discrimination relatively keen but constant in that neighbourhood. Here we have an intriguing possible ecological justification for threshold non-linearity: its role could be not to make the effective precision of the neural representation non-uniform over the relevant part of the input range, but to make it uniform. The split-range code may be viewed as a step from a purely analogue representation (the sigmoidal single-neuron pleistochrome) to a hybrid, analogue–digital one. Although the benefits of taking that step may seem intuitively surprising—because in adopting the split-range code, the visual system appears to get something for nothing—the process could be taken further, with still greater ensuing benefits. In a more fully digital encoding of a stimulus dimension, a set of N neurons, each with m reliably distinct outputs, can represent m N different stimulus levels by allocating successive digits of the digital representation to the different neurons. This compares with just mN distinguishable levels (in the simplest case) for a split-range code where the input range is divided into N segments, each spanned by the graded firing range of one neuron, or with mN 1/2 for a parallel averaging of signals from neurons with identical response functions (Fig. 5.6a). But the fully digital representation is dangerous and difficult to implement. Because individual neural outputs depend discontinuously (in a sawtooth manner) on the input value, it creates a risk of large errors.8 A split-range code with N > 2 (Fig. 5.6b) is not subject to any such problems—it allows a simple centre-of-gravity or weighted sum of neural firing rates to represent the stimulus value. This may be the only biologically plausible alternative to the simpler dual-opponent-neuron encoding scheme (N = 2) that is shown in Fig. 5.5 and that is usually taken to be representative of physiological findings in colour vision (DeValois and DeValois 1975; Derrington et al. 1984). But there seems as yet to be no clear evidence for the staggered arrangement of null planes and response functions among different neurons that this scheme would require. 8 Encoding schemes known as Gray codes (Savage 1997) avoid this discontinuity by replacing the sawtooth function of Fig. 5.6a with a symmetrical triangle function, but since these schemes still require the individual digit values to vary in a non-monotonic manner with the stimulus value, they still pose formidable difficulties for a biological system in both encoding and readout.
172
colour perception 6 Response
(a)
4 2 0
0
10
20
30
40
0
10
20 Stimulus value
30
40
1 Response
(b)
0. 5
0
Figure 5.6 Other candidate multi-neuron encoding schemes. (a) Digital encoding, with one neuron (filled circles) for the more significant digit of the stimulus value, and a second neuron (open circles) for a second digit. (b) Split-range encoding with N > 2; multiple neurons (pluses, circles, crosses) have monotonic, but staggered, response functions.
Comparison of colour appearance and discrimination with predictions based on pleistochrome Appearance If the split-range colour opponent code is designed for optimal characterization of natural colours, and if the null stimulus in the chromaticity diagram is the subjectively achromatic white, typical natural colours should be nearly white. This is of course roughly correct, but, as shown in Fig. 5.7, the prediction is not fulfilled exactly. Typical natural colours, at least in the chosen environments, where vegetation tends to be predominant, are greenish and yellowish. The constant luminance chromaticity diagram of Fig. 5.7 has axes r = L/(L +M ) and b = S/(L + M ), which are closely related to the inputs to different classes of colour opponent neurons in the lateral geniculate nucleus under adaptation to equal-energy white (Derrington et al. 1984). The contour map is for Ruderman et al.’s distribution of colours of natural surfaces, under D65 illumination, which simulates a slightly overcast daylight. The equal-energy white stimulus is at r = 0.70, b = 1.0, and the heavy straight line, r = 0.723 − 0.0325b, shows the approximate locus of colours that are subjectively neither reddish nor greenish when presented in the dark (Larimer et al. 1974). Clearly, the centroid of the distribution of natural colours is displaced toward yellow (low b) and toward green (low r) from the subjective neutral point.9 9 In Brown’s sample of haphazardly selected colours, the mean is again yellowish, but is reddish rather than greenish: leaves form a dense concentration in the green, but their influence is outweighed in the case of the r axis by the inclusion of many reddish fruits and flowers, leading to mean (r, b) coordinates under D65 of (0.715, 0.636) as compared with (0.691, 0.750) for the complete images of Ruderman et al.
optimal opponent codes for natural colours
173
1.2
b
1
0.8
0.6
0.4 0.67
0.68
0.69
0.7
0.71
0.72
r Figure 5.7 Constant luminance chromaticity diagram, with axes r = L/(L + M) and b = S/(L + M), with contours of Ruderman et al.’s distribution of colours of natural surfaces under D65 illumination. The equal-energy white plots at r = 0.70, b = 1.0. The straight line shows the locus of colours that are subjectively neither reddish nor greenish. The grid shows equally spaced constant-response contours for a two-neuron non-linear code optimized for the natural colour distribution.
There are a number of more or less plausible post hoc rationalizations for the somewhat unexpected placement of the white point. First, although most natural surfaces are yellowish, the sky is bluish. If equilibrium hue loci are adaptively fixed by the average input, the blue of the sky might act as a massive low-r and high-b counterweight to shift the environmental mean substantially from the mean of surface colours. Secondly, the location of the white point could be a compromise between optimizing discrimination for the most frequent surface colours (which, if the images of Ruderman et al. are typical, would put it in the part of colour space we actually identify as yellow-green) and preserving some discrimination for saturated blue and red surfaces. Even if this choice is not optimal by the unweighted least-squared-error criterion, it could be appropriate if saturated colours (other than the greens) tend to have greater than average biological importance. Osorio and Bossomaier (1992) suggest that discrimination among the greens of vegetation is not particularly important, whereas discrimination of reddish fruits from vegetation is. A null point, with optimal discrimination, near white might usefully promote those discriminations at the expense of the less important ones. As noted above, the orthogonal vectors in terms of which the natural colours in Fig. 5.7 can be represented with least average error by neurons with limited range, are those that best approximate the independence relation p(x, y) = p(x)p(y). The principal component vectors are therefore good candidates, but owing to the negative correlation (−0.18) between r and b in Fig. 5.7, the principal component coordinate frame in Fig. 5.7 is tilted anticlockwise from the (r, b) frame. The red–green equilibrium axis of the Hering opponent scheme,
174
colour perception
which connects colours that are neither reddish nor greenish, is likewise tilted anticlockwise (straight line, Fig. 5.7). This raises the possibility that the subjective redness of violets (located above the red–green equilibrium locus near the middle of Fig. 5.7), and the associated near-circularity of the spectrum in phenomenal colour space, is needed for optimal discrimination among natural colours.10 The linear decorrelation principle does not, however, predict the red–green equilibrium locus exactly, as the experimental red–green equilibrium axis is tilted nearly five times more than the principal component direction for the natural stimulus distribution in Fig. 5.7, and about twice as much as the principal component direction for Brown’s similar data. As Fig. 5.7 shows, it is also somewhat more tilted than the system of curved constant-response contours generated by the non-linear algorithm of p. 168, but here the correspondence is closer. Discrimination We have suggested that the opponent split-range code has evolved to minimize errors in colour perception, through the adoption of an optimally designed non-linearity in neural response. That proposal leads to an ecologically based, but quite definite, quantitative prediction for the discriminability of stimuli that differ in colour or intensity. If neural nonlinearity is designed to minimize average error in discrimination, discrimination thresholds should be inversely proportional to g ′ (x) in Equation 5.4, and hence to the cube root of the natural probability density function p(x). This prediction links discrimination thresholds to the form of the distribution of natural colours that the system has evolved to deal with, without reference to known or estimated physiological characteristics of the visual system.11 Data for evaluating this prediction are available. The mean stimulus difference needed to make a test stimulus just noticeably different from a standard depends upon the colour or intensity of the standard (Krauskopf and Gegenfurtner 1992; Miyahara et al. 1993). In typical experiments the test and standard stimuli are intensive or chromatic modulations that appear in separate regions within a steady, generally white adapting field. Discrimination is most acute if the standard and test are both very close in colour and intensity to the adapting white; and quite small differences of the standard from white seriously impair the precision of the comparison. Current experiments by Leonova (in preparation) quantify this for difference directions in colour space, using as a metric for stimulus differences the cone contrast between the standard stimulus and the adapting white. For achromatic contrasts and achromatic intensity differences between test and standard, comparison error 10 These phenomena of colour appearance result from an alliance of the short-wavelength cones with the long-wavelength cones in the psychophysically defined red–green opponent system. No such an alliance is found in the cells of the lateral geniculate nucleus (LGN) (Derrington et al. 1984). The LGN therefore embodies nonoptimal—because correlated—signals. In the LGN representation, it is blue colours that most strongly polarize the M–L (‘red–green’) colour opponent signal in the M (‘green’) direction and are the most likely to overload it; in certain multiple-stage models of the colour system (Müller 1924; DeValois and DeValois 1993), this tendency is counteracted in the third and final stage by a short-wave cone input, synergistic with the long-wave cones and antagonistic to the midspectral cones. 11 Conveniently enough, as noted on p. 165, non-uniformity in output noise (embodied in the unknown factor [σy ∗ (x)]2 in Equation 5.3) does not affect this prediction at all, provided that the non-linearity g (x) that couples the stimulus x to the response y is indeed optimized for the prevailing dependence of σy ∗ (x) on y and x.
optimal opponent codes for natural colours
175
is doubled for a standard contrast of about 20% ; for isoluminant yellow–blue differences the cone contrast (for S cones in this case) at which error is doubled is again about 20% . But in the case of red–green isoluminant stimuli a standard L cone contrast of only 2% is enough to double the mean comparison error (Fig. 5.8a). If the visual system adopts the encoding principle of the pleistochrome, we would therefore expect to find the probability density function p(r) for natural colours dropping to one-eighth of its peak value at r values that give a contrast of 2% with white. To a rough approximation, this prediction is borne out: the distribution of Ruderman et al. is indeed extremely narrow in the red–green direction, although it is about 25% wider than would be required for a best fit to the data of Fig. 5.8. Further, since discriminations in luminance, or in the blueness-related chromaticity coordinate b, are both maintained for a range of standard colours an order of magnitude greater than are discriminations in r, the distribution of environmental colours for these coordinates should be an order of magnitude or so broader than for r. And indeed it is wider, by an order of magnitude or more (see p. 158). These comparisons suggest that the operating ranges of the various relevant neurons are fairly well matched to the very diverse distributions of environmental inputs that they have to represent. By encoding the r dimension of colour space with a particularly steep and narrow sigmoid (with halved discrimination at a deviation of 2% from the null value of r)—one that is matched to the small environmental range in r—the system is able to make correspondingly finer discriminations within that narrow range. It would be disastrous, however, to encode only the same narrow range for b as for r, since this would result in almost incessant overloading of the blueness signal, with a purely categorical classification of the visual scene into intensely bluish or yellowish colours. In fact, however, the environmental distributions are somewhat broader than would be expected for strict consistency with the pleistochrome principle. As noted, the discrepancy is not large for r, at least if the Ruderman et al. scene statistics are adopted in making the comparison. But for b the environmental distributions are at least twice as broad, and for luminance, several times broader than expected on the basis of the discrimination data.12 Why should the operating range of the visual system be narrower than ‘optimal’ in this way? Adaptation, local contrast, and the pleistochrome One answer points to a deficiency in our theoretical framework: we have thus far neglected visual adaptation. Prior to the extraction of a colour opponent signal, the cone photoreceptors individually take up a sensitivity inversely related to their short-term average intensity of stimulation (MacLeod et al. 1992; Webster, Chapter 2 this volume). For this and other reasons, retinally stable images fade in perception (Ditchburn 1973). Evidently, the visual signals that support vision are the transients generated as our eyes play over spatial gradients 12 The variation in luminance is greater than the variation in r by a factor of 15 (Brown 1994) or as much as 60 (Ruderman et al. 1998), instead of the factor of 10 or so expected from the discrimination data. This exacerbates the slight deviation from prediction that we noted in the analysis of red–green discrimination data. It is not clear which of the two divergent estimates of the ratio of achromatic to chromatic environmental variance is to be preferred. The ratio in the whole scene-based data could be inflated by the uncompensated effects of local variation in illumination within the scene (Brown 1994). On the other hand, Brown doubtless favoured vividly coloured objects in selecting his samples.
(a) 0.009
0.69
0.71 0.7
0.008
r
0.007 0.006 0.005 0.004 0.003 0.002
0.7
r
0.001 0.000 0.64 0.66 0.68 0.70 0.72 0.74 0.76 r ∆r
(b)
r0–
0.7 r0+
r
Threshold ∆r ~ dr/dN dr/dN = |(r-r0)| Neural signal
N = a ln|(r-r0)| N
0 Figure 5.8 (a) Circles, with straight lines fit, show r, the difference in r = L/(L + M) just sufficient for 84% correct discrimination between isoluminant test and standard stimuli, as a function of value of r for the standard stimulus. The surround was an equal-energy white, for which r = 0.70; hence abscissa values of 0.707 and 0.693 correspond to a 1% L cone contrast between standard and background. Dashed lines show predictions for the extreme cases of linear encoding (horizontal dashed line) and for step-function encoding (steep dashed line). The inset illustrates these encoding schemes. (b) Non-linear response functions generated by the data of (a) on the basis of Fechner’s integration (effectively assuming fixed output noise). The required value of the half-gradient chromaticity r0 is roughly 2% in L cone contrast.
optimal opponent codes for natural colours
177
in the image, and retinal physiology confirms that it is these spatial differences, rather than absolute colour and luminance at a point, that are encoded in the output from the retina. It is therefore relevant to consider the distribution of the spatial differences in cone excitation across a neighbourhood small enough to be traversed by the gaze of a fixating observer. In the images of Ruderman et al., the differences in luminance, in b, and in r between adjacent 3′ -arc pixels, are suited for this purpose. Because the excitations for adjacent pixels are correlated, the distributions of those differences are considerably tighter than those for the absolute values, with standard deviations of 30%, 13%, and 0.6% respectively. To achieve a maximally precise representation of local contrast, the visual system can therefore advantageously employ a correspondingly narrow, but dynamically shifting operating range (Craik 1938) for each cone type. This entails a roving null point, rather than a fixed one, for the colour-opponent code (Thornton and Pugh 1983; Krauskopf and Gegenfurtner 1992).13 The advantage of a roving null point is that the range of input values spanned by the neural response functions can be more restricted—it need only be wide enough to capture the relatively small deviations in the stimulus values from their time- and space-varying adapting levels—and the precision with which those values can be represented then becomes correspondingly greater. We have generated local-contrast pleistochromes from the images of Ruderman et al. on this basis. These are the contrast–response functions that lead to least error in the representation of pixelwise spatial differences. They are roughly consistent with the psychophysical results in the case of the chromatic variables. But for luminance, the contrast operating range implicit in the discrimination results remains several times narrower than the theoretically optimal one. A second limitation in our initial framework, also connected with the role of adaptation, may account for this remaining discrepancy for luminance. We have taken for granted that the purpose of colour and lightness vision is to represent colours and lightnesses with the least possible error and to allow these attributes of a surface to be estimated as precisely as possible. But of course, differences in lightness and colour are also indispensable for the detection of spatial features (Morgan et al. 1992). Precision in the representation of surface elements that are already recognizably distinct is not useful for that purpose. All that edge detection requires is that local contrasts should be detected with the greatest possible sensitivity. For this purpose, the ideal encoding scheme is an all-or-none or categorical one, with a step-function non-linearity at a small threshold offset from the adapting background stimulus, as in the inset to Fig. 5.8a. The large, all-or-none spatial contrast signal 13 An important problem for a system operating in this way is: how can the spatial and temporal differential signals be used to construct a precise representation of colour in absolute terms? This is discussed in other chapters and elsewhere (Land 1964; Arend 1973). The influence of non-linearity of the code in this context has not been much considered, but has been explored in current experiments by Brown, Leonova, and MacLeod that use nonuniform surrounds to diagnose the non-linearity in the contrast response (Brown and MacLeod 1991, 1997). Here we note only that to construct a metric representation of colour and brightness on the basis of spatio-temporal contrast signals generated at borders, those signals must themselves have a metrically meaningful dependence on border contrast, rather than being all-or-none. Our analysis of optimal non-linearity thus remains applicable, requiring only the modification considered here—that it is error in the representation of spatial contrast that must be minimized.
178
colour perception
resists obliteration by fluctuations in the output, and the graded response of the pleistochrome is not needed. Visual non-linearity more step-like than the pleistochrome could therefore reflect a compromise in design between the conflicting requirements of surface identification and characterization, on the one hand, and detection of spatial features, on the other. If we adopt the common view that the luminance system (or in physiological terms, the magnocellular pathway) is more concerned with form and with detection of spatial structure than are the chromatic ones (e.g. Boynton et al. 1977; Gregory and Heard 1979; Livingstone and Hubel 1987; see also below), the unexpectedly abrupt non-linear saturation of the psychophysical signal for luminance contrast could have this as its functional justification.
Anisotropy of colour space Thus far we have considered only how discriminative power is allocated (or should be allocated) to the different parts of each stimulus continuum, but our ecological framework also raises issues concerning the relative precision with which different stimulus dimensions are represented. If the same number of opponent neurons, with the same output range and variability, are used to encode the three dimensions of colour space, discrimination errors along each dimension will be scaled in the same way as the total operating range for that dimension. Thus, the mean error will be about tenfold less for r than for b or for luminance. To a very rough approximation, the data support this: Leonova’s lowest threshold values are 0.8% for luminance, 0.1% for r, and 3% for b. Red–green contrasts are indeed ‘what the eye sees best’ (Chaparro et al. 1993)—but not in the sense that natural environments provide more detectable differences for the red–green dimension than for luminance. Rather, the sensitivity difference is as expected for two otherwise comparable systems, limited by output noise, that have two very different contrast responses—response functions matched in each case to the very different range of environmental inputs and spanning that range with about the same number of reliably distinguishable signal levels. But the visual system invests less in discrimination for b than for the other two dimensions. The neural investment required to represent a stimulus continuum depends on the operating range as well as on the discrimination threshold. It would be costly, for instance, to achieve the same precision in estimates of b as of r, while using a tenfold shallower response function to span the tenfold greater range of natural inputs. That would require 10 times as many ‘slices’ for b as for r, and hence the investment of 100 times as many neurons, if signal-to-noise ratio increases as the square root of the number of samples. In fact the visual system, with 30-fold better discrimination for r than for b, makes three times as many slices in the r as in the b direction, as if by allocating nine times more neurons to r.14
14 Considerations of spatial resolution also dictate a reduced number of S cones and of ‘yellow–blue’ postreceptoral cells that take input from the S cones. Here, too, what makes the sparseness of S cones a good design choice is their own spectral isolation at the short-wavelength end of the spectrum, where chromatic aberration prevents them from receiving a sharp image (Boynton 1980).
optimal opponent codes for natural colours
179
Comparison with physiological input–output functions: variations on a theme of Fechner Much of our discussion makes use of ideas traceable to G. T. Fechner (Fechner 1860), who postulated a non-linear relationship between stimulus magnitude and sensory effect to explain non-uniformity in discrimination thresholds. We next apply a Fechnerian construction to derive non-linear opponent codes for lightness and colour from discrimination data, such as those of Fig. 5.8a, and to compare the result with physiological data on the response functions of single neurons in the optic nerve. In Fig. 5.8a, predictions for the two extreme cases that were introduced in Fig. 5.3 are illustrated for comparison with the data. A linear code predicts uniform precision of discrimination (horizontal dashed line). An all-or-none response, that distinguishes sharply between reddish and greenish colours but makes no distinction among the colours of each category, permits standards of any redness to be distinguished only from greenish tests, and vice versa; hence the threshold r = |r − r ∗ |, where r ∗ is the colour category boundary. In Fig. 5.8 the steep dashed V illustrates this prediction, assuming a category boundary at r ∗ = 0.7 (the value for white). Neither of these extreme models describes the data well; the condition for discrimination is neither constant nor as abruptly standard-dependent as the step non-linearity would require. Instead, the linear increase in threshold on each side of the white point suggests, by a straightforward extension of Fechner’s argument to the colour domain, a logarithmic compression of each of the two colour-opponent neural signals that form the split-range code (Fig. 5.8b). The linear variation of the discrimination threshold with r on each side of the null point r = 0.7 in Fig. 5.8, with an abscissa intercept at r0 , leads to a response-intensity function of the form: N = ln |(r − r0 )|
(5.6)
where r0 has a value of about 0.714 for the ‘green’ response (applicable for r < 0.7) and 0.68 for the ‘red’ response. The value obtained by reflecting r0 around the null point, r = 0.7 + (0.7 − r0 ), is the value of r associated with a doubling of threshold, or a halving of differential sensitivity. This condition occurs at an L cone contrast of about 2% with respect to the null white stimulus in each case. By reversing the argument that led to Equation 5.4, one can then ask: for what distribution of environmental inputs is the Weber law discrimination function—and the logarithmic response non-linearity of Equation 5.6—optimal?15 The answer is p(x) = pmax /(1 + |(x/x0 )|)3 . This function does fit tolerably well the central core of the distribution of local contrast in the images of Ruderman et al. Whether we accept Fechner’s integration or not, the need to perceptually reconstruct values distributed in this way adds a new functional rationale for Weber’s law for contrast. 15 Unfortunately the logarithmic non-linearity of Equation 5.6 and Fig. 5.8b has no strict response limit, so the assumptions underlying Equations 5.3 and 5.4 are not applicable. Modifications of that framework can, however accommodate the Fechnerian non-linearity. It might be desired, for instance, to keep the firing rate below some practical limit, even though the logarithmic function does not entail such a limit. Or one could consider a revised optimization criterion: to minimize average error within the constraint of a given average response to all stimuli; we find that this yields optimal prescriptions only subtly different from the pleistochrome of Equation 5.4.
180
colour perception
The non-linearity implied by Equation 5.6 with the experimentally determined parameter values is quite severe. The gradient of the response function, assumed to be directly proportional to differential sensitivity, is halved at a cone contrast of 2% . No physiological data suggest quite so severely compressed a response function for responses to chromatic stimuli: half-saturation L cone contrasts of around 10% appear to be more typical, for the red–green sensitive P cells of the parvocellular stream (Lee et al. 1990). Thus, although the psychophysically estimated visual operating range is efficiently matched to the range of environmental inputs, the physiological one apparently is not. This would not have alarmed Fechner, who located the logarithmic compression at the brain–mind interface and assumed that physiological processing would be linear. Even if we reject this particular reconciliation of non-linear psychophysics with (relatively) linear retinal physiology as metaphysically unsound, it remains possible that later stages of processing could be implementing the severe logarithmic contrast compression that Fechner’s integration entails. But the theoretical link between discrimination and physiological non-linearity is uncertain for at least two other reasons. First, physiological measures of half-saturating contrast may be made with stimuli that are not optimal in spatiotemporal structure. This leads to underestimation of half-saturating contrast, because the non-linearity of retinal ganglion cells is a function of response rather than of the stimulus contrast per se. Secondly, in applying Fechner’s integration to estimate hypothetical neural response functions, we assume that just detectable differences correspond to equal differences in the neural signal. This amounts to assuming fixed output noise, whereas physiological observation suggests instead that the standard deviation in firing rate increases with mean rate. In some experiments the increase is almost linear with the square root of the mean, as expected for a Poisson process (Tolhurst et al. 1983; Levine et al. 1988; Levine et al. 1992); in others, the increase in variability is small, and the fixed-noise idealization of cell behaviour is more appropriate (Croner et al. 1993). If we choose to model neural firing with a Poisson process rather than a fixed-noise assumption, then the compressive non-linearity required to model the red– green discrimination data becomes far less severe; spontaneous activity rate now becomes an important free parameter, but, with plausible estimates of that, the r0 value needed to model the data becomes an order of magnitude greater than in the fixed-noise analysis—more than enough to match roughly the physiologically measured non-linearity. Psychophysical chromatic discrimination data and (P cell) retinal physiological data are therefore consistent after all, if the assumptions made about neural noise are tailored for a good fit between them. Fortunately the theoretical connection between discrimination and the distribution of environmental inputs, explored above, is not subject to these uncertainties. Turning to the achromatic axis of colour space, we saw on p. 177 that the psychophysical operating range in cone contrast along that axis is some tenfold greater than along the red–green axis, and that the ratio of the dispersions of the environmental inputs is at least that large. If physiologically documented non-linearities were consistent with visual performance, on the one hand, and with the stimulus statistics on the other, the cells mediating judgements of achromatic intensity should likewise have a tenfold greater contrast range than those mediating red–green sensitivity. But which are these cells that construct the achromatic axis of colour appearance? A popular answer would be: the M cells of the magnocellular pathway (Livingstone and Hubel 1987). However, these cells saturate at
optimal opponent codes for natural colours
181
very low achromatic cone contrasts, with half-saturation values of around 5%, even lower apparently than the 10% L cone contrast value quoted for red–green P cells (Kaplan and Shapley 1986; Lee et al. 1990; Wachtler et al. 1996). The M cells, then, deviate more than tenfold from the optimal behaviour embodied in the pleistochrome, and, as a result, could not support the observed keen discrimination between test patches with relatively high achromatic contrast relative to their surrounds. It is therefore likely that the M cells are not responsible for representing the achromatic attributes of surfaces in a continuous fashion, but serve instead as all-or-none detectors of spatial contrast in the sense discussed on p. 178. As noted there, a very rapidly saturating contrast-response function suits them well for such a role. The metric representation of the achromatic axis could be the job of the colour system (Allman and Zucker 1990). In support of this idea, physiological investigations such as those cited above have shown that the P cells have an almost linear response to achromatic contrast, consistent with the pleistochrome for achromatic inputs.
Concluding summary We have shown how visual non-linearity can be optimized for the precise representation of environmental inputs. Such optimization leads to the adoption of opponent split-range codes, and the recognition of this provides a new ecological justification for such codes. The key points in our account are: • Non-linearly compressed neural signals are needed in order to form the most precise
representation of stimulus values from a peaked frequency distribution, using neurons of limited response range (p. 159).
• The optimal form for the non-linear response function (the pleistochrome) can be
determined given the distribution of inputs (p. 161).
• The treatment can be extended to multidimensional stimulus domains, notably to colour
space (p. 168).
• When a single stimulus dimension can be represented by more than one neuron, a dual
opponent or ‘split range’ code, of the type familiar from the physiology of colour vision, is much more efficient that the optimal single-neuron code (p. 170). • Some aspects of the phenomenology of colour vision, and of data on colour discrimination, are understandable on the assumption that the relevant neural codes have been selected for minimizing error in the perceptual estimation of stimulus parameters for natural colours (p. 172). In particular, for different dimensions of colour space the psychophysically assessed operating range of the visual system spans a range well matched to the environmental distribution of natural colours. • Physiological data from the parvocellular pathway are also roughly consistent with the
idea that these cells are optimized for the precise metric representation of colour (and perhaps lightness). But cells in the magnocellular pathway have a much stronger than optimal saturating non-linearity, and this supports the view that their function is mainly to detect boundaries rather than to specify contrast or lightness (p. 179).
182
colour perception
Acknowledgements We are grateful to Richard Brown, Anya Leonova, Dan Ruderman, Tom Cronin, and C. C. Chiao for permission to use the data we cite from them (some of it not yet fully published). For useful conversations or comments we thank Richard Brown, Anya Leonova, Dan Ruderman, Simon Laughlin, Markus Meister, and Harvey Smallman. Supported by NIH grant EY01711.
References Allman, J. and Zucker, S. (1990). Cytochrome oxidase and functional coding in primate striate cortex: A hypothesis. Cold Spring Harbor Symposium on Quantitative Biology 55, 979–982. Arend, L. E. (1973). Spatial differential and integral operation in human vision: Implications of stabilized retinal image fading. Psychological Review 80, 374–395. Atick, J. J., Li, Z. P., and Redlich, A. N. (1992). Understanding retinal color coding from 1st principles. Neural Computation 4, 559–572. Barlow, H. B. (1965). Optic nerve impulses and Weber’s law. Cold Spring Harbor Symposium on Quantitative Biology 30, 539–546. Barlow, H. B. (1972). Single units and sensation: A neuron doctrine for perceptual psychology? Perception 1, 371–394. Barlow, H. B., Levick, W. R., and Yoon, M. (1971). Responses to single quanta of light in retinal ganglion cells of the cat. Vision Research, Suppl. 3, 87–101. Baylor, D. A. (1987). Photoreceptor signals and vision. Proctor Lecture. Investigative Ophthalmology of Visual Science 28, 34–49. Bell, A. J. and Sejnowski,T. J. (1997). The ‘independent components’ of natural scenes are edge filters. Vision Research 37, 3327–3338. Bialek, W. and Rieke, F. (1992). Reliability and information transmission in spiking neurons. Trends in Neuroscience 15, 428–434. Boynton, R. M. (1980). Design for an eye. In Neural mechanisms in behavior, (ed. D. McFadden). SpringerVerlag, Berlin. Boynton, R. M. and Kambe, N. (1980). Chromatic difference steps of moderate size measured along theoretically critical axes. Color Research and Application 5, 13–23. Boynton, R. M., Hayhoe, M. M., and MacLeod, D. I. A. (1977). The gap effect: Chromatic and achromatic visual discrimination as affected by field separation. Optica Acta 24, 159–177. Brindley, G. S. (1960). Physiology of the retina and the visual pathway. Edward Arnold, London. Brown, R. O. (1994). The world is not grey. Investigative Opthalmology of Visual Science, Suppl. 35, 2165. Brown, R. O. and MacLeod, D. I. A. (1991). Induction and constancy for color saturation and achromatic contrast variance. Investigative Opthalmology of Visual Science, Suppl. 32, 1214. Brown, R. O. and MacLeod, D. I. (1997). Color appearance depends on the variance of surround colors. Current Biology 7, 844–849. Buchsbaum, G. and Gottschalk, A. (1983). Trichromacy, opponent colours coding and optimum colour information transmission in the retina. Proceedings of the Royal Society of London B 220, 89–113. Chaparro, A., Stromeyer, C. F. III, Huang, E. P., Kronauer, R. E., and Eskew, R. T., Jr (1993). Colour is what the eye sees best. Nature 361, 348–350.
optimal opponent codes for natural colours
183
Craik, K. J. (1938). The effect of adaptation on differential brightness discrimination. Journal of Physiology (London) 92, 406–421. Croner, L. J., Purpura, K., and Kaplan, E. (1993). Response variability in retinal ganglion cells of primates. Proceedings of the National Academy of Sciences, USA 90, 8128–8130. Derrington, A. M., Krauskopf, J., and Lennie, P. (1984). Chromatic mechanisms in lateral geniculate nucleus of macaque. Journal of Physiology (London), 357, 241–265. DeValois, R. L. and DeValois, K. K. (1975). Neural coding of color. In Handbook of perception, Vol. 5, (ed. E. D. Carterette and M. P. Friedman), pp. 117–166. Academic Press, New York. DeValois, R. L. and DeValois, K. K. (1993). A multi-stage color model. Vision Research 33, 1053–1065. Ditchburn, R. W. (1973). Eye-movements and visual perception. Clarendon Press, Oxford. Donner, K. (1992). Noise and the absolute thresholds of cone and rod vision. Vision Research 32, 853–866. Eisner, A. and MacLeod, D. I. A. (1980). Blue sensitive cones do not contribute to luminance. Journal of the Optical Society of America 70, 121–123. Fechner, G. T. (1860). Elemente der Psychophysik. Breitkopf und Härtel, Leipzig. Field, D. J. (1987). Relations between the statistics of natural images and the response properties of cortical cells. Journal of the Optical Society of America A 4, 2379–2394. Fukurotani, K. (1982). Color information coding of horizontal cell responses in fish retina. Color Research Application 7, 146–148. Gregory, R. L. and Heard, P. (1979). Border locking and the Cafe Wall illusion. Perception 8, 365–380. Kaplan, E. and Shapley, R. M. (1986). The primate retina contains two types of ganglion cells, with high and low contrast sensitivity. Proceedings of the National Academy of Sciences, USA 83, 2755–2757. Kohonen, T. (1989). Self-organization and associative memory. Springer-Verlag, Berlin. Krauskopf, J. and Gegenfurtner, K. (1992). Color discrimination and adaptation. Vision Research 32, 2165–2175. Land, E. H. (1964). The Retinex. American Scientist 52, 247–264. Larimer, J., Krantz, D. H., and Cicerone, C. M. (1974). Opponent-process additivity–I. Red/green equilibria. Vision Research 14, 1127–1140. Laughlin, S. B. (1981). A simple coding procedure enhances a neuron’s information capacity. Z Naturforsch [C] 36, 910–912. Laughlin, S. B. (1983). Matching coding to scenes to enhance efficiency. In Biological processing of images, (ed. O. J. Braddick and A. C. Sleigh), pp. 42–52. Springer Verlag, Berlin. Lee, B. B., Pokorny, J., Smith, V. C., Martin, P. R., and Valberg, A. (1990). Luminance and chromatic modulation sensitivity of macaque ganglion cells and human observers. Journal of the Optical Society of America A 7, 2223–2236. Lee, B. B., Wehrhahn, C., Westheimer, G., and Kremers, J. (1993). Macaque ganglion cell responses to stimuli that elicit hyperacuity in man: detection of small displacements. Journal of Neuroscience 13, 1001–1009. Le Grand, Y. (1949). Les seuils differentiells de couleurs dans la theorie de Young. Revue d’Optique 28, 261–278. Lennie, P., Pokorny, J., and Smith, V. C. (1993). Luminance. Journal of the Optical Society of America A 10, 1283–1293. Levine, M. W., Zimmerman, R. P., and Carrion-Carire, V. (1988). Variability in responses of retinal ganglion cells. Journal of the Optical Society of America A 5, 593–597.
184
colour perception
Levine, M. W., Cleland, B. G., and Zimmerman, R. P. (1992). Variability of responses of cat retinal ganglion cells. Visual Neuroscience 8, 277–279. Livingstone, M. S. and Hubel, D. H. (1987). Psychophysical evidence for separate channels for the perception of form, color, movement, and depth. Journal of Neuroscience 7, 3416–3468. Luther, R. (1927). Aus dem Gebiet der Farbreizmetrik. Zeitschrift für Technische Physik 8, 540–558. MacLeod, D. I. A. and Boynton, R. M. (1979). Chromaticity diagram showing cone excitation by stimuli of equal luminance. Journal of the Optical Society of America 69, 1183–1186. MacLeod, D. I. A., Williams, D. R., and Makous, W. (1992). A visual nonlinearity fed by single cones. Vision Research 32, 347–63. Marr, D. (1974). The computation of lightness by the primate retina. Vision Research 14, 1377–1388. McMahon, M. J. and MacLeod, D. I. A. (1998). Dichromatic vision at high light levels: red/green discrimination using the blue-sensitive mechanism. Vision Research 38, 973–983. Miyahara, E., Smith, V. C., and Pokorny, J. (1993). How surrounds affect chromaticity discrimination. Journal of the Optical Society of America A 10, 545–553. Morgan, M. J., Adam, A., and Mollon, J. D. (1992). Dichromats detect colour-camouflaged objects that are not detected by trichromats. Proceedings of the Royal Society of London B 248, 291–295. Müller, G. E. (1924). Darstellung und Erklärung der verschiedenen Typen der Farbenblindheit. VandenhoeckRuprecht, Göttingen. Osorio, D. and Bossomaier, T. R. J. (1992). Human cone-pigment spectral sensitivities and the reflectances of natural surfaces. Biological Cybernetics 67, 217–222. Ritter, H. J., Martinetz, T. M., and Schulten, K. J. (1989). Topology-conserving maps for learning visuomotor-coordination. Neural Networks 2, 159–168. Ruderman, D. L., Cronin, T. W., and Chiao, C. C. (1998). Statistics of cone responses to natural images: Implications for visual coding. Journalof the Optical Society of America A 15, 2036–2045. Savage, C. (1997). A survey of combinational Gray codes. Siam Review 39, 605–629. Stockman, A., MacLeod, D. I. A., and Johnson, N. E. (1993). Spectral sensitivities of the human cones. Journal of the Optical Society of America A 10, 2491–2521. Thornton, J. E. and Pugh, E. N. Jr (1983). Red/Green color opponency at detection threshold. Science 219, 191–193. Tolhurst, D. J., Movshon, J. A., and Dean, A. F. (1983). The statistical reliability of signals in single neurons in cat and monkey visual cortex. Vision Research 23, 775–786. van des Twer, T. and MacLeod, D. I. A. (2001). Optimal nonlinear codes for the perception of natural colours. Network 12, 395–407. Wachtler, T., Wehrhahn, C., and Lee, B. B. (1996). A simple model of human foveal ganglion cell responses to hyperacuity stimuli. Journal of Computational Neuroscience 3, 73–82.
commentary: optimal opponent codes for natural colours
185
Commentary on MacLeod and von der Twer Thinking outside the black box Michael A. Webster One way to try to understand our colour perception is to ask what the actual properties of human colour vision are—for example, the number and selectivities of mechanisms measured by different tasks. Another way is to ask what properties we might expect our colour vision to have—that is, what design might best achieve a given goal. The chapter by MacLeod and von der Twer provides an exciting illustration of the insights that can be gained by combining these two approaches. The goal they examine is how to encode colour in a way that maintains optimal discrimination over the range of colours we encounter. This is a fundamental problem in perception because neurons are noisy and have limited dynamic range, so they can reliably signal only a small number of stimulus levels. How should we allocate these levels to different regions of colour space? A large part of the answer lies not in the observer but in the scenes they are viewing. Most points in the image have low contrast, so that the distribution of colours is sharply peaked around the mean. It would therefore make sense for the contrast response function to be steepest near the mean (so that the small differences we are frequently faced with can be distinguished accurately), while asymptoting at high contrasts (which are only rarely encountered). By formalizing this problem in terms of minimizing the error of representation, the authors show that it is possible to derive a precise prediction for the contrast response to colour. This work follows in the spirit of Laughlin’s analysis of contrast coding in the fly visual system (Laughlin 1987), although MacLeod and von der Twer pursue a different optimization rule that they show is more appropriate for a noise-limited system. One prediction of this analysis is that the dynamic range of different post-receptoral channels should be matched to the specific gamut of colour signals to which they are tuned. For example, because the spectral sensitivities of the L and M cones overlap, colour signals that depend on the difference between L and M activity are necessarily smaller than luminance signals that depend on their sum. Accordingly, an L − M chromatic channel should devote its capacity to a much narrower range of cone contrasts than a luminance (L + M) channel. Does the visual system actually do this? To complement their theoretical analyses, the authors examine empirical measurements of contrast coding. A well-established property of our colour perception is that we are best at discriminating small differences around the white point and become increasingly worse as the two colours to be distinguished are further removed from white. Intriguingly, they show that the rate of this saturation follows roughly the rate predicted by natural colour distributions. They also go on to show that the predicted response functions are roughly consistent with the measured dynamic range of neurons in the parvocellular layers of the LGN. Notably, luminance coding in the magnocellular pathway saturates too rapidly. This provides powerful support for the argument that it is the parvocellular pathway that encodes both the colour and the lightness of stimuli, while the luminance signals in the magnocellular system may instead be used to represent ‘non-colour’ properties such as edges and motion. A serious problem facing the visual system is that the environment varies, and thus what was optimal under one context will not be under others. As the authors note, this problem can be solved by adaptation. The chapter considers how retinal adaptation can adjust to changes in the average level of the distributions. However, adaptation can also adjust to the range or overall contrast of the distribution. While this also has a fundamental influence on contrast coding, the sensitivity changes underlying it arise mainly in the cortex, at sites subsequent to those they focus on. This raises the question of which visual levels are best revealed by their analysis and by psychophysical measurements of contrast coding. The output of the retina is a clear bottleneck in the visual system and thus an
186
colour perception
obvious candidate to consider. Yet it is not until the cortex that some properties predicted by the analysis emerge fully. One of these properties is rectification (while another is the representation of stimulus motion, to which the approach is generalized). Among the most powerful insights of their analysis is the demonstration that coding efficiency is enhanced by a ‘split-code’, in which separate neurons respond to increments and decrements. This is because each neuron can then devote its full response range to coding only half the input range, thereby halving the problem of noise. The work thus provides a theoretical justification for separate ‘on’ and ‘off ’ responses in the visual system. Rectified responses are more clearly evident in cortical cells because their lower spontaneous activity restricts responses to an increase in firing rates. The authors also suggest that there may be advantages to coding contrast in multiple channels, each with a limited dynamic range. In fact there is again some evidence for this in the cortex (Albrecht and Hamilton 1982). This leads to a fundamental question—to what extent are attributes such as contrast or saturation encoded in a different way than attributes such as hue? It is typical to assume that hue is represented by the distribution of responses across channels, while saturation is represented by the size of the channel responses. But if contrast itself is represented by multiple channels, then the contrast response function may begin to take on the flavour of a confidence rating, and not only a representation of stimulus intensity. The power of the approaches they develop are hard to overemphasize, and are part of a growing trend to understand visual coding by attending to properties of the visual environment. This perspective has yielded many recent successes, and promises many further avenues to pursue (Simoncelli and Olshausen 2001). In fact, if we can predict so well the properties of vision from the visual environment, it is tempting to suppose that we could also move in the opposite direction—to predict the physical world from our vision. The colour characteristics of different environments vary widely. Could we identify the colour distributions of the specific environment that shaped our colour vision—either through evolution or short-term adaptation—by measuring the precise response properties of our colour vision? As MacLeod and von der Twer emphasize, while their focus is on colour vision, the principles are general and thus apply to many aspects of visual coding, not only of simple stimulus attributes but perhaps also of very complex ones. Consider the problem of face perception. We are very good at distinguishing subtle differences between familiar faces, while comparatively poor at discriminating between faces that share less familiar features, a phenomenon known as the ‘other race’ effect. Could this effect be a form of saturation in the ‘contrast’ response for faces, a function that is optimized for discriminating differences from the average face? Could the architecture we uncover for colour vision turn out to provide a feasible model for the representation of higher-order attributes such as faces?
References Albrecht, D. G. and Hamilton, D. B. (1982). Striate cortex of cat and monkey: Contrast response function. Journal of Neurophysiology 48, 217–237. Laughlin, S. B. (1987). Form and function in retinal processing. Trends in Neuroscience 10, 478–483. Simoncelli, E. P. and Olshausen, B. A. (2001). Natural image statistics and neural representation. Annual Review of Neuroscience 24, 1193–1216.
chapter 6
OBJECTIVITY AND SUBJECTIVITY REVISITED: COLOUR AS A PSYCHOBIOLOGICAL PROPERTY gary hatfield
Preface The status of sensory qualities has been a topic in philosophy since the ancient Greeks. In the history of thought about colour, scientific theories were formed in relation to a background of philosophy, and philosophical theories often arose in conjunction with, or as a result of, scientific work (as in the cases of Aristotle, Ibn al-Haytham, Galileo, Descartes, Boyle, Locke, and Newton). Through the end of the nineteenth century it was usual for colour scientists, such as Helmholtz and Hering, to address the philosophical assumptions and implications of their work. They engaged such assumptions directly, and examined them with philosophical thoroughness. During the middle of the twentieth century philosophical and scientific thought about colour separated. Especially in the decades after 1950, philosophers offered ‘physicalist’ theories of colour without knowing much about the physics, physiology, and psychology of colour vision. Their work was often guided by what they imagined the ‘ordinary’ person would say. But this imagined ‘ordinary’ person usually advanced theses recognizable from previous, or old, philosophy and science. At the same time, the scientists came to believe that they could proceed without philosophy, or without themselves adopting philosophical assumptions. In any area of active science which moves at the border of the unknown, there is no such thing as doing science without philosophy. To attempt to do so simply means that one’s philosophical assumptions go unexamined. That may not cause much damage locally, but it can limit scientific imagination if one is stuck in old philosophy. It can be damaging in colour science if one’s philosophical assumptions, imbibed in the final decades of behaviourism and expressed through an unthinking commitment to physicalist reductionism, lead one to be suspicious of phenomenal experience and biological function, and hence of the very substance of colour vision. During the 1970s and 1980s philosophy of science, led by the subfields of philosophy of biology and philosophy of physics, but including philosophy of psychology as well, re-engaged the scientific literature. In philosophical theories about colour this meant coming to terms with the physics, physiology, and psychology (including the phenomenology) of colour vision. That was a good thing. However ubiquitous and ‘obvious’ colour experience may be, the basis of colour vision in the properties of objects, in the structure and functioning of the nervous system, and in the psychological processes of colour vision, is complex. ‘Ordinary’ assumptions about causation, propertyhood, and ‘how things are or must be’ cannot carry the day. In this area, as in many other areas of philosophy, one can’t make philosophical progress without knowing anything else, that is, without engaging with what others know about colour. That’s how it should be. Philosophy aims at generality, but must earn its broad perspective one step at a time, working from the bottom up, while keeping in sight the general vista it demands of itself. Workers in philosophy, as in other areas of the humanities, must earn their abstractions. They are fortunate to be able to do so through the pleasurable toil of understanding. G. Hatfield
188
colour perception
Introduction Philosophical theories of colour divide into three. There are the so-called objectivists, who argue that colour is a mind-independent property of objects. There are the subjectivists, who argue that colour is not a property of objects, but an internal state of the perceiver or the subjective content of a perceiver’s experience. And there are the relationalists, who argue that colour, considered as a property of objects, is a relational property; it is a property that surfaces and light sources have of causing experiences with various phenomenal characters in perceivers. These philosophical theories differ on the question of what colour is. Objectivists think of colour as a physical property, which is in principle independent of colour experience and visual perception. Subjectivists make colour experience primary in their conceptions of colour; indeed, they think that the notion of colour has primary reference only to visual experience. Relationalists also define colour in relation to colour experience; however, they are able to define colour as a property of the surfaces of objects by considering the relation between objects and colour experience. Proponents of all three positions marshal the available scientific evidence in their support. To support objectivism, Hilbert (1987, 1992) appeals to Maloney and Wandell’s (1986) analysis of colour constancy as an inference to the spectral reflectance distribution of a given surface (see Maloney, Chapter 9 this volume). The objective colours of things are equated with individual surface reflectance distributions. In arguing for a subjectivist position, Hardin (1988) points to facts of perceiver variability and variety in the physical causes of colour phenomena. He argues that because colour cannot be equated with a specific physical kind, colour experience is a (useful) illusion. The relationalist uses similar sorts of data to argue that colour as a property of objects is constituted by the fact that illuminated objects have a disposition to cause perceivers to experience colour visually (Johnston 1992; Campbell 1993; Harman 1996). Some relationalists appeal to a functional notion of colour perception, perhaps supplemented with data concerning inter-species differences, to argue that colour is a psychobiological property, and that a primary function of colour perception is discrimination among objects (Hatfield 1992; Thompson 1995). The frequent appeal to the facts of colour science in the philosophical colour literature is a good thing. It is an instance of the more general trend in philosophy of science to expect that the philosopher’s examples and arguments are responsive to actual scientific positions and to common features of scientific practice (see Hatfield 1995). At the same time, attention to scientific practice reveals that interesting questions at the forefronts of science typically are not resolved by a bare appeal to facts, but to facts in relation to a background of scientific theory and philosophical assumption. The same feature is present in philosophical debates on colour. The three major positions just named depend heavily on background understandings of theoretical terms from both science and philosophy; two of the most important are the terms ‘objective’ and ‘subjective’ themselves. In this chapter I will focus on the notion of colour as a property of the surfaces of objects. Examination of the arguments of the objectivists will help us understand how they seek to reduce colour to a physical property of object surfaces. Subjectivists, by contrast, seek to argue that no such reduction is possible, and hence that colour must be wholly subjective.
colour as a psychobiological property
189
I will argue that when functional considerations are taken into account, a relationalist position best accommodates the primary data concerning colour perception, and permits a better understanding of the ways in which colour is both objective and subjective. The chapter ends with a reconsideration of the notions of objectivity and subjectivity themselves, and a consideration of how modern technology can foster misleading expectations about the specificity of colour properties.
Objectivism Traditional objectivists hold that colour is a mind-independent physical property of objects. The most likely candidate for such a property is the surface spectral reflectance (SSR) of an object. The SSR is the percentage of the light at each wavelength across the visible spectrum that is reflected by a surface. The amount reflected depends on the percentage of the light absorbed by the surface, the remainder being reflected. Chapters 9 and 11 in this book include examples of surface spectral reflectance distributions (or reflectance functions). The most important characteristic of such distributions for our purpose is that they tend, in natural objects, to be relatively smooth functions, which differ in shape. As we will see, the relation between such distributions and perceived colour can be complex. But there are some regularities, such as that typical red objects will reflect more light toward the long wavelength or red end of the visible spectrum, and typical blue objects will reflect more light toward the short wavelength or blue end of the spectrum. Sophisticated objectivists such as Hilbert (1987, 1992), Maloney (Chapter 9 this volume), and Wandell (1995, Chapter 9) identify object colours with surface spectral reflectances. They see the visual system as seeking to develop a stable representation of the surface reflectance (or a more basic physical property related to that reflectance, such as Maloney’s bi-directional reflectance density function, see Chapter 9 this volume). The ability to develop a stable representation of surface colour under variations in ambient illumination is known as colour constancy. The light received at the eyes from an object is a function of both the object’s reflectance properties and the spectral composition of the illuminant (e.g. dawn sunlight, incandescent light, midday sunlight, all of which differ). Therefore, if constancy is to be achieved, the illuminant properties must somehow be accounted for. As traditionally conceived, this would require solving a problem in two unknowns by contemplating only a single value (the light received at the eyes); so stated, the problem cannot be solved. Additional information of some sort is needed. Maloney and his colleagues have developed ingenious linear models of colour constancy that attribute to the organism some engineering assumptions concerning candidate spectral reflectance distributions and the candidate illuminants, making the problem soluble within certain ranges of accuracy. Some objectivists, including Barlow (1982), Hilbert (1992), and Shepard (1992), see colour constancy as the driving force behind trichromacy (the three-pigment system in human and some other primate eyes). That is, they think that trichromacy evolved because it allows the eye to serve as a better instrument by which the visual system can recover information about SSRs. Two aspects of the objectivist stance are of interest here. First, its overall conception of the task of colour perception is what I have called a ‘physical instruments’ conception (Hatfield
190
colour perception
1992, pp. 496–9). Objectivists see the perceptual system as seeking a representation of a distal physical property, such as the SSR. Mausfeld criticizes this view of perception for treating the visual system as a ‘measurement device’ (Mausfeld 1998, p. 224; Chapter 13 this volume). On such a conception, physics provides the appropriate concepts for describing the representational task in colour vision, which is to achieve a representation of physical properties described as such. The relationalist functional view presented below offers an alternative to this conception of the visual system’s function in colour perception. The second point of interest concerns the objectivist’s response to the fact of metamerism. Metameric surface colours occur when different SSRs yield the same perceived colour under specified conditions of illumination. This means that physically distinct stimuli, which exhibit different functions relating wavelength to the absorption and reflection of light, yield phenomenally indistinguishable colour experiences. The phenomenon of metamerism is well established. The interesting question is how to interpret it. Objectivists such as Hilbert (1987, Chapter 5) and Barlow (1982) respond by saying that there are many more colours than we perceive. Having defined surface colour in terms of SSR (or a related measure), they identify each SSR as a distinct colour. If the human visual system, or any visual system, fails to discriminate among SSRs, then it fails to discriminate all the colours there are. Consonant with their physical instrument conception of the function of colour vision, the physical description of surface properties provides the standard for individuating colours, and not the facts of colour experience.
Subjectivism The position of subjectivism is most prominently associated with C. L. Hardin’s 1988 book, Color for philosophers. This book raised the standard of philosophical discussions of colour by paying close attention to scientific work. Hardin examined the various objectivist and dispositionalist theories. (Dispositionalism is a type of relationalist theory.) He rejected objectivism and physicalism on the grounds that there is no single physical property corresponding to the colours we experience. In so doing, he adopted a phenomenalist stance: he took it that a theory of colour should be driven by the facts of colour perception. To this extent, he made colour experience, or at least colour response, fundamental in colour theory considered as a part of the theory of vision. Hardin argued that if physical properties cannot be put into sufficiently direct relation to colour experience, the notion that colours are objective should be rejected (see also Boghossian and Velleman 1991). He disposed of an objectivism similar to Hilbert’s (1987) by appealing to metamerism and certain other higher-order properties of colour, such as the finding that red, green, yellow, blue, black, and white are the primary colours. Hardin contended that since objectivists cannot explain the special status of these primaries by appeal to physical properties alone, their attempted reduction fails. (Jackson and Pargetter 1987, though not responding specifically to Hardin, provide the basis for an objectivist reply that allows subjective variability but identifies colour as the physical property that causes experience in individual perceivers in specific circumstances. This position, although interesting for its admission of the variability of relation between physical properties and
colour as a psychobiological property
191
colour experience, fails to respond adequately to the objectivist desideratum of making colour a mind-independent physical property; on which see Hilbert 1992.) In addition, Hardin (1988) moved against dispositionalist theories of colour vision, which he characterized as a variant of subjectivism. A common form of dispositionalism, descended from the natural philosophies of René Descartes and Robert Boyle, and made prominent in the philosophy of John Locke, holds that colours are secondary qualities (for a review, see Hilbert 1987, Chapter 1). A secondary quality is a property of an object that is defined by the object’s standard effect on something else. In the case of colour, the standard effect is the ‘idea’ or experience of colour. In more recent language, the position holds that for an object to be a certain shade of blue is for it to produce a specific experience of blue in standard observers under standard conditions. The appeal to standard conditions takes account of differences in illumination; objects that look white in daylight may, in certain conditions, look red under red light. A dispositionalist theory might make daylight the standard condition, in which case the object would be classed as white. The notion of a standard observer rules out colour-blind observers, or observers in special states of adaptation or in drug-altered states. Boyle and Locke expressed this position using the language of primary and secondary qualities. Primary qualities are physically basic. For Boyle and Locke, they include the size, shape, position, and motion of the microscopic corpuscles that they held to constitute matter. Candidates for the relevant primary qualities today might be the absorption and reflectance properties of surfaces, or the underlying atomic and molecular properties that determine those properties. Colours, sounds, tastes, odours, and tactual qualities such as hot and cold are secondary qualities. Physically, they are constituted from primary qualities (for Locke and Boyle, configurations of corpuscles). But they are defined as powers to produce sensations or ideas in the minds of observers. In this sense, they are relational properties. If there were no (actual, or perhaps possible) observers, there would be no secondary qualities—the existence of the secondary qualities depends upon there being observers in which the experience of colour, for example, can be caused. Hardin (1988) sought to show that the notions of standard conditions and standard observers cannot support a view that colours are stable dispositions of objects to produce experiences. The scientific literature shows that colour constancy is not perfect. So if colour in objects is the disposition to produce colour experience of a specific hue (or shade of colour) in standard observers under standard conditions, nature does not cooperate. Under any natural (i.e. not artificially restricted) interpretation of what might count as standard conditions or standard observers, the conditions and observers can be fixed and yet the colour response vary (among standard observers and within the class of standard conditions). If the dispositionalist wants to assign to objects specific, stable, intersubjectively common hues using the relation between surface reflectance properties and the colour experience of observers, the evidence Hardin presents poses a serious problem. In the end, Hardin argues that close scrutiny of the notions of standard conditions and observers reveals that colour is an interest-relative and subjective notion with no objective basis. He concludes that colour experience is a useful illusion; it presents objects as having properties they do not have. The illusion results from properties that objects and perceivers do have, hence has some foundation in reality, is persistent, and so permits the use of
192
colour perception
colour appearances in the classification of objects (Hardin 1988, Chap. 2). But, Hardin thinks, these findings undermine any attempt to ascribe colour as an objective property to objects (see also Boghossian and Velleman 1989). In my view Hardin’s response to the scientific evidence is too extreme. By re-examining the notions of subjectivity and objectivity and reflecting further on the notion of property, I think we can find a place for a relationalist functional theory of colour that permits colour to be a subject-relative, but in important respects objective, psychobiological property of objects. These reflections will not require that we examine or qualify the empirical results Hardin describes. For our purposes, we need not re-examine his facts. Rather, we will look at the theoretical context and philosophical assumptions he and others use to interpret those and other facts.
Relational functionalism I agree with Hardin that colour experience should be an important component in any analysis of colour as a property. My analysis therefore begins from the place of colour in perception. From this position one might, or might not, come to reduce colour as a property of objects to a mind-independent physical property. In fact, I also agree with the tenor of Hardin’s response to physicalist objectivism (Hatfield 1992). However, I think that the sort of facts he presents can be made consistent with a certain kind of objectivist view of colour, a relational functionalist view. My view is relationalist in that, like the dispositionalist, it accepts that colour as a property in things consists in the disposition of things to cause experiences of a certain sort in perceivers. It is functionalist in that it looks to the biological function of colour vision for guidance about what sort of property is constituted by the relations between objects and perceivers. The analysis I will present disagrees with the physicalist objectivists on four important points. I will argue that: 1. 2. 3. 4.
Colour constancy need not be the driving force toward trichromacy. To possess colour an object need not be assigned a precise shade of colour. Properties can be species-relative. Objectivity is not always incompatible with subjectivity.
These points, taken together, are consonant with a view that trichromatic colour vision evolved in primates as a means for discriminating objects by their surface properties, for which exact constancy is not needed. In opposition to Hardin’s (1988) subjectivism, these points can serve as the basis for assigning colour to objects as an objective, subjectand species-relative property. One way of asking what property coloured objects have is to ask what representational content is found in colour experience. That is, what does experienced colour represent about objects? The physicalist objectivist thinks that colour experiences do, or should, represent individual SSRs, and then concludes that to the extent that colour experience does not uniquely reveal SSRs, it falls short of its representational task. My approach is that colour experience represents surfaces as having properties that make them instances of a hue class.
colour as a psychobiological property
193
It may do so by representing the surface as having a specific hue, but this does not mean that the object can or should be assigned that particular shade as its colour. Rather, the object is assigned a colour type, in relation to its appearance to colour observers of a specific type (e.g. normal human observers) under ecologically standard conditions (e.g. daylight viewing). If an object appears green, blue, red, yellow, etc., in daylight, then it is assigned that colour, but need not be assigned (as a stable, objective property) the particular shade it appears as having to an individual observer under a given instance of daylight. This position arises from a functionalist conception of assigning representational content in perception. A functional approach assigns content in relation to a task analysis, or an analysis of the function of the representational system in question (Hatfield 1988, 1991; Matthen 1988). Thus, one function of vision is surely to represent the spatial layout; various spatial structures would be assigned as contents of visual experiences under this analysis. In the case of colour vision, to apply this sort of analysis one would seek to determine what the (or a) function of colour vision is for a given species. (There need not be only one function in a given species, or across species.) Ascriptions of such functions are based in biology, and typically appeal to evolutionary theory. The basic idea is that a structure or system is assigned a function in accordance with the selection pressures that lead to its evolution and maintenance in a type of organism. Consequently, if colour vision has come to have other, culturally defined functions that have not been active in natural selection, those functions are described as artefact functions, and are left out of the primary analysis of colour as a naturally occurring property (more on this below). The long history of the evolution of eyes shows that visual pigments are adapted to prevailing light conditions. The pure rod retinas of deep-sea fish are adapted to the small segment of the visible spectrum that penetrates to their depth (Lythgoe 1972; Lythgoe and Partridge 1991). In those with only one type of rod pigment, the wavelength of maximum light sensitivity of the rods closely matches the peak ambient light. That sort of match would be effective for fish who hunt from below, seeing their prey as dark areas against the downward light. Many fishes are dichromats. Investigators have wondered how a two-cone system could evolve. They have considered evolutionary scenarios in which a stable two-cone retina might evolve prior to the development of dichromatic colour vision itself. (The possession of two types of visual pigment is not sufficient for colour vision; the visual system must compare the outputs of the two types for colour discrimination to occur.) McFarland and Munz (1975) argue that the original selection pressure for two types of cones in ocean fish that hunt near the surface might have come from the demands of two sorts of discriminatory tasks. For hunting from below, such fish would be well served by cones with maximum sensitivity matching the peak wavelength in the available downwelling light, as for the deep-sea fish. That would make any object seen from below dark against a bright background. Along the horizontal line of sight, the peak available light is of shorter wavelength than the broad spectrum downwelling light (within several metres of the surface). Hence, for hunting objects along that line of sight, it is better to have a cone type with maximum sensitivity offset toward the long wavelengths. In that way, the ambient spacelight of the background would appear darker, and objects reflecting the broad-band downwelling light would stand out.
194
colour perception
McFarland and Munz (1975) contend that two-pigment cone retinas might have evolved so that both sorts of discrimination could be served by a single eye. That would require separate visual pathways for each cone type, a precursor to colour vision. They conjecture that ‘the evolution of high visual acuity with maximum contrast under varied photic conditions would favor the selection and maintenance of separate visual pathways for these different cases. In other words, we have described the elements necessary for color vision’ (McFarland and Munz 1975, p. 1073). Colour vision would not be needed initially to explain the advantage of this system, and could evolve subsequently, once the two visual pathways were available to allow further selection on neural wiring. Adopting a biofunctional and comparative attitude, we may ask what colour vision is ‘for’ in (at least some) mammals. After a thorough review of the literature, Jacobs (1993, pp. 456–7) concluded that colour vision serves the following functions: (1) to provide contrast not based on achromatic brightness or lightness; (2) to aid in the detection of small objects in a dappled environment, where lightness cues are largely masked (e.g. fruit in trees); (3) to aid in segregation of objects divided by occlusion (e.g. fruit seen through leaves, see Mollon 1989); (4) to identify objects by their stably perceived colour (requires something approaching colour constancy). Only item (4) requires something approaching colour constancy, and even it does not require perfect constancy; it would suffice if environmentally salient objects could be stably re-identified by colour class. The fineness of the partition of the hue space needed to achieve this task would depend on the characteristics of the objects to be sorted (Hatfield 1992, 1999). That, of course, is an empirical matter that would require analysis of the photic properties of biologically significant objects on a species by species basis. Much of the literature on comparative colour vision, and on the evolution of trichromacy in primates, stresses functions (1) to (3). Mammalian trichromacy is comparatively recent, having evolved in the Cenozoic era, after the adaptive radiation of mammals some 65 million years ago (Goldsmith 1990). Genetic analysis suggests that it evolved through selection on naturally occurring polymorphism in the middle-wavelength sensitive (MWS) cone. Thus, the short-wavelength cone is thought to have been stable, but the MWS cone to have exhibited polymorphic variance that provided instances of the MWS and LWS cone types, in relation to which selection for neural wiring to permit trichromatic colour vision might occur. Trichromatic colour vision of this sort would allow better discrimination of yellow, red, and orange objects found among green leaves. For such discrimination to occur, perfect or near-perfect colour constancy would not be needed. Rather, it would need only be the case that yellow, red, and orange fruit was more easily discriminable to a trichromat (by comparison with a dichromat) across a significant range of natural lighting conditions. This ‘fruit detection’ hypothesis has long been favoured as the explanation of the development of colour vision (e.g. Allen 1879, Chapter 6; Walls 1942, Chapter 12) and trichromacy (Polyak 1957, pp. 972–4), and receives support from recent empirical studies such as those reported by Mollon (1989) and Jacobs (1996).
colour as a psychobiological property
195
According to this analysis, when trichromacy evolved things gained new colours, as the visual system became able to group things using a more fine-grained partition of the chromatic appearance of surfaces. Thus, fruit and leaves came to appear more distinctly different, chromatically, than before. For tasks (1) to (3), there is no need for precise colour constancy, nor any need that colour properties be equated with specific shades (that is, highly determinate hues).
Colour as a psychobiological property of surfaces Colour is an attribute of objects that makes surfaces visually discriminable without a difference in brightness or lightness. Focusing for the moment on human colour vision, it makes objects discriminable because they appear with differing hue or chromaticity. More generally, ascriptions of colour vision to various animals can be made by finding that the members of a species (or a subpopulation of the species, e.g. normal trichromatic humans) can discriminate independently of brightness or lightness in E (an environment, normally specified by ecologically typical conditions). Under this analysis, colour is a relational attribute, analogous to being a solvent. The existence of colour as an attribute of objects depends on the normal effects of objects on perceiving subjects. In humans, these effects include a phenomenal or experiential component. Accordingly, for an object to possess colour is for it to have a surface reflectance that produces a phenomenal chromatic visual presence that permits discrimination among objects independent of brightness or lightness by members of a type of population in E. The colours under which objects appear can serve as the basis for categorizing objects. However, qualitatively similar clusters of colour experiences are not themselves categories (pace Thompson 1995, pp. 184, 196). For the colours of objects to be useful for categorization, the same object should appear with the same hue-type under a variety of conditions, but it need not appear with the same specific hue. It is consistent with an object possessing colour that it appear differently under differing conditions (of the perceiver, and/or the environment); such differences would be multiplied if there were no colour constancy, but objects would still possess the attribute of colour. Even with some degree of colour constancy, the expression of the attribute of colour can be affected by environmental conditions and the state of the perceiver. Modern colour science has developed colorimetry, or the alignment of colour judgements with combinations of wavelengths, into an exact art (Kaiser and Boynton 1996, pp. 25–6 and appendix). This art is made possible by severely restricting the conditions under which colour observations are made by test observers. The high degree of accuracy achieved makes possible standardized dyes, and serves engineering functions, such as the production of colour television sets. The specificity found in laboratory colorimetry should not result in our treating the colour attribute as if it were realized by a set of finely differentiated colour properties (corresponding to the range of highly specific hues). For certain cultural, scientific, or industrial purposes, such specificity is desirable. However, when colour vision is regarded as a biological capacity of sighted animals, the resulting functional approach to the colour attribute suggests it is realized by surface characteristics that yield varying colour responses across differences in ambient conditions and type and state of Ss.
196
colour perception
This variation also is recognized in colour science. The attitude toward it varies. We have seen that many objectivists view ‘the colour of an object’ as a highly specific physical property that may be recovered with more or less success by natural visual systems under ecological photic conditions; under this conception, the same response to differing SSRs, or differing responses to the same SSR, indicate error. Subjectivists have concluded that the extant variation undermines the very notion that objects are really coloured (have a colour property). In my view, the subjectivist gives up on colour properties too quickly, while the objectivist divorces the colour property from colour experience and misdescribes the function of colour vision. There is a prejudice in ordinary philosophical uses of language against relational attributes and properties, and against attributes that don’t stably possess determinate values. Yet there are perfectly good relational properties which, in virtue of their relativity, may be differently assigned to the one and the same object at the same time. An example is the biological property of being nutritious. To be nutritious is to be usable in metabolism. The property of being nutritious is species relative. Wood is nutritious for termites, not for humans; that is, it possesses the property of being nutritious for termites, but does not have a nutritive property in relation to humans. Its being nutritious depends on its physico-chemical properties. These physico-chemical properties have effects on all sorts of things, and interact with other chemicals during metabolism. Being nutritious does not add anything to the chemical constitution of wood. Yet it is a property that wood might or might not have. If there could be no wood-eating animals, wood would not be an animal nutrient. It would not be altered physically by facts about its being or not being a nutrient. But it would have, or not have, a biological property. Colour as an attribute of objects is analogous to the property of being nutritious, except that the effect it has on organisms has a mental component. Hence, I denominate colour a psychobiological attribute. It is a property objects have, in relation to perceivers, of being visually discriminable by phenomenal hue rather than lightness or brightness. (Notice that I take phenomenal hue, colour, or chromaticity as primitives, and do not try to define them in terms of something else; that is a characteristic of theories that make colour experience, or colour discriminatory capacities, theoretically primary.) Because colour properties are individuated in relation to perceivers, objects might be described under more than one colour name at the same time, in relation to various populations of seers. That is fine, because they have as many instances of the relational colour property as there are distinct classes of perceivers to which objects are related. Objects that may be assigned more than one colour name (e.g. they are yellow to certain dichromats but orange to trichromats) possess two (or more) distinct colour properties at the same time, depending on how many type-distinct classes of colour perceivers there are for whom they appear chromatically distinct. This does not, of course, imply that they have mutually exclusive properties (being yellow and being orange in the same respect) at one and the same time; they have as many different colour properties as there are types of perceiver in which they cause type-distinct colour responses. Moreover, if there were not (and could not be?) any chromatically endowed perceivers, there would be no colours. There would, of course, still be photons and reflectances.
colour as a psychobiological property
197
The metaphysics of relational and dispositional properties is intricate (see Chapter 16 this volume, for an analysis). When I say that colour is a relational property that involves the disposition of objects to cause experiences of certain sorts in a population of perceivers, I am telling you what kind of property it is. I am not trying to capture ordinary language talk about colours. [Philosophical colour theories (see, for example, Jackson and Pargetter 1987; Johnston 1992) are often driven by ‘ordinary’ intuitions about property and causal talk, but such language has no particular authority in my view.] In particular, I am not trying to capture language about the causal relation between objects and colour experience, or about the notion of ‘property’ as distilled from ordinary talk of objects. My aim has been to locate the colour property within a biofunctional conception of the senses. Once the basic notion of colour as a psychobiological property is in place, there is no reason to preclude use of a notion of ‘physical colour’ that is independent of colour as a visual property. Visually, colour is a relational property involving both objects and perceivers. But we could also speak of ‘physical colour’ as a property of reflecting light according to a specific SSR. Even while granting that the relational notion of colour as a psychobiological property is primary, we might choose to develop a perceiver-independent notion of ‘physical colour’ as a means of describing the reflective properties of objects, or the spectral composition of light. To avoid confusion, it would be necessary to keep in mind that such physical colours would be defined without relation to colour experience or colour perception; they would be defined in a purely physical vocabulary of wavelength or photon vibration. Whatever language we choose for describing the physical properties of light and of surface reflectances, it is in virtue of its physical SSR that an object is able to affect light and produce a colour response in an observer. But the colours of objects cannot be reduced to or identified with SSRs. Rather, object colours are to be identified with properties objects have of causing colour experiences in perceivers. A physical SSR may help us identify this class, but using it alone, independent of the colour-discrimination capacities of organisms, we could not define real colours. There would be no physical reason for marking off the ‘visible spectrum’ or carving it into colour regions independent of the visual capacities of organisms. Colour is a perceiver-dependent property of objects.
Objectivity and subjectivity revisited Hardin (1988) opposed his brand of subjectivism to the sort of objectivism espoused by Hilbert 1987 (Hardin in fact addressed earlier forms of the position, as in Armstrong 1961 and Smart 1961). The arguments of the various objectivists and subjectivists share a common conception of objectivity, according to which objectivity requires mindindependence. This conception of objectivity allows Hardin to argue that if there is no candidate colour property individuated by purely physical criteria independent of effects on perceivers, colour is not an objective property, but is wholly subjective or illusory. In my view this particular dichotomy of positions into objectivist and subjectivist relies on an overly coarse analysis of the notions of objectivity and subjectivity themselves.
198
colour perception
The notion of objectivity is complex and many faceted. It can include at least the following aspects: (1) (2) (3) (4) (5)
pertains to a mind-independent reality; pertains to the object; sustains factual claims; pertains to publicly available states of affairs; is real.
Item (1) is often invoked in discussions of colour, but the other factors are important, too. Moreover, most or all of the other aspects are independent of (1). Although some philosophers still question whether there can be factual claims about mind-dependent or mind-supported states as affairs, such as the sensations, thoughts, and feelings of individual subjects, experimental psychology has been offering measurements of psychological states for more than 150 years. Of course, those psychologists who consider themselves to be determining the experiential sensory states of their subjects may be wrong, in the general sense that all science is fallible and not absolute. However, in what follows I will explore the implications of thinking that they are right. The notion of the subjective is also complex and many faceted. It can include the following: (A) is dependent on the mind alone (with no dependence on objects); (B) pertains to the subject; (C) varies idiosyncratically (no intersubjective agreement); (D) pertains to experiential, private states of affairs; (E) is not real. The root notion of ‘subjective’ is that it pertains to the subject (B), which need not entail that it depends on the mind alone (A). A feeling of hunger pertains to the subject and involves a mental state, but it may depend on the state of the digestive system and blood chemistry. Students who accuse professors of ‘subjective grading’ have aspect (C) in mind. Aspect (D) is sometimes thought to preclude intersubjective knowledge of a subjective state, but that depends on what grounds there might be for inferences across subjects. It is sometimes suggested that something wholly mind-dependent ‘is not real’ or does not belong to the world (E). On the other hand, one might argue that minds (or brain-dependent experiential mental states) exist and so must belong to the world—that is, must be real. (Indeed, even dualists such as Descartes typically thought of the mind as existing in the natural world, and hence did not exclude dualistically conceived mental states from the ‘reality’ of the natural world; see Hatfield 2000.) Colour as a psychobiological property of objects is ‘objective’ in senses (2) to (5). It lacks only (1), mind-independence. But even if (1) is denied, we can retain (2) to (4), which allow a robust notion of objectivity. Items (2) to (4) include pertaining to the object, sustaining factual claims, and pertaining to publicly available states of affairs. They permit a notion of objectivity including publicly available facts. I like item (5) as well; even though the
colour as a psychobiological property
199
relational notion of colour depends on mental experiences for its paradigm statement (in the case of human beings), one might well assert that human phenomenal experience is none the less ‘real’ (i.e. a part of the world). Colours as relational properties of objects are objective in that they: (2) pertain to the object; (3) sustain factual claims; (4) pertain to publicly available states of affairs; (5) are real. But this is not inconsistent with their: (A′ ) being dependent on the mind, because attributed relative to effects on experience; (B) pertaining to (an experiential effect on) the subject. (A′ ) is rewritten from (A) to make explicit that mind-dependence can include relations to extra-mental or extra-brain states of affairs. Even when colour is defined in relation to phenomenal experience, then, it has elements of both objectivity and subjectivity. It is subjective in senses (A′ ) and (B), but not (E). As regards (C), some intersubjective variation occurs, but it often (and increasingly with the growth of knowledge) can be explained in a systematic fashion by taking into account physiological differences among subjects. Sense (D) should be divided. Colour defined in relation to experience is subjective in sense (D′ ): the experiences of individuals are ontologically private, that is, a given instance of a colour experience can be ‘had’ by only one person. But it need not be, and typically is not subjective in sense (D′′ ): epistemically private. Third parties can make reasonable claims about someone else’s colour experience, arguing from analogy with their own experience (and, if needed, pointing to species-shared biological characteristics). Hence, the subjectivity of colour experience in senses (A′ ), (B), and (D′ ) is not inconsistent with the public availability of colour as species-relative property.
Culture, naming, and property specificity Culturally, we have exploited the chromatic sensitivity of our visual systems to develop finely divided colour categories, and we exploit visual sensitivity to use colour in systems of identification and contrast, which we rely on for many practical purposes. Colour coding is used in medical and engineering contexts where life-or-death outcomes depend on colour discrimination. Artists and decorators rely on the availability of stable, reproducible paints and dies exhibiting a highly specific hue under a range of conditions. Such scientific and cultural uses of our abilities for fine-grained colour discrimination have led some to mistakenly concretize the colour names as well-behaved colour predicates for which we should expect to find a corresponding mind-independent physical property in the world. This has resulted in misplaced demands on candidate colour ‘properties’, as in expectations of transitivity of colour matches, excessively stable possession of determinate colour values, and so on.
200
colour perception
These are unreasonable expectations about colour, which may come from supposing that if colour is to be a property it must be a mind-independent property and behave like a physically measurable state of an object, taken in isolation. Such unreasonable demands on analyses of colour as a property can be avoided by recognizing that: • Colour as an experience is a way our visual system presents objects.
• Colour as an attribute of objects is defined in relation to the ways objects produce in us
representations of their surfaces, discriminable by hue class.
• Biologically, colour attributes are broadly tuned dispositional relational attributes of
objects.
Not every property is a physical property. The property of being nutritious is not. Neither is colour. They are both biofunctional properties. Colour, as a property defined in relation to phenomenal experience or psychological discriminatory capacities, is a psychobiological property. As such, its basis may be found in the relation of subjects to objects. It is in relevant respects both subjective and objective. As explained, there need be no paradox in that.
Acknowledgements Earlier versions of this paper were presented at the Conference on Color Science and Philosophy, Institute for Research in Cognitive Science, University of Pennsylvania (April, 1994), to the Philosophy Colloquium, City University of New York Graduate Center (April, 1996), and in the colloquium series Perception and Evolution, ZiF, Bielefeld (May, 1996). I am grateful to members of the audiences on those occasions for their interest and conversation. I am indebted to Yumiko Inukai for helpful criticism of a recent version.
References Allen, G. (1879). The colour-sense: Its origin and development: An essay in comparative psychology. Trübner, London. Armstrong, D. M. (1961). Perception and the physical world. Humanities Press, New York. Barlow, H. B. (1982). What causes trichromacy? A theoretical analysis using comb-filtered spectra. Vision Research 22, 635–644. Boghossian, P. A. and Velleman, J. D. (1989). Colour as a secondary quality. Mind 98, 81–103. Boghossian, P. A. and Velleman, J. D. (1991). Physicalist theories of color. Philosophical Review 100, 67–106. Campbell, J. (1993). A simple view of colour. In Reality, representation, and projection (ed. J. Haldane and C. Wright), pp. 257–268. Oxford University Press, New York. Goldsmith, T. H. (1990). Optimization, constraint, and history in the evolution of eyes. Quarterly Review of Biology 65, 281–322. Hardin, C. L. (1988). Color for philosophers: Unweaving the rainbow. Hackett, Indianapolis. Harman, G. (1996). Explaining objective color in terms of subjective reactions. In Perception (ed. E. Villanueva), Philosophical Issues Series no. 7, pp. 1–17. Ridgeview, Atascadero, CA. Hatfield, G. (1988). Representation and content in some (actual) theories of perception. Studies in History and Philosophy of Science 19, 175–214.
colour as a psychobiological property
201
Hatfield, G. (1991). Representation in perception and cognition: Connectionist affordances. In Philosophy and Connectionist Theory (ed. W. Ramsey, D. Rumelhart, and S. Stich), pp. 163–195. Lawrence Erlbaum, Hillsdale, New Jersey. Hatfield, G. (1992). Color perception and neural encoding: Does metameric matching entail a loss of information? In Philosophy of Science Association 1992 (ed. D. Hull and M. Forbes), vol. I, pp. 492–504. Philosophy of Science Association, East Lansing, MI. Hatfield, G. (1995). Philosophy of psychology as philosophy of science. In Philosophy of Science Association 1994, Vol. 2), (ed. D. Hull, M. Forbes and R. Burian), pp. 19–23. Philosophy of Science Association, East Lansing, MI. Hatfield, G. (1999). Mental functions as constraints on neurophysiology: Biology and psychology of vision. In Where biology meets psychology: Philosophical essays, (ed. V. Hardcastle), pp. 251–271. MIT Press, Cambridge, MA. Hatfield, G. (2000). Descartes’ naturalism about the mental. In Descartes’ natural philosophy, (ed. S. Gaukroger, J. Schuster, and J. Sutton), pp. 630–658. Routledge, London. Hilbert, D. R. (1987). Color and color perception: A study in anthropocentric realism. Center for the Study of Language and Information, Stanford. Hilbert, D. R. (1992). What is color vision? Philosophical Studies 68, 351–370. Jacobs, G. H. (1993). Distribution and nature of colour vision among the mammals. Biological Review 68, 413–471. Jacobs, G. H. (1996). Primate photopigments and primate color vision. Proceedings of the National Academy of Science, USA 93, 577–581. Jackson, F. and Pargetter, R. (1987). An objectivist’s guide to subjectivism about colour. Revue Internationale de Philosophie 41, 127–141. Johnston, M. (1992). How to speak of the colors. Philosophical Studies 68, 221–263. Kaiser, P. K. and Boynton, R. M. (1996). Human color vision (2nd edn). Optical Society of America, Washington, DC. Lythgoe, J. N. (1972). Adaptation of visual pigments to the photic environment. In Handbook of sensory physiology, VII.1, Photochemistry of vision (ed. H. J. A. Dartnall), pp. 566–603. Springer-Verlag, Berlin. Lythgoe, J. N. and Partridge, J. C. (1991). Modelling of optimal visual pigments of dichromatic teleosts in green coastal waters. Vision Research 31, 361–371. Maloney, L. T. and Wandell, B. A. (1986). Color constancy: A method for recovering surface spectral reflectance. Journal of the Optical Society of America A 3, 29–33. Matthen, M. (1988). Biological functions and perceptual content. Journal of Philosophy 85, 5–27. Mausfeld, R. (1998). Color perception: From Grassmann codes to a dual code for object and illumination colors. In Color vision: Perspectives from different disciplines (ed. W. Backhaus, R. Kliegl and J. S. Werner), pp. 219–50. de Gruyter, Berlin. McFarland, W. N. and Munz, F. W. (1975). Evolution of photopic visual pigments in fishes. Vision Research 15, 1071–1080. Mollon, J. D. (1989). ‘Tho’ she kneel’d in that place where they grew . . .’ The uses and origins of primate colour vision. Journal of Experimental Biology 146, 21–38. Polyak, S. L. (1957). The vertebrate visual system. University of Chicago Press. Shepard, R. (1992). The perceptual organization of colors: An adaptation to regularities of the terrestrial world? In The adapted mind: Evolutionary psychology and the generation of culture (ed. J. H. Barkow, L. Cosmides, and J. Tooby), pp. 495–532. Oxford University Press, New York.
202
colour perception
Smart, J. J. C. (1961). Colours. Philosophy 36, 128–142. Thompson, E. (1995). Colour vision. Routledge, London. Walls, G. L. (1942). The vertebrate eye and its adaptive radiation. Cranbrook Institute of Science, Bloomfield Hills, MI. Wandell, B. A. (1995). Foundations of vision. Sinauer, Sunderland, MA.
commentary: colour as a psychobiological property
203
Commentary on Hatfield Why is this game still being played? Paul Whittle Hatfield offers a nuanced biofunctional treatment of colour, characterizing it as a phenomenon of the interaction of organism and world, and hence being in some respects objective and in some, subjective. His listing of multiple criteria for ‘objectivity’ and ‘subjectivity’ is particularly helpful. A few years ago when I was working on colour in a university department of Experimental Psychology, his treatment would have seemed to me unexceptionable. However, having in the meantime severed those institutional ties, I find myself more resistant to it. My principal reaction now is to query this whole genre of philosophical discussions of colour. Why are we (particularly philosophers) still caught up in the problem of objectivity versus subjectivity? We find puzzles about the appearance or reality of colour as far back as Plato’s Timaeus, and they were given a crucial role in the foundations of modern scientific thought by Galileo’s partitioning of perceived qualities into primary and secondary. But although we are all interactionists now, in one way or another, whatever our preferred label (even Hardin allows ‘some foundation in reality’), we seem unable to let the matter rest. Scientists are as divided in their opinions as philosophers, although in my experience their work is remarkably independent of their metaphysics, and when necessary they set up dual definitions of, for example, ‘psychophysical’ and ‘psychological’ aspects of colour (e.g. Wyszecki and Stiles 1967, p. 229). Consider a perceptual domain which is in some ways comparable: vowel quality. In both colour and vowel quality, advances in technology, in pigments and speech synthesis respectively, have allowed a multidimensional continuum to be set up. Gage (1993) describes one instance of the former when in the late Middle Ages the development of oil paint lifted the taboo on mixing, and manuals of painting switched from talking primarily in terms of individual pigments to talking in terms of hues. In both domains, neuroscience confronts problems of transduction, encoding, analysis, and recognition. In both there are the same questions about categorization: the relative roles of biology and culture, the fuzziness of the boundaries, the relevance of ideal types (prototypes or focal instances). Yet in the case of vowel quality, there is not, as far as I know, the slightest interest in whether it is ‘objective’ or ‘subjective’. To everyone concerned it is obvious that vowel sounds have physical characteristics, and that the subject’s role in perceiving them is of paramount importance. To argue about objectivity or subjectivity would be felt as verbal time-wasting. Psycholinguists and phoneticians have more compelling things to do. There are, of course, some differences between the domains. First, vowels are produced by humans: speaking is as important as hearing. In speech science the two sides anchor and structure each other, sometimes intimately, as in the analysis-by-synthesis theory of speech perception. Theorists of speech perception are much less likely to lose sight of action than in the case of vision. Secondly, there is a context not only of action but of performative use: speech. The scientific questions centre around communication. Accordingly there is very little interest in ‘what it is like’ questions, in vowel ‘qualia’. It would seem absurd to regard the experience of a vowel sound as constituting its essence, whereas the corresponding claim for colour is still common in philosophical discussion. (Hatfield does not talk of essences, but he does give prominence to experience.) Thirdly, speech sounds are transient events, embedded in a flow of interaction. They do not hang around to be contemplated, in the way that coloured objects generally do. This allows the experience of colour to be more prominent, because as Merleau-Ponty remarks, ‘The quality, the separate sensory impact occurs . . . when instead of living
204
colour perception
the vision, I question myself about it, . . . in order to catch and describe it’ (Merleau-Ponty 1962, p. 227). So, in the light of this example, can we say any more about why contemporary philosophers are still so concerned to place colour along an objective–subjective dimension? I have only a few suggestions. First, both poles are very prominent in experience. Colour is the most obviously ‘objective’ of the ‘secondary’ qualities. Your food on the plate may look tasty but it is always coloured. And on the other hand, as I’ve said, we attend surprisingly much to the ‘appearances’ of colour. Secondly, there is the two millennia of ‘historical ontology’, with its associated metaphysical and theological baggage. To the primary/secondary distinction already mentioned, one could add the importance of the rainbow, that awe-inspiring spectral ‘illusion’, whose gradual mathematical elucidation (by Roger Bacon, for example) was one of the corner-stones of optics, which in turn was a paradigm of the new mathematical sciences (partly stimulated, note, by a ‘secondary’ quality). But although there is this and more in the background, I remain at a loss to understand what is driving the contemporary discussions. I find it hard to see what the excitement is about, other than to extrapolate from my own experience to the cynical view that they are caught in a professional conversation with its ‘secondary gains’, or in Merleau-Ponty’s more dismissive phrase, are ‘bogged down in a scholastic’. With perhaps also the perpetual optimism, which I am familiar with in different contexts, that if only we can crack this crucial case, many others will fall into line. It might be thought that these grumbles reflect a scientist’s impatience with philosophy per se. Quite the opposite. They are aggravated by my conviction that at present philosophy (together with history and anthropology) have more to teach us about colour and perception than does experimental science, which seems to me limited by a too restrictive conceptual framework. The science of perception needs philosophy just now, and philosophy has much to offer. I am thinking, for example, of the dissolution of encrusted dualisms, which we see in the current re-evaluation of the American Pragmatists, in the analytical philosophy of Wittgenstein, Sellars, Davidson, and others, and in the European traditions that seek to recover the life-world and embodiment. The same three strands appear in the work of Brandom (2000), whose inferentialism allows us to open up the question, neglected by cognitive science under the influence of its ‘modular’ concept of cognition, of what might be specific to human perception. These are but two examples of the resources in contemporary philosophy for thinking afresh about perception. It has been my dawning awareness of these resources that has made me impatient with philosophy that seems to be playing old tunes with only minor variations. It seems to me not to be examining its concepts radically enough, and to be too in thrall to aspects of science which themselves need conceptual overhaul.
References Brandom, R. (2000). Articulating reasons. Harvard University Press. Cambridge, MA. Gage, J. (1993). Colour and culture. Thames & Hudson, London. Merleau-Ponty, M. (1962). Phenomenology of perception (translated by C. Smith). Routledge and Kegan Paul, London. Wyszecki, G. and Stiles, W. S. (1967). Color science. Wiley, New York.
chapter 7
A COMPUTATIONAL ANALYSIS OF COLOUR CONSTANCY donald i. a. macleod and jürgen golz Preface In the field of colour vision, sharp certainties about the earliest processes at the photoreceptor level shade off rapidly into foggy confusion about almost everything else. The conditions for two patches of light to match when they are presented in the same context have long been well understood. But when the illumination on a scene changes, we have limited quantitative data and no complete theoretical models for how the changing context affects perception. The basis for colour constancy in particular—the relative stability of surface appearance under changes of illumination that drastically affect the retinal stimulus—is still uncertain. Like many others who have attempted to bring clarity to the confusion, we proceeded by considering quantitatively defined mappings in the three-dimensional space of photoreceptor excitations. It has been clear for some time that changes of illumination alter photoreceptor excitations generated by elements of the scene in complex ways. Part of our project was to investigate these changes numerically, initially using spectroradiometric data from natural surfaces measured in San Diego by Richard Brown (Chapter 8 this volume). The resulting analysis is ‘computational’, both in the ordinary sense and in Marr’s sense, in that it investigates what would be required to achieve colour constancy under natural conditions. The complexity of the interplay between surface reflectance and illumination has customarily been analysed (following the work of Brill, Maloney, and Wandell, and others over the past 25 years) by considering the various natural surface reflectance functions and illuminant spectral power distributions as weighted sums of a diverse but fixed set of basis functions (what we call the ‘Linear World’). This captures the fact that when the illumination changes in colour, the photoreceptor excitations generated by different elements of the scene may change by different factors, a fact that appears to preclude recovery of surface colour by a simple illuminant-compensating normalization of the excitation values for each cone independently (the ‘von Kries law of coefficients’, Chapter 3 this volume) in the way that could work if surface reflectance or illuminant energy were confined to three wavelengths. Here we introduce a competing simple idealization for surface-illuminant interplay (the Gaussian World) where that interplay imposes a simple algebraical form on the illuminant-dependent changes of the colour code distribution. The intellectual environment of the ZiF project provided an encouraging forum for the elaboration of these ideas. Surprisingly, it turns out that chromatic values in the Gaussian World would be accurately corrected by a von Kries normalization, leaving only a failure of lightness constancy reflecting the fact that under red light, the stimuli provided at the eye by reddish surfaces become more intense relative to other elements within the scene. This theoretically expected luminance–colour correlation could be useful for solving what Alan Gilchrist has called the ‘anchoring problem’ in colour vision, helping the observer distinguish between a reddish scene under neutral light and a neutral scene under reddish light.
206
colour perception
We found experimental support for this when Jürgen Golz was able to come from Kiel for a year in San Diego. The idea that statistics of the distribution of the input colour codes can provide diagnostic information about illumination was already advocated long ago by Forsyth (whose focus on the form of the gamut was helpful in guiding our own thinking), and (in a very different line of thought) by Maloney and Wandell. The current work of Mausfeld and Andres on the significance of chromatic variance in the image is a second experimental example. One of us (Jürgen Golz) found support (not experimentally though) for the relevance of a reduced chromatic variance for avoiding embarrassing situations: one night he picked up from a drugstore photos that he had turned in for development. On the way back to his car he opened the envelope to look at the pictures. Much to his dismay, he realized that the laboratory had made a mistake and had developed all photos in black and white instead of colour. So he went back to the drugstore and complained. But when the lady behind the counter opened the envelope, everything was in brilliant colours and she asked: ‘You are colour blind, aren’t you?’ When he replied ‘No, I am not. But I am a colour scientist and I already have an idea what’s wrong outside at the parking lot’, she looked as if she thought he might be missing more than merely some cones in his retina. What he had not known is that all public lighting in the San Diego area uses sodium light so that the nearby Palomar Observatory can filter this portion of stray light. Under such narrow-band illuminants all surfaces reflect this spectral light only in different intensities, compressing the gamut of the retinal chromaticities to a point on the spectral locus (as discussed on p. 216). He simply never imagined encountering that extreme case outside the lab. Discussions of colour constancy can be divided into mechanistic and cue-based accounts. Because we consider colour constancy as a definite and simple mapping from stimulus to perceived chromatic values, our discussion has a superficially mechanistic flavour and might encourage what Mausfeld (Chapter 13 this volume) calls the ‘measurement device’ conception of perception. Yet, in reality, even the von Kries normalization can hardly—when applied to whole scenes—have the simple basis that it probably does in situations such as Whittle’s experiments (Chapter 3 this volume), where local contrast, extracted as a sensory primitive, appears to determine perception. Moreover, the later incorporation of scene statistics transforms our discussion into an explicitly cue-oriented account. Surely the union of these two approaches is ‘a consummation devoutly to be wished’, but in the meantime, our analysis is best regarded as purely descriptive. D. I. A. MacLeod, J. Golz
a computational analysis of colour constancy
207
Introduction The visual stimulus associated with an object surface varies with the illumination falling on the object. Constancy of perceived surface colour under changing illumination implies that the illumination-dependent mapping from surface reflectance to stimulus is cancelled by a compensatory variation in the mapping from retinal stimulus to perceived surface colour. To accomplish this compensation, the visual system must in some sense internalize, or at least exploit, the regularities of the environmental interaction between observed surfaces and the incident light, as these jointly determine the visual stimulus. In this chapter we begin by asking: what kinds of mappings from cone excitation triplets to surface colour appearance are required to make the latter constant under changes of illumination? This is a question not about the visual process itself, but about the environmental interplay between lights and surfaces, for which the visual system must compensate if it is to achieve colour constancy. In modern parlance (Marr 1982; Hurlbert 1998), it is a computational issue. We discuss it first for various idealized worlds that are more or less mathematically tractable (p. 207 and Appendix to this chapter), and next for the real world (p. 216). These investigations reveal the nature of the compensation that a colour-constant visual system should perform in each world. We next consider (p. 222) what mappings the visual system actually makes in the interests of colour constancy, and how these differ from the optimal ones. Appropriate compensation for the illuminant presupposes that the illuminant is appropriately estimated, and in the last section (p. 224) we turn from the compensation problem to the equally critical (and perhaps more challenging) estimation problem that a colourconstant visual system must surmount. What stimulus information is available to estimate the illuminant and thereby select the appropriate compensatory mapping? Among many visible effects of changing illumination we concentrate here on one class of cue: the information inherent in the illumination-dependent mapping itself. We use the results of the first two sections of the chapter to investigate how the statistics of the cone excitations associated with individual scene elements can help in estimating the illuminant, and we present experimental results that suggest that certain statistical sources of information are indeed exploited in solving the estimation problem.
The compensation problem: ideal worlds and their illumination-dependent mappings Vision generally depends on light that is incident on object surfaces, gets reflected toward the eye, and is absorbed in the retinal photoreceptors. At each wavelength in the visible spectrum, the light that gets to the eye is the product of: (1) what is incident on the surface, represented by the spectral power distribution (SPD) of the incident illumination (I λ) (which is constrained, but not strictly determined, by the intensity and colour of the light source); and (2) the fraction reflected by the surface, given by the surface’s spectral reflectance function S(λ).
208
colour perception
Thus the mapping from local surface reflectance to the associated retinal stimulus (a triplet of cone excitations) is illumination-dependent. To achieve an illumination-invariant representation of surface colour, the visual system must introduce a compensatory (hence also illumination-dependent) mapping from retinal stimulus to perceived colour. Can this be done—and, if so, what is the nature of the required compensatory mapping? We investigate this question for four quite different idealizations of the visual environment, discussing these in turn in the context of a very simple proposal about the nature of the visual system’s compensatory mapping.
Normalization The effects of changes in the intensity of the illuminant could, in principle, be compensated by a reciprocal adjustment of sensitivity, for instance at the photoreceptor level. It has proved attractive to extend this idea to changes in the colour balance of the illuminant (e.g. Helmholtz 1896; Land and McCann 1971). A change in the colour balance of the illuminant might be signalled effectively by its effect on the predominant colour of a scene, since it affects the chromaticity of every surface lit by the source, and it might be possible to compensate for these effects by applying reciprocal sensitivity adjustments selectively to different regions of the spectrum. The idea that constancy is implemented by operations equivalent to a scaling of sensitivity to incident light is a recurring one in both current and earlier discussions of the topic, and it will recur in our discussion here. To avoid premature implications about the physiological mechanisms responsible, we refer to it as the normalization model.1 Normalization, then, could be implemented either by a scaling of sensitivity to light at the photoreceptor level, or by later, and perhaps much more complex, processes, so long as these processes finally change colour appearance in the same way that photoreceptor sensitivity scaling would. The term ‘normalization’ might suggest that the change in effective sensitivity is reciprocal with some measure of the cone excitation generated by the observed scene—a measure that defines an implicit estimate of the illuminant, the effects of which are cancelled by the normalizing compensation. But a compensatory normalization could be associated with any illuminant estimate, no matter how the estimate is derived, and no matter whether it is accurate or not. Thus the normalization factors need not be reciprocally related to stimulus intensity measures, nor need they each depend only on the excitation of the photoreceptor to which they apply. The normalization model has a clear rationale if the photoreceptors each respond to a band within the visible spectrum so narrow that illuminant power or surface reflectance are always effectively constant across the band (a situation we consider below). In the real visual system, however, the spectral bands to which the three types of cone photoreceptor are sensitive are not very narrow (for good reason, since making them narrow would 1 Hurlbert (1998) calls such models ‘lightness models’. The normalization factors associated with each cone excitation are often referred to as ‘von Kries coefficients’.
a computational analysis of colour constancy
209
severely impair visual sensitivity by neglecting all incident light energy outside the sensitive band). It has been emphasized increasingly (e.g. Brill 1978; Buchsbaum 1980; Worthey and Brill 1986; Hurlbert 1998; Maloney, Chapter 9 this volume) that determining the colour of a surface under changing or unknown illumination is a difficult problem for a visual system with broad-band photoreceptors—so difficult, in fact, as to be, in general, unsolvable.
The Chaotic World The nature of the problem can be appreciated by considering two metameric yellow illuminants, visually similar but physically different. Suppose that each supplies light in only two narrow spectral bands, one red and one green, but that the bands supplied by the two light sources are slightly offset from one another, by enough that they do not overlap. Denote the bands by R1 and G1 for illuminant 1 and R2 and G2 for illuminant 2. Imagine each of these sources in turn lighting the following two yellow surfaces. One surface (which we may call surface R1G2), has a spectral reflectance that takes non-zero values only in the bands R1 and G2. The other yellow surface, surface R2G1, reflects only in the other two bands. Surface R1G2 is physically red under source 1: it only reflects the red band R1 of radiations to the eye, and not the second, green band, G1, supplied by source 1. Surface R2G1, on the other hand, turns green under source 1, because only its green reflection band G1, and not the red one R2, is present in the spectrum of source 1. The situation is completely reversed under source 2. Now surface R2G1 reflects red light only, and surface R1G2 reflects only green. This example is enough to show how there are, in general, no constraints at all on how a change of light source will change the colour of a surface: any set of colours can be transformed into any other set. In this example the two surfaces, as well as the two illuminants, can be indistinguishable yellows under a white light source (because the reflection bands can be arbitrarily narrow and correspondingly closely spaced), but they can assume any two chromaticities, limited only by the chromaticities of their reflection bands, when a suitably contrived illuminant is switched on to light them. The chromaticities they assume adopt can span the physically realizable gamut, and the changes due to changing the illuminant are independent for different surfaces, in the sense that any set of surface colours can be transformed into any other set. Such chaotic colour changes would defeat any attempt to achieve colour constancy by any systematic remapping of retinal stimuli on to subjective appearances. If a change of light source could be relied on to transform the intensity and chromaticity of retinal stimuli in a more or less orderly way, the visual system might be able to keep the surface colours constant by reversing that orderly transformation or mapping. But you can’t create order by unscrambling total chaos. So colour constancy is not in general possible! But of course, colour constancy does occur, because the Chaotic World of this scenario is not the real world we live in. This raises the question: under what real world constraints would a simple compensating process, such as normalization through sensitivity adjustments, give us colour constancy?
210
colour perception
The Three-band World The answer to this question depends on whether we require perfect colour constancy, or merely a useful approximation to constancy. For perfect or nearly perfect constancy through normalization, some extremely artificial conditions must be satisfied (Worthey and Brill 1986). First, the three cone types must respond to non-overlapping bands in the illuminant spectra. This will happen if the cone spectral sensitivities have no overlap, or if the illuminant energy is always confined to discrete non-overlapping bands, with one cone type sensitive to each such band. Secondly, illuminants must never vary significantly in their relative distribution of power across any one such band; they can vary only in the amounts of power they radiate in the three bands.2 Under this ‘three-band’ scenario, the cone excitations for each cone type vary in proportion to the total illuminant power within the corresponding spectral band, and this variation can be compensated—for all surfaces at once—by a reciprocal adjustment of sensitivity appropriate to the scene as a whole. Thus the Three-band World exemplifies what we call normalization-compatible mapping: the possible illumination-dependent mappings from surface reflectance to cone excitations differ only in the scaling factors applied to the three cone excitations. A convenient way of picturing the effects of changes in intensity and in colour balance of the illuminant is to consider a logarithmic cone excitation space in which three coordinates represent the excitations of the long-, mid-, and short-wavelength cones on a log scale. The position of the point representing any surface moves parallel to the positive diagonal under variations in illuminant intensity. If the colour of the illuminant changes, then surfaces move in the two orthogonal, ‘chromatic’ directions as well. But in the Three-band World, the constellation of points representing a set of surfaces in a scene moves rigidly under all changes in illuminant. This happens because a change of illuminant changes the excitation of, say, the long-wavelength cones by exactly the same factor for all surfaces in the scene— the factor by which the illuminant power in the red band is changed. Each cone excitation is scaled by the power being supplied within its associated spectral band. The logs of these three cone-specific scaling factors are added to the corresponding coordinate in the logarithmic plot. If we assume that a change of illuminant not only changes the average cone excitation within the scene but also triggers a reciprocal adjustment of sensitivity in each cone type, then the changes of log sensitivity would exactly cancel the change of log cone excitation for each surface, and the resulting normalized excitations would provide, for each surface, a representation independent of the illuminant colour. Geometrically, in log cone excitation space, the constellation of points representing the surfaces would be translated back in all three dimensions and would be brought back to where it started. There are many objections, mainly of a non-computational character, to the adoption of reciprocal adjustments of photoreceptor sensitivity as a model for constancy (e.g. MacLeod 1985). A couple of these merit at least brief acknowledgement, prior to leaving them aside for the moment. First, cone sensitivity regulation is mainly local (MacLeod et al. 1992) 2 Alternatively, but still less plausibly, the surface reflectances, rather than the illuminant power spectral densities, could be uniform in this sense for each band.
a computational analysis of colour constancy
211
rather than being based on excitation averaged over a scene. However, local sensitivity regulation can also produce an illumination-invariant representation—but of the local contrast at edges within the scene, not of local lightness and colour directly. In this way it can form the initial basis of a ‘retinex’ model for colour constancy (Land and McCann 1971). Secondly, the model normalizes out variations of lightness and colour balance inherent to the surfaces of a scene, as well as those due to the illuminant. We return briefly to this deficiency in discussing the ‘Grey World assumption’ (p. 225). In the context of a computational analysis, a more serious embarrassment is that the three-band scenario is hardly defensible as even a crude approximation to the real world. Obviously, we need to consider the possibilities for constancy in a slightly more natural scenario, in which cones, and especially surfaces and illuminants, have fairly broad and smooth spectral characteristics. Any principles of illumination-dependent mapping found to hold in such a world are more likely to be applicable in the real one.
The Linear World In the search for order in the chromatic universe, the approach usually taken is to approximate naturally occurring spectral reflectance functions and spectral power distributions by a weighted sum of suitably chosen basis functions, the latter forming a fixed set common to all spectra (e.g. Sällström 1973; Brill 1978; Buchsbaum 1980; Maloney, Chapter 9 this volume). To account for trichromatic perception of surface colour, we need, minimally, three basis functions for surface reflectance. Schemes like this that use linear models of reflectance and power distributions have been investigated quite a lot. And one of the first things to emerge is that the kind of normalization that worked for constancy in the threeband scenario won’t do at all, in principle, for the Linear World. This is not surprising, since it is only in the hopelessly artificial Three-band World that the cone excitations from different surfaces all change by the same factor when the light changes. Yet illuminationdependent mapping in the Linear World remains formally fairly simple. Each of the three surface reflectance components generates an illumination-dependent triplet of cone excitations (a cone excitation vector). Each such cone excitation vector is the weighted sum of Ni fixed vectors contributed by the Ni possible illuminant components: the fixed vectors are the cone excitation triplets resulting from the interaction of the illuminant component (in unit quantity) with the surface component under consideration, and the weights are determined by the particular illuminant’s composition. The visual system can recover an illumination-invariant representation of surface colour (its composition in terms of the three basis functions for reflectance) if it multiplies the cone excitation triplet by the inverse of an illuminant-specific 3 × 3 matrix that contains the cone excitation vectors that would be generated by unit values of each surface component under the prevailing illumination. The approximation of natural reflectance functions with a combination of three basis functions might not be considered satisfactory (it accounts for only 60% of the chromatic variance in a haphazard sample of natural surface reflectance functions measured spectroradiometrically by Brown, Chapter 8 this volume), but the Linear World is at least a
212
colour perception
great improvement over the Three-band World in this respect, and as a result it has come to dominate discussions of colour constancy from a computational perspective. Yet the Linear World has serious shortcomings as a theoretical framework. At a purely descriptive level, the linear description both profits by and suffers from an arbitrary element in the choice of the basis functions. Free choice of the basis functions permits more accurate description of a given set of spectra with a small number of weight parameters. On the other hand, optimization of the basis functions for a particular target set of spectra (such as Munsell papers, or natural terrains, or haphazard collections of interesting objects from one environment or another) is not a principled justification for their choice for general application. A more serious shortcoming is that the world of linearly modelled surface and illuminant spectra harbours a fundamental inconsistency: when visually identical stimuli originate from physically different sources, their Linear World approximations will generally be different in ways that should make them visually distinguishable. This inconsistency has its origin in an awkward general feature of the Linear World: even with the minimal three degrees of freedom each for light source and surface, the stimulus spectra have six degrees of freedom; this is more than is necessary or desirable for characterizing them if three are sufficient for the functions that generate them (unless, of course, the linear model happens to be an accurate idealization of reality in this respect—a point that has not been established). Moreover, the simplicity and order implied in the linear mapping principle are not as complete as might appear. The framework of the linear model places no constraint on the individual components of the reflectance and illumination spectra, so their interactions could, in principle, be as arbitrary as the ones in the Chaotic World. To guarantee order in this sense, the basis functions should be smooth. But, as we will see, the same objective can be secured by modelling the surface and illuminant spectra themselves with smooth functions. The theoretical value of the Linear World scenario is also limited. The formal simplicity of the compensation operation (matrix division) is not supported by any known or plausible basis in visual processing. In particular, there is a lack of evidence that the visual system does, in any sense, internalize the matrices of coefficients specifying cone excitations for different combinations of surface and illumination components, or that its colour corrections are consistent with such a computational scheme (we return to this point below, p. 222). In general, despite extensive work employing the Linear World as an analytical tool, it does not seem to have produced much insight either into the actual principles of illuminationdependent mapping in the environment, or into the visual processes subserving colour constancy. The fondness of theorists for the linear idealization of the chromatic universe may, in part, reflect the lack of an alternative. In the ideal ideal world, the appropriate illuminationdependent or compensatory transformations would be more intuitively understandable and mechanistically plausible, as is the case with simple normalization, and regularity would be guaranteed by restricting the model to smooth (and not arbitrary) functions throughout. We next introduce one such alternative, which we have found illuminating in the computational analysis of colour constancy.
a computational analysis of colour constancy
213
The Gaussian World Special cases of the Linear World may involve a principled choice of basis functions and guarantee smooth spectra. One such scheme approximates natural spectral reflectance functions (cf. Moon and Spencer 1945) and spectral power distributions by polynomials in wavelength or some other spectral variable. Trichromacy requires at least secondorder polynomials for the surfaces, so a second-order polynomial constraint on the surfaces and illuminants yields a minimal trichromatic world. The Gaussian World we introduce here differs from this in just the following way: it is the logarithms of the spectral distributions that are built up from additive components. That is, the log of the radiant power or of the surface reflectance is always describable as a secondorder polynomial. Depending on the sign of the quadratic term, these idealized spectral functions are either Gaussians (with exponentials as a special case), or the reciprocals of Gaussians.3 Like the Linear World with three degrees of freedom for surfaces and illuminants, the Gaussian World is a fully (but minimally) trichromatic world. It is an equally crude approximation to reality. But the benefits of abandoning the Linear World by taking logs are many: • We don’t have to worry about negative light emissions or negative reflectances, since the
numbers represented by their logarithms can’t go negative. And because of their more natural approach to zero, the Gaussian descriptions fit natural spectra better than do the comparable linear ones. • We can include monochromatic radiations in the chromatic universe as a limiting case, something not possible in the Linear World. • There is an important gain in mathematical simplicity and tractability: the proliferation
of degrees of freedom that occurs in the Linear World in the progression from light and surface to the visual stimulus is here entirely avoided. When incident light is multiplied by surface reflectance, the second-order polynomials descriptive of the log power and the log reflectance simply add to form the log of the retinal stimulus, making this, too, a second-order polynomial (in non-logarithmic terms, the product of the Gaussians for surface and illuminant yields another Gaussian). Thus the function specifying the proximal stimulus still has the same form as the surface spectral reflectance and the illuminant spectral power distribution (SPD), and the same three degrees of freedom (each of its three parameters being a function of the surface and illluminant parameters), whereas in the Linear World it has six.
• Each stimulus chromaticity is associated with the same spectral energy distribution,
no matter what the light and surface reflectance that combined to generate it, so indistinguishable stimuli are now consistently represented as indistinguishable.
3 With integration over a finite spectral range, we need not be troubled by the fact that the latter functions approach infinity in the limit of infinite wavelength.
214
colour perception
• Most important, the cone spectral sensitivities themselves can be approximated by
Gaussians,4 and this gives us something not available in the other theoretical worlds: a simple equation (see the appendix to this chapter, eqn A7.1) for the cone excitations generated by any surface and illuminant, where the only variables are the parameters specifying cone sensitivity, reflectance, and illuminant.
The appendix to this chapter is an algebraic analysis of illumination-dependent mapping in the Gaussian World. In Equation A7.1 of the appendix, cone excitation is a product of three Gaussian factors. One factor represents surface colour, independent of the illuminant. It is a Gaussian function of the spectral distance of the surface’s spectral centroid from the λmax of the cone photoreceptor. This is what we want to preserve in a colourconstant visual system. A second factor similarly depends on the illuminant colour. If surface bandwidths are not too narrow, this factor is approximately the same for all surfaces, as it depends only on the relation between the illuminant spectral centroid and the cone’s wavelength of peak sensitivity. This means it can be successfully normalized out, in just the way that we saw with the Three-band World. The equation for cone excitation is completed by another factor that depends both on the illuminant and the surface parameters. In a sense this factor precludes full colour constancy based on normalization, since the normalization factor required to remove it is different for different surfaces within a scene. Luckily though, this factor is independent of the cone’s preferred wavelength, which means that it generates variations in effective intensity only, not colour. What this factor means is simply that when the lighting gets red, the reds get lighter, relative to the blues and greens. Illumination-dependent mapping in the Gaussian World therefore has the following characteristics, which are specified quantitatively by the equations in the appendix. (1) Normalization-compatible mapping for colour. Independent normalization of the cone photoreceptor excitations can recover the chromatic aspects of surface reflectance accurately for sufficiently broad-band surfaces and illuminants. In that sense, normalization could yield colour constancy in this idealized world. Normalization-compatible mapping of chromatic values has a simple geometric expression. As we noted in discussing the Threeband World, normalization-compatible mapping of both colour and intensity implies rigid translation in log(L,M ,S) space under changes of illumination. We can abandon the requirement for correct recovery of intensity by considering Fig. 7.1, where chromatic values of stimuli are represented in a planar cross-section of that space by projecting along parallel lines of variable intensity but constant chromaticity. The axes are the differences in log cone excitation between different pairs of cones. Normalization-compatible mapping of chromatic values requires that under different illuminants, the constellation of points
4 The Gaussian description of cone sensitivity fails in the tails, but that doesn’t matter much if we are interested in broad-band stimuli. This shortcoming can be alleviated by choosing a suitable non-linear function of wavelength as the spectral variable. We have not explored this, since (for natural surfaces) the errors introduced by approximating the cone sensitivities are much smaller than those inherent in the idealization of the reflectances and illuminants.
0.4
(a)
0.3
Log(S) – log(L)
0.2 0.1 0 –0.1 –0.2 –0.3 –0.4 –0.04 –0.03 –0.02 –0.01
0.01 0 Log(L/M)
0.02
0.03
0.04
0.5
(b)
0 –0.5
Log(S) – log(L)
–1.0 –1.5 –2.0 –2.5 –3.0 –3.5 –0.05
0
0.05
0.1 0.15 Log(L /M)
0.2
0.25
0.3
Figure 7.1 (a)Illumination-dependent chromatic values in the Gaussian world. The kite represents stimuli generated by a set of natural surfaces centred on white and representative of natural scenes; it is formed from lines of constant spectral centroid (the radii) or of constant spectral curvature (the crossbars). With illuminants roughly approximating 4000K (left), equal-energy (centre) or 20 000K (right), the kite undergoes a translation, coupled with slight compression and skewing. Units for the cone excitations are chosen to place the equal-energy white at the origin. (b) The same plot, with the addition of the spectrum locus as the illumination-invariant limit approached for monochromatic surface reflectances. This illustrates how the chromatic shifts for colours typical of natural scenes are theoretically related to the spectrum locus (although the model does not provide a usefully accurate construction of the spectrum locus itself).
216
colour perception
representing a set of surfaces must be translated rigidly in this plane. For broad-band colours, the logarithmic plot of the (r, b) coordinates of MacLeod and Boynton (1979) is practically equivalent to such a plane, and is used below to evaluate the rigidity of shifts for sets of natural colours. But normalization-compatibility is only approximate, and the equations in the appendix also show how the Gaussian World violates normalizationcompatibility in three distinct and quantitatively specifiable (but generally minor) ways. (2) When the lighting gets red, the reds get lighter. First, as mentioned above, and as the appendix demonstrates, there remains (after normalization) a deviation from constancy, occurring for both broad-band and narrow-band colours, that can be corrected by a change of stimulus intensity alone: surfaces similar to the illuminant in colour are rendered as lighter than surfaces dissimilar from the illuminant in colour. While it precludes rigid translation in log cone excitation space, this does not affect chromatic values and thus preserves rigid translation in Fig. 7.1. For narrow-band colours, normalization fails to restore even the chromatic values correctly. The appendix identifies two such deviations from normalization-compatibility. (3) Resistance of narrow-band surfaces to chromaticity shift. Stimuli from narrow-band surfaces are resistant to shifts in log cone excitation space, with illumination-invariance as a limit approached for monochromatic reflectances (these remain at the spectrum locus). (4) Gamut compression with narrow-band illuminants. For narrow-band illuminants, the spectrum locus is again approached as a limit, but this time for all surfaces, so the gamut is compressed. Figure 7.1 illustrates these principles of the mapping of chromatic values in the Gaussian World. The ‘kite’ represents a set of surface colours, centred on a non-selective white and with a dispersion roughly representative of natural scenes (see p. 216). For the centre kite, the surfaces are viewed under equal-energy illumination; at left and right, under an idealization of bluish and reddish daylights. The kite is formed from lines of constant surface-reflectance spectral centroid (its radii) or of constant surface-reflectance bandwidth or spectral curvature (its crossbars).5 Under chromatic illumination, the kite is translated almost rigidly, illustrating the normalizationcompatible mapping of (1) above. But it also undergoes a slight skewing because the more narrow-band reflectances (e.g. the yellows forming the bottom edge of the kite) are more resistant to the illuminant-dependent chromaticity shift. Figure 7.1b clarifies the geometry of this skewing effect [(3) above]: the kite radii pivot around the fixed points obtained by extrapolating them to the spectrum locus. In addition to skewing, the kite undergoes a less obvious illuminant-dependent compression [(4) above]. Because the illuminants considered are all broad-band (or more precisely, have close to exponential spectra, giving
5 In these logarithmic coordinates, lines of constant bandwidth, including the spectrum locus, are entirely straight. But in a chromaticity diagram whose coordinates are linear with the cone excitations, these straight lines are bent into the more familiar curved form, with a straight region only where S cone excitation is negligible.
a computational analysis of colour constancy
217
them low spectral curvatures in the sense defined in the appendix), the compression here is minimal. Since discussions framed in the Linear World have emphasized that normalization is, in principle, inadequate, the approximate validity of normalization in the Gaussian World is somewhat surprising. This raises the question: which ideal world is more pertinent to real vision in the real world? Later (p. 222) we will review briefly evidence about the types of compensation actually implemented by the visual system. But first we consider the illumination-dependent mapping found in the real world, and compare it with what happens in the ideal worlds—with particular reference to the Gaussian one, since that model of the environment is particularly tightly constrained and provides a rich set of easily tested predictions.
The real world For an initial sample of the real world we have relied mainly on data of Ruderman et al. (1998), who obtained spectral reflectance estimates, pixel by 3′ -arc pixel, for 12 entire views of natural environments, creating a data set comprising nearly 200 000 pixels. They characterized each surface by its spectral reflectance relative to a full reflectance white standard measured in the same scene. We derived the cone excitations L, M , and S (for long-wavelength, mid-spectral, and short-wavelength cones) from the measured spectral reflectances by integrating their cross-products with the energy basis cone sensitivities of Stockman et al. (1993), assuming four different illuminants from the set of CIE daylight spectra (Wyszecki and Stiles 1982, p. 145), with correlated colour temperatures 4000K, 5500K, 8500K, and 20 000K, covering the extreme range of unimpeded daylight illuminants (Fig. 7.2). The measure of surface luminance is given by the summed excitations of L and M cones. The intensity-invariant chromatic measures we mainly consider are the (r, b) chromaticity coordinates of MacLeod and Boynton (1979), defined by r = L/(L + M ) and b = S/(L +M ), with units for b chosen to make it unity for white. The choice of cone sensitivities or chromaticity measures is not critical. The particular logarithmic colour coordinates adopted by Ruderman et al. in their own analysis, for example, are practically linear with log(r) and log(b) in the case of natural broad-band stimuli, although the equivalence is not exact. We discuss in turn the different aspects of illumination-dependent mapping that were listed above.
Normalization-compatible mapping As noted above, normalization-compatible mapping implies a rigid translation in log(L,M ,S) space under changing illumination. For all individual pixels in the natural scenes of Ruderman et al. we plotted the appropriate logarithmic coordinate for 20 000K illumination against its value under the 4000K illuminant. Figure 7.3 shows results for one typical scene (their Park4). Despite large chromatic shifts, the pixels cluster closely around lines of slope 1, as expected if the change of illumination from 4000K to 20 000K makes
218
colour perception 300
250 20000K
Power density
200
150
100
50
0
8500K
5500K
4000K
450
500
550
600
650
Wavelength (nm) Figure 7.2 Spectra of the three CIE daylights applied to the natural images.
the stimulus chromaticities all undergo the same shift in log(L,M ,S) space. These results support those of Dannemiller (1993) based on Krinov’s average terrain data, and of Foster and Nascimento (1994) based on Munsell papers, and extend them to individual elements of natural scenes. A novel feature of our analysis is the separate treatment of the chromatic and luminance axes. While the validity of the rigid translation approximation appears better for the luminance axis (Fig. 7.3c) than for the chromatic ones, it should be noted that the axis for log(r) has a very different scale from the other two, due to the very small variance of this quantity in natural images (Ruderman et al. 1998) which is accompanied by a commensurate sensitivity of the visual system to this variable (MacLeod and von der Twer, Chapter 5 this volume). Approximately rigid displacements measured on the coarse scale required for displaying the individual cone excitations, or luminance, do not imply (perceptually) rigid displacement along this highly sensitive chromatic axis. Nevertheless, the rigidity principle appears here to be a useful approximation for all three axes. Averaged over 12 scenes, the standard deviation of the changes in the (decimal) log of r (among the pixels of one scene) is only 0.0024; this means that the factor by which r changes is stable across pixels with a standard deviation of 0.55% of its mean. For b and for luminance, the standard deviation is 1.8%; all three deviations from rigid
a computational analysis of colour constancy (a)
(b)
Log(b) 20 000K
Log(r ) 20 000K
–0.13 –0.15
–0.17
219
0.5
0
–0.5
–0.19 –0.21 –0.21 –0.19 –0.17 –0.15 –0.13 Log(r ) 4000K
–1 –1
–0.5 0 Log(b) 4000K
0.5
2 Log(luminance) 20 000K
(c)
1.5
1
0.5
0 0
0.5 1 1.5 2 Log(luminance) 4000K
Figure 7.3 Chromaticity and luminance values for individual pixels of the PARK4 scene of Ruderman et al. (1998), under 4000K illumination (horizontal axis) and under 20 000K illumination (vertical axis). Normalization-compatible mapping requires proportionality between the values, or clustering of the points around lines of slope 1 in these double-logarithmic plots. Note the very different scales for the different plots.
translation are small enough to be undetectable or barely detectable, hence perceptually inconsequential. Although the deviations imply that a constancy compensation based on normalization could not be strictly perfect, they are so small that it could be practically perfect by comparison with the conspicuously imperfect constancy that characterizes human vision. The closely proportional variation of the luminance values under the two illuminants (Fig. 7.3c) warrants further comment. As mentioned on p. 214, strict proportionality is not characteristic of the Gaussian World. Nor is it, in fact, characteristic of normalizationcompatible mapping in the sense we consider here, where each cone type has its own
220
colour perception
normalization factor. This requires proportional variation of the L and the M cone excitations under different illuminants, rather than proportional variation of luminance (which is well modelled by L + M ). These are not equivalent (unless the proportionality constant is the same for L and M cones). But if L and M cone excitations are considered separately, the proportionality holds still more precisely, with a standard deviation of 0.84% for L and 0.89% for M, as opposed to the 1.8% value for luminance. An expected improvement of this sort formed the basis of the suggestion by von Campenhausen (1986) and Shepard (1992) that the requirements of lightness constancy have generated significant selective pressure for the evolution of trichromacy. But the already very close proportionality found for luminance undermines this argument: a system using the slightly broader L + M function as its sole spectral sensitivity could achieve very good constancy through normalization (with a root mean square error of 1.8% in the comparison of intensities between extreme daylight illuminants), so the improvement in constancy obtainable by dividing the red–green spectral range between the L and M cones is minimal. The real world, then, is far from chaotic; and while it is not perfectly orderly like the Three-band World, on this evidence it exhibits normalization-compatible mapping to a perceptually acceptable approximation. The Gaussian idealization appears to be a viable model in this respect, and the added intricacy of the linear model may not be necessary, or even appropriate. We next ask whether the real world exhibits the deviations from normalization-compatible mapping that characterize the Gaussian World. When the light gets red, the reds get lighter Figure 7.4 shows for the same scene the change in luminance of individual pixels in going from the bluish 20 000K illuminant to the reddish 4000K illuminant, plotted versus the pixel’s r coordinate. The expected correlation is present: the reds (mostly) get lighter and the blues and greens dimmer. The magnitude of the effect is only a few per cent, but as the figure shows, it is a major source of all departure from normalization-compatible mapping for the luminance axis (since the departures are themselves small). The effect is expected for the Gaussian World idealization but is also generated in the Linear World model with appropriate basis functions. Shift resistance and gamut compression While it is inevitable on physical grounds that surfaces with sufficiently narrow-band reflectance will undergo a smaller chromaticity shift toward the illuminant, this fact cannot be exploited in the recovery of surface colour unless the observers can recognize the differences in surface bandwidth. Here we assume that the only information available for that purpose is the triple of cone excitations generated by the surface. This means that only those bandwidth effects that are correlated with surface colour are relevant. Two such surface bandwidth effects emerged in our simulations. The first is illustrated for the Park4 scene in Fig. 7.5. In this and in the other scenes, surface-reflectance bandwidth (or more precisely, signed spectral curvature, as defined in
Log(Luminance) 4000K –log(Luminance) 20000K
a computational analysis of colour constancy
221
0.04 0.02 0 –0.02 –0.04 –0.06 –0.08 –0.1 –0.2
–0.19
–0.18
–0.17
–0.16
–0.15
–0.14
–0.13
Log(r ) 7000K Figure 7.4 Change in log10 (luminance) for individual pixels of PARK4 when the 4000K illuminant replaces the 20 000K illuminant, plotted versus redness of the pixel (as measured by log10 (r) under 7000K).
the Appendix) tends to vary markedly along the b axis, with narrower bandwidth for the abundant yellowish colours than for whites, or for the relatively infrequent purplish colours, which tend to have concave-upward spectra (negative spectral curvature). Accordingly, Fig. 7.5 shows a result otherwise surprising: the illumination-induced shift in r is greater for surfaces with high b. While not large, this deviation from rigid logarithmic translation is enough, in many scenes, to skew the distribution of pixels in the [log(r), log(b)] plane in just the manner illustrated for the Gaussian World in Fig. 7.1. Secondly, the simulations revealed a weak tendency for the lighter surfaces to be more stable in physical chromaticity under change of illuminant. This was largely mediated by the correlation between luminance and b, the lighter surfaces being more yellowish in these scenes. Gamut compression by narrow-band illumination was not clearly evident in our simulations. Variances among the scene elements in log cone excitation space did not differ systematically for the four illuminants. But the chosen daylight illuminants are not well suited to reveal such an effect. As Fig. 7.2 shows, they are roughly exponential in the visible range, and in the algebraic analysis of the Gaussian World (Appendix) they would be assigned spectral curvatures close to zero, and therefore would not be expected to restrict the range of log(r) or log(b) much. Gamut compression is physically inevitable when the illumination is of sufficiently restricted bandwidth, but in the natural world it may become pronounced (if at all) only under restricted or indirect illumination (for instance, in forest scenes).
222
colour perception
Log(r ) 4000K – log(r ) 2000K
0.03
0.025
0.02
0.015
0.015 –0.8
–0.6
–0.4
–0.2
0
0.2
0.4
Log(b) 7000K Figure 7.5 Change in redness [log10 (r)] for individual pixels of PARK4 when the 4000K illuminant replaces the 20 000K illuminant, plotted versus blueness of the pixel [as measured by log10 (b) under 7000K]. The yellowish surfaces, at left, undergo smaller shifts.
Compensation by the visual system: which world do we internalize? Having established that the illumination-dependent mappings found both in a plausible ideal world (the Gaussian World) and in the real one are approximately normalizationcompatible, we now ask whether the visual system has adapted to this environmental regularity by adopting something like a normalization algorithm for its recovery of surface colour. Evidence to be reviewed supports this, but leaves it uncertain whether the visual process is able to compensate for the relatively small natural deviations from normalizationcompatible mapping.
Normalization for colour Brainard and Wandell (1992) made memory matches between computer-simulated surfaces viewed under two different (computer-simulated) conditions of illumination, and compared the errors in prediction of these matches for different candidate models of human colour constancy. The normalization model (with only three free parameters, one for scaling each cone sensitivity) predicted the human matches very nearly as well as Linear World
a computational analysis of colour constancy
223
models in which the human observer is supposed to internalize a linear description of the spectral distributions, using nine or more parameters per illuminant, and do a matrix inversion to recover the surface colours. Under very different conditions involving binocular matching, Chichilnisky and Wandell (1995) obtained similar results. And under more natural conditions, Brainard et al. (1997) found the same. In all these studies, a simple cone-sensitivity scaling or normalization model mimics human visual judgments almost about as well as the Linear World based models can do with their extra parameters. Nor is there any suggestion that specific features of the linear model’s predictions are reflected in the human matches. Further evidence that our visual systems are designed for a normalization-compatible world comes from an interesting experiment by Nascimento and Foster (1997). They made careful simulations of illuminant changes applied simultaneously to several surfaces in a cathode ray tube (CRT) display, and asked observers to compare these with precisely normalization-compatible transformations. Although the latter are not as accurately representative of the physical effects of illumination change, they were more likely to be identified as illumination changes by the observers. When the lighting gets red, the reds get lighter Lightness variations like this are easily demonstrated or observed in common experience with artificial light—when the light gets red, the reds do get lighter perceptually—but they seem to be taken for granted and have not been discussed in the context of colour constancy or even quantified experimentally. They represent a partial or complete failure of constancy. It is a failure that could, in principle, be eliminated relatively easily by a sophisticated visual system that has internalized the relevant environmental regularities. But its elimination would require interaction between the intensive and the purely chromatic components of the neural representation, and this might be problematic for a primitive compensation mechanism. Whatever the reason, this environmental deviation from normalization-compatible mapping may not, on the present limited evidence, have been internalized in the functional organization of our visual system. Resistance of narrow-band surfaces to chromaticity shift We have noted that to compensate for this deviation from normalization-compatible mapping, the visual system must exploit correlations between surface colour and surface bandwidth, which cause certain colours, in general, to be more resistant to chromaticity shifts. Two such effects were noted on pp. 220–1. The reduced illumination-induced (redward) chromaticity shift for yellows as compared with whites, in the metric of Fig. 7.5, is a violation of normalization-compatible mapping, that calls for the visual system to make a greater compensatory correction for whites (or desaturated purples) than for yellows. The question whether such differential compensation actually occurs has not been specifically addressed experimentally, and it is not clear, in the data of Ware and Cowan (1982), for example, whether the visual system merely adopts the simple but environmentally suboptimal normalization algorithm. Chichilnisky
224
colour perception
and Wandell (1995) and Whittle (Chapter 3 this volume) note that equal logarithmic differences relative to differently coloured backgrounds in Whittle’s haploscopic display are generally perceptually equal, as if only normalization were operative. But deviations from this principle appear with saturated backgrounds. These differences are in the expected direction, viz. that equal logarithmic differences are perceptually larger in the case of the saturated background, but it is not yet clear whether they quantitatively or comprehensively support the idea that the visual system may have successfully internalized this environmental deviation from normalization-compatible mapping. The tendency for brighter surfaces to change less in chromaticity, mentioned on p. 221, provides a possible (eco-)logical basis for the Helson–Judd effect (Helson 1938), in which the lighter surfaces in an artificial physically achromatic scene (made up of spectrally nonselective surfaces) perceptually assume a faint tint of the colour of the illuminant, and darker surfaces assume the complementary colour. In Helson’s situation the change in stimulus chromaticity was independent of luminance. If the visual system has a basis in environmental statistics for expecting the more luminous surfaces to change less than the darker ones, then the shifts toward the illuminant chromaticity for the more luminous surfaces in Helson’s scenes would be unexpectedly large, and the resulting stimulus chromaticities might logically be taken as evidence that the lighter surfaces have an inherent colouration similar to that of the illuminant. Gamut compression with narrow band illuminants When illuminant bandwidth becomes narrow, the configuration of surfaces in log cone excitation space is compressed toward the locus of physically non-selective reflectances (as discussed quantitavely in the Appendix). Cone signal normalization can effect compensatory translations in this space but cannot counteract the compression. Yet there is evidence that we have quite powerful perceptual compensatory mechanisms available to deal with such gamut compression. Brown and MacLeod (1992, 1997) found that when a test field is surrounded by elements of uniform or nearly uniform colour (as would occur with a narrow-band illuminant) the colour of the test field is perceptually enhanced when comparison is made with a test field in a chromatically varied surround. This perceptual gamut expansion, cued by the physical heterogeneity of other stimuli within the scene, would be useful in compensating for physical gamut compression when the illuminant bandwidth becomes narrow. This is one aspect of colour constancy that can owe nothing to normalization. It could, however, originate from something as primitive as sensitivity modifications at post-receptoral (spatially opponent and/or colour-opponent) stages of visual processing. Perceptual gamut expansion may not, however, have evolved for dealing with natural illuminants of restricted bandwidth, since as we have noted, bandwidth restrictions with unimpeded daylight illumination are not pronounced. Perceptual gamut expansion is probably more valuable under hazy conditions (the situation investigated for the achromatic domain by Gilchrist and Jacobsen 1983) than in the rare case where the illuminant bandwidth becomes narrow. Our conclusion with regard to the compensation problem is that the visual system employs a colour-correcting mapping approximately equivalent to normalization,
a computational analysis of colour constancy
225
but may not compensate appropriately for the relatively small natural deviations from normalization-compatible mapping that characterize both the real world and its most useful idealizations. We now turn from the compensation problem to the complementary problem of colour constancy: the problem of estimating the illuminant.
The estimation problem: inferring the illuminant from scene statistics Here we consider one source of information for estimating the illuminant: the illuminationdependent mapping itself. Illuminants may be recognizable by their effects on the scene statistics, or specifically on the distribution of scene elements in cone excitation space. This leaves aside many other potential cues to the illuminant, particularly ones that depend on the three-dimensional geometry of the scene (Hurlbert 1998). Effectively, we consider a world of co-planar diffusely reflecting surfaces, with illumination incident uniformly over each scene. The most obvious candidate scene statistics are the scene-averages or maximum values of relevant quantities, but here we will conclude that environmental deviations from normalization-compatible mapping allow other scene statistics to play an important independent role. The limited value of the scene-averaged chromaticity for estimating the illuminant When the illumination of a scene is changed (for example toward more energy at long wavelengths), all reflected lights will change correspondingly, and the chromaticity averaged over the entire image will become more reddish. A simple approach to colour constancy could therefore take the space average chromaticity as a cue for the chromaticity of the illuminant and use this to correct for shifts in chromaticity of objects due to non-neutral illuminants. One model for colour constancy that uses a spatial average of the receptor responses to estimate the illumination of the image was proposed by Buchsbaum (1980). His estimation of the illuminant is based on the assumption that for all scenes the field average reflectance is equal to an internal reflectance standard S0 (λ).6 The illuminant is estimated by determining which illuminant would have resulted in the actual obtained average receptor response, assuming that the scene is illuminated uniformly and the spatial average reflectance of the scene is S0 (λ). This estimate is then used together with the responses for the subfields in the image in order to obtain illuminant-independent reflectance descriptors for each subfield. For scenes for which the mean reflectance function differs from S0 (λ), the algorithm wrongly attributes the mean receptor response to a spectrally biased illuminant. Every change in the mean receptor response is interpreted as due to a change in the illumination. Thus this algorithm, like many others relying on what has been called the ‘Grey World 6 Even though this internal reflectance standard does not have to be spectrally neutral, this approach is often referred to as one representative of Grey World algorithms.
226
colour perception
assumption’, is not able to deal with scenes in which coloured surfaces that differ in a particular direction from the known standard are predominant. This weakness is inherent to all models that use a space-averaged chromaticity of the scene to estimate the illuminant. The reason is that this measure is ambiguous: for a given scene, the mean chromaticity could be reddish due to a predominance of reddish surfaces within this scene, or to a reddish illuminant (Fig. 7.6a). One approach that takes into account more information about the scene than mean chromaticity is the framework of probabilistic colour constancy. D’Zmura et al. (1995) presented a stochastic linear model for estimating the illuminant of a scene. This scheme uses a priori knowledge about reflectance and illuminant probabilities in order to calculate how likely it is that the viewed scene is illuminated by a particular light, given the chromaticities of the reflected lights. The most likely illuminant is then determined by a maximum likelihood estimation procedure. Let p[S(λ)] describe the a priori probability that the surface reflectance function S(λ) is encountered in the world. If the probability distribution p[S(λ)] is known, one can determine the probability distribution of the receptor responses r = (r1 , r2 , r3 ) for a trichromatic visual system. Let I (λ) be the scene illuminant and Ci (λ) the sensitivity function of the ith receptor, then it follows that ri = Ci (λ)I (λ)S(λ)dλ (i = 1, 2, 3). The conditional probability distribution p[(r1 , r2 , r3 )|I (λ)] for the receptor responses given a particular illuminant I (λ) can thus be calculated. Now consider a scene containing a set of NS surfaces drawn independently and viewed under an unknown illuminant resulting in a set of NS cone responses. For a particular illuminant I (λ) the likelihood L[I (λ)] for this set of cone responses can be calculated as L[I (λ)] =
NS
n=1
p((r1 , r2 , r3 )n |I (λ)).
A simple way to estimate the illuminant of an image is to take the illuminant that has the highest likelihood for the given set of responses. The result of this maximum likelihood estimation is also called maximum a posteriori (MAP) estimate. If one also has prior knowledge about the probability of encountering various illuminants in the world, one can refine the estimation by finding the maximum for the likelihood that also takes this probability distribution into account: L[I (λ)] =
NS
n=1
p((r1 , r2 , r3 )n |I (λ))p(I (λ))
This estimate of the illuminant then can be used to recover the surface reflectance functions of the scene. D’Zmura et al. (1995) implemented a maximum likelihood estimation scheme that uses only two chromaticity values for each reflected light and employs a linear model for the
(a)
(b)
Figure 7.6 (a) The mean chromaticity of a scene is an ambiguous measure for estimating the illuminant. (b) How this ambiguity could be resolved by high-order scene statistics.
228
colour perception
surface reflectances and illuminations. They present a Monte Carlo simulation which shows that the chromaticities of relatively few reflected lights are sufficient to recover an illuminant accurately. Brainard and Freeman (1997) presented a similar Bayesian scheme to determine the illuminant, although they use a different optimality criterion for finding the best estimate. They argue that this so-called maximum local mass (MLM) estimator is more appropriate for perceptual tasks than other estimators. They compare the simulated performance of this scheme with a Bayesian scheme using a MAP estimator, a Grey World algorithm and other colour constancy algorithms. The MLM method performs better than all other algorithms. But if the mean chromaticity of the sample of surfaces in a scene is biased, all algorithms perform poorly. Thus, these methods have only a limited capability to separate illuminant changes from changes in the surface collection under the simple viewing conditions considered here. By using the probability of the observed chromaticities, both Bayesian schemes take into account information about the entire distribution of the reflected lights, not just the mean. If other statistics change in a characteristic way under changing illumination and the employed a priori knowledge about probabilities mirrors these regularities of the world, these schemes automatically gain improved performance. Nevertheless, in adopting the assumption that all scenes draw their surfaces independently from a single p[S(λ)] distribution that is characteristic of the world in general, these schemes still embody the Grey World assumption (albeit in a statistical reformulation), and inherit its problems. In the real world, where different scenes have surfaces drawn from different populations, the expected values of all statistics differ from one type of scene to another, and these schemes deliver correspondingly incorrect estimates of the illuminant. Consideration of scene statistics does, however, allow a more radical departure from the Grey World assumption. The assumption of a single p[S(λ)] can be abandoned, and replaced by other, more general constraints on the distribution of cone excitations. An interesting example is the proposal of Forsyth (1990), which considers the gamut of cone excitations that are physically realizable under a particular illuminant. This gamut is fixed, for any given illuminant, by the constraint that whatever collection of surfaces is present, none can reflect more than 100% of the incident light at any wavelength (Koenderink and van Doorn, Chapter 1 this volume). Forsyth (1990) modelled the colour of an object as the receptor responses of the object under a fixed canonical light. In order to achieve colour constancy, one must then estimate which illuminant is present and try to estimate what the receptor responses for the objects would have been if the scene had been illuminated by the canonical light. Every illuminant is associated with a linear mapping that maps the responses under the canonical light to a different set of responses under this particular illuminant. Once one has estimated the illuminant in a scene, one can then apply the inverse mapping to determine the responses under the canonical light, and thus obtain the constant colour descriptors. Forsyth (1990) described the circumstances under which these mappings are invertible and the conditions and assumptions necessary for finding the illuminant of an image (and therefore the mapping).
a computational analysis of colour constancy
229
The estimation of the illuminant is based mainly on the fact that a surface cannot reflect more light than is cast on it at any wavelength. For a particular illuminant, therefore, many receptor responses cannot be achieved; the set of possible receptor responses is bounded. The illuminant is constraint by the observed image: some illuminants are not compatible with the set of responses given in the image. If, for example, a patch strongly excites the long-wavelength receptor, it cannot be illuminated by a blue light. Using the gamut of the receptor responses of an image, one can therefore exclude some illuminants, leaving only the set of possible illuminants (the ‘feasible set’). Other visual information may then provide cues that reduce the feasible set further, or an estimator is needed to choose the most likely illuminant from this set. Forsyth (1990) specified an algorithm, Crule, using the convex hull of the receptor responses of an image in order to find the feasible set. He compared the performance of this algorithm with that of the retinex algorithm of Land and McCann (1971). Unlike the retinex algorithm, Crule is only slightly disturbed if the mean chromaticity of an image is biased by a predominance of one colour in the scene. Use of the hull of the receptor responses instead of an average allows the algorithm to better separate changes in illumination from changes in surface ensemble. The power of the Crule algorithm derives from the fact that the gamut is both characteristic of the illuminant and invariant (as a limit) with changing scene content. Given uncertainty about scene composition, the gamut constraint can be helpful for illuminant estimation even in an environment where illumination-dependent mapping is normalization-compatible. But deviations from normalization-compatible mapping provide further cues for illuminant estimation. One example is the case of almostmonochromatic illuminants, which each produce a highly characteristic distribution in cone excitation space, whatever surfaces may populate the scene. Unfortunately, the gamut is seldom approached by natural surfaces. But the real-world constraints that determine the gamut may also influence appropriate statistics of the distribution of cone excitations, making these statistics usefully diagnostic of the illuminant. We next discuss the usefulness of various scene statistics for this purpose. Higher-order chromatic scene statistics as cues for the illuminant: their distribution in natural scenes Correlations As mentioned above, the visual system has to deal with an ambiguity if it uses only the mean chromaticity of an image to estimate the chromaticity of its illumination. This statistic alone does not allow us to distinguish a scene under a chromatic illumination from a scene with a chromatically biased surface ensemble (Fig. 7.6a). But since a red illumination makes red surfaces (relatively) lighter, as shown in the analysis of the natural scenes (p. 220) and as predicted by the Gaussian World (p. 214), the correlation between redness and luminance may be diagnostic for the illumination. A high luminance–redness correlation among the elements of a scene might thus by itself suggest a reddish illuminant, no matter what the scene-averaged chromaticity (Fig. 6b), and could thus provide an estimate of illuminant
230
colour perception
colour balance, even in a world where different scenes differ in predominant colour and hence violate the Grey World assumption. To express the point more generally: by evaluating both mean and correlation, an observer can estimate two unknowns—the predominant colour inherent in the objects making up the scene, and the chromaticity of the light source that illuminates the scene. In this way, higher-order statistics of the distribution of surface luminance and chromaticity within a scene can resolve the ambiguity encountered in considering scene-averaged chromaticity alone. We performed a theoretical analysis of higher-order chromatic scene statistics in natural scenes in order to investigate whether these measures can usefully support inferences about the illuminant. Figure 7.7 shows the correlation between redness and luminance within a scene versus the mean scene chromaticity for the 12 scenes under each illuminant. The mean redness of all scenes is, of course, highest under 4000K illumination. Contrary to our initial expectation, the correlations are almost independent of illumination.
Figure 7.7 Luminance–redness correlation and mean redness of natural scenes under different illuminations. The vertical coordinate of each data point represents the correlation between pixel redness (log(r)) and pixel luminance (log(luminance)) within an image of a scene under a particular illuminant. This correlation is plotted here against space average image redness (mean of log(r)). The four clusters show data for 12 natural scenes under four different illuminants (, colour temperature 4000K; O, 5500K; , 8500K; ♦, 20 000K). For a given scene, the luminance–redness correlations are almost independent of illumination, but within each illuminant cluster they are more negative for the redder scenes, as shown by the negatively sloped regression lines. Thus correlation and mean together separate the distributions of images resulting from different illuminants and make it possible to distinguish scene redness from illuminant redness.
a computational analysis of colour constancy
231
But note that, in Fig. 7.7, for each illuminant the distribution of the two statistics across scenes is negatively sloped, as indicated by the regression lines. Thus, despite this invariance with illuminant, the correlation measure can resolve an ambiguity with which a visual system would have to deal, if it took only the mean chromaticity of a scene as a cue for the illuminant. This is possible because scene redness and illuminant redness affect the correlation differently: • For a reddish scene under neutral lighting, the visual system’s diminishing sensitivity
for long wavelengths introduces a negative correlation between pixel redness and pixel luminance. (This effect is reduced or absent for predominantly greenish scenes, since the redder pixels within such scenes are likely to have spectral distributions better placed in relation to the luminosity function. This accounts for the sloped regression lines in Fig. 7.7.)
• For a neutral scene under reddish light, the low luminosity of reds is counteracted by the
illuminant’s greater energy at long wavelengths, making the correlation between pixel redness and pixel luminance greater in such a scene than for the reddish scene under neutral light.
Consider, as an example, a scene for which the mean log 10 (r) is −0.155. If only this mean chromaticity is known, this could be a reddish scene under neutral light or a neutral scene under reddish light. But if the visual system takes into account the correlation between redness and luminance, it can distinguish between illumination redness and scene redness. If the correlation is low, then a reddish scene under a neutral illuminant is more likely. Thus the use of the correlation measure can improve the estimation of the illuminant, even though this statistic is almost unaffected by changes in illumination. The correlation between luminance and blueness within a scene was almost independent of illumination for the used set of scenes. Unlike the correlation between luminance and redness, the correlation between luminance and blueness varied too widely in the set of scenes we used, to play a useful disambiguating role. Variances We also investigated the variance of the chromaticities within scenes. As Fig. 7.8 shows, the standard deviation of log(r) increases for only some scenes when the colour temperature of the illumination changes from 20 000K to 4000K, and there is no indication that this statistic could play the disambiguating role suggested for the luminance–redness correlation in Fig. 7.7. The lack of a consistent effect of illuminant on chromatic variance within scenes is not surprising since, as noted above, the illuminants we used did not vary much in bandwidth. The variance statistic could be diagnostic of a chromatically biased illuminant in more extreme cases than the ones considered here. The standard deviation of log(b), as well as the standard deviation of log(luminance), were likewise almost independent of illumination, and were similarly distributed for scenes of different mean chromaticity. This means that they cannot be very useful for diagnosing the illuminant.
232
colour perception
Figure 7.8 Standard deviation of pixel redness within a scene, plotted versus scene-average redness for 12 natural scenes under four different illuminants (, colour temperature 4000K; O, 5500K; , 8500K; ♦, 20 000K).
Skewness For the reddest surfaces in any scene, the redward shift in stimulus chromaticity under a reddish illuminant is ultimately limited by the spectrum locus. One might therefore suppose that a shift to reddish illumination would shift the already reddish pixels less toward red than greenish pixels. The distribution of chromaticities would in this way get skewed away from the chromaticity of the illuminant. The results of the simulation actually revealed an opposite trend. For each of the 12 natural scenes, the shift in log(r) in going from 20 000K illumination to 4000K illumination was higher for pixels with high redness (as measured by log(r) values under 7000K illumination)—if we consider separately sets of pixels of similar blueness (similar values of the chromaticity coordinate, b). This may be due in part to the abundance in the natural scenes of greenish vegetation, which tends to have a relatively narrow reflectance band, and consequently to be resistant to chromaticity shift. In any case, this relation does not emerge clearly if one considers the whole set of pixels for a scene, since there is also a dependence of the shift in redness on the blueness of the pixels (p. 220), and this latter effect counterbalances the former if one pools all slices of constant blueness values. Thus the calculated Fisher skewness in log(r) or in log(b) within the scenes is almost independent of the illuminant. The use of higher-order scene statistics by the visual system One experiment that investigated the influence of higher-order scene statistics on colour constancy was performed by Mausfeld and Andres (2002). They created a new type of
a computational analysis of colour constancy
233
stimulus that makes it possible to vary certain statistics of the chromaticity distribution of the image independently. These computer-generated displays consist of a random structure of overlapping circles around a central test spot.7 By varying the modulation of the colour of the circles along the luminance axis and along the red–green axis, Mausfeld and Andres (2002) obtained a set of such ‘Seurat’ stimuli, each with a different combination of variances for chromaticity and luminance. The spatial mean chromaticity and luminance of the surround were equal for all stimuli. Subjects made red–green equilibrium settings (‘unique yellow’ settings) by a double random staircase procedure. This subjective yellow point shifts toward the (mean) surround chromaticity, as expected, but the shift was greater for surrounds of reduced chromatic variance than for very heterogeneous surrounds. Mausfeld (1998; Chapter 13 this volume) discusses these results within an ethologically inspired perspective: certain stimulus characteristics trigger the visual system to interpret an image in terms of certain representational primitives. He argues that a reduced chromatic variance increases the tendency of the visual system to interpret a scene with a biased average chromaticity as chromatically illuminated.8 This results in a stronger correction for the appearance of all patches, including the test spot. Thus the effect of a chromatically non-neutral surround on unique yellow settings is larger if the surround has low chromatic variance than if it has high chromatic variance. That is, these surrounds are not ‘functionally equivalent’ even though the space-averaged chromaticity of both surrounds is equal. This contradicts algorithms for colour constancy in which only the space-average of the surround elements is important (e.g., those that rely on a Grey World assumption). Another experiment that tackled the question whether surrounds with same spaceaveraged chromaticity always cause the same changes in appearance of a test region was reported by Jenness and Shevell (1995). They used red backgrounds with randomly scattered sparse white or green dots, and compared them to the red background without these dots and to uniform backgrounds with the same space-averaged chromaticity as the inhomogeneous backgrounds. Subjects had to set the test field so that it appeared neither reddish nor greenish. The influence of the inhomogeneous background on these unique yellow settings differed from that of the uniform background with the same space average. To find out whether human vision employs higher-order scene statistics to estimate the illuminant, we performed experiments using stimuli similar to the one used by Mausfeld and Andres (2002), but varying other statistics. We asked subjects to adjust the colour of a circular test field embedded in such a computer-generated display so that it appeared neutral grey. In our main experiment (Golz and MacLeod 2002) we varied the correlation between log(r) and log(luminance) independently of other statistics (means and variances). For a given condition, the chromaticity and luminance values for the circles in the surround were 7 If the diameter of the circles is chosen to be very small, these images resemble Neo-Impressionistic paintings. Therefore Mausfeld (1998) refers to these stimuli as Seurat-type configurations. 8 This assumption is based on the fact that a chromatically biased illuminant that is narrow in bandwidth will lead to a reduced variance of chromaticity values for the reflected lights of a scene. For a discussion of whether the chromatic variance of natural scenes changes under a range of typical daylights, see p. 231.
234
colour perception (a)
Figure 7.9 Dependence of centre test spot settings on correlation between redness and luminance in background for subject JG. Closed circles are the means of log(r) for perceptually achromatic test fields, error bars are ± one standard error of the mean. If the settings were not dependent on the correlation but only on the mean of the chromaticity of the background, the settings should be the same for all conditions (horizontal dashed line). For the case that the settings are not dependent on the correlation but only on the luminance-weighted mean of the chromaticity of the background, the results are predicted by the oblique dashed line. The measured settings are significantly different from both models (P < 0.001).
chosen to achieve a certain correlation value (−1, −0.8, 0.0, 0.8 or 1). If the perceived colour of the centre test spot was not influenced by the varied correlation, then the settings to make the test spot neutral grey should be the same for all five conditions. This is because the spaceaveraged chromaticities of the backgrounds were the same. The backgrounds would then be functionally equivalent with respect to the perceived colour of the centre test spot. Results for subject JG are shown in Fig. 7.9. For conditions with higher correlation between redness and luminance, a more reddish chromaticity was required to make the test field subjectively achromatic. The data for eight of ten subjects tested were quantitatively similar and individually statistically significant. When the correlation between redness and luminance was positive, all subjects selected a physically more reddish (higher r) test field as neutral grey. Since higher r-values are associated with redder illumination of a physical neutral surface, this is the result expected if the observer infers a more reddish illumination in the case of positive luminance–redness correlation, and perceives neutral grey when a correspondingly reddish light stimulus is received. These results are thus consistent with the possible use of the luminance–redness correlation as a cue for the chromaticity of the illuminant, as suggested by the analysis of natural scenes earlier in this section. How much weight should a smart visual system give to the correlation between redness and luminance in estimating the illumination? To answer this question for our simulated world, we calculated a maximum likelihood estimate of the chromaticity of the illumination, based on the mean and correlation values of the scenes. Figure 7.10 shows the effect of the correlation between redness and luminance on the test spot settings for all subjects. The
a computational analysis of colour constancy
235
(b)
Figure 7.10 Experimental results for all subjects (circles) compared with predictions for an estimation of the illuminant making optimal use of the correlation statistic (dashed lines). The steeper-sloped line is for the model that the visual system uses the luminance-weighted mean chromaticity of the background; the more shallow-sloped line for the not luminance-weighted mean. Error bars for the experimental results are ± one standard error for subject variability.
dashed lines through the data are parameter-free theoretical predictions, on the hypothesis that optimal weight is given to the correlation measure in estimating the illuminant. Two cases are illustrated. The steeper line is obtained for an optimal visual system that uses (in addition to the luminance–redness correlation) the luminance-weighted mean chromaticity (or the mean cone excitations) of the surround; the more shallow sloped line is for a system that applies no luminance weighting in evaluating the mean chromaticity. The small effect observed in our experiment is thus roughly consistent with optimal computation. Our experiments revealed no comparable effect for the correlation between blueness and luminance, or for the skewness of the blueness and redness distributions. This is also consistent with the theoretical analysis of natural scenes, in which the latter statistics did not appear helpful for estimating the illuminant. To summarize our discussion of the estimation problem: the effects of changing illumination on natural scenes indicate that the statistics of the cone excitations associated with individual scene elements are potentially helpful in resolving the ambiguity inherent in use of scene-averages alone. And recent evidence from perceptual experiments suggests that certain of these statistics are indeed exploited, possibly in a statistically nearly optimal manner.
Concluding summary We introduce this chapter with an idealization of the world of colour in which the interplay of illuminants, surfaces and photoreceptors becomes mathematically tractable. Normalization is a fairly effective algorithm for colour correction in this Gaussian World, and also in the
236
colour perception
real world, as viewed by human retinas (pp. 214–22). The recent evidence reviewed on pp. 222–24 indicates that normalization may be a fairly good model for the human visual system’s compensation for changing illumination. On pp. 224–35 we turn to the problem of illuminant estimation. There we show, theoretically and experimentally, how considering higher-order statistics of the cone excitation distribution make it possible to dissociate scene-average colouration from illuminant colouration.
Acknowledgements We thank D. L. Ruderman, T. W. Cronin, and C. C. Chiao for supplying us with their spectral data of natural scenes. This work was supported by NIH grant EY01711. J. Golz was supported by the German–American Fulbright Commission.
References Brainard, D. H. and Freeman, W. T. (1997). Bayesian color constancy. Journal of the Optical Society of America A 14, 1393–1411. Brainard, D. H. and Wandell, B. A. (1992). Asymmetric color matching: how color appearance depends on the illuminant. Journal of the Optical Society of America A 9, 1433–1448. Brainard, D. H., Brunt, W. A., and Speigle, J. M. (1997). Color constancy in the nearly natural image. I. Asymmetric matches. Journal of the Optical Society of America A 14, 2091–2110. Brill, M. H. (1978). A device performing illuminant-invariant assessment of chromatic relations. Journal of Theoretical Biology 71, 473–478. Brown, R. O. and MacLeod, D. I. A. (1992). Saturation and color constancy. Advances in Color Vision Technical Digest (Optical Society of America) 4, 110–111. Brown, R. O. and MacLeod, D. I. A. (1997). Color appearance depends on the variance of surround colors. Current Biology 7, 844–849. Buchsbaum, G. (1980). A spatial processor model for object colour perception. Journal of The Franklin Institute 310, 1–26. Campenhausen, C. V. (1986). Photoreceptors, lightness constancy and color vision. Naturwissenschaften 73, 674–675. Chichilnisky, E. J. and Wandell, B. A. (1995). Photoreceptor sensitivity changes explain color appearance shifts induced by large uniform backgrounds in dichoptic matching. Vision Research 35, 239–254. Dannemiller, J. L. (1993). Rank orderings of photoreceptor photon catches from natural objects are nearly illuminant-invariant. Vision Research 33, 131–40. D’Zmura, M., Iverson, G., and Singer, B. (1995). Probabilistic color constancy. In Geometric representations of perceptual phenomena, (ed. D. Luce, M. D’Zmura, D. Hoffman, G. Iverson, and A. Romney). Lawrence Erlbaum Associates, Mahwah. Forsyth, D. A. (1990). Colour constancy. In AI and the Eye, (ed. A. Blake), pp. 201–228. Wiley, Chichester. Foster, D. H. and Nascimento, S. M. (1994). Relational colour constancy from invariant cone-excitation ratios. Proceedings of the Royal Society of London. Series B: Biological Sciences 257, 115–121. Gilchrist, A. L. and Jacobsen, A. (1983). Lightness constancy through a veiling luminance. Journal of Experimental Psychology 9, 936–944. Golz, J. and MacLeod, D. I. A. (2002). Influence of scene statistics on colour constancy. Nature 415, 637–640. Helmholtz, H. von (1896). Handbuch der Physiologischen Optik, (2nd edn). Voss, Hamburg.
appendix: a computational analysis of colour constancy
237
Helson, H. (1938). Fundamental problems in color vision. I. The principle governing changes in hue, saturation, and lightness of nonselective samples in chromatic illumination. Journal of Experimental Psychology 23, 439–476. Hurlbert, A. C. (1998). Computational models of color constancy. In Perceptual constancy: Why things look as they do, (ed. V. Walsh and J. Kulikowski), pp. 283–322. Cambridge University Press, Cambridge. Jenness, J. W. and Shevell, S. K. (1995). Color appearance with sparse chromatic context. Vision Research 35, 797–805. Land, E. H. and McCann, J. J. (1971). Lightness and retinex theory. Journal of the Optical Society of America 61, 1–11. MacLeod, D. I. A. (1985). Receptoral constraints on color appearance. In Central and peripheral mechanisms of color vision, (ed. D. Otterson and S. Zeki). MacMillan, London. MacLeod, D. I. A. and Boynton, R. M. (1979). Chromaticity diagram showing cone excitation by stimuli of equal luminance. Journal of the Optical Society of America 69, 1183–1186. MacLeod, D. I. A., Williams, D. R. and Makous, W. (1992). A visual nonlinearity fed by single cones. Vision Research 32, 347–363. Marr, D. (1982). Vision: A computational investigation into the human representation and processing of visual information. W.H. Freeman, San Francisco. Mausfeld, R. (1998). Color perception: From Grassmann codes to a dual code for object and illumination colors. In Color vision: Perspectives from different disciplines, (ed. W. Backhaus, R. Kliegel, and J. S. Werner), pp. 219–250). de Gruyter, Berlin. Mausfeld, R. and Andres, J. (2002). Second order statistics of colour codes modulate transformations that effectuate varying degrees of scene invariance and illumination invariance. Perception 31, 209–224. Moon, P. and Spencer, D. E. (1945). Polynomial representation of spectral curves. Journal of the Optical Society of America 35, 597–600. Nascimento, S. M. and Foster, D. H. (1997). Detecting natural changes of cone-excitation ratios in simple and complex coloured images. Proceedings of the Royal Society of London. Series B: Biological Sciences 264, 1395–1402. Ruderman, D. L., Cronin, T. W., and Chiao, C. C. (1998). Statistics of cone responses to natural images: Implications for visual coding. Journal of the Optical Society of America A 15, 2036–2045. Sällström, P. (1973). Colour and physics: Some remarks concerning the physical aspects of human colour vision (Report No. 73–09). Institute of Physics, University of Stockholm, Stockholm. Shepard, R. N. (1992). The Perceptual organization of colors: An adaptation to regularities of the terrestrial world? In The adapted mind, (ed. J. H. Barkow, L. Cosmides, and J. Tooby). Oxford University Press, New York. Stockman, A., MacLeod, D. I. A., and Johnson, N. E. (1993). Spectral sensitivities of the human cones. Journal of the Optical Society of America A 10, 2491–2521. Ware, C. and Cowan, W. B. (1982). Changes in perceived color due to chromatic interactions. Vision Research 22, 1353–1362. Worthey, J. A. and Brill, M. H. (1986). Heuristic analysis of von Kries color constancy. Journal of the Optical Society of America A 3, 1708–1712. Wyszecki, G. and Stiles, W. S. (1982). Color science: Concepts and methods, quantitative data and formulae, (2nd edn). John Wiley & Sons, New York.
238
colour perception
Appendix: Cone excitations in the Gaussian World In the Gaussian World, three parameters characterize illuminants, surfaces or cone sensitivities. These are: the spectral centroid; the spectral curvature, which is related to spectral dispersion or bandwidth; and the value at the centroid, which is a maximum or a minimum, depending on the sign of the spectral curvature. We denote the spectral centroids for illuminant power distribution, surface reflectance and cone sensitivity by λI , λS , and λC respectively, where λC is, of course, a different number for each cone type. Spectral curvatures are similarly kI , kS , and kC ; centroid values (maximum or minimum values, in either case scaling the entire function) are Imax , Smax , and Cmax , respectively. The following three equations then apply, with the parameter values appropriate to any given surface, illuminant and cone type: Illuminant spectral power density for illuminant I: I (λ) = Imax exp[−kI (λ − λI )2 ] Surface spectral reflectance function for surface S: Cone sensitivity function for cone C:
S(λ) = Smax exp[−kS (λ − λS )2 ]
C(λ) = Cmax exp[−kC (λ − λC )2 ]
If kI + kS + kC > 0, as is generally the case since the cone sensitivity function is typically narrower than I (λ) or S(λ), the cone excitation elicited in cone C by surface S under illuminant I is given by: ∞ E(I , S, C) = I (λ)S(λ)C(λ)dλ −∞
= Imax Smax Cmax
∞
−∞
exp[−kI (λ − λI )2 − kS (λ − λS )2 − kC (λ − λC )2 ]dλ
The integral can be evaluated by grouping the terms involving λ and then completing the square in the exponent, transferring the term independent of λ outside of the integrand: E(I , S, C) = Imax Smax Cmax exp[−(kI λ2I + kS λ2S + kC λ2C )] ∞ exp[−(kI + kS + kC )λ2 + 2(kI λI + kS λS + kC λC )λ]dλ × −∞
(kI λI + kS λS + kC λC )2 = Imax Smax Cmax exp kI + k S + k C
∞ k I λI + k S λS + k C λC 2 dλ kI + k S + k C λ − √ × exp − kI + k S + k C −∞ (kI λI + kS λS + kC λC )2 2 2 2 = k0 exp −(kI λI + kS λS + kC λC ) + kI + k S + k C where k0 = Imax Smax Cmax (π )/(kI + kS + kC ). The square root factor here is the value of the Gaussian integral in the previous expression.
−(kI λ2I
+ kS λ2S
+ kC λ2C ) +
appendix: a computational analysis of colour constancy
239
The terms in the exponent can now be rearranged as a sum of squares, thereby representing the cone excitation as a product of three Gaussian factors (together with the intensity factor k0 , which in many contexts may be neglected, since it is independent of the spectral variables but incorporates a relatively slight dependence on bandwidth). E(I , S, C) 1 2 2 2 (kI kS (λI − λS ) + kI kC (λI − λC ) + kS kC (λS − λC ) ) = k0 exp − kI + k S + k C kI kC kI kS = k0 exp − (λI − λS )2 exp − (λI − λC )2 kI + k S + k C kI + k S + k C kS kC × exp − (λS − λC )2 (A7.1) kI + k S + k C
The first Gaussian factor depends both on the illuminant and the surface parameters, and therefore might appear to jeopardize colour constancy based on normalization, since it can’t be normalized out for all surfaces at the same time. But this factor is independent of the cone spectral centroid, which means that it generates variations in effective intensity only. What this factor means is simply that (e.g.) when the lighting gets red, the reds get lighter relative to the shorter-wavelength surfaces. Unlike other deviations from strict normalization-compatibility mentioned below, this factor is potentially significant even in the important case of broadband illuminants and surfaces, that is if kS ≪ kC and kI ≪ kC , provided neither spectral curvature is zero. The second Gaussian factor similarly determines the illuminant colour. A key point to note is that this factor is approximately the same for all broadband surfaces, (surfaces for which kS ≪ kI + kC ), as it then depends only on (λI − λC ), the difference between the illuminant spectral centroid and that of the cone spectral sensitivity. This means that this illuminant factor can be removed, for all such surfaces, by a simple normalization, in just the way that was possible only for effectively monochromatic radiations in the Three-band World (p. 209). The final Gaussian is a factor representing surface colour, independent of the illuminant. It is a Gaussian function (or if kS kC is negative, the reciprocal of one) of the spectral distance of the surface’s spectral centroid from that of the cone. This is what normalization preserves, and it is what a colour constant visual system needs to preserve.
Colour differences approximately invariant with illumination Chromatic values in the Gaussian World are most easily analysed by considering intensityinvariant chromatic coordinates in log cone excitation space. One such coordinate is the difference in the logs of the L and M cone excitations. To evaluate this we replace λC by λM and λL respectively in Equation A7.1, assuming the same cone spectral bandwidths kC in each case (both for simplicity, and because the bandwidths can be made equal by adopting a more suitable function of wavelength—approximately its logarithm—as the
240
colour perception
spectral variable at the outset). We also assume the same value peak cone sensitivity Cm ; this can always be arranged by appropriate choice of the units for the cone excitations. By substituting into Equation A7.1 we obtain the difference between the natural logs of the L and M cone excitations: log (L) − log (M ) = log (L/M ) =
2(λL − λM )kC {kI [λI − (λL + λM )/2] + kS [λS − (λL + λM )/2]} kI + k S + k C (A7.2)
This expression gives a particularly simple mathematical embodiment to the trade-off between illuminant colour and surface colour in the determination of the retinal stimulus, since illumination and surface colour here contribute two symmetrical terms, derived respectively from the second and the third Gaussian factors in Equation A7.1. (The k0 factor and the first of the three Gaussian factors in Equation A7.1 have been cancelled, being equal for the two cone types compared.) The sum of the terms in parentheses involving kI and kS is the effective spectral centroid of the stimulus, measured from the mean peak wavelength of the L and M cones. The difference in log excitation is proportional to that spectral distance and also to two other factors: the spectral separation of the L and M cone sensivities, and a squared-bandwidth factor kC /(kI + kS + kC ). The bandwidth factor is an inverse measure of the spectral curvature of a cone’s effective stimulus, relative to that of the cone sensitivity itself. Similarly, if λB is the spectral centroid of the spectral sensitivity of the blue or shortwavelength cones (referred to elsewhere as the S cones, but here, to avoid confusion with the surface identifier, as the B cones, with excitation B), 2(λB − λL )kC {kI [λI − (λB + λL )/2] + kS − [λS − (λB + λL )/2]}. kI + k S + k C (A7.3) Equations A7.2 and A7.3 allow us to locate surface-derived stimuli in the chromatic plane [log (L) − log (M ), log (B) − log (L)], shown in Fig. 7.1 (Note that the quantities r and log (r) employed in much of our analysis of natural images are both practically linear with log (L) − log (M ), making the three quantities practically equivalent for most theoretical purposes. Likewise, log (b) is almost linear with log (B) − log (L) for natural colours.) Under spectrally flat illumination (kI = 0), spectrally non-selective surfaces (kS = 0) are at the origin in Fig. 7.1 since the illuminant term and the surface colour term are both zero. For selective surfaces, the non-zero surface colour term generates the stimulus coordinates: log (B) − log (L) =
2(λL − λM )kS kC [λS − (λL + λM )/2], kS + k C
2(λB − λL )kS kC [λS − (λB + λL )/2]. kS + k C
Loci of constant λS , traced out by varying kS , radiate in straight lines from the origin, each coordinate being proportional to a purity measure, kS /(kS + kC ), that approaches unity for
appendix: a computational analysis of colour constancy
241
monochromatic reflectances. Loci of constant surface bandwidth kS lie on parallel straight lines, such as the crossbars of the ‘kites’ in Fig. 7.1, that have a slope equal to the ratio of the separations between the cone sensitivity peaks. As a special case, the spectrum locus is entirely straight in these logarithmic coordinates, though not, of course, in a chromaticity diagram where the coordinates are linearly related to the cone excitations. Under chromatic illumination, the illuminant term becomes non-zero and the spectrally non-selective surface shifts from the origin to the point: 2(λL − λM )kI kC [λI − (λL + λM )/2], kI + kC
2(λB − λL )kI kC [λI − (λB + λL )/2]. kI + k C
Selective surfaces undergo a comparable displacement. In the important range of conditions where the illuminant bandwidth and surface bandwidth are both large enough, relative to the cone bandwidth, to keep the factor kC /(kI + kS + kC ) close to 1, the displacement is approximately the same for all surfaces. Under such conditions, normalization achieves approximate colour constancy. Deviations from this simple pattern of rigid translation take two forms: gamut compression due to illuminant bandwidth restriction and shift resistance due to surface-reflectance bandwidth restriction. If illuminant bandwidths differ, the narrower band illuminants (with high kI ) reduce saturation, compressing the constellation of stimulus colours by the factor kC /(kI + kS + kC ) toward the locus of the neutral (non-selective) surface under the prevailing illuminant; we refer to this as gamut compression. Likewise, if surface bandwidths differ, variation in kS reduces the shift of the narrower band surfaces toward the illuminant colour, by this same factor; in Fig. 7.1 this is seen as a skewing of the kites as their radii each pivot around a fixed point on the spectrum locus. In the limit of narrow illuminant bandwidth, surface stimuli all assume the chromaticity of the monochromatic illuminant. In the limit of narrow surface bandwidth (monochromatic reflectance), changing illumination makes no difference to the chromaticity of the stimulus. Under narrow-band illuminants, shift resistance and gamut compression work together to compress colour differences orthogonal to the spectrum locus (‘saturation’ values). As a result, these differences are compressed more than the ‘hue’ differences that are parallel to the spectrum locus. The illumination-invariant plane The reason why changing illumination does not lead to strictly rigid translation in the coordinate system of Fig. 7.1 is that the coordinates (from Equations A7.2 and A7.3) are not additive combinations of an illuminant component and a surface component. Indeed, the bandwidth factor kC /(kI + kS + kC ) by itself is not additive in that sense. But its reciprocal is, and so too is the effective spectral centroid kI λI + kS λS . By adopting these two quantities as coordinates, we can create a colour plane that is illumination-invariant, in the sense that differences between any pair of surface colours are represented by illuminationinvariant vectors. In this plane, the effect of changing illumination is always a precisely rigid translation of the constellation of points representing the difference surfaces within a scene. Geometrically, the plane of Fig. 7.1 is a perspective view of this illuminationinvariant plane. Instead of regarding the ‘kites’ as vertically oriented in the plane of the
242
colour perception
paper and tethered to the spectrum locus below, they should be imagined as oriented in an orthogonal plane that recedes in depth toward the spectrum locus at infinite distance. The triples of apparently converging ‘tether’ lines are parallel within the illuminationinvariant plane, where they recede toward their vanishing points on the spectrum locus. The depth coordinate is linear with kI + kS , a spectral curvature value that approaches infinity for monochromatic stimuli. The horizontal coordinate is linear with the effective spectral centroid kI λI + kS λS . These coordinates of the illumination-invariant plane are each expressible as a ratio of linear combinations of the logs of the cone excitations. But the visual system has little to gain by computing them, since for natural colours, illumination-dependent mapping is generally well approximated by translation in Fig. 7.1, and is therefore normalization-compatible.
commentary: a computational analysis of colour constancy
243
Commentary on MacLeod and Golz The importance of realistic models of surface and light in the study of human colour vision Laurence T. Maloney MacLeod and Golz propose explicit models of possible surface reflectance functions and illuminant spectral power distributions as part of a computational model of human colour vision. In their idealized Gaussian World, reflectance functions, spectral power distributions, and photoreceptor spectral sensitivity functions can all be expressed, across the human-visible spectrum, as exponential-quadratic functions of wavelength. They derive a closed-form formula for the excitation of a photoreceptor exposed to the light emitted from a particular surface under a particular illuminant. The resulting computational model of human colour vision comprises: (1) the Gaussian World model of image formation, (2) the assumption that human chromatic processing is confined to a scaling of photoreceptor excitations, and (3) an unspecified algorithm that selects the scale factors for each class of photoreceptor based on scene statistics. Analysis of the model leads them to predict a particular lightness interaction in human vision: ‘when the light gets redder, the reds get lighter’. Their model is an impressive attempt to link the basic physics of light-surface interaction to observers’ colour judgements in psychophysical tasks. While they present it as a package, it is evident that each component can be examined and tested in isolation. The component I will concentrate on is the one they characterize most completely: the model of surface spectral reflectance functions spectral power distributions of lights that they refer to as Gaussian World. In Gaussian World, a surface spectral reflectance function is an exponential-quadratic, S(λ) = Sm exp ( − ks (λ − λs )2 )
(7.1)
It is characterized by the three parameters (Sm , kS , λS ), which are readily interpretable as controlling the vertical scale, ‘narrowness’ and location of the surface reflectance within the visible spectrum. Note that the parameter kS need not be positive. The spectral power distribution of the light is similarly specified by the three parameters (Im , kI , λI ), and the spectral sensitivity of a photoreceptor by the three values, (Cm , kC , λC ), which, of course, do not change from scene to scene and are therefore physical constants characterizing the visual system. MacLeod and Golz compute, in closed form, the excitation of a photoreceptor exposed to an illuminant reflected from a given surface in terms of the six parameters characterizing the light and the surface. The evident question is, how accurate is the ‘Gaussian World’ idealization as a model of surfaces and illuminants encountered in the natural environment? This question could be addressed directly by fitting the model in Equation 7.1 to empirical data (by whatever criterion the authors believe is appropriate) and reporting the distribution of goodness-of-fit measures. The authors do not do this (or cite any readily available evidence bearing on the issue). It is difficult to see how their models could fit the approximately step-function spectral reflectance functions that typically correspond to reddish or yellowish biological colourants (Lythgoe 1979; Chittka et al. 1994). Even if Gaussian World were not a very accurate model of the physical environment, we could still ask, is the model of chromatic processing proposed by MacLeod and Golz an accurate model of human chromatic processing? To address this question, we could study the colour judgements of observers embedded in a simulated Gaussian World, something that is now possible experimentally (Maloney and Yang, Chapter 11 this volume). Observers in a Gaussian World environment should exhibit the specific, quantitative pattern of errors predicted by MacLeod and Golz.
244
colour perception
A fair test of the model would include the possibility that observers do not correctly estimate the illuminant in a given scene. As a consequence, they may not select the correct photoreceptor scaling factors. By treating these factors as free parameters and fitting them to the observer’s data, we can obviate this difficulty and still test the model. In doing this, we adapt the equivalent illuminant approach of Brainard and colleagues to Gaussian World (Brainard et al. 1997; Brainard 1998). In contrast, MacLeod and Golz develop a series of qualitative predictions, including, for example, the prediction that ‘when the lighting gets red, the reds get lighter relative to the shorter wavelength surfaces’ (p. 000). This sentence occurs in their discussion of the surface-light interaction term, kI kS 2 exp − (λI − λS ) , kI + kS + kC
(7.2)
which achieves a maximum when kI kS > 0 and λI = λS . This term is common to the excitation of all photoreceptor classes and, for any surface considered in isolation, the term is confounded with Sm , the mean reflectivity of the surface. MacLeod and Golz assume that visual chromatic processing involves no compensation for the term and, as a consequence, it should be possible to develop explicit, quantitative predictions of the apparent lightness of surfaces under different illuminants. It is consequently surprising to find that MacLeod and Golz make only ordinal predictions (‘lighter’) concerning the characteristic failures predicted by their chromatic processing model. Other models of chromatic processing predict the same qualitative failures as they themselves note. That the ‘reds get lighter as the lights get redder’ is scarcely compelling evidence in favour of their computational model or any of its components, including Gaussian World. More use could also be made of the equation above, in deriving quantitative, testable predictions in the case where kI kS < 0 and either light or surface is represented by an inverse Gaussian: ‘when the light gets purpler. . .’ . At times MacLeod and Golz seem to be arguing that the success of their model in predicting phenomena in human colour perception lends support to each component in their theory, something along the lines of: The Gaussian World approximation leads to a model of adaptation involving approximate photoreceptor scaling and, therefore, evidence for photoreceptor scaling in human vision is evidence in favour of the Gaussian World approximation. It is likely that any plausible model of light-surface interaction in the natural environment will be consistent with the prediction that changes of illumination lead to approximate scaling of photoreceptor sensitivities (see, for example, Dannemiller 1993; Foster and Nascimento 1994). The different components of Gaussian World need to be tested independently. A substantial portion of the chapter is a spirited assault on an alternative method for representing surfaces, the Linear Models approach (Brian Wandell and I coined the term). A three-dimensional Linear Model representation of surface reflectance function is a truncated generalized Fourier series, Sσ (λ) =
3
σj Sj (λ)
(7.3)
j=1
where the basis surfaces Sj (λ) are fixed. The scalar weights σj are varied to produce the range of possible surface reflectance functions allowed by the model. These functions form a linear function subspace (Apostol 1969) and by choosing the basis surfaces, Sj (λ), the researcher selects the subspace. Linear Models of light are similarly defined. A finite Fourier series is a special case of a Linear Model that the reader is likely to be familiar with. The approach has been discussed extensively and evaluated
commentary: a computational analysis of colour constancy
245
elsewhere by several different authors (for reviews and references see Hurlbert 1998; Maloney 1999, Chapter 9 this volume). One evident difference between Gaussian World and a world described by a Linear Model is that an additive mixture of surface reflectance functions in ‘Linear World’ has an effective surface reflectance function that is still in the same ‘Linear World’. This sort of mixture occurs in a pointilliste composition when viewed from a sufficient distance or in coloured surfaces mixed by spinning a disc rapidly: Linear World is closed under additive mixtures of surfaces. It is also closed under additive mixtures of lights. Gaussian World is not. A landscape composed of surfaces drawn from Gaussian World, viewed at a distance, is not equivalent to any surface drawn from Gaussian World. For example, if three, narrowband Gaussian surface reflectances are mixed additively, the resulting effective surface reflectance can be tri-modal and difficult to approximate as an exponential-quadratic. Gaussian World seems to be inadequate to express the mixtures of light and surface to be expected in complex scenes with multiple illuminants and mixtures of heterogeneous surfaces viewed at a distance. MacLeod and Golz make a few puzzling claims concerning the Linear Models approach. For example, they argue that the ‘even with the minimal three degrees of freedom each for light source and surface, the stimulus and spectra have six degrees of freedom; this is more than is necessary or desirable for characterizing them if three are sufficient for the functions that generate them. . . .’ (p. 000). Yet, as noted above, it is evident that the expression for the stimulus spectra (the light re-emitted from the surface) in Gaussian World is written in terms of six free parameters: (Sm , kS , λS ) for the surface, and (Im , kI , λI ) for the light. It is possible that they are trying to make a different point in the section contrasting the two models. Suppose that, in a complex scene, light is successively absorbed and re-emitted from two or more surfaces before it reaches the eye, what the computer graphics community refers to as ‘multi-bounce’. The spectral power distribution of the light is multiplied by the surface spectral reflectance of each surface in turn. The product of two elements of Gaussian World is an element of Gaussian World and, as the light ‘bounces’ from surface to surface, the cumulative effect is always representable in Gaussian World. Linear Models, in contrast, need not be closed under multiplication.1 To represent the consequences of ‘multi-bounce’, it is typically necessary to increase the dimensionality of the Linear Model employed. Gaussian World is closed under multiplcation and that is useful in modelling complex scenes with mutual illumination of surfaces. MacLeod and Golz conclude that the Linear Models approach ‘does not seem to have produced much insight . . . into the actual principles of illumination-dependent mapping in the environment or into the visual processes’ (p. 000). In reaching this conclusion, I think they have overlooked the considerable analytical work by several authors, notably D’Zmura and Iverson (1993a,b, 1994); the construction, by several authors, of algorithms for deriving information concerning the illuminant from realistic, three-dimensional scenes (see Maloney 1999, for a review); and the growing consensus that the visual system forms explicit estimates of illuminant chromaticity that must be based on sources of information concerning illumination within the scene (Maloney and Yang, Chapter 11 this volume). Furthermore, it is only recently that researchers have been able to simulate binocularly viewed scenes containing lights and surfaces with any degree of realism (Yang and Maloney 2001) and, consequently, it is only recently that the kinds of experiments needed to test hypotheses arising from the Linear Models approach have become feasible.
1 Certain linear models are closed under both addition and multiplication. Yang and Maloney (2001) use such step-function models, described in the Appendix, to improve the spectral accuracy of rendering in complex scenes. With only three parameters, these models cannot represent lights or surfaces accurately, and Yang and Maloney use nine or more.
246
colour perception
Better models of light–surface interaction, based on physical principles, will certainly be developed and these will replace the current, ad hoc Linear Models. However, I think that it is premature to assess their long-term impact, or lack thereof, for the reasons just given. As Chou En Lai, past Premier of China, said to Richard Nixon, when asked his opinion of the impact of the French Revolution, ‘Too early to tell.’
References Apostol, T. M. (1969). Calculus, Vol. II, (2nd edn). Xerox, Waltham, Massachusetts. Brainard, D. H. (1998). Color constancy in the nearly natural image. 2. Achromatic loci. Journal of the Optical Society of America A 15, 307–325. Brainard, D. H., Brunt, W. A., and Speigle, J. M. (1997). Color constancy in the nearly natural image. 1. Asymmetric matches. Journal of the Optical Society of America A 14, 2091–2110. Chittka, L., Shmida, A., Troje, N., and Menzel, R. (1994). Ultraviolets as a component of flower reflections, and the color perception of hymenoptera. Vision Research 34, 1489–1508. Dannemiller, J. L. (1993). Rank orderings of photoreceptor photon catches from natural objects are nearly illumination-invariant. Vision Research 33, 131–140. D’Zmura, M. and Iverson, G. (1993a). Color constancy: I. Basic theory of two-stage linear recovery of spectral descriptions for lights and surfaces. Journal of the Optical Society of America A 10, 2148–2165. D’Zmura, M. and Iverson, G. (1993b). Color Constancy: II. Results for two-stage linear recovery of spectral descriptions for lights and surfaces. Journal of the Optical Society of America A 10, 2166–2180. D’Zmura, M. and Iverson, G. (1994). Color Constancy: III. General linear recovery of spectral descriptions for lights and surfaces. Journal of the Optical Society of America A 11, 2389–2400. Foster, D. H. and Nascimento, M. C. (1994). Relational colour constancy from invariant coneexcitation ratios. Proceedings of the Royal Society of London B 257, 115–121. van Hateren, J. H. (1993). Spatial, temporal and spectral pre-processing for color vision. Proceedings of the Royal Society of London Series B 251, 61–68. Hurlbert, A. (1998). Computational models of color constancy. In Perceptual constancies, (ed.V. Walsh and J. Kulikowski). Cambridge University Press, Cambridge. Lythgoe, J. N. (1979). The ecology of vision. Clarendon, Oxford. Maloney, L. T. (1999). Physics-based models of surface color perception. In Color vision: From genes to perception, (ed. K. R. Gegenfurtner and L. T. Sharpe), pp. 387–418. Cambridge University Press, Cambridge. Yang, J. N. and Maloney, L. T. (2001). Illuminant cues in surface color perception: Tests of three candidate cues. Vision Research 41, 2581–2600.
chapter 8
BACKGROUNDS AND ILLUMINANTS: THE YIN AND YANG OF COLOUR CONSTANCY richard o. brown Preface I am very grateful to Rainer Mausfeld, Dieter Heyer, and Reinhard Niederee for their inspiration and efforts in organizing the ZiF conference, and thank them for offering me the chance to participate in this intensive examination of scientific and philosophical aspects of colour perception. Like many neurobiologists, I was initially drawn to study the brain, rather than, say, volcanoes or livers, because it held the prospect of answering deep questions, such as the relationship between mind and matter. But, also like many neurobiologists, my pragmatic need to ask answerable questions and perform doable experiments led me to focus on finer and more esoteric details, making it harder to keep sight of the big picture. So it was exciting to have this chance to step back and take a hard look at whether our scientific studies are really advancing our search for answers to the big questions. I have always tried, in my research, to find the simplest possible system which captures the essence of the problem I’m interested in, to make the experiments as easy as possible. For instance, my graduate research explored the neural basis of reproductive behaviour, but used the giant neurons of the marine mollusc Aplysia as a model system. And when I became interested in the neural basis of perception, I naively imagined that human colour vision would prove a simple model system to tackle. But it’s always essential with such reductions to keep testing that the answers we obtain apply to the original phenomena of interest, and are not just artefacts of a reduced preparation. It was eye-opening for me to start comparing the simplifying assumptions commonly used in colour vision models with the actual properties of the real world beyond a windowless vision lab, leading me to such ‘discoveries’ as that natural lighting varies a lot more in brightness than in colour, and that the world is not really grey after all. This paper attempts to look at some of the implications for colour vision models of the actual properties of the natural environment in which it evolved. On a different level, we’re all working to expand our shared body of knowledge and understanding of the world. But here we run an analogous risk of limiting our gains to closed circles of specialists. My current challenge in working at the Exploratorium is to design and build interactive exhibits that make the process and excitement of our scientific quests accessible to a much wider public. This commonly involves ‘reverse-engineering’ into physical objects of perceptual phenomena we ordinarily study in laboratory simulations, and the ease with which this can be done is often very instructive about the relevance of those phenomena in the real world. R. O. Brown
248
colour perception
Introduction Coloured objects normally maintain stable colour appearances across a wide range of viewing conditions, even though these changing conditions introduce large variations into the corresponding visual signals. This is the phenomenon known as ‘colour constancy’. Two major types of variation arise from changes in the light illuminating the objects, and from changes in the backgrounds against which the objects are seen. Most studies of colour constancy have focused on the challenge posed by variations in illumination, and many of the best-known models of colour constancy were developed primarily to solve this problem. Colour constancy with changing backgrounds has received relatively little attention, but is increasingly being recognized as an equally important problem. There is an interesting complementarity between these two aspects of colour constancy, in that simple mechanisms that would tend to maintain excellent colour constancy for one of these types of variation, tend to fail quite badly for the other. In particular, many colour constancy models rely on the space-averaged light from scenes to estimate illumination, but such models generally misinterpret coloured backgrounds as coloured illuminants. There may not be a general solution that achieves colour constancy with both changing illuminants and changing backgrounds. It is argued that instead of seeking a general, computational solution, colour constancy should be studied in terms of the actual properties of the visual system and of the ecological colour signals it evolved to see. An analysis of measured natural reflectances and illuminants has led to several hypotheses about the mechanisms involved in biological colour constancy: (1) The popular ‘Grey World’ models of colour constancy, which interpret changes in the space-averaged light as changes in illumination, will generally fail, as variations in the chromaticity of space-averaged light are at least as likely to arise from changing backgrounds as from changing illuminants; (2) linear models based on three-dimensional representations of illuminants and surfaces are inadequate to capture important variations in ecological colour signals; (3) the relative variations due to changing backgrounds and illuminants are highly asymmetric in the luminance and colour-opponent channels, with variations in the luminance channel corresponding largely to changes of illumination intensity, while variations in the chromatic channels primarily represent varying reflectances; (4) asymmetries in the known physiological and psychophysical properties of the luminance and colour-opponent channels may represent important adaptive tuning to these measured asymmetries in the corresponding ecological signals, suggesting that colour constancy is not simply ‘lightness constancy × 3’.
the yin and yang of colour constancy
249
Colour vision When we open our eyes, our world of stably coloured objects seems to appear instantly and automatically. The reliability and seeming effortlessness of colour perception belies the difficulties of this achievement. It remains a major goal for perceptual science to understand how biological systems process visual signals from the external world to generate perceived colours. Our naive, direct experience of colour is that colour is simply a property of coloured objects, just as size and shape are, and reflecting this view one approach to colour science has sought to identify colour with the physical properties of objects. But since Newton’s famous spectral analysis of light, colour research has focused predominantly on the relationship between colours and the spectral characteristics of light, with objects viewed more as the modifiers of coloured light than as themselves the source of colour (see Finger 1994; Mausfeld 1998 and Chapter 13 this volume). A competing line has focused on the important role of contrast in colour vision, and the identification of colours with ratios of light signals from different parts of a visual scene (Zeki 1993; Webster, Chapter 2 this volume; Whittle, Chapter 3 this volume). A more biologically oriented approach, and the one I favour, is to identify colour with the colour sensations of seeing organisms, and to try to understand colour in terms of the relationship between the physical signals captured by eyes and the neural processes that generate colours. Studies and discussions of colour often suffer from the many ambiguities and confusion in our language for colour. J. J. Gibson (cited in Eco 1985) observed that ‘The meaning of the term colour is one of the worst muddles in the history of science!’ The term colour, and other colour words, have been used variously to describe light, objects, contrasts, and perceptions. Under ordinary circumstances, these are all so tightly linked that the confusion is minimal. But the scientific study of colour typically involves the deliberate dissection of these variables into competing influences, and so requires more precise terminology to distinguish all these aspects of colour. Here, the term colour will be used to refer to the sensation of colour in an organism with colour vision, while objects and lights will be described in terms of their spectral reflectances and spectral power distributions, respectively. Colour, so defined as a sensation, need not correspond to any physically measurable property of objects or lights, and in this sense there is no defined measure of the ‘veridicality’ of colour perception. Another language ambiguity arises from the frequent exclusion of blacks, whites, and greys from the domain of colour; here colour will refer to the full range of colours, including these achromatic colours. Object colours It is generally assumed that the primary purpose of colour vision is to support detection and identification of objects in the environment. Helmholtz (1867) noted that ‘Colors have their greatest significance for us in so far as they are properties of bodies and can be used as marks of identification of bodies’, and on this point Hering (1920) agreed, ‘In vision, we are not concerned with perceiving light rays as such, but with perceiving the external objects mediated by these radiations; the eye must inform us, not about the momentary intensity
250
colour perception
or quality of the light reflected from external objects, but about these objects themselves.’ This emphasis on coloured objects as the business of colour vision makes colour constancy paramount, so the perceived colours can be reliably associated with appropriate coloured objects. But it should be kept in mind that seeing stably coloured objects need not be the only purpose of colour vision. For instance, perceiving the changing colours of the sky may be important for keeping time or predicting weather changes, and some organisms appear to use colour sensitivity primarily to detect such environmental changes. The perception of object colours is mediated by the light reflected from objects to the eye. Figure 8.1 is a very simplified schematic of the basic problem of colour vision. The banana represents a typical coloured object of interest to the visual system. For the purposes of colour vision, its surface may be approximated by a spectral reflectance function, showing the fraction of incident light at each wavelength reflected from the surface. In this case, the banana is ripe, and absorbs most of the short wavelengths of light while
Reflectance
Illuminant
Signal
X
400
500
600
700
=
400
500
600
700
400
500
600
700
Cones
?
400
500
600
700
(L, M, S) Figure 8.1 A simplified view of colour vision. A coloured object, such as the banana shown in this cartoon, may be roughly characterized by its surface reflectance function, which plots the proportion of incident light reflected at each wavelength. The light illuminating an object is represented by its spectral power distribution, which shows its intensity at each wavelength. The light reflected from the object at each wavelength is given by the product of these two measures. In the human visual system, the proximal light signal is sampled by three types of cones, each of which may be characterized by its spectral sensitivity to light of different wavelengths. The resulting triplet of cone responses, L, M, and S, provides the initial neural signal available for colour vision. Although these cone responses depend as much on the illuminants as on the reflecting surfaces, to achieve colour constancy the visual system must generate perceived colours which depend only on the object.
the yin and yang of colour constancy
251
reflecting much of the middle- and long-wavelength light. The banana is illuminated by daylight, which may be represented by a spectral power distribution, showing its relative power at each wavelength. The light reflected from the banana, which provides the proximal signal available for colour vision, is determined at each wavelength by the banana’s spectral reflectance multiplied by the illuminant’s spectral power distribution. Note that the proximal light signals for colour vision depend equally on the reflectances of objects, and the incident illumination on those objects. In human eyes, these proximal light signals would normally be sensed by three types of cones, reducing a high-dimensional proximal light signal (i.e. one which may vary independently at each wavelength) to a three-dimensional neural signal. The disarmingly simple, yet elusive, goal of colour science is to understand how this set of three cone responses is processed to reliably generate the perceived colour of the banana. Colour contrast When isolated uniform spots of light are viewed against completely dark backgrounds, their perceived colours (sometimes called ‘aperture colours’) can be reliably predicted from the physical composition of light in the patch. But the same uniform patch of light may generate a very different colour appearance as soon as it is juxtaposed with, surrounded by, or preceded by, other spots of light. Hering observed, ‘In general, one and the same ray can be seen, according to the circumstances, in all possible colour hues,’ and Delacroix (see Evans 1964) boasted that he could paint the skin of Venus from the dirtiest mud, provided he could surround it with appropriate contrast colours. Such colour contrast phenomena provide compelling evidence that perceived colours are not determined locally, by just the light signals from each point in a scene, but are relativistic, and involve comparisons of light signals across space and time. Colour contrast effects are often treated as illusions or defects of visual processing. Kaiser and Boynton (1996), for example, refer to the aperture colour of light as its ‘objective colour’, while the colours perceived in an identical spot of light in another context are called ‘subjective colours’. On the other hand, many colour researchers consider contrast the essential mechanism for achieving colour constancy. Whether colour contrast works to favour or hinder colour constancy depends on how it relates to changes in illumination and changes in backgrounds.
Colour constancy Colour constancy is the tendency for objects to maintain stable colour appearances, despite considerable variations in the physical and neural signals mediating colour vision. Colour constancy is just one of many perceptual constancies that allow us to recognize and maintain stable perceptual representations of the external world, despite changes in the received signals informing us about the world. Because the fixed, intrinsic properties of objects are likely to have greater behavioural significance than the fluctuating signals available for their perception at any moment, the constancies are considered essential perceptual achievements. A general definition of perceptual constancy, after Hochberg (1988), is ‘the
252
colour perception
constancy of perception of the fixed properties of distal objects, despite variations in the proximal stimuli from the objects’. For example, a friend’s face may be uniquely recognized across a wide range of distances, positions, lighting, motions, and facial expressions, which all generate very different retinal images. Note that just as this ‘face constancy’ need not preclude us from simultaneously recognizing faces and perceiving all these variations, colour constancy need not imply that we are blind to the variations, such as changing backgrounds and illumination. Applying Hochberg’s above formulation to colour constancy generates the following general definition of object colour constancy: ‘the constancy of the perceived colours of objects, despite variations in the proximal light signals’. This differs importantly from the most common formulation, which restricts colour constancy to ‘the constancy of the perceived colors of objects despite variations in the illumination’. The general definition includes variations in the illumination, as well as many other types of variations that may pose challenges for colour constancy. These include changes of the backgrounds against which objects are seen, changes in the atmospheric conditions through which light signals travel, and other changes, such as changes in the viewing geometry and changes in the neural sensitivity to the signals. Each of these challenges will be considered in turn below, with examples of the failures of colour constancy that arise from each. Variations of illumination The best-known challenge to object colour constancy comes from changes in the light illuminating objects. The difficulties caused by changing illumination can be appreciated from the schematic of colour signals shown in Fig. 8.1. Because the light reflected from the banana depends on the reflectance multiplied by the illuminant, any change in the illuminant will cause a proportionate change in the reflected light. For example, the light reflected from a banana may become greenish when the banana is an unripe green under a white illuminant, or when the banana is ripe yellow but its illuminant is greenish, perhaps from being filtered through green leaves. The reflected light reaching the eye from the banana may be identical in these two cases, but determining the true state of the banana may be important to a hungry animal. The changes of illumination to be considered should include not only temporal changes in the average illumination across an entire scene, which is what most colour constancy references to changes of ‘the’ illuminant imply, but also to the spatial variations of illumination across different parts of the scene, or even across individual objects. In everyday experience, most coloured objects do seem to maintain approximate colour constancy across diverse illuminants, suggesting that our visual systems are largely successful at unscrambling objects’ reflectances from illuminants. This is generally considered the great achievement of colour constancy; Hering (1920) wrote, ‘The approximate constancy of the colors of seen objects, in spite of large quantitative or qualitative changes of the general illumination of the visual field, is one of the most noteworthy and most important facts in the field of physiological optics.’ Examples of failure of colour constancy with changing illumination can be enjoyed in the clouds at sunset, as they run through a dramatic range of
the yin and yang of colour constancy
253
colours due entirely to changing illumination, or on a uniform white movie screen, which fortunately does not maintain the constant appearance of a large white rectangle as the projector changes its illumination.
Variations of background Besides changing illumination, objects may also be seen against a variety of different backgrounds. In ordinary viewing, this rarely seems to affect their perceived colours. And because the local light signals from ordinary, opaque objects depend on the object’s surface and its illuminant, as shown in Fig. 8.1, but not on its background, this phenomenon of colour constancy with variations of background may at first seem a trivial ‘achievement’. But the well-known effects of colour contrast certainly do affect colour appearances. Helson (1938) pointed out that ‘Hue, lightness, and saturation depend not only upon composition and intensity of light from an object but fully as much upon the reflectance of background and other objects.’ Whittle and Challands (1969) and Gilchrist (1979) noted that the reliance of the visual system on edges and contrasts makes constancy with changing backgrounds an important challenge for lightness constancy. Hamilton (1979) made the provocative suggestion that the evolution of trichromacy in our primate line may have been driven not by the reflectances of objects of interest, but by the need to identify them against a variety of different backgrounds. Failures of colour constancy with changing background include the well-known simultaneous contrast effects. One striking example is the moon’s appearance as bright white against the deep black background of the heavens, even though its surface is dark grey rock.
Variations of atmospheric conditions In the usual idealizations of colour signals, such as the cartoon in Fig. 8.1, the proximal signals are equated with the light reflected from objects. In fact, the atmosphere through which light is transmitted can greatly attenuate and distort this light. These filtering effects can arise from vision through fog, haze, precipitation, smog, water, or other transparent media. In general, these atmospheric effects increase with viewing distance; the ‘aerial perspective’ of Leonardo da Vinci, in which distant objects were painted with more blue and less contrast than near objects, takes advantage of this common effect of atmospheric haze. Colour constancy with variations in atmospheric conditions has not been studied extensively, but there is evidence that visual processes provide partial compensation for the reduced contrasts in scenes viewed through ‘veiling illuminants’ (Gilchrist and Jacobsen 1983) or underwater (Emmerson and Ross 1987). Brown and MacLeod (1997) studied the dependence of colour appearance on the variance (roughly, the contrast) of surrounding colours, and suggested that the observed perceptual compensation for reduced contrast may be involved in colour constancy with variations in atmospheric conditions. A typical failure of colour constancy with atmospheric conditions occurs when objects become greyer and eventually disappear as the fog rolls in.
254
colour perception
Other variations In addition to the above, there are a variety of other sources of variation in the colour signals from objects that pose difficulties for colour constancy. Changes in viewing geometry, such as may arise from relative motion of the object, illuminant or viewer, or from gradients of surface orientation within objects, may have important effects. Many of these effects are obscured by the canonical one-dimensional approximations of reflectances and illuminants (such as in Fig. 8.1) widely used in colour constancy models; this oversimplification of complex physical properties led Wandell (1995) to call this type of canonical model ‘a ruse’. For example, changes in the surface orientation with respect to both the direction from the illuminant, and the direction to the observer, will generally affect both the intensity and the spectral distribution of the received light signals. The changing colours of iridescent butterfly wings as they move provides one example of such a failure of colour constancy with changes in viewing geometry. And variations in the spatial distribution of illumination, such as from a spotlight to diffuse lighting, may also have dramatic effects. Specular highlights, and variations of shading with shape, are two common manifestation of these effects of viewing geometry on colour signals; note that these both also provide valuable information about the objects and their illuminants. Another effect of viewing geometry lies not in the external signals, but in anisotropies in the eye itself, such as spatial variations in its optics, resolution, and light and spectral sensitivities. These may produce large changes in object colours as they are seen in different parts of the visual field. For example, coloured objects appear decreasingly saturated and eventually turn grey as their retinal images move toward the periphery. Coloured patterns such as stripes may also appear to change colour with viewing distance, as the colour contrast effects seen at close range become assimilation and eventually a homogeneous mixing at further distances. Even brightly coloured objects will look grey under dim illumination, as the cone signals become too weak to support colour vision, even though the physical chromatic contrasts are unaffected. Finally, dynamic variations in the sensitivity of the visual system, including changing pupil size and neural adaptations to previous stimuli, also affect perceived colours, with coloured afterimages providing a dramatic example. Although such neural sources of variation in early colour signals are not typically considered part of the domain of colour constancy, these are the proximal signals for subsequent brain processes, and so they should also be kept in mind.
Illuminants versus backgrounds While all of the above types of variations may pose difficulties for colour constancy, and must be included in any eventual comprehensive model, this chapter focuses just on the problem of achieving colour constancy when both backgrounds and illuminants may be varying. These two sources of variation have distinct but largely complementary effects on colour signals, as illustrated in Fig. 8.2. Consider a reference spot seen on a particular background, under a particular illuminant, as in the top stimulus. If only the background reflectance changes, as in the bottom left stimulus, the light received from the spot itself would be unaffected, but the contrast of the spot relative to its background may change
the yin and yang of colour constancy
Change of backgound
Change of illumination
Preserves luminance
Preserves contrast (approximately!)
255
Figure 8.2 Change of background versus change of illumination. This illustrates the complementary effects on visual signals caused by two types of variation which pose challenges for colour constancy. The reference stimulus, shown above, is a middle grey spot (reflecting, say, 50% of the incident light) on a lighter grey background. If the same spot is seen under the same illuminant but against a darker background, it will still reflect 50% of the incident light, and thus have the same luminance as before, but its contrast relative to the background changes. On the other hand, if the spot is seen under a dimmer illuminant but against the original background, it will now have a proportionally lower luminance (reflecting 50% of a dimmer light), but its contrast relative to the background will not change. To achieve colour constancy for both types of change, the spot must produce the same perceived colour in all three conditions shown.
dramatically—even reversing sign, as in the example shown. On the other hand, if only the illuminant changes, as in the bottom right, this directly affects the light reflected from the test spot itself, but usually has only a minor effect on the contrast of the test spot relative to its background. Thus, if the colour of the test spot was entirely determined by just the light from the spot itself, as in a purely local model of colour vision, there would be excellent colour constancy when the background changed, but none when the illumination changed (as the spot colour would change in proportion to the illuminant change). On the other hand, if the colour of the test spot depended entirely on the contrast of the spot relative to its background, as in a number of colour constancy models, there would be good colour constancy when the illumination changed, but none when the background reflectance changed (as now the spot would change colour in proportion to the background change). Thus while may be easy if only one of these can change, the challenge for colour constancy to succeed in the real world is to simultaneously handle both types of change. Brainard and Wandell (1986) remind us that ‘Human vision maintains approximate colour constancy despite variation in the spectral reflectance functions of nearby objects and despite variation in the spectral power distribution of the ambient light.’ Whittle and Challands (1969) suggested that this requires two types of constancy: ‘first with respect to changes
256
colour perception
of background alone such as occur with relative movement of the object and its background, and second with respect to changes of illumination of the object and its background together’. In general, when surround signals change as in Fig. 8.2, the proximal signal is ambiguous, whether this change arose from a change in illumination, a change in background reflectance, or some combination of the two. In a typical colour constancy experiment, a subject is shown a fixed reference spot and surround, as in the top of Fig. 8.2, and is asked to adjust the spot in a test surround, such as the those in the bottom of Fig. 8.2, until the test spot matches the reference spot. Note that although the two surrounds shown in the bottom of Fig. 8.2 provide identical proximal signals, constancy would require the subject to somehow discriminate them and make very different spot settings in each. When such experiments are simulated with emissive displays controlled by computers, not just the proximal signals but, in fact, the entire stimulus is physically identical for the two conditions; whether the change in the surround is considered a change of background reflectance or a change of illumination, and in consequence the degree of colour constancy assigned to the subject’s response, depends only on which occult software variable was changed. In practice, owing to the commonly restricted formulation of colour constancy in terms of changing illumination, such experiments usually equate colour constancy with constant contrast responses (as if compensating for changing illumination). But perhaps such experiments should best be interpreted more neutrally, in terms of spatial interactions affecting colour, rather than as measures of colour constancy.
Models of colour constancy With the paradigm shift from colour as a property of objects to colour as a response to light, the mystery of colour constancy arose. Because the intensity and spectral composition of light from objects change with their illumination, shouldn’t their perceived colours also change? So the challenge for colour constancy was formulated early in terms of invariant object colours despite varying illumination. The local model of colour vision, in which the colours seen at each point correspond just to the composition of light at the same point, has been the default model of colour vision, and probably still corresponds to the notions of most people with a modicum of scientific education. It became the reference point for colour constancy models to improve upon, as it offers zero colour constancy with changing illumination. But this same local model can provide perfect colour constancy with changing backgrounds, and so it is useful to include it in the mix of colour constancy models. Indeed, based on evidence discussed on pp. 261–4, under natural conditions it may actually be a more successful model of chromatic colour constancy than some of the more celebrated colour constancy models, which provide excellent colour constancy with changing illuminants, but at the expense of zero colour constancy with changing backgrounds. Most models are designed to first estimate the illuminant, in order to then compensate for it. Note that estimating the illuminant may be neither necessary nor sufficient for achieving colour constancy. For example, the models of Wallach (1948) and Cornsweet (1970) can
the yin and yang of colour constancy
257
achieve lightness constancy with changing illuminants by using only signals based on the ratios of light across edges, and need not ever estimate or represent the absolute intensities of illuminants in order to compensate for them. But if estimating the illuminant is the goal, it would seem at first that the most sensible thing to do would be to simply look up at the sky, and measure the illuminant directly. But for all the varying, and often complex, indirect schemes that have been suggested for estimating the illuminant, this direct approach does not seem to be part of any serious colour constancy models. But two approaches are closely related: Land and McCann’s original retinex model (see McCann 1989) used the light reflected from white surfaces to directly estimate the illuminant, and others have used specular reflectances within the scene in the same way. Most recent models of colour constancy can be divided into two broad (but partially overlapping) classes. The first group of models takes advantage of known visual mechanisms, such as contrast and adaptation (cf. Webster, Chapter 2 this volume), to estimate or normalize to changing illuminants. The second group is based on linear models of the physical properties of illuminants and reflectances, and constructs algorithms to reconstruct the most likely surface reflectances from the proximal signals. The visual models have generally been directed toward achieving colour constancy with respect to variations in illumination, but not changes in background reflectance. Given one proximal light signal and two unknowns (background and illuminant), the easy solution is to try to eliminate one of the unknowns. So, generally a strong assumption is made which severely constrains or eliminates changes of background reflectance, leaving only illumination changes. The most common of these assumptions is some form of the ‘Grey World’ hypothesis: that the space-averaged reflectance of visual scenes is a neutral middle grey, and thus that the space-averaged light from each scene is proportional to its illuminant. Grey World models include those based on von Kries’s adaptation to the entire visual scene (e.g. Ives 1912; Helson 1938; Fairchild and Lennie 1992), and the more recent retinexes (Land et al. 1983). A modification of this model assumes only that the space-average reflectance from scenes is known, but is not necessarily neutral grey (Buchsbaum 1980). When the Grey World assumption holds true, adaptation to the space-average light would tend to achieve colour constancy despite changes in illumination. The critical assumption is that all deviations from the standard in the space-averaged light from scenes represent changes of illumination. If this Grey World assumption is violated, such as might happen in a forest of predominantly green surfaces, these models will automatically normalize as if the illuminant were green and the leaves grey, and consequently generate strong failures of colour constancy. Brainard and Wandell (1986) studied this problem in Land’s retinex, and concluded that ‘the algorithm is too sensitive to changes in the color of nearby objects to serve as an adequate model of human color constancy.’ The second general class of colour constancy models involves constructing linear models of the physical properties of reflectances and illuminants (cf. Maloney, Chapter 9 this volume; Maloney and Yang, Chapter 11 this volume), and devising computational solutions to estimate the most likely reflectances from these models. These models, unlike the visual models, may bear little resemblance to the known properties of biological visual systems, and their design may be oriented more toward machine vision than human vision. (Many are even designed to reconstruct the full spectral reflectance functions of surfaces; this seems
258
colour perception
akin to suggesting that the goal of olfaction is to reconstruct three-dimensional molecular models of odorants.) The linear models are based on the hypothesis that low-dimensional linear models are sufficient to represent both the spectral reflectances of surfaces and the spectral power distributions of illuminants for colour vision (Cohen 1964). Such models are usually limited to two or three dimensions, corresponding to the trichromacy of human colour vision. Then, under specified conditions, it may be possible to reconstruct precisely the reflectances and illuminants of scenes. These models would not be expected to succeed if their assumptions about the low dimensionality of natural colour signals are violated. Both types of colour constancy models can perform extremely well in artificial model worlds which incorporate their assumptions. (For the visual models, these are most commonly that changes in space-averaged background reflectances are small compared to changes in space-averaged illumination; for linear models, that natural reflectances and illuminants have low dimensionality.) In fact, these model are commonly ‘tested’ only in conditions that incorporate their own assumptions, and seldom under natural viewing conditions in the ‘real world’. So how well do their assumptions represent the natural colour environments for which colour vision evolved? This becomes an empirical question, which may be addressed by analysing the actual properties of natural surfaces and illuminants.
Ecology of colour signals An evolutionary view of perception holds that our sensory and cognitive processes are tuned to the relevant ecological signals and challenges that affect survival and reproduction (von Uexküll 1909/1985; Vollmer 1984; Delbrück 1986). From this perspective, the purpose of colour vision would be not to solve some general computational problem, such as reconstructing all the physical reflectance functions, but to reliably inform the organism about behaviourally relevant colour signals in the environment. To understand colour vision, then, it may be more valuable to study the properties of natural colour signals than the physics of light. Measuring and analysing ecological colour signals serves two purposes here: one is to test the assumptions and constraints built into current colour constancy models against the real world; and the other is to seek insights from these colour signals into the challenges and possible solutions of colour constancy. Individual natural reflectances The interactions of a surface with light are surprisingly complex. To properly characterize surfaces requires high-dimensional functions that include not just the dependence on the wavelengths of incident light, but the incident angle, the reflected angle, and, in the case of fluorescent surfaces, the various wavelengths of emitted light as well. In most models of colour vision, surface reflectances are approximated by one-dimensional functions of reflectance versus wavelength. (Indeed, in lightness models, they are often just a scalar!) The measured surface reflectances to follow are such one-dimensional approximations, and so they presumably underestimate the diversity and dimensionality of reflectances.
the yin and yang of colour constancy
259
Cohen (1964) analysed the spectral reflectances of 433 Munsell colour chips and found them to be well characterized by a three-dimensional basis set. But the Munsell chips are artificial surfaces, explicitly designed to be well behaved for human colour vision, and so may not represent very well the true diversity of natural surfaces. Another commonly studied set of natural reflectances is Krinov’s (1947) measurements of terrains, but, as discussed below, this should not be considered a set of individual object reflectances. A third approach to collecting natural light signals makes simultaneous spectral analyses of the light from many points in a scene (Webster and Mollon 1997; Ruderman et al. 1998). However, these light signals are not true reflectances but the products of reflectance and illumination, and there is no practical way to analyse just the reflectance components from them. Lacking an appropriate data set for natural spectral reflectances, I undertook to compile one using a portable Photo Research PR-650 spectrophotometer. By measuring the spectral power distributions of light reflected from various natural surfaces, and dividing by the light reflected from an artificial white standard held in the same location and orientation, one-dimensional approximations of surface reflectances were obtained. Ideally one hopes to study a set of surface reflectances representative of the natural surfaces that our colour vision evolved to see. For practical reasons, I settled for measuring the surface reflectances of natural objects found within a few miles of the University of California, San Diego. Also for practical reasons, I did not randomly sample points from across natural scenes, but tried to capture the gamut of natural colours by selecting a wide variety of coloured objects. So, brightly coloured flowers and fruits are vastly over-represented relative to their actual frequency in nature, while large expanses of sand and leaves were very undersampled relative to their spatial extent. But note that relatively uncommon but salient colour signals, particularly of coloured fruits, may have been a dominant driving force in colour vision (Polyak 1957), while the variations in background colours of leaves may have been spectral noise that our colour vision actually evolved not to see (Nagle and Osorio 1993). A total of 563 such natural objects were measured relative to the white standard to obtain a new set of natural object reflectance data (Brown 1994). These reflectances were highly diverse and contained far more spectral variation than was found in the Krinov set. To represent the chromatic range in this set, the chromaticities of these objects, under simulated illumination by CIE Source C, are plotted in Fig. 8.3. The same data are shown on both the standard CIE chromaticity plot (left), and on the more physiologically relevant MacLeod–Boynton chromaticity plot (right), in which the axes correspond to the axes of the colour-opponent channels. Interestingly, the chromaticities almost all lie to one side of a line through the white point, and are biased toward the red, yellow and green regions, with little representation of blues or turquoises. This chromatic bias in individual reflectances suggests that space-averaged backgrounds are also unlikely to be neutral grey, as the Grey World hypothesis requires. Averaged natural reflectances It is also useful to measure the space-averaged reflectances of various scenes. The best-known set of such measurements are Krinov’s (1947) spectral data from Russian terrains. These
260
colour perception D 409 Daylights (from J,M,&W 1964); 3 337 Natural Terrains (Krinov 1947); 563 Natural Objects (Brown 1994) 0.7
5
0.6
4
0.5 3 y
b
0.4 0.3
2
0.2
1
0.1
0
0 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 x CIE Chromaticity
0.4
0.5
0.6
0.7 0.8 0.9 r MacLeod–Boynton
1
Figure 8.3 Chromatic distributions of natural colour signals. The plots show the range of chromaticities for three sets of natural colour signals, plotted in both the standard CIE chromaticity diagram (left), and the physiologically based MacLeod−Boynton chromaticity diagram (right). To plot the chromaticities corresponding to reflectance data, diffuse illumination by CIE Source C (represented by the white cross) was assumed. Natural Objects (individual black circles): these represent the chromaticities of 563 individual natural reflectances. Data are from Brown’s (1994) measurements of the spectral reflectances of 563 natural objects in San Diego. Natural Terrains (lined region): this outlines the gamut of chromaticities spanned by 337 large, space-averaged natural backgrounds. Data are from the Russian data reported by Krinov (1947) based on wide-field measurements of spectral reflectances of natural terrains. The data in electronic form were kindly provided by Larry Maloney, and corrected against Krinov’s original published data. Daylights (black region): this outlines the gamut of chromaticities spanned by daylight illumination, from part or all of the sky. These chromaticities were taken from the data published in Judd et al. (1964). (Some of the original published points could not be resolved, but these were generally near the centre of the distribution, so their omission does not affect the gamut shown here.)
terrains have sometimes been described in colour constancy studies as natural objects, which in one sense they are, but they are certainly not small objects such as leaves and fruits, but rather very large expanses of terrain, measured while rotating the spectrophotometer, or even from an airplane. As Krinov noted, ‘Thus the data obtained refer basically to average natural backgrounds’. These spectral measurements correspond to virtual reflectances of the whole terrain. (They are not true averages of individual reflectances, because the actual illuminant may vary from point to point across the terrain, and particularly in the shadows, while this approximation assumes a uniform illuminant.) The gamut spanned by the chromaticities of these average natural backgrounds, again assuming Source C illumination, is shown in Fig. 8.3. Krinov’s set clearly spans a much smaller
the yin and yang of colour constancy
261
gamut of chromaticities than the individual object reflectances, suggesting that treating Krinov’s set as representative of natural object reflectances leads to a great underestimate of the actual diversity of object reflectances. On the other hand, treating Krinov’s set as representative of space-averaged background reflectances, it is hardly the tight cluster around the white point that the Grey World hypothesis assumes. On average, these terrains were a dark desaturated yellow-green, demonstrating that the world is not grey. (I made similar space-averaged measurements in San Diego, using a calibrated diffuser over the spectrophotometer, and obtained a gamut of terrains similar to Krinov’s but relatively lacking in the green regions.) Individual and averaged natural illuminants Illuminants, like reflectances, are more complex than the one-dimensional spectral power distribution functions commonly used to represent them. The spatial distribution of all the sources of illumination, and the polarization, are two important factors omitted from such approximations. Moreover, the natural illuminants studied are usually just the daylight from the sky, while the actual light reflected from objects may come from a variety of other sources, including previously reflected or filtered light with significantly different spectral characteristics. There have been many studies of the spectral power distributions of natural illuminants (Henderson 1970). One of the most influential was that of Judd et al. (1964), who analysed the data from three other sets of measurements of daylight. They plotted the chromaticities of each sample, and the gamut spanned by these chromaticities is replotted in Fig. 8.3. These daylights form a rather compact range of chromaticities, somewhat smaller than the range of average natural backgrounds, and oriented along a blue-yellow axis. According to the Grey World hypothesis, the range of daylight variance should be larger than that of space-averaged backgrounds. If, in fact, the opposite holds, as appears to be the case here, normalizing to the average of the background will likely backfire, by misinterpreting the chromaticity of the background as that of the illuminant. The daylights studied by Judd et al. and plotted in Fig. 8.3 included both measurements of the integrated daylight from the entire sky, and measurements taken in the shade and representing essentially the chromaticity of the sky minus the sun. Such restricted daylights do often occur in the shadows of scenes, but they do not represent the space-averaged illuminants that a scene would be likely to have. Two other sets of correlated colour temperature measurements of daylight which compared these two types of daylight measurements (full sky and sky-minus-sun) were also obtained from Henderson (1970) and are replotted in Fig. 8.4. This clearly shows that almost all of the chromatic variance in mixed daylight sets (such as that in Fig. 8.3) was restricted to the sky-minus-sun measurements, reflecting the well-known fact that the sky is blue. But when sunlight is included in measurements of daylight, there is remarkably little chromatic variance in these daylights. This implies that most of the chromatic variance in daylights is found in the spatial variations between sunlit and shady regions of a scene, particularly when the sky (and thus the shadows) is blue, and not in the temporal variations of space-averaged illumination across entire scenes. Since it is the space-averaged illuminant that most colour constancy models seek to estimate or adapt to, it would seem that the problem for colour constancy of
262
colour perception # of samples in 5 mired interval
120 100
Sky, no sun Sky with sun
80 60 40 20 0
0
40
80
120
160
200
240
CCT (mireds) of daylight (20 000 K)
(5 000 K)
Figure 8.4 Distribution of daylights. This plot shows that almost all of the chromatic variation in measured daylight illuminants was obtained from measurements taken in shadow (grey bar), and corresponds to the varying colours of patches of sky. When the sun is included in integrated measurements of total daylight illumination (black bars), there is very little chromatic variation among measurements. Data from the daylight measurements of Henderson and Hodgkiss (Henderson 1970, p. 127f.) and Winch et al. (Henderson 1970, p. 137f.) were pooled and plotted on a mired scale. (Mireds are an inverse measure of correlated colour temperature, which provides a more perceptually uniform scale of colours than colour temperature. Data were obtained from Henderson 1970.)
varying chromaticities in natural illumination has been greatly overestimated. On the other hand, illumination does vary from point to point across scenes, and this will pose a serious challenge for colour constancy. Endler (1993) has analysed the illumination in natural forests on a fine scale, and characterized many additional contributions, such as reflections and filtering through leaves, in addition to light from the sky. So while the Judd et al. set overestimates the chromatic variance in space-averaged illuminants, it underestimates the chromatic variance in spatially varying illumination. Analysis of spectral variance The linear models approach to colour constancy relies on the assumption that natural reflectances and illuminants can be represented by low-dimensional basis sets. Notably, Maloney reported that Cohen’s three principal components derived from the reflectances of the Munsell set could account for >99% of the variance in the reflectances of Krinov’s natural terrains set. But Maloney used an unusual metric, of variance not from the mean of the set but from absolute zero reflectance (black), and consequently the bulk of the variance accounted for represented just the DC signal in these all-positive reflectance functions. It would be interesting to know just how much of the chromatic variance (roughly, the variance in the shapes of the reflectance functions) could be accounted for by such a lowdimensional set. Applying another metric, of the variance from the mean accounted for in
the yin and yang of colour constancy
263
normalized reflectance functions, yielded starkly different results: now Cohen’s Munsell set could account for just 43% of the chromatic variance in the Krinov set, and 65% of the variance in the individual object reflectances. A new three-dimensional basis set derived from natural reflectances was also developed, and this improved these values of variance accounted for to 76% for the Krinov set, and 87% for the individual reflectances (Brown 1993b). This suggests that, at least for chromatic constancy, the low-dimensional linear models will generate much larger errors than was previously appreciated. Endler (1993) also analysed the variances in his measurements of microilluminants, and found that while three basis functions could capture most of the variance in illumination within each scene, a different set of three basis functions was generally required for each different scene. It has been suggested that the trichromacy of human colour vision might represent an optimal adaptation to an underlying three-dimensionality in natural colour signals, of surfaces and/or illuminants (Cohen 1964; Shepard 1992). The present analysis suggests that natural signals contain considerably more than three dimensions of variance. Of course, the fact that many, or most, non-mammalian vertebrate animals have evolved four or more dimensions of spectral sensitivity also provides strong evidence that there remain natural colour signals for which we humans are colourblind. In addition to the chromatic variances shown in Figs 8.3 and 8.4, reflectances and illuminants have important achromatic variances in amplitude. Figure 8.5 illustrates the range of achromatic variance for illuminants (the Judd et al. set) and reflectances (the Brown set), in comparison with the chromatic variances along the r–g and y–b colour opponent axes (corresponding to the axes of the MacLeod–Boynton chromaticity diagram in Fig. 8.3).
Reflectances
Relative range of signals
Illuminants
log I
b
r
Figure 8.5 Variances of natural colour signals in the opponent channels. This illustrates the relative ranges of variation in illuminants and reflectances, for each dimension of MacLeod–Boynton colour space (corresponding to the physiological opponent-colour channels). For luminance, the variation of intensity in illuminants is many orders of magnitude greater than for reflectances, showing a dramatic difference even on the logarithmic scale of luminance. But for purely chromatic variations, after luminance variation is removed, this relation is reversed, with natural reflectances showing greater variation than natural illuminants in both the blue–yellow (b) and red−green (r) dimensions. Illuminants data from Judd et al. (1964), and reflectances data from Brown, as in Fig. 8.3.
264
colour perception
Natural illuminants span an enormous range of intensities, roughly 10 orders of magnitude from starlight to bright sunlight, while the range of total reflectances of natural objects spans less than two orders of magnitude from deep black to white. This disparity in the range of intensities is indicated on the logarithmic scale of luminance at left. But with this intensity variance removed, the natural reflectances span a much larger range of the chromatic axes than do the natural illuminants. Apparently, the chromatic and achromatic dimensions of colour vision face very different challenges from changing illuminants versus changing backgrounds. Might the visual system have evolved distinct strategies for achieving achromatic and chromatic colour constancy, in tune with these differences in the ecological colour signals?
Asymmetries in the opponent channels The initial human visual response to colour signals occurs in the three types of cones, which each respond to the intensity of illumination weighted by different spectral sensitivities. Each type of cone is subject to almost the same 10 billion-fold range of illuminant intensities, and to the 100-fold range of reflectances, as the luminance signal of Fig. 8.5, and so the three cone channels are practically symmetrical in their relative variance due to illuminants and backgrounds. Thus the cones do not seem well suited to take advantage of the achromatic/chromatic asymmetry in natural colour signals. The retina transforms the initial cone signals into the luminance and colour-opponent signals of retinal ganglion cells, which the brain uses for subsequent visual processing, including colour vision. This transformation places almost all of the intensity variation into the luminance channel, while the remaining chromatic signals are in the two colouropponent channels. This suggests that the opponent transformation places almost all of the variance in illumination into the luminance channel, and leaves the two colour-opponent channels to deal with variations due primarily to reflectances. While there may be many other reasons for this retinal transformation into opponent channels (see MacLeod and von der Twer, Chapter 5 this volume), the observation that it largely separates the illuminant variation from the reflectance variation suggests that one advantage it may offer is to facilitate colour constancy with changing backgrounds and illuminants. The transformation from cone responses into opponent channels has not generally been regarded as particularly relevant for colour constancy. According to David Hubel (1988), ‘the two ways of handling color—r, g, and b on the one hand and b–w. r–g, and y–b on the other—are really equivalent’. But there are a number of intriguing asymmetries in the properties of the luminance and colour-opponent channels which might represent adaptations to their different distributions of natural colour signals. Some of these asymmetries are discussed below, with speculations on their possible relevance for colour constancy. Saturation of chromatic but not achromatic induction at low contrast One of the striking, but often overlooked, differences between achromatic chromatic colour induction is that the strength of achromatic induction increases monotonically with
the yin and yang of colour constancy
265
increasing contrast, while the strength of chromatic induction saturates at surprisingly low chromatic contrasts, and is flat, or even deceasing, for higher contrasts. A nice demonstration of this (from Meyer, described in Helmholtz 1867) is the tissue paper effect: a grey piece of paper on a saturated colour background generally appears lightly tinged with an induce complementary colour, but desaturating the background by overlaying a piece of white tissue paper may dramatically increase the induced colour (cf. Mausfeld, Chapter 13 this volume). Kirschmann (1892) reported diminishing returns with increasing chromatic contrast in his classic studies of colour induction. De Valois et al. (1986) also reported an asymmetry between achromatic and chromatic induction, with achromatic induction continuing to increase over a much larger range of contrasts than chromatic induction. Might this asymmetry in induction relate to a corresponding difference between luminance and achromatic variance? Helmholtz suggested that Meyer’s effect could be interpreted in terms of the likelihood that the light from the inducing surround corresponds to the colour of the illuminant. Since the chromatic range of natural illuminants is quite small, it might make sense to ‘bet’ that only desaturated surrounds are likely to represent the chromaticity of the illuminant. On the other hand, for achromatic induction there would be practically no limit to the range of surround intensities corresponding to illuminant intensities. It must be noted, however, that humans also achieve colour constancy over a large range of artificial illuminants, spanning a much larger gamut of chromaticities than natural illuminants, and the simplistic assumption that illuminants rarely vary in chromaticity cannot account for this. Chromatic but not achromatic sensitivity to diffuse stimulation It is commonly noted that the early visual system responds primarily to local contrasts, and thus loses sensitivity to uniform changes. Whittle and Challands wrote that ‘loosely speaking, the visual system differentiates the input, and to achieve certain perceptual goals we have to integrate it’. Because of this contrast dependence, there is remarkably little effect of changing the overall intensity of illumination (Walraven et al. 1990), and this has been taken as the key to lightness constancy. But there is considerable evidence that the colouropponent channels are not so dependent on spatial contrast. In the study of De Valois et al. (1958) on primate lateral geniculate nucleus (LGN) cells, strong responses were found to diffuse monochromatic light. The centre-surround structure of retinal ganglion cells and LGN cells also supports this distinction. Typical Type II cells, with a cone-selective centre and a complementary or non-selective antagonistic surround, provide the same cell sensitivity to spatial variations in luminance contrast, while maintaining sensitivity to spectral variations in uniform illumination (Hubel 1988). A study by Tootell et al. (1988) of activity in the primary visual cortex of macaques, using radioactive tracer 2-deoxyglucose (2-DG) to measure neural activity, also found much stronger cortical responses to spatially diffuse chromatic variation than to spatially diffuse luminance variations, and a number of psychophysical studies have found a corresponding effect in which contrast sensitivity requires much greater spatial contrast for luminance than for chromaticity (Mullen 1985). So perhaps the visual system does not ‘differentiate’ the chromatic signals as much as it does the luminance signals. Since the space-averaged chromaticity of light from a scene is
266
colour perception
more likely to represent surface reflectances than illuminants, allowing the DC chromatic signals to pass might be an important contributor to chromatic colour constancy, possibly accounting for why the green forest continues to look green and not a normalized grey. At the same time, the empirical observations that chromatic ganzfelds lose their perceived colour (Cohen 1964), that retinally stabilized stimuli fill in with the surrounding colour (Iarbus 1967), and that dichoptic contrast matches are determined almost entirely by contrast (Whittle, Chapter 3 this volume) indicate that under these circumstances local chromatic signals may not reach the brain.
Dependence of chromatic induction on luminance contrast Another interesting asymmetry among the opponent channels lies in the interaction between chromatic induction and achromatic contrast. Kirschmann (1892) reported that maximal colour induction occurred when the inducing surround and the induced test spot had equal brightness. Others have found that induction is strongest for surrounds equal or greater in luminance, and falls off when the surround is dimmer than the test spot (Jameson and Hurvich 1959; Kinney 1962). What implications might this have for colour constancy? Recall that most of the chromatic variance in daylights was associated with the spatial changes from sun to shadow under blue skies (Fig. 8.4). If the strongest chromatic deviations in natural illuminants are associated with deep shadows, it may be valuable to link chromatic induction to luminance relations. The experiment shown in Fig. 8.6 provides a hint for how this interaction may be used to promote colour constancy with spatially varying illumination across a scene. The surround consisted of sectors varying in luminance and chromaticity. Induction was measured into test spots of variable luminance. The result was that induction into the test spot was strongest from sectors having the same luminance as the test spot, consistent with Kirschmann’s law. In other words, the dark test spots were most affected by the dark sectors of the surround, and the light test spots by the light sectors of the surround. This mechanism, if it operates similarly when viewing natural scenes, would tend to segregate chromatic interactions within shadows from those within sunlit areas. These results also provide another strong challenge to models based on the average light from surrounds: the surround with large dark-green sectors and small bright-purple sectors had the identical space-averaged light as the complementary surround with large dark-purple sectors and small bright-green sectors, yet they had opposite colour induction effects on the test spots. Both surrounds also had essentially the same luminance variances and chromatic variances, but differed in the covariances of luminance and chromaticity. In considering the possible contribution of the opponent channel transformation to colour constancy, it is necessary to assess the likelihood that the neural processing underlying colour constancy actually involves opponent representations. Many models of colour constancy, especially those involving von Kries’s adaptation, are based on cone-based representations. The cone signals are transformed by retinal processing to generate the luminance and colour-opponent retinal ganglion cell signals, which are sent to the LGN and thence to the primary visual cortex for further processing. Both the LGN and the primary cortex maintain opponent representations, and so it seems a reasonable assumption that post-retinal colour processing occurs largely with opponent representations. Therefore, evidence that
the yin and yang of colour constancy
267
Net colour induction
1
0.5 0 –0.5 –1
Dark green and bright purple, or dark purple and bright green, with neutral space-averages
Dark Mean
Bright
Luminance of the test spot
Figure 8.6 Dependence of colour induction on luminance of the induced spot. This summarizes data from an experiment by Brown (1993a). When a background varied in both luminance and chromaticity, the colour induced into test spots depended strongly on the luminance of the test spots, and was determined primarily by regions of the surround having luminances equal to or greater than the test spots. In this study, a small test spot of variable luminance was embedded in surrounds varying in both luminance and chromaticity. The two complementary surrounds used had identical space-averaged luminances and neutral chromaticity, but one consisted of three large, dark-green sectors and three small, bright-purple sectors, while the other consisted of three large, dark-purple sectors and three small, bright-green sectors. (The sizes of the sectors were inversely related to their luminances to maintain the neutral space-average. Purple and green were chosen to maximize the saturation available on a monitor.) Subjects cancelled the induced colour by adjusting the chromaticity of the test spot along a purple−green axis until it appeared neutral grey. Results from 18 subjects (nine for each surround) were pooled, and data plotted to show the dependence of chromatic induction on the luminance of the test spot. When the test spot had the same luminance as the dark sectors, its colour appeared complementary to the dark sectors. Test spots with luminances equal to either the bright sectors or the mean luminance of the disc appeared complementary to the bright sectors.
the adaptations and spatial interactions involved in colour constancy are post-retinal would support the likelihood that they occur on an opponent representation, and not in a cone space. A number of lines of evidence do suggest a cortical site for much of the processing involved in colour constancy. There is the anatomical evidence that receptive fields in precortical visual areas are generally quite small, and not likely to support the long-range interactions involved in colour constancy. Zeki’s (1983) study of colour responses to Mondrian displays under changing illumination found that cells, even in V1, were responding based primarily on the wavelengths and not the colour appearances, while cells in V4, with much larger receptive fields and long-range connections, had responses resembling colour constancy. Land et al. (1983) studied a patient with a severed corpus callosum, and found that colour induction effects were restricted to half visual fields, suggesting that the spatial interactions involved must be cortical. De Valois et al. (1986) suggested that the temporal properties of colour induction also indicate a cortical locus. Alan Gilchrist (1977) demonstrated the effects of depth, which presumably involve cortical processing, on colour induction. Olson
268
colour perception
and Boynton (1984) found that colour induction in stimuli presented to one eye can be masked by stimuli presented to the other eye, and wrote, ‘It is concluded that the basis of the chromatic induction is largely or entirely nonretinal’. And two studies (Thompson and Latchford 1986; Webster et al. 1988) used the McCollough effect of orientation-contingent colour adaptation to demonstrate that the McCollough effect adapts to the local physical composition of light, not its perceived colour; this implied that the perceived colour was generated after the presumably cortical locus of the orientation-dependent effect. The above experiments all point to the likelihood that much of the processing involved in colour constancy has a cortical locus, and so are likely to involve opponent representations. This hypothesis suggests that the visual system evolved to take advantage of the asymmetries in the achromatic and chromatic dimensions of colour signals for changing backgrounds and illuminants by confining almost all of the achromatic variance into one channel, leaving primarily reflectance variance in the other two. The asymmetries and other idiosyncrasies of colour induction in these channels may be tuned to take advantage of this difference, and make the problem of colour constancy easier. But by no stretch does this solve colour constancy, as each dimension still has to deal with varying illuminants and backgrounds, just with different distributions of these. The luminance channel must deal with essentially the problem of ‘lightness constancy’, over a range of 10 orders of magnitude of illumination and 2 orders of magnitude of reflectance. And while the colour-opponent channels have relatively small chromatic variance in natural illumination to contend with, this can still be significant. For example, an observation believed due to Maksimov pointed out that the leaf of a dandelion in direct sunlight has approximately the same physical chromaticity as a dandelion flower in deep shade, yet the flowers look yellow and the leaves green in both sun and shade.
Summary The seeming immediacy and reliability of colour vision belies the inherent ambiguity of colour signals, and the complexity of the neural processing involved in colour perception. The effort to understand human colour vision has been one of the major enterprises of perceptual study, engaging the efforts of generations of top researchers, including many of history’s most celebrated scientists, such as Leonardo da Vinci, Newton, Young, Helmholtz, Maxwell, Mach, Schrödinger, and Crick. Despite this enormous effort, we still lack a model of colour vision that can successfully predict the perceived colours, even in simple two-dimensional ‘Mondrians’. Much of the difficulty arises from two essential but complementary aspects of colour vision: its dependence on the spectral power distribution of light from each point in a scene, reflected in the cone quantum catches, and its dependence on comparisons between the light signals from across the scene, reflected in post-receptoral contrast signals. The local signals involve equally the surface reflectance properties of objects and the spectral power distribution of illuminants, and so make it difficult to maintain object colour constancy across varying illuminants. Contrast representations are largely invariant with changing illuminants, but introduce problematic sensitivity to changing background reflectances.
the yin and yang of colour constancy
269
Two classes of colour constancy models make strong assumptions about the properties of natural colour signals in order, which were tested in an analysis of natural spectral reflectances and illuminants. The frequent assumption made by visual models of colour constancy, that the space-averaged light from scenes may be used to estimate the illuminant, was shown to be inconsistent with the natural data for chromatic variance, but a more plausible hypothesis for intensity variations. The assumption made by linear models of colour constancy, that natural colour signals are well represented by two- or three-dimensional basis sets, was challenged by the inability of such models to represent the chromatic variance in natural reflectances. A further analysis of the natural reflectances and illuminants found an interesting asymmetry between the luminance and chromatic dimensions of colour vision. The luminance channel must handle an enormous range of illuminant intensities, and a much smaller (though still large) range of reflectances, corresponding to the problem of lightness constancy. But the chromatic variance in the colour-opponent channels is considerably larger for natural reflectances than natural illuminants. Evidence from neuroscience and psychophysics suggests that much of the processing for colour constancy occurs at or beyond this opponent processing stage. This raises the possibility that the visual system has adopted distinct strategies for achieving colour constancy in the luminance and colour-opponent channels, rather than approaching colour constancy as simply ‘lightness constancy × 3”. Asymmetries in the properties of luminance and colour-opponent channels, including the saturation of chromatic but not achromatic induction at low contrast, the sensitivity of chromatic but not achromatic mechanisms to diffuse stimulation, and the dependence of chromatic induction on achromatic contrast, are all possible manifestations of such adaptive responses to natural colour signals.
Acknowledgements I am very grateful to Donald MacLeod for his kind support and encouragement of this research. I thank Rainer Mausfeld, as well as all the participants in the ZiF workshop, for their companionship and their valuable comments and criticisms. This work was funded by NIH EY01711 to DIAM.
References Brainard, D. H. and Wandell, B. A. (1986). Analysis of the retinex theory of color vision. Journal of the Optical Society of America A 3, 1651–1661. Brown, R. O. (1993a). Integration of nonlinear contrasts: implications for color constancy. Perception 22, 14. Brown, R. O. (1993b). A cone-based linear model of spectral reflectances. Optical Society of America Annual Meeting Technical Digest, 252. Brown, R. O. (1994).The world is not grey. Investigative Ophthalmology and Visual Science 35, 2165. Brown, R. O. and MacLeod, D. I. A. (1997). Color appearance depends on the variance of surround colors. Current Biology 7, 844–849.
270
colour perception
Buchsbaum, G. (1980). A spatial processor model for object-color perception. Journal of the Franklin Institute 310, 1–26. Cohen, J. (1964). Dependency of the spectral reflectance curves of the Munsell color chips. Psychonomic Science 1, 369–370. Cornsweet, T. N. (1970). Visual perception. Academic Press, New York. Delbrück, M. (1986). Mind from matter—An essay on evolutionary epistemology. Blackwell Scientific, Palo Alto. De Valois, R. L., Smith, C. J., Kitao, A.J., and Kita, S. (1958). Responses of single cells in different layers of the primate lateral geniculate nucleus to monochromatic light. Science 127, 238–239. De Valois, R. L., Webster, M. A., De Valois, K. K., and Lingelbach, B. (1986). Temporal properties of brightness and color induction. Vision Research 26, 887–897. Eco, U. (1985). How culture conditions the colours we see. In On signs: A semiotics reader, (ed. M. Blonsky), pp. 158–175. Blackwell, Oxford. Emmerson, P. G. and Ross, H. E. (1987). Variation in colour constancy with visual information in the underwater environment. Acta Psychologica 65, 101–113. Endler, J. A. (1993). The color of light in forests and its implications. Ecological Monographs 63, 1–27. Evans, R. (1964). Variables of perceived color. Journal of the Optical Society of America 54, 1467–1474. Fairchild, M. D. and Lennie, P. (1992). Chromatic adaptation to natural and incandescent illuminants. Vision Research 32, 2077–2085. Finger, S. (1994). Origins of neuroscience, Chapter 7. Oxford University Press, Oxford. Gilchrist, A. L. (1977). Perceived lightness depends on perceived spatial arrangement. Science 195, 185–187. Gilchrist, A. L. (1979). Perception of surface blacks and whites. Scientific American 240, 112–124. Gilchrist, A. L. and Jacobsen, A. (1983). Lightness constancy through a veiling luminance. Journal of Experimental Psychology: Human Perception and Performance 9, 936–944. Hamilton, W. J. (1979). Are selection pressures different? Discussion of Snoddrely. In The behavioral significance of color, (ed. E. H. Burtt Jr), pp. 282–283. Garland STPM Press, New York. Helmholtz, H. von (1867). Handbuch der physiologischen Optik. Helson, H. (1938). Fundamental problems in color vision. I. The principle governing changes in hue, saturation, and lightness of non-selective samples in chromatic illumination. Journal of Experimental Psychology 23, 439–476. Henderson, S. T. (1970). Daylight and its spectrum. American Elsevier Publishing Company, New York. Hering, E. (1920). Grundzüge der Lehre vom Lichtsinn. Springer, Berlin. Hochberg, J. (1988). Visual perception. In Stevens’ handbook of experimental psychology, Vol. 1, Perception and motivation, (ed. R. C. Atkinson, R. J. Herrnstein, G. Lindzey, and R. D. Luce). John Wiley & Sons, New York. Hubel, D. H. (1988). Eye, brain and vision. Scientific American, New York. Hurlbert, A. (1986). Formal connections between lightness algorithms. Journal of the Optical Society of America A 3, 1684–1693. Iarbus, A. L. (1967). Eye movements and vision, (transl. B. Haigh). Plenum Press, New York. Ives, H. E. (1912). The relation between the color of the illuminant and the color of the illuminated object. Transactions of the Illuminating Engineering Society 7, 62–72. Jameson, D. and Hurvich, L. M. (1959). Perceived color and its dependence on focal, surrounding, and preceding stimulus variables. Journal of the Optical Society of America 51, 46–53.
the yin and yang of colour constancy
271
Judd, D. B., MacAdam, D. L., and Wysezcki, G. (1964). Spectral distribution of typical daylight as a function of correlated color temperature. Journal of the Optical Society of America 54, 1031–1040. Kaiser, P. K. and Boynton, R. M. (1996). Human color vision, (2nd edn). OSA, Washington. Kinney, J. A. S. (1962). Factors affecting induced color. Vision Research 2, 503–525. Kirschmann, A. (1892). Some effects of contrast. American Journal of Psychology 4, 542–557. Krinov, E. L. (1947). Spectral reflectance properties of natural formations (Technical Translation). National Research Council of Canada TT, 439. Land, E. H. et al. (1983). Colour-generating interactions across the corpus callosum. Nature 303, 616–618. Maloney, L. T. (1986). Evaluation of linear models of surface spectral reflectance with small numbers of parameters. Journal of the Optical Society of America A 3, 1673–1683. Mausfeld, R. (1998). Color perception: From Grassman codes to dual code for object and illumination colors. In Color vision—perspectives from different disciplines (ed. W. Backhaus, R. Kliegl, and J. S. Werner), pp. 219–250. de Gruyter, Berlin. McCann, J. J. (1989). The role of simple nonlinear operations in modeling human lightness and color sensations. International Society for Optical Engineering 10777, 355–363. Mullen, K. T. (1985). The contrast sensitivity of human colour vision to red–green and blue–yellow chromatic gratings. Journal of Physiology 359, 381–400. Nagle, M. G. and Osorio, D. (1993). The tuning of human photopigments may minimize red–green chromatic signals in natural conditions. Proceedings of the Royal Society of London Series B – Biological Sciences 252, 209–213. Olson, C. X. and Boynton, R. M. (1984). Dichoptic metacontrast masking reveals a central basis for monoptic chromatic induction. Perception and Psychophysics 35, 295–300. Polyak, S. L. (1957). In The vertebrate visual system, (ed. H. Kluver). University of Chicago Press, Chicago. Ruderman, D. L., Cronin, T. W., and Chiao, C.-C. (1998). Statistics of cone responses to natural images: implications for visual coding. Journal of the Optical Society of America A 15, 2036–2045. Shepard, R. N. (1992). The perceptual organization of colors: An adaptation to regularities of the terrestrial world? In The adapted mind: Evolutionary psychology and the generation of culture, (ed. J. H. Barkow et al.), pp. 495–532. Oxford University Press, New York. Thompson, P. and Latchford, G. (1986). Colour-contingent after-effects are really wavelength-contingent. Nature 320, 525–526. Tootell, R. B., Silverman, M. S., Hamilton, S. L., De Valois, R. L., and Switkes, E. (1988). Functional anatomy of macaque striate cortex. III. Color. Journal of Neuroscience 8, 1569–1593. Uexküll, J. von (1909/1985). Environment and inner world of animals (Umwelt und Innenwelt der Tiere). In Foundations of comparative ethology, (ed. G. M. Burghardt), pp. 222–245. Van Nostrand Reinhold, New York. Vollmer, G. (1984). Mesocosm and objective knowledge. In Concepts and approaches in evolutionary epistemology, (ed. F. M. Wuketits), pp. 69–121. Reidel, Amsterdam. Wallach, H. (1948). Brightness constancy and the nature of achromatic colors. Journal of Experimental Psychology 38, 310–324. Walraven, J., Enroth-Cugell, C., Hood, D. C., MacLeod, D. I. A., and Schnapf, J. L. (1990). The control of visual sensitivity. In Visual perception: The neurophysiological foundations, (ed. L. Spillmann and J. S. Werner), pp. 53–101. Academic Press, San Diego. Wandell, B. A. (1995). Foundations of vision. Sinauer, Sunderland, MA.
272
colour perception
Webster, M. A. and Mollon, J. D. (1997). Adaptation and the color statistics of natural images. Vision Research 37, 3283–3298. Webster, W. R., Day, R. H., and Willenberg, K. (1988). Orientation-contingent color aftereffects are determined by real color, not induced color. Perception and Psychophysics 44, 43–49. Whittle, P. and Challands, P. D. C. (1969). The effect of background luminance on the brightness of flashes. Vision Research 9, 1095–1110. Zeki, S. (1983). Colour coding in the cerebral cortex: the responses of wavelength-selective and colour-coded cells in monkey visual cortex to changes in wavelength composition. Neuroscience 9, 767–781. Zeki, S. (1993). A vision of the brain. Blackwell Scientific Publications, Oxford.
commentary: the yin and yang of colour constancy
273
Commentaries on Brown Colour construction Don Hoffman Over a wide range of viewing conditions, our experience of an object’s colour varies little. This ‘colour constancy’ is a striking achievement of human vision, and raises a much-debated question: What is the proper theoretical framework for understanding colour constancy? Several approaches have been considered. The most common has been to estimate the spectral composition of the illuminant, and to compensate for it, while estimating the colours of object surfaces. An example of this approach is any model based on a variant of the ‘Grey World’ hypothesis: the assumption that the average reflectance of surfaces in a visual scene is grey. This hypothesis allows one to estimate the spectral composition of the illuminant by simply computing how the spaceaveraged light from a scene deviates from grey. Another approach uses the assumption that illuminants and reflectances in nature can adequately be represented by linear models of low dimension, say two or three dimensions each. This assumption, when true, allows one to compute the reflectances and illuminants in a scene, thereby giving colour constancy. In his paper ‘Backgrounds and illuminants: The yin and yang of colour constancy’ Richard Brown pursues a third approach, which urges careful attention to the evolutionary constraints on perception, and therefore to the ecological properties of the environment to which perception, and in particular colour perception, might be adapted. Careful attention to ecology can uncover shortcomings of the standard approaches. Brown notes that careful study of the reflectances of natural scenes reveals that the Grey World hypothesis is, in general, false. Furthermore, careful study of reflectances and illuminants in nature reveals that they are not adequately represented by linear models of low dimension. Although the standard approaches are attractive for their computational simplicity, that simplicity apparently derives from unrealistically simplified representations of the natural ecology of reflectances and illuminants, and therefore such approaches are unlikely to perform adequately for natural scenes. They are, in consequence, unlikely to be adequate models for human visual performance. Brown also uses his ecological approach to suggest interesting directions to explore for biologically plausible accounts of colour constancy. He notes, for instance, that for luminance, the variation of intensity in natural illuminants is much larger than in natural reflectances, whereas natural reflectances exhibit much larger variation than natural illuminants in both the blue–yellow and red–green dimensions of MacLeod–Boynton colour space. He suggests that the luminance and colour-opponent channels in human vision are adapted to these differences, and that this adaptation is helpful, though not the entire solution, for colour constancy. Thus Brown’s ecological approach is useful both in pointing out shortcomings of existing approaches to colour constancy, and in suggesting new directions of exploration for more adequate accounts. However, Brown himself does not propose such an account. I agree with Brown that an ecological approach to colour constancy is required, one that gives due attention to the properties of natural visual scenes and the perceptual problems faced by the visual system. Brown’s analysis of the chromatic and luminance properties of natural reflectances and illuminants, and their relation to human colour opponent channels, is an important step in this direction. But I also suggest, and I suspect Brown would agree, that the scope of such analyses must be made much wider. A demonstration I first saw from Jan Koenderink will help to make this point. The demonstration is available online at this URL: <web>http://aris.ss.uci.edu/cogsci/personnel/hoffman/Applets/Grid/Grid.html <xweb>
274
colour perception
On the left, in this demonstration, I have arranged a set of 49 squares with systematically varying chromaticities. On the right I have simply rearranged the positions of the same 49 squares randomly. However the two sets of squares appear different in several respects. The set of squares on the left appear to be illuminated by several different coloured light sources. The squares themselves do not appear flat, but slightly scalloped, concave or convex. Moreover each square does not appear of uniform brightness, but appears lighter on one side and darker on the other. In contrast, the set of squares on the right appear to be illuminated by a single white light source. The squares appear perfectly flat and each of uniform brightness. And there appear to be browns and tans, colours that do not appear in the set of squares on the left. This demonstration illustrates that the visual construction of colours is done in concert not only with the construction of the chromatic properties of light sources, but also in concert with the construction of the three-dimensional shapes of surfaces. And this makes sense when one considers that realistic reflectance models must, in general, take into account the angle of incidence and reflection of light with respect to the local normal of the surface, i.e. such reflectance models must incorporate the three-dimensional geometry of the surface. So the problem of colour constancy cannot be divorced from the problem of constructing three-dimensional surface shapes, and therefore the ecological considerations that motivate our theories of colour constancy must be extended to include threedimensional shape. Not only does three-dimensional shape interact with our construction of colours, so also does apparent motion (Cicerone et al. 1995; Wollschläger et al. 2001). A demonstration of this is available online at: <web>http://aris.ss.uci.edu/cogsci/personnel/hoffman/Applets/Outline/java.html <xweb> In this demonstration, there are many small, coloured dots scattered at random over a white background. All of the dots are one colour, say, red, except for those dots that happen to lie inside a virtual disc and are coloured, say, green. As the virtual disc moves from one frame to the next, some dots change colour, depending on whether they are now inside or outside the disc, but no dots ever move. The perceptual effect is quite striking. One sees a uniformly coloured disc gliding across the display. The chromatic properties of this disk depend on the speed of its apparent motion (Cicerone et al. 1995). Note that in this display one sees the green colour spreading over regions that a spectral photometer would reveal are white. So this demonstration illustrates that the problem of colour constancy cannot be divorced from the problem of constructing object motions, and therefore the ecological considerations that motivate our theories of colour constancy must be extended to include object motion. And this raises my last point. Once we focus on the fact that the colours we see depend not only on illuminants and reflectances, but also on three-dimensional shape, motion, and other factors as well, is it really useful to pick out colour constancy as the object of study? The perceptual phenomena we place under the rubric of colour constancy are part of a much wider range of phenomena, and perhaps not a natural part. We might do better to focus on the principles of colour construction more generally, rather than on colour constancy in particular. The phenomena now called colour constancy might then fall out as a special case.
References Cicerone, C. M., Hoffman, D. D., Gowdy, P., and Kim, J. (1995). The perception of color from motion. Perception and Psychophysics 57, 761–777. Wollschläger, D., Rodriguez, A. M., and Hoffman, D. D. (2001). Flank transparency: Transparent filters seen in dynamic two-color displays. Perception 30, 1423–1426.
commentary: the yin and yang of colour constancy
275
Commentaries on Brown Fitting linear models to data Laurence T. Maloney This chapter includes observations on a variety of topics relevant to surface colour perception and colour constancy. One topic of special relevance to me concerns the fitting of linear models to surface spectral reflectance functions, and I comment primarily on Brown’s remarks in the section titled ‘Analysis of spectral variance’ directed to my 1986 paper. That paper had two distinct goals. The first was to characterize and identify the physical processes that constrain naturally occurring surface spectral reflectance functions (SSRs), to understand why SSRs can often be described by models with a small number of parameters and why they sometimes cannot be. I argued that it was essential to go beyond simple fitting of data sets to arbitrary models and, instead, to explain why some models are appropriate and others not in terms of the basic physics of light–surface interaction (see Nassau 1983). This analysis can be framed without reference to any particular biological or computational visual system, and, indeed, must be. The second goal of the paper was to assess the consequences of the observed physical constraints for human surface colour perception. In his comments, Brown has somewhat confounded these two goals. Sorting them out is perhaps the best way to address the issues he raises. In Maloney (1986), I chose to represent each surface spectral reflectance, S k (λ), as, the sum of a ‘model surface’ Sσk (λ) and a ‘residual surface’, s k (λ), S k (λ) = Sσk (λ) + s k (λ).
(1)
The model surface is, of course, the desired approximation and my goal was to be able to express it as a small number of parameters. The residual surface is simply the error in approximation and I wanted it to be as small as possible in the least-squares sense. I expressed the model surface as a truncated, generalized Fourier series, Sσk (λ) =
N
σjk Sj (λ).
(2)
j=1
Note that the basis functions, {S1 (λ), S2 (λ), . . . , SN (λ)}, are the same for all surface reflectance functions and that, therefore, once selected, the coordinates (σ1k , σ2k , . . . , σNk ) completely specify the model surface corresponding to any surface spectral reflectance function in the original sample. I chose to select the basis functions that allowed me to minimize squared error. The details of the procedure are described in Maloney (1986) and again in Maloney (1999). Once the {S1 (λ), S2 (λ), . . . , SN (λ)} are selected, the fitting of each surface reflectance to a model surface is just an application of multivariate linear regression (Mardia et al. 1979). I reported the quartiles and median of the variance-accountedfor (VAF) by the linear regression fits to the individual surfaces in the sample, for each N . If the VAF were 1, then the model surface reproduces the original surface.1 In Maloney (1986), I concluded that the ‘. . . number of parameters [N ] required to model the spectral reflectances is 1 This statement is not quite true since two functions can differ at a single point and their difference will still have zero quadratic norm. It is sufficient to assume that the functions are continuous or piecewise-continuous to make the statement in the main text true.
276
colour perception
five to seven, not three’ (p. 1673). Consequently, a three-parameter linear model does not capture the range of variation in surface spectral reflectance functions in the samples I analysed, but a linear model with 5–7 parameters did. Of course, the question I then consider is, why? Why should a linear model with so few parameters provide a parsimonious description of two large data sets? The middle section of the paper addresses this question in detail and the hypothesis that emerged was that thermally induced molecular interactions in disordered liquids and solids effected a low-pass filtering on the surface spectral reflectances corresponding to these liquids and solids. Any reader interested the details can find both in Maloney (1986, 1999) and Bonnardel and Maloney (2000). Up to this point, in the 1986 paper, I had avoided introducing any aspect of human vision into the analysis. This restriction was intentional: I wanted to be able to make a ‘species-independent’2 statement about the physics of surface reflectance. When N is 3 (appropriate for a trichromatic visual system), any surface reflectance can be decomposed additively into its model surface and the residual, S k (λ) = Sσk (λ) + s k (λ).
(3)
The residual need not be physically negligible, but we can examine whether the presence or absence of the residual has any effect on the human visual system and estimate the size of the effect. A linear model with N set to 3 is not a perfectly accurate model of the environment [this result is confirmed in Maloney (1999)]. It is an approximation and an idealization of the environment that might provide a basis for surface colour perception algorithms (Maloney, Chapter 9 this volume). In particular, if the human visual system proved to be relatively insensitive to the residual term, then this insensitivity would serve to enhance the effective fit of the model surface to the physical surface. The concluding third of Maloney (1986) shows that the broad, smooth shape of human photoreceptor spectral sensitivities serve to attenuate the impact of the residual surfaces relative to the model surfaces. The human visual system is selectively insensitive to deviations from the Linear Model representation developed in Maloney (1986). Wandell (1995, colour plate 4) allows the reader to compare model scenes for N = 1, 2, 3, 4, 5 and 6. It is evident that there is little or no change beyond N = 3 (or at least the changes are lost in the uncertainties surrounding commercial colour reproduction). Dannemiller (1992) examined human ability to discriminate model and physical surfaces for N = 3 and concluded that they should be typically indiscriminable. Dannemiller’s conclusion is likely correct and at the same time optimistic. Colour discrimination ability depends on viewing conditions (cf. Cornsweet 1970, pp. 217–219). The most rigorous test of discriminability would be to alternate rapidly two visual scenes, one composed of true surfaces, and one where a single surface is replaced by its model surface. This experiment has been carried out recently by Nascimento et al. (2001). They found that scenes recorded by a multispectral camera were visually indistinguishable from the originals by this flicker-criterion when N was 8 or more, not far from the physical limit (5–7) proposed by Maloney (1986), but readily discriminable when N was 3. Note that there are two issues here and I employed different metrics to answer them. The first is, What is the nature of the physical constraints on naturally occurring surface reflectances? The second is, how good is the approximation with respect to the performance of any given visual system? In addressing the first issue, it is appropriate to exclude any considerations related to any particular biological visual system. The VAF criterion that Brown objects to is completely appropriate. 2 Except that the limits of the recorded surface reflectance functions roughly coincided with the human visible spectrum.
commentary: the yin and yang of colour constancy
277
Brown proposes a measure he refers to as ‘chromatic variance’ that he does not fully define. What he seems to be doing is, first of all, to normalize all of the SSRs in a sample of SSRs (method of normalization unspecified) and then subtract from each the mean of the normalized SSRs. Let {S 1 (λ), S 2 (λ), . . . , S M (λ)} denote a sample of M SSRs and {S˜ 1 (λ), S˜ 2 (λ), . . . , S˜ M (λ)} the set of normalized SSRs with mean S0 (λ) removed. Brown then selects a linear model to express this transformed set, N σ˜ jk S˜ j (λ), (4) S˜ σk (λ) = j=1
allowing him to write, S˜ k (λ) = S˜ σk (λ) + s˜k (λ),
(5)
as a sum of a model surface and a residual surface. In order to reconstruct a model surface corresponding to the original surface, he must restore the mean, and de-normalize, getting, Sσk (λ) = σ0k S0 (λ) +
N
σjk Sj (λ)
(6)
j=1
which is identical in form to Equation 8.2 (except for indexing). The evident difference is that this method of fitting requires that the first basic function be the mean of the normalized SSRs, a choice that is unlikely to be optimal. Consequently, this fitting method results in fits that are suboptimal [see discussion in Maloney (1999), pp. 395–396]. If we wished to address the first issue described above, identification of the physical processes responsible for observed constraints on SSRs, it is hard to see why we ever fit data in this way. But now consider the second issue, the impact of these constraints on human vision. In labelling his measure of goodness of fit ‘chromatic variance’, it is likely that Brown has this second issue in mind. I believe that what he means by ‘chromatic variance accounted for’ is the VAF in Equation 8.4, the VAF in reconstructing the normalized SSR with mean removed. But he has provided no evidence that the residual surface has any serious impact on human colour vision. Consequently, to call this measure ‘chromatic variance’ is misleading. MacLeod and Golz (Chapter 7 this volume), for example, conclude that a low chromatic-variance-accounted-for implies that a human observer will notice the discrepancies between idealized model of the environment and the physical environment. This point is precisely what is not established by Brown: ‘chromatic variance’ seems to incorporate no aspect of the known sensitivity of human colour vision to colour differences.
References Bonnardel, V. and Maloney, L. T. (2000). Daylight, biochrome surfaces, and human chromatic response in the Fourier domain. Journal of the Optical Society of America A 17, 677–687. Cornsweet, T. (1970). Visual perception. Academic Press, New York. Dannemiller, J. L. (1992). Spectral reflectance of natural objects: How many basis functions are necessary? Journal of the Optical Society of America A 9, 507–515. Maloney, L. T. (1984). Computational approaches to color constancy. Dissertation, Stanford University. [Reprinted as (1985) Stanford Applied Psychology Laboratory Report 1985–01.] Maloney, L. T. (1986). Evaluation of linear models of surface spectral reflectance with small numbers of parameters. Journal of the Optical Society of America A 3, 1673–1683.
278
colour perception
Maloney, L. T. (1999). Physics-based models of surface color perception. In Color vision: From genes to perception, (ed. K. R. Gegenfurtner and L. T. Sharpe), pp. 387–418. Cambridge University Press, Cambridge, UK. Mardia, K. V., Kent, J. T., and Bibby, J. M. (1979). Multivariate analysis. Academic Press, London. Nascimento, S. M., Foster, D. H., and Amano, K. (2001). Reproduction of colors of natural scenes by low-dimensional models. Investigative Ophthalmology and Visual Science 42, Abstract 3871. Nassau, K. (1983). The physics and chemistry of color. Chichester: Wiley. Wandell, B. A. (1995). The foundations of vision. Sinauer, Sunderland, MA.
chapter 9
SURFACE COLOUR PERCEPTION AND ENVIRONMENTAL CONSTRAINTS laurence t. maloney . . . I shall now remind you, that I did not deny, but that colour might in some sense be considered a quality residing in the body that is said to be coloured. Robert Boyle (1663)
Preface I decided that I liked the study of colour vision when, as a beginning doctoral student, I found out there were vectors in it. Colour matching performance is readily characterized in terms of simple mathematical operations drawn from linear algebra, the same sorts of operations that describe the physical space around us in classical physics. To discover that something as difficult to describe as the experience of colour was, on occasion, so rule-governed, was amazing. The questions about visual perception that excite me the most, then and now, concern the psychological representation of properties of the visual environment and the kinds of mental operations we can perform on them. For me, these questions are intimately tied to mathematics since mathematical models of psychological representation can be specified with precision. Mathematics provides the necessary distance for us to understand explicitly things that we know too well implicitly, such as colour. The thesis of the following chapter is simply that we can model human surface colour perception as algorithms that, over certain ranges of environmental conditions, manage to assign colours to objects that are in correspondence with specific, objective properties of the object’s surface. These properties are the intrinsic colours (terminology due to Roger Shepard) and a part of the chapter is taken up with discussing what they might be. When an algorithm operates in its environment, our subjective experience of colour remains in correspondence with these intrinsic colours and a specification of the environment of any algorithm is a necessary part of any theory of colour perception along the lines proposed here. The ideas described here have accompanied me for a long time, and I find them to be enjoyable companions. There is some mathematics (see above) but I’ve tried to double the argument in words and accompany it with examples. L. T. Maloney
280
colour perception
N →→
S(;,l )
→
l →
v
Figure 9.1 The bi-directional reflectance density function. The vector N is the unit normal to a specific point on a surface. The vector l is a unit vector from the same point on the surface in the direction of the light source and the vector ν is a unit vector from the same point in the direction of the viewer. The bi-directional reflectance density function specifies the proportion of light of wavelength λ arriving along l that is re-emitted in the direction ν.
Introduction If you and I were to disagree concerning the lengths of two rods, we might send out for a measuring tape or arrange to put the two rods next to each other so that they could be compared directly. In contrast, if we were to disagree about the relative redness of two surfaces, it is not at all clear what we might do to resolve the dispute. A physicist, called in for consultation, could readily provide a summary of how a small, designated patch of a surface interacts with light, its bi-directional reflectance density function (BRDF), S(λ; ν, l). This is, roughly speaking, the probability1 that a photon of wavelength λ, arriving at the surface from direction l, will be re-emitted in the direction of the viewer ν (see Fig. 9.1). It is plausible to assume that, if there is an objective correlate of the perceived colour of a surface, the intrinsic colour of the surface (Shepard 1992), then some computation applied to the BRDF of the surface should serve to measure it. But which computation, exactly? In the quotation that heads this chapter, the celebrated chemist Robert Boyle allows for the possibility that colours correspond to objective properties of surfaces, but is evidently uncertain as to how that could be. Indeed, the argument to the contrary has considerable force. There is psychophysical evidence indicating that there can be no intrinsic colours. Colour judgements depend not only on the BRDF of the surfaces involved, but also on the illumination of the scene (Helson and Judd 1936), atmospheric haze (Brown and MacLeod 1997), and the presence or absence of other surfaces not directly involved in the judgement, among other factors. These effects are not small: ‘If changes in illumination are sufficiently great, surface colors may become radically altered . . . [W]eakly or moderately selective illuminants with respect to wavelength leave surface colors relatively unchanged . . . but a highly selective illuminant may make two surfaces which appear different in daylight 1 The values of the BRDF are probability densities, not probabilities. It is the probability that photons arriving along a narrow cone centred on the vector l will be re-emitted along a narrow cone centred on the vector ν (see Cohen and Wallace 1993).
surface colour perception and environmental constraints
281
indistinguishable, and surfaces of the same daylight color widely different’ (Helson and Judd 1936, pp. 740–741). Over the range of experimental conditions considered by Helson and Judd, this lack of constancy of surface colour suggests that the relation between the BRDF of a surface patch and its perceived colour is slight: there are no intrinsic colours. Evans (1948, colour plate 13) illustrates how saturated, bright objects can change colour dramatically with a change in illuminant: yellow becomes red, green becomes grey. Nassau (1983, colour plate) includes a colour photograph of the gemstone alexandrite which can appear emerald green under daylight and ruby red under ordinary, indoor, incandescent illumination. Yet we need not visit a laboratory to observe large failures of colour and lightness2 constancy. When you attend a movie, for example, you view a flat, white surface on to which is projected a complicated, dynamic pattern of light, the illuminant. Your estimates of the surfaces in front of you likely corresponds to the filmmaker’s conception of the film. You see people, cars, explosions, and so on, just as the script of the film predicted. None of these objects or their surfaces are present, and yet you ‘see’ them, occasionally forgetting about the only surface truly present, the uniform white screen. A small patch of the screen, during the course of the movie, might appear to be any and all colours in rapid succession. The failure of constancy in your perception of surface colour could not be larger. Mathematical analyses (Ives 1912; Sällström 1973; Maloney 1984) confirm this conclusion: it is simply not possible to go from the kind of information available to biological visual systems to estimates of properties of the BRDFs of surfaces without some restriction on possible illuminants and possible surfaces in scenes. There cannot be objective correlates of perceived surface colour (intrinsic surface colours) that a biological visual system can estimate under all possible choices of BRDFs, illuminants, and spatial layout of lights and surfaces in scenes. The concept of environment Yet, recent research indicates that, under certain circumstances, human observers do seem to estimate intrinsic surface colours accurately (Brainard et al. 1997; Brainard 1998). Other species are known to exhibit some degree of colour constancy (Neumeyer 1981; Werner 1990). We can reconcile the apparent impossibility of surface colour perception in ‘arbitrary’ environments with the evident competence exhibited by human observers and other animals in psychophysical experiments and much of everyday life by recognizing that we do not live in ‘arbitrary environments’. It is specific constraints present in our immediate surroundings that permit us to succeed in perceiving stable surface colours. These constraints can be thought of as a list of precise assertions concerning a visual scene. If all of the assertions on the list are true of the scene, then human colour vision will assign colours to surfaces in that scene that are the same as those it assigns to these surfaces in another scene that also satisfies these assertions. Judd (1940), for example, notes that with ‘moderate departures from 2 Lightness, roughly speaking, refers to the light–dark dimension of surface colour perception (Gilchrist 1994). Throughout this chapter I will use the term ‘surface colour’ to refer to black, white, and grey stimuli as well as surfaces that are coloured in the everyday use of the term.
282
colour perception
daylight in the spectral distribution of energy in the illuminant, external objects are seen . . . nearly in their natural, daylight colours’. In making this statement, Judd dichotomizes scenes into those satisfying the stated condition on the scene illumination and those that do not. In any near-daylight scene, he asserts, the human colour visual system assigns nearly the same colour to any surface patch as in any other. The converse of Judd’s assertion is probably not correct but it seems plausible that it can be extended to a description of the scenes where our colour visual system assigns intrinsic colours to surfaces. If we succeed, then we have established an operating range for the human colour visual system, which I will refer to as its environment, over which it is capable of assigning stable colours to surfaces. With no further specification of what that environment might be, this environmental hypothesis is neither falsifiable nor useful. Indeed, we run the risk of developing a deus ex machina that we trot out at the end of every experiment. We explain observed failures of colour constancy by asserting that the ‘environment was bad’, and we explain success by asserting that the ‘environment was good’. The environmental hypothesis has scientific content only to the extent that we can precisely state, in advance, what it is about an environment that permits or precludes accurate surface colour perception. For human vision, this environment does not include movie theatres or the conditions of many psychophysical experiments. It does include the conditions of much of our everyday colour experience, but, as we shall see, not all of it. In the past 20 years, a number of researchers have studied the link between environmental constraints, the mathematical possibility of accurate surface colour perception, and human performance in colour tasks in real and simulated environments. Four new areas of research have emerged. The first comprises development of algorithms that make use of explicit constraints in estimating properties of the BRDF (for reviews, see Hurlbert 1998; Maloney 1999). If the constraints corresponding to an algorithm are satisfied, then the algorithm can estimate intrinsic colours of surfaces. Once beyond its operating range, an algorithm will typically fail; the link between its colour estimates and the properties of surfaces is severed. In raising the environment hypothesis for human colour vision, we emphasize the analogy between human colour visual processing and these sorts of algorithms, each of which has its own specified operating range or environment. The second area of research involves study of the constraints present in actual physical environments (for a review, see Bonnardel and Maloney 2000). If we thought that we knew the operating range over which human surface colour perception could function, then we would certainly be interested in learning whether, and to what extent, that operating range resembled our everyday world. If we were uncertain what this operating range might be, it seems reasonable to look for clues in the structure of the environments we live in. The third concentrates on measured human performance in real or simulated environments, attempting to determine which environmental constraints affect human colour perception and over what ranges colour perception is stable (Brainard et al. 1997; Brainard 1998; Yang and Maloney 2001). As indicated above, I will use the term environment to refer to a collection of mathematical descriptions of constraints (Maloney 1999). This usage is unusual, but has much to
surface colour perception and environmental constraints
283
Ideal Environment
perceived colours ~ surface properties Figure 9.2 Ideal environments. An ideal environment is a mathematical description of a collection of scenes. The environment associated with a computational vision algorithm specifies the range of possible scenes over which the algorithm can function correctly. We can also speak of the ideal environment of a biological visual system. If we simulate scenes that satisfy the ideal algorithm of a biological visual system, we expect that it will correctly estimate specific properties of the world, e.g. surface colours.
recommend it. The surface colour perception algorithms just mentioned each come with a paired environment in which the algorithm will function correctly. We are considering the hypothesis that human colour vision, confined to a specified environment, will assign colours to surfaces that are in correspondence with certain properties of surfaces that also remain to be specified. When necessary, I will contrast ideal environments (mathematical descriptions) with the real environments that they are intended to describe. By means of computer graphics it is now possible to embed human observers in simulated environments that correspond to ideal environments. That is, we can simulate a world that satisfies a specified collection of constraints, and record an observer’s colour judgements in that world, and thereby explore how specific constraints affect human colour vision. Maloney and Yang (Chapter 11 this volume) describe experiments using such stimuli. Figure 9.2 illustrates the inter-relations among environments and algorithms in modelling human colour perception. In the remainder of this chapter I will summarize what we know about environmental constraints, their relevance to surface colour perception, and the accuracy with which they describe the world around us.
Flat World environments The first class of environments that we will consider is missing any information concerning the three-dimensional layout of surfaces in normal scenes. The observer, in effect, views a scene painted on a large, distant planar surface, or perhaps the inside of a large sphere centred on him or her. The scene is illuminated uniformly by a single light source (the illuminant). There is no inter-reflection (‘mutual illumination’) among surfaces nor any specularity or shadows. I’ll refer to environments that omit the three-dimensional structure of scenes as Flat World environments (Maloney 1999). Such environments are idealizations of typical experimental arrangements that have been used to measure human surface colour perception, e.g. in Mondrian displays (Land and McCann 1971). If human colour visual processing did not, in fact, make use of any information concerning the three-dimensional layout of the scene, then such environments could serve as accurate models of the world as ‘seen’ through human visual processing of colour.
284
colour perception Illuminant E()
photoreceptor excitations R xy k (),k = 1,2,3
surface reflectances S xy()
Figure 9.3 Flat World environments. As its name suggests, Flat World is almost completely devoid of three-dimensional structure. The observer sees surfaces confined to a plane or very large sphere, illuminated by a single, distant light source. The precise specifications of ‘Flat World’ are given in the text.
Very recent work (Bloj et al. 1999) indicates that information concerning the threedimensional layout of scenes does affect colour appearance, calling into question the adequacy of Flat World environments and algorithms as models of human surface colour perception. A second piece of evidence hinting that Flat World environments are not appropriate for human colour processing is that we do not yet have any Flat World algorithms that mimic human colour vision in such scenes (Hurlbert 1998; Maloney 1999). Of course, this might be due to a lack of imagination on our part. Yet it suggests that surface colour perception in Flat World is difficult. The Flat World environments are worth consideration not only because of their historical importance but also because they serve as an introduction to more sophisticated models of light and surface in the world around us. Lights and surfaces Let’s begin by specifying precisely the elements common to every Flat World environment: light from a single, distant, punctate light source (the ‘illuminant’) is absorbed by surfaces within a scene and re-emitted. E(λ) will be used to denote the spectral power distribution of the incident illuminant at each wavelength λ in the electromagnetic spectrum. It is not important to specify the location of the light source or its size and shape. Light from this single source is absorbed by surfaces and re-emitted. The re-emitted light that reaches the observer will be referred to as the colour signal. Its spectral power distribution is denoted L(λ): we assume that L(λ) = E(λ)S(λ),
(9.1)
where S(λ) denotes the surface spectral reflectance of the surface. There may be many surfaces in a scene, each with a distinct surface spectral reflectance function, but we will assume that the light incident at one point in the scene is exactly the same as the light at any other point. The colour signal contains the information available to the observer about illuminant and surface at each point in the scene. The Flat World constraints are illustrated in Fig. 9.3.
surface colour perception and environmental constraints
285
The colour signals reaching the observer (Fig. 9.1) are imaged on to the retina.3 We assign coordinates xy to each point in the retina, and it is convenient to label the colour signal arriving at xy by L xy (λ). We also denote the spectral reflectance function, S(λ), of the surface patch imaged at point xy in the retina by S xy (λ). We do not need to superscript the light arriving at the surface patch since we have assumed it is uniform across the scene. To repeat Equation 9.1 with retinal coordinates inserted: L xy (λ) = E(λ)S xy (λ).
(9.2)
In environments more complex than Flat World, the function S(λ) for a surface patch depends on the viewing geometry: the location in three dimensions of the surface patch, the locations of other surfaces, the location of the observer, and that of the illuminant. We will return to this point below when we consider Shape World environments. Photoreceptor classes The retina is assumed to contain three distinct classes of photoreceptors, differing in their spectral sensitivities, denoted R k (λ), k = 1, 2, 3. These three spectral sensitivity functions are often denoted L, M, and S to reflect their differential sensitivity to long, medium, and short wavelength light. The initial visual information available to the colour system at a single retinal location is just the excitation of each of the three classes of receptor in response to incident light, xy
ρ1 = xy
ρ2 = xy ρ3
=
L xy (λ)R1 (λ)dλ = L xy (λ)R2 (λ)dλ = L xy (λ)R3 (λ)dλ
=
E(λ)S xy (λ)R1 (λ)dλ, E(λ)S xy (λ)R2 (λ)dλ, E(λ)S xy (λ)R
(9.3)
3 (λ)dλ. xy
xy
xy
The three numbers at each location xy form a vector ρ xy = (ρ1 , ρ2 , ρ3 ). Evidently the entries of the vector ρ xy depend on both the spectral power distribution, E(λ), of the illuminant and on the surface spectral reflectance function, S xy (λ). Intrinsic colours In Flat World, we define the intrinsic colours of a surface with surface spectral reflectance as a vector (C1 [S(λ)], C2 [S(λ)], C3 [S(λ)]), where the functions4 Ci [·] represent computations applied to the surface spectral reflectance function that return a single number. We are simply providing notation (Maloney 1984) for what was said before: intrinsic colours depend upon the surface and nothing else. A simple example of a set of intrinsic colours (Maloney 1984) are the photoreceptor excitations under a known reference 3 4
We will only be concerned with monocular viewing conditions in Flat World. Precisely, functionals whose arguments are themselves functions.
286
colour perception
illuminant, E 0 (λ): C1 [S(λ)] =
E 0 (λ)S(λ)R1 (λ)dλ,
C3 [S(λ)] =
3 (λ)dλ.
C2 [S(λ)] =
E 0 (λ)S(λ)R2 (λ)dλ, E 0 (λ)S(λ)R
(9.4)
Since the reference illuminant and photoreceptor sensitivities are known, the intrinsic colours in Equation 9.4 are simply functions of the (variable) surface spectral reflectance. If we can somehow compute the photoreceptor excitations under a known reference illuminant (eqn 9.4) from the photoreceptor excitations under an arbitrary, unknown illuminant (eqn 9.3), then we have evidently ‘discounted the unknown illuminant’. Of course, there are many possible choices of intrinsic colours other than Equation 9.4, and we will meet an alternative set in the next section. What might be called the surface colour perception problem is encapsulated in Equation 9.3. The visual system has access to only the photoreceptor excitations (at all retinal locations) on the left-hand side of the equation and must somehow transform them into non-trivial5 intrinsic colours. Without further restrictions on the possible spectral power distributions and surface spectral reflectance functions that can appear in the integral on the right-hand side, there is simply no possibility whatsoever of succeeding, as noted above. That human observers do seem to succeed in some scenes indicates that surfaces, lights, or both are subject to constraints in these scenes, constraints that the human visual system makes use of. In the next section, we will examine one such set of constraints. Linear models Several authors (Yilmaz 1962; Sällström 1973; Brill 1978, 1979; Buchsbaum 1980; Maloney and Wandell 1986) have independently chosen to model spectral power distribution functions and surface spectral reflectance functions as finite-dimensional linear function spaces or, equivalently, generalized Fourier series (Apostol 1969, chapters 1–2; Strang 1988; Young 1988). Many other researchers have since made use of this representation. Maloney and Wandell (1986) introduced the term linear model for this sort of representation and Wandell (1995) contains a clear introduction to the use of linear function spaces in vision. The basic idea of the representation is simple. I’ll begin with illuminant spectral power distributions. The model comprises illuminant spectral power distributions that are weighted mixtures of a fixed set of basis illuminants, E1 (λ), E2 (λ), . . . , EM (λ). The mixture can be written as, M Eε (λ) = εi Ei (λ). (9.5) i=1
We can think of the basis illuminants, E1 (λ), E2 (λ), . . . , EM (λ), as known, fixed reference lights. The unknowns in Equation 9.5, then, are the weights ε = (ε1 , . . . , εM ) which 5 An example of trivial intrinsic colours: simply set the functions C [·] to all be the zero function. To be noni trivial, the choice of intrinsic colours must vary from surface to surface and also satisfy a simple independence condition if the observer is to be trichromatic see Maloney (1984).
surface colour perception and environmental constraints
287
determine the illuminant Eε (λ). An evident consequence of imposing this linear model constraint is that every ‘possible’ light can be specified by just specifying M numbers (the weights). Surface spectral reflectances are represented by a second linear model: Sσ (λ) =
N
σj Sj (λ).
(9.6)
j=1
The basis surface reflectance functions S1 (λ), S2 (λ), . . . , SN (λ) are also fixed and known. The weight vector σ = (σ1 , . . . , σN ) contains N numbers that together determine the surface spectral reflectance. When N = 3 and the basis surface reflectance functions are linearly independent,6 the weights σ = (σ1 , σ2 , σ3 ) are an example of a set of intrinsic colours. Given Sσ (λ), each weight can be computed by using standard linear algebra methods (Strang 1988), allowing us to write the weights in the form of intrinsic colours: σj = Cj [Sσ (λ)]. The importance of the linear model approach only becomes apparent when we substitute Equations 9.5 and 9.6 with M = N = 3 into Equation 9.3. After simplifications (Maloney 1984, 1999), Equation 9.3 becomes a simple matrix equation: ρ xy = ε σ xy ,
(9.7)
or, rather, a set of simultaneous matrix equations, one for each location xy, that all share the same 3 × 3 matrix ε . This matrix is unknown but depends only on the unknown illuminant. The vectors ρ xy are, as before, the known excitations of the three photoreceptor classes across the retina, and the vectors σ xy are now the unknown intrinsic colours that we seek to recover. If we somehow could determine the illuminant (Maloney and Yang, Chapter 11 this volume), then we could solve Equation 9.7 for the σ xy at each xy by inverting the matrix ε : ε−1 ρ xy = σ xy . (9.8) Hurlbert (1998) and Maloney (1999) provide detailed reviews of attempts to solve Equation 9.7 by adding additional constraints to the linear model constraints that allow estimation of the unknown illuminant; D’Zmura and Iverson (D’Zmura and Iverson 1993a, b, 1994; Iverson and D’Zmura 1995a, b) have characterized the solution conditions for Equation 9.7 and certain generalizations of Equation 9.7. In the following section, I describe evaluations of linear model descriptions of collections of real lights and surfaces. It is evident that any solution of Equation 9.7 is of relevance to computer vision only when the real environment in an application is captured by an ideal environment based on Equations 9.5 and 9.6 and the other Flat World assumptions. Such models are relevant to human surface colour perception only if we can find suitable choices of basis functions (subspaces) in Equations 9.5 and 9.6 over which human colour vision exhibits near-perfect colour constancy: a reasonable place to begin the search is by evaluating fits to various parts of the world around us, as described in the next section. 6
The three basis surface reflectances are linearly independent if no one of them can be mixed from the other two.
288
colour perception
While it is evidently of value to know which choice of basis functions optimize human performance in surface colour perception, there has been no research on this issue. The following section is more technical than the remainder of the article and the reader may skip to the subsection titled ‘The human environment’ without losing the thread of the argument. Evaluating linear models of surfaces and lights Researchers have considered two different routes for evaluating the fit of linear models to surfaces and illuminants. The first, and most common, is to fit a linear model explicitly to a collection of measured surface reflectance functions or measured spectral power distributions. I will describe how this is done later in this section, and report results for previous fits. This empirical approach is often vulnerable to the criticism that any conclusions drawn cannot be generalized beyond the collection of surfaces analysed, a point we will consider in some detail below. The second approach is theoretical: if the surface spectral reflectance functions of any class of surfaces or the spectral power distributions of illuminants are constrained, the constraint can, in principle, be derived from consideration of the physics and chemistry of light and surface. There are well-developed models of the variations observed in terrestrial daylight with changing atmospheric conditions (Henderson 1977), and the surface spectral reflectance functions of non-diffractive biological colourants (non-diffractive biochrome surfaces) may share a common low-pass constraint that can be derived from consideration of the quantum chemistry of disordered solids and liquids (Maloney 1986; van Hateren 1993; Bonnardel and Maloney 2000). Surfaces I next describe how to fit an optimal least-squares linear model to a set of 1413 empirical surface reflectances described in Chittka et al. (1994). They measured the surface spectral reflectance functions of flowers across the span of several months at high spectral resolution. There are no published reports of fits of linear models to this large empirical sample of surface spectral reflectance functions, although Bonnardel and Maloney (2000) studied it in the Fourier domain. Suppose that we have a set of empirically measured surface spectral reflectances,7 ν S (λ), ν = 1, 2, . . . , V . For any fixed value of N , and any choice of basis functions S1 (λ), S2 (λ), . . . , SN (λ), we can compute, by linear regression (Maloney 1986), the weights σ = (σ1 , . . . , σN ) in Equation 9.3, that minimize the least-square error, ν =
[S ν (λ) − Sσ (λ)]2 dλ.
(9.9)
7 I will treat a sampled surface reflectance function, sampled at wavelengths λ , as a step function that is i constant between λi and λi+1 , and has as its constant value, the sampled value at λi . These step functions have values defined at all wavelengths λ and their use allows me to use the same notation and conventions for empirical and theoretical surface spectral reflectance functions.
surface colour perception and environmental constraints
289
We are, in effect, measuring how well this choice of basis functions S1 (λ), S2 (λ), . . . , SN (λ) can fit each particular empirical reflectance function S ν (λ). We wish to choose the collection of basis functions S1 (λ), S2 (λ), . . . , SN (λ) that allow us to minimize the overall error, = 1 + · · · + ν . (9.10) This turns out to be a simple computation using standard numerical analysis methods (see Mardia et al. 1979) and a standard measure of goodness of fit for such methods is variance accounted for (VAF). Of course there are other criteria for goodness-of-fit and other fitting methods, a point we will return to further on. To date, VAF is overwhelmingly the measure used by researchers who fit collections of lights and surfaces by means of linear models. Unfortunately, many authors fit empirical data to linear models (eqn 9.6) using a suboptimal fitting procedure. The first basis function is assumed to be the arithmetic mean8 of the empirical sample V ¯ S(λ) = Sν (λ)/V (9.11) ν=1
and the remaining basis functions are then chosen to minimize Equation 9.10. Equation 9.6 can be rewritten in this case as, ¯ Sσ (λ) = σ1 S(λ) +
N
σj Sj (λ).
(9.12)
j=2
The unconstrained fit (when the first basis function need not be the mean of the empirical sample) cannot result in a lower VAF than the constrained fit, and the resulting VAF will typically be higher. Consequently, the reports of VAF when the constrained fitting procedure is used will typically understate the true VAF that can be achieved by a linear model with any fixed number of basis functions. Figure 9.4 contains plots of VAF versus the number of basis elements N for the 1413 surface reflectance functions in the Chittka et al. sample. For comparison, results for the 170 surfaces measured by Vrhel et al. (1994) and the 337 Krinov surfaces (Maloney 1986) are plotted.9 The results confirm the conclusions drawn in Maloney (1986, p. 1674): ‘. . . the number of parameters required to model . . . spectral reflectances is five to seven, not three’. Several researchers have previously fit linear models to empirical collections of surface reflectance functions. Cohen (1964) analysed a subset of the Nickerson–Munsell colour reference surfaces (Kelley et al. 1943) and concluded that a linear model with as few as three or four free parameters could provide accurate approximations to the measured SSRs. Maloney (1986) analysed the full set of 462 Nickerson–Munsell chips and confirmed Cohen’s conclusions. As just noted, Maloney also analysed a collection of SSRs of 337 8 See Maloney (1999) for further discussion. The authors who use the constrained fitting methods employ principle components analysis (Mardia et al. 1979) on normalized data, a plausible but suboptimal method. The resulting basis functions need not be orthogonal nor even linearly independent. 9 Vrhel et al. (1994) used the suboptimal procedure just discussed. The results reported in Fig. 9.4 were computed using the optimal procedure.
colour perception Proportion of variance accounted for
290
1.00 0.99 0.98 0.97 0.96
337 Krinov surfaces 170 Vhrel surfaces 1413 Chittka surfaces
0.95
2
3 4 5 6 7 8 Number of basis elements
9
Figure 9.4 Proportion of variance accounted for by a linear model with 2−9 parameters. The dimensionality of the linear model (the number of basis reflectance functions) is plotted on the horizontal axis. The vertical axis is the variance accounted for (VAF) of the optimal linear model for a collection of surface reflectance functions with the specified number of basis elements. Results for three collections of surface data are shown. The Krinov data are taken from Krinov (1947/1953), the Vrhel data set is the data set described in Vrhel et al. (1994), and the Chittka data set is the data set described in Chittka et al. (1994). The fits reported for all three data sets were computed as described in the text.
samples of natural formations collected by Krinov (1947/1953) and found that a linear model with as few as 5–7 free parameters could provide an essentially perfect fit to both sets of reflectance spectra taken together (Fig. 9.4). Parkkinen et al. (1989) measured the surface spectral reflectance functions of the current Munsell collection (1257 chips) with a finer sampling interval in the electromagnetic spectrum (5 nm instead of 10 nm as in the studies just described). They concluded that 8–12 basis functions were needed to achieve accurate approximations for all spectra. Finally, Vrhel et al. (1994) measured 170 surface reflectance functions of natural and artificial objects (with 2 nm sampling intervals) and concluded that roughly 7–8 free parameters were needed to reproduce their data (see Fig. 9.4). In all cases, the measure of goodness of fit increased rapidly with the number of basis elements: linear models with as few as three parameters provide fits to surface reflectance data with a residual least-square error less than 2% of the total variance. These studies suggest that a linear model with as few as three parameters could be used to approximate accurately a remarkably wide range of SSRs; no more than 10–12 parameters are needed to effectively reproduce each of the surface collections above considered in isolation. Illuminants Several authors have investigated how well linear models capture collections of daylight spectral power distributions (Judd et al. 1964; Das and Sastri 1965; Sastri and Das 1966, 1968; Dixon 1978; Romero et al. 1997, 1998). All of the studies (except for those of Romero
surface colour perception and environmental constraints
291
et al.) used the suboptimal fitting procedure of Equation 9.12. The studies are in agreement that linear models with as few as 3–4 parameters account for most of the observed variation in measured spectral power distributions of daylight. The data sets corresponding to all but the two most recent studies are no longer available for re-analysis. Romero et al. (1997) sampled the spectral power distributions of daylight from 400 nm to 700 nm over a period of 4 days in Granada, Spain. They performed a principle components analysis on the resulting 99 spectral power distributions. They found that approximations to the measured spectral power distributions using three basis elements accounted for 0.9997 of the variance. The number of samples collected was not large, the time period over which they were collected was short, and it is not clear how representative the climate of Granada is of climates in other regions of the world. Nevertheless, the fit is remarkable, and their results, together with the results of earlier research, indicate that low-dimensional linear models provide very good approximations to daylight spectral power distributions. Issues in fitting empirical data More sophisticated methods permit simultaneous choice of optimal linear models for any sets of empirical illuminants and surfaces (Marimont and Wandell 1992). The results of this section suggest that surface reflectance functions and illuminants in ‘natural environments’ are constrained. This idea is not new. Several authors (Stiles et al. 1977; Lythgoe 1979; MacAdam 1981) have expressed the opinion that empirical surface reflectances are smooth, constrained curves. Land (1959/1961) asserts that ‘Pigments in our world have broad reflection characteristics’. Still, there are many open questions concerning the nature and importance of the constraints on ‘natural’ surfaces and light and how they might best be modelled. In the remainder of this section I raise some of them. Non-linear models Only linear models are considered here as candidate representations for natural surfaces and reflectances. It is certainly possible that a non-linear model with N parameters, such as Sσ (λ) =
N
Sj (λ)σj
(9.13)
j=1
might provide better approximations to empirical surface reflectance functions than any linear model with N parameters. There has been no systematic attempt to find non-linear models that provide better fits than linear models. Many of the algorithms below could be readily altered to take advantage of a non-linear constraint such as that embodied in Equation 9.13. Loss functions The reader may question the use of the least-square criterion in fitting models to data. An advantage of the least-square error measure is that it is independent of the properties of any particular visual system. Any conclusions drawn are statements about the empirically measured surface spectral reflectances themselves, and, in attempting to understand the
292
colour perception
physical bases for the empirically observed constraints on surfaces, this is desirable (Maloney 1986). Also, it is important to recognize that many of the better known error measures (‘metric’) such as the Minkowski p-metrics, p
ν =
(S ν (λ) − Sσ (λ))p dλ.
(9.14)
with 1 ≤ p < ∞ are consistent with the least-square norm (p = 2) in the following sense. If a sequence of functions converges under the Minkowski p-metric for p not equal to 2, then it converges when p = 2 and vice versa. Intuitively speaking, ‘close’ in any of these norms means the same thing. Not all norms are consistent with the least-square norm. The Kolmogorov norm ν ∞ (9.15) ν = Max λ |S (λ) − Sσ (λ)| is not consistent with the least-square norm: two functions can disagree by an amount w at a single wavelength and be otherwise identical, and these will always be w apart according to the Kolmogorov norm. But by Equation 9.14, they will have a difference of 0, since a finite number of point discontinuities do not affect the integral. But, for precisely that reason, the Kolmogorov norm is not of use in evaluating human visual response. A different approach (Dannemiller 1992) is to measure the ability of an ideal observer to discriminate approximate from exact surface functions. It is important to consider both the nature of the physical constraint on surfaces (independent of any visual system) and also the impact of the constraint on visual performance for particular visual systems. Theoretical approaches It is not clear what constitutes a ‘natural environment’ for human vision or how to sample it, what surfaces should be included, and what weight each should be given. Only one of the data sets of surface reflectances discussed above [Cohen (1964) analysed a sample of the full Nickerson–Munsell data] could be considered as random samples from a larger, identifiable population to which we might generalize. When we consider other biological visual systems, the problem is scarcely less difficult. It is, therefore, desirable to consider why at least some classes of physical surface exhibit physical constraints, and what these constraints might be. If we understood the theoretical bases for these constraints, we need not wonder whether they might vanish with the next collection of empirical surface reflectance functions. Stiles et al. (1977) and, later, Buchsbaum and Gottschalk (1984) suggested that surface spectral reflectance functions are approximately low-pass. Maloney (1984, 1986) tested this ‘low-pass hypothesis’ for the Krinov data and concluded that the Krinov reflectances contained little spectral energy above a band-limit corresponding to three samples. He suggested specific physical processes responsible for this observed ‘low-pass’ constraint for organic colourants. Further discussion of the low-pass hypothesis may be found in Bonnardel and Maloney (2000). Maloney (1986) attempts to link surface spectral reflectance functions of disordered liquids and solids and the observed low-pass constraint on these same surfaces.
surface colour perception and environmental constraints
293
The human environment The key advantage of the linear models approach is that it permits us to describe surfaces and illuminants by specifying only a small number of parameters. If the number of parameters needed to describe a surface (eqn 9.6) exceeds the number of classes of photoreceptors in a visual system, then Equation 9.7 cannot, in general, be solved. As we have seen, the number of surface basis functions needed to describe various collections of empirically measured surface reflectance functions require more than three basis elements to represent them within a linear model: the world of surfaces is not three-dimensional, though it can be well approximated by a three-parameter linear model. The primary concern of this chapter, though, is the operating range corresponding to human surface colour perception. Is there a non-trivial choice of a three-dimensional linear model for surface reflectances and non-trivial choice of linear model for illuminants10 for which, when the other Flat World assumptions are satisfied, human colour vision provides stable estimates of surface colour despite changes of illuminant? This environment, if it exists, would not correspond to the physical world we live in, but would provide good approximations for many, but not all of the surfaces and illuminants found in it. When the scene immediately around us conformed to the ‘human environment’, we would experience stable perception of surface properties, encoded as colours. When it did not, we would perceive marked failures of colour constancy with changes in the illuminant or other aspects of a scene. The issue, then, is whether there is a slight idealization of the physical environment for which human colour vision is perfectly colour constant. We do not know whether there is such an environment to be found within Flat World. If there is, though, it would clear up some of the difficulties we encountered in discussing surface colours in the environment. Flat World environments lack much of the structure we encounter in normal environments. In the next section, we consider a class of environments that represent explicitly the three-dimensional layout of objects and light sources in scenes.
Shape World environments Figure 9.5 indicates some of the additional structure introduced into the environment in Shape World: shading, specularity, mutual illumination, etc. The shape-from-shading literature is a source of models for illuminant–surface interaction in three-dimensional scenes (see Horn and Brooks 1989) and the descriptive language for Shape World is drawn largely from computer graphics (Cohen and Wallace 1993). Note that we can always reduce any Shape World environment to a Flat World environment by simply ignoring the three-dimensional structure of the scene. Explicitly three-dimensional structures, such as shadows or specularity, could then be reinterpreted as a change in surface region. For example, a specular highlight can be modelled as a white surface patch. 10 Which must be either two- or three-dimensional (Maloney 1984). The term ‘non-trivial’ is used here to guarantee that surfaces corresponding to a three-dimensional colour gamut are included.
294
colour perception
Illuminant E()
surface reflectances S xy(,xy, l xy )
photoreceptor excitations R xy k (),k = 1,2,3
Figure 9.5 Shape World environments. Shape World environments explicitly represent the locations and properties of surfaces and light sources in three dimensions, not unlike the specification of inputs to a sophisticated computer graphics rendering package (e.g. Larson and Shakespeare 1997).
Bi-directional reflectance density functions The surface spectral reflectance function of Flat World, S xy (λ), is replaced by the bi-directional reflectance density function (BRDF) introduced in Fig. 9.1, denoted S xy (λ, ν xy , l xy ) at location xy in the scene.11 The vector ν xy is a unit vector from the surface at xy in the direction of the visual system and l xy is the unit vector from the surface toward a punctate light source. The possible interactions between light and surface are complex (Weisskopf 1968; Nassau 1983; Cohen and Wallace 1993; Larson and Shakespeare 1997) and little understood. Research in this area is often carried out for the purposes of developing better models for computer graphics applications, such as scene rendering. Cohen and Wallace (1993) is still a good introduction to such models. Lee et al. (1990) report measurements of BRDFs of a small number of surfaces, which we will return to below. I will organize this section by considering a nested list of constraints on possible BRDFs. Before we begin, though, we must stop for a moment to decide what we mean by an intrinsic colour, now that we are free to move in three dimensions with respect to a surface. The BRDF S(λ, ν, l) can be interpreted as a spectral reflectance function with corresponding intrinsic colours {C1 [S(λ, ν, l)], C2 [S(λ, ν, l)], C3 [S(λ, ν, l)]}. A surface can evidently have arbitrarily different intrinsic colours, as we vary the directions ν xy and l xy to the eye and to the light source, respectively; and there are, of course, surfaces that do change colour appearance dramatically with viewing angle and position of the light source (e.g. biological colourants based on diffraction). Much of the modelling of the BDRF, however, has assumed 11 If we consider Shape World environments including transparent or translucent surfaces, then our retinal coordinate system is no longer sufficient and we should develop an adequate scene-based coordinate system for specifying the location of specific points on surfaces. However, the added notational complexity would add little to the discussion here. We will continue to use retinal coordinates xy as a stand-in for a better choice.
surface colour perception and environmental constraints
295
that there is, for each surface, a function S(λ), that captures the spectral properties of the surface. This would be the case if the BRDF could be written in the form, S xy (λ, ν xy , l xy ) = S xy (λ)G(ν xy , l xy )
(9.16)
I’ll refer to BRDFs that can be written in this form as spectrogeometrically separable. Intuitively, changes in the orientation of the surface, the position of the eye, and the position of the light source serve only to scale the surface reflectance function. The Lambertian BRDF model and other models commonly employed in rendering applications (Cohen and Wallace 1993) are separable in this sense, as is the recent model of Oren and Nayar (Nayar and Oren 1995; Oren and Nayar 1995), which attempts to model the effects of local roughness on the BRDF. Radiosity methods in computer graphics are readily adapted to render environments where BRDFs are spectrogeometrically separable (Larson and Shakespeare 1997) and it is therefore possible to study human colour vision in such environments using rendered scenes. It will cause no confusion to use the same notation for the Flat World spectral reflectance function and the spectral component of a BRDF in Shape World, and there is an evident and immediate application of Equation 9.6 to Equation 9.16 that would allow us to replace S(λ) by a linear model approximation, reducing the spectral information to a small number xy xy of parameters: Sσ (λ, ν xy , l xy ) = Sσ (λ)G(ν xy , l xy ). Diffuse-specular superposition Surfaces often do not satisfy spectrogeometric separability. Shafer (Shafer 1985; Klinker et al. 1988) suggested that many surface BRDFs [corresponding to dielectric (non-conducting) surfaces such as plastics] may be represented as the sum of two surface BRDFs, each of which satisfies geometric-reflectance separability: S xy (λ, ν xy , l xy ) = S xy (λ)G(ν xy , l xy ) + G ′ (ν xy , l xy ).
(9.17)
The first term in the summation is termed the diffuse component, the second, the specular component. The constraint on surfaces embodied in Equation 9.17 will be referred to as the diffuse-specular superposition property. The Neutral Interface Model of Lee et al. (1990) embodies this property. Radiosity methods in computer graphics can readily render such scenes (and indeed any scene where BRDFs can be represented as the sum of a small number of classes of spectrogeometrically separable BRDFs). Again, an application of Equation 9.6 to Equation 9.17 allows us to compress the spectral information to a small number of parameters. Lee et al. (1990) tested whether surfaces satisfied the diffuse-specular superposition property. They measured the spectral reflectance functions of nine surface materials for different viewing geometries. They found that it was satisfied for some of them (including ‘yellow plastic cylinder’, ‘green leaf ’, and ‘orange peel’) but not all (e.g. ‘blue paper’, ‘maroon bowl’). Tominaga and Wandell (1989, 1990) also report empirical tests of the property.
296
colour perception
There is limited empirical work evaluating the properties of BRDFs or the fit of any of the proposed models to data. Dana et al. (1999) have collected 60 empirical BRDFs sampled at many combinations of ν xy and l xy which are available to other researchers.
Conclusion We do not know if there is a specific non-trivial environment for human surface colour perception. If there were, and we knew the list of assumptions that it comprised, then, sometime in the near future, we could simulate a virtual world in which a human observer would see changes in surface colours only when the material properties of surfaces changed. Changes in illumination, redistribution of other surfaces in the scene, and so on, would not alter the link between perceived colour and the intrinsic colours of surfaces. The work reviewed in this chapter hints that such an environment must be different from the world we live in, but need not be very different. If we were to take a stroll in this virtual world, we might have difficulty telling it from the real one. We might overlook a lack of alexandrite (Nassau 1983) and only gradually come to realize that diffraction processes are excluded: oil slicks and peacocks are less glorious than they are in reality. In that world, though, perceived colour would simply be the encoding of specific surface properties. We could devise a machine to predict human colour perception. There is an impressive array of philosophical work arguing that colour is qualitatively different from, say, shape, and not a property of things in the world in the same way as, say, shape (Hardin 1982; Thompson 1995; Byrne and Hilbert 1997; Stroud 2000). If we cannot identify the hypothetical measurable properties of the BRDF that are objective correlates of colour appearance, the intrinsic colours of surfaces (Shepard 1992), then evidently it is not. The philosophers are correct. Colour is not a property of surfaces, but of scenes. Yet, in a different, idealized environment, very close to our own, but with a somewhat simplified physics and chemistry, our colour vision might be able to perceive stable surface colours accurately and reliably. If we can determine exactly what that environment is, we will certainly have a better understanding of human colour vision and its limitations, and we will also be in a better position to explain Robert Boyle’s discomfort in ascribing colours to surfaces. And, if there are philosophers in this idealized world, they would be wrong to consider colour as anything other than a perceptual correlate of an objective surface property, an intrinsic colour.
Acknowledgements The initial quote is taken from Nicolas Wade’s wonderful book (Wade 1998, p. 123). Grant EY08266 from the National Institute of Health, National Eye Institute provided partial support for much of the work described here. The author thanks Lars Chittka for access to the surface reflectance data of Chittka et al. (1994). Several people were kind enough to read this chapter in earlier drafts and comment on it. I especially want to thank Michael Landy for detailed comments and criticisms. Last of all, the author is grateful to the Computer Science Institute at Hebrew University, Jerusalem for support as Forchheimer Professor while writing this chapter.
surface colour perception and environmental constraints
297
References Apostol, T. M. (1969). Calculus, Vol. II, (2nd edn). Xerox, Waltham, Massachusetts. Bloj, M. G., Kersten, D., and Hurlbert, A. C. (1999). Perception of three dimensional shape influences colour perception through mutual illumination. Nature 402, 877–879. Bonnardel, V. and Maloney, L. T. (2000). Daylight, biochrome surfaces, and human chromatic response in the Fourier domain. Journal of the Optical Society of America A 17, 677–687. Brill, M. H. (1978). A device performing illuminant-invariant assessment of chromatic relations. Journal of Theoretical Biology 71, 473. Brill, M. H. (1979). Further features of the illuminant-invariant trichromatic photosensor. Journal of Theoretical Biology 78, 305. Brainard, D. H. (1998). Color constancy in the nearly natural image. 2. Achromatic loci. Journal of the Optical Society of America A 15, 307–325. Brainard, D. H., Brunt, W. A., and Speigle, J. M. (1997). Color constancy in the nearly natural image. 1. Asymmetric matches. Journal of the Optical Society of America A 14, 2091–2110. Brown, R. and MacLeod, D. I. A. (1997). Color appearance depends on the variance of surround colors. Current Biology 7, 844–849. Buchsbaum, G. (1980). A spatial processor model for object color perception. Journal of the Franklin Institute 310, 1–26. Buchsbaum, G. and Gottschalk, A. (1984). Chromaticity coordinates of frequency-limited functions. Journal of the Optical Society of America 1, 885–887. Byrne, A. and Hilbert, D. R. (1997). Readings on color; Vol. 1: The philosophy of color. MIT Press, Cambridge, MA. Chittka, L., Shmida, A., Troje, N., and Menzel, R. (1994). Ultraviolets as a component of flower reflections, and the color perception of hymenoptera. Vision Research 34, 1489–1508. Cohen, J. (1964). Dependency of the spectral reflectance curves of the Munsell color chips. Psychonomic Science l, 369. Cohen, M. F. and Wallace, J. R. (1993). Radiosity and realistic image synthesis. Morgan Kaufmann, San Francisco. Dana, K. J, van Ginneken, B., Nayar, S. K., and Koenderink, J. J. (1999). Reflectance and texture of real world surfaces. ACM Transactions on Graphics 18, 1–34. Dannemiller, J. L. (1992). Spectral reflectance of natural objects: how many basis functions are necessary? Journal of the Optical Society of America A 9, 507–515. Das, S. R. and Sastri, V. D. P. (1965). Spectral distribution and color of tropical daylight. Journal of the Optical Society of America 55, 319. Dixon, E. R. (1978). Spectral distribution of Australian daylight. Journal of the Optical Society of America 68, 437–450. D’Zmura, M. and Iverson, G. (1993a). Color constancy: I. Basic theory of two-stage linear recovery of spectral descriptions for lights and surfaces. Journal of the Optical Society of America A 10, 2148–2165. D’Zmura, M. and Iverson, G. (1993b). Color Constancy: II. Results for two-stage linear recovery of spectral descriptions for lights and surfaces. Journal of the Optical Society of America A 10, 2166–2180. D’Zmura, M. and Iverson, G. (1994). Color Constancy: III. General linear recovery of spectral descriptions for lights and surfaces. Journal of the Optical Society of America A 11, 2389–2400.
298
colour perception
Evans, R. M. (1948). An introduction to color. Wiley, New York. Gilchrist, A. L. (ed.) (1994). Lightness, brightness, and transparency. Lawrence Erlbaum Associates, Hillsdale, NJ. Hardin, C. L. (1982). Color for philosophers; Unweaving the rainbow. Hackett, Indianapolis, Indiana. Hateren, J. H. van (1993). Spatial, temporal and spectral pre-processing for color vision. Proceedings of the Royal Society of London Series B 251, 61–68. Helson, H. and Judd, D. B. (1936). An experimental and theoretical study of changes in surface colors under changing illuminations. Psychological Bulletin 33, 740–741. Henderson, S. T. (1977). Daylight and its spectrum, (2nd edn). Wiley, New York. Horn, B. K. P. and Brooks, M. J. (ed.) (1989). Shape from shading. The MIT Press, Cambridge, MA. Hurlbert, A. (1998). Computational models of color constancy. In Perceptual constancies, (ed. V. Walsh and J. Kulikowski). Cambridge University Press, Cambridge. Iverson, G. and D’Zmura, M. (1995a). Criteria for color constancy in trichromatic bilinear models. Journal of the Optical Society of America A 11, 1970–1975. Iverson, G. and D’Zmura, M. (1995b). Color constancy: Spectral recovery using trichromatic bilinear models. In Geometric representations of perceptual phenomena; Papers in honor of Tarow Indow on his 70th birthday, (ed. R. D. Luce, M. D’Zmura, D. Hoffman, G. J. Iverson, and A. K. Romney), pp. 169–185. Lawrence Erlbaum Associates, Mahwah, NJ. Ives, H. E. (1912). The relation between the color of the illuminant and the color of the illuminated object. Transactions of the Illuminating Engineering Society 7, 62–72. Judd, D. B. (1940). Hue saturation and lightness of surface colors with chromatic illumination. Journal of the Optical Society of America 30, 2. Judd, D. B., MacAdam, D. L., and Wyszecki, G. (1964). Spectral distribution of typical daylight as a function of correlated color temperature. Journal of the Optical Society of America 54, 1031. Kelley, K. L., Gibson, K. S., and Nickerson, D. (1943). Tristimulus specification of the Munsell Book of Color from spectrophotometric measurements. Journal of the Optical Society of America 33, 355–376. Klinker, G. J., Shafer, S. A., and Kanade, T. (1988). The measurement of highlight in color images. International Journal of Computer Vision 2, 7–32. Krinov, E. L. (1947/1953). Spectral’naye otrazhatel’naya sposobnost’prirodnykh obrazovanii. Izd. Akad. Nauk USSR (Proceedings of the Academy of Sciences USSR). [Translated by G. Belkov, Spectral reflectance properties of natural formations; Technical translation: TT-439. National Research Council of Canada, Ottawa, Canada.] Land, E. H. (1959/1961). Experiments in color vision. Scientific American 201, 84–99. [Reprinted in Color vision; An enduring problem in psychology, (ed. R. C. Teevan and R. C. Birney) (1961). Van Nostrand. Toronto.] Land, E. H. and McCann, J. J. (1971). Lightness and retinex theory. Journal of the Optical Society of America 61, 1–11. Larson, G. W. and Shakespeare, R. (1997). Rendering with radiance. Morgan Kaufmann, San Francisco. Lee, H.-C., Breneman, E. J., and Schulte, C. P. (1990). Modeling light reflection for computer color vision. IEEE Transactions on Pattern Analysis and Machine Intelligence 12, 402–409. Lythgoe, J. N. (1979). The ecology of vision. Clarendon, Oxford. MacAdam, D. L. (1981). Color measurement. theme and variations. Springer-Verlag, Berlin. Maloney, L. T. (1984). Computational approaches to color constancy. Dissertation, Stanford University. [Reprinted as (1985) Stanford Applied Psychology Laboratory Report 1985–01.]
surface colour perception and environmental constraints
299
Maloney, L. T. (1986). Evaluation of linear models of surface spectral reflectance with small numbers of parameters. Journal of the Optical Society of America A 3, 1673–1683. Maloney, L. T. (1999). Physics-based models of surface color perception. In Color vision: From genes to perception, (ed. K. R. Gegenfurtner and L. T. Sharpe), pp. 387–341. Cambridge University Press, Cambridge. Maloney, L. T. and Wandell, B. A. (1986). Color constancy: A method for recovering surface spectral reflectance. Journal of the Optical Society of America A 3, 29–33. Mardia, K. V., Kent, J. T., and Bibby, J. M. (1979). Multivariate analysis. Academic Press, London. Marimont, D. and Wandell, B. A. (1992). Linear models of surface and illuminant spectra. Journal of the Optical Society of America A 9, 1905–1913. Nassau, K. (1983). The physics and chemistry of color: The fifteen causes of color. Wiley, New York. Nayar, S. K. and Oren, M. (1995). Visual appearance of matte surfaces. Science 267, 1153–1156. Neumeyer, C. (1981). Chromatic adaptation in the honey bee: Successive color contrast and color constancy. Journal of Computational Physiology 144, 543–553. Oren, M. and Nayar, S. K. (1995). Generalization of the Lambertian model and implications for machine vision. International Journal of Computer Vision 14, 227–251. Parkkinen, J. P. S., Hallikainen, J., and Jaaskelainen, T. (1989). Characteristic spectra of Munsell colors. Journal of the Optical Society of America A 6, 318–322. Romero, J., Garciá-Beltrán, A., and Hernández-Andrés, J. (1997). Linear bases for representation of natural and artificial illuminants. Journal of the Optical Society of America A 14, 1007–1014. Sällström, P. (1973). Colour and physics: Some remarks concerning the physical aspects of human colour vision. University of Stockholm: Institute of Physics Report, 73–09. Sastri, V. D. P. and Das, S. R. (1966). Spectral distribution and color of north sky at Delhi. Journal of the Optical Society of America 56, 829. Sastri, V. D. P. and Das, S. R. (1968). Typical spectra distributions and color for tropical daylight. Journal of the Optical Society of America, 58, 391. Shafer, S. A. (1985). Using color to separate reflectance components. Color Research and Applications 10, 210–218. Shepard, R. N. (1992). The perceptual organization of colors: An adaptation to regularities of the terrestrial world? In The adapted mind; Evolutionary psychology and the generation of culture, (ed. J. H. Barkow, L. Cosmides, and J. Tooby), pp. 495–531. Oxford University Press, New York. Stiles, W. S., Wyszecki, G., and Ohta, N. (1977). Counting metameric object-color stimuli using frequency-limited spectral reflectance functions. Journal of the Optical Society of America 67, 779. Strang, G. (1988). Linear algebra and its applications. Harcourt, Brace, Jovanovich, New York. Stroud, B. (2000). The quest for reality. Subjectivism and the metaphysics of colour. Oxford University Press, Oxford. Thompson, E. (1995). Colour vision. A study in cognitive science and the philosophy of perception. Routledge, London. Tominaga, S. and Wandell, B. A. (1989). The standard surface reflectance model and illuminant estimation. Journal of the Optical Society of America A 6, 576–584. Tominaga, S. and Wandell, B. A. (1990). Component estimation of surface spectral reflectance. Journal of the Optical Society of America A 7, 312–317. Vrhel, M. J., Gershon, R., and Iwan, L. S. (1994). Measurement and analysis of object reflectance spectra. Color Research and Applications 19, 4–9. Wade, N. J. (1998). A natural history of vision. MIT Press, Cambridge, MA.
300
colour perception
Wandell, B. A. (1995). Foundations of vision. Sinauer and Associates, Sunderland, MA. Weisskopf, V. F. (1968). How light interacts with matter. Scientific American 219, 59–71. Werner, A. (1990). Farbkonstanz bei der Honigbiene, Apis Mellifera. Doctoral dissertation, Fachbereich Biologie, Freie Universität Berlin. Yang, J. N. and Maloney, L. T. (2001). Illuminant cues and surface color perception: Tests of three candidate cues. Vision Research 41, 2581–2600. Yilmaz, H. (1962). Color vision and a new approach to color perception. In Biological prototypes and synthetic systems, Vol. 1. Plenum, New York. Young, N. (1988). An introduction to Hilbert space. Cambridge University Press, Cambridge.
commentary: surface colour perception constraints
301
Commentaries on Maloney On the functions of colour vision Gary Hatfield Maloney’s chapter represents one of the major research traditions in the visual science of colour. An important feature of this tradition is its conception of the property of colour itself. Maloney elaborates a notion of ‘intrinsic colour’, adapted from Shepard (1992). As these authors define it, an intrinsic colour is a mind- or perceiver-independent physical property of the surface of an object. Maloney thinks of intrinsic colour in terms of a physical characteristic known as a bi-directional reflectance density function. This function gives the probability density (over directions of reflection) that a photon of given wavelength, arriving at a surface from a given direction, will be re-emitted in a specified direction. For present purposes, we can simply think of a graph of the percentage of light reflected at each wavelength, known as a surface spectral reflectance function (SSR). On the conception of colour vision developed by Maloney, the ideal performance of the human visual system would be to recover, from the light arriving at the retina, the precise value of the SSR. In his view, this would be the definitive act of surface colour perception. If it were found that conditions cannot be specified in which such acts would be at least possible, there would be consequences for whether colour was even a property of surfaces (relative to human vision, one must suppose). Accordingly, Maloney would conclude that ‘colour is not a property of surfaces’ (p. 296). Since human performance, in fact, does not meet this ideal in many circumstances, one of his goals is the specification of environments in which human performance could match the ideal. Environments for surface colour perception can be characterized by the available classes of illuminants (the characteristics of the light that falls on surfaces) and classes of SSRs. If there are no constraints on these two types of class, then the problem of colour constancy appears insoluble. The light arriving at the retina is determined as a combined function of the illuminant and the SSR. The problem would be one of solving for two unknowns given only a single value (the light received at the retina). In normal human observers, there are three types of light receivers at the retina, or three cone types. (Biologically, there is variety within even the normal types, but Maloney restricts himself to ideal cone-types.) Maloney and other investigators have shown that if it can be assumed that there are restrictions on illuminant types and environmental SSRs, then the human colour system might approach, or perhaps achieve, perfect constancy—in conditions where those restrictions hold. Part of the interest of his project is to establish whether there are any (non-trivial) environments in which an idealized human visual system would exhibit perfect constancy, that is, ‘reliably perceive stable surface colours’ or perceive ‘intrinsic colour’ (pp. 000–00). Even if there are no ideal environments of this type, his investigations can determine expected failures of constancy in various specified environments. Maloney’s work fits into the framework of visual science described by Marr (1982), which stems from a blend of computer-vision work, physiological investigations, and Gibson’s (1966, 1979) ecological approach (see Hatfield 2002). As developed by Marr (1982), this approach sees the problem of vision as that of specifying a computational task for the visual system, and then asking how that task might possibly be carried out if it can be assumed that the system has been engineered to operate in a specific environment or environments. For example, the spatial ambiguity of the bi-dimensional images that are received sequentially on the retinas is greatly reduced if it can be assumed that objects are rigid and have comparatively smooth surfaces. If we then designed an ideal visual system to operate in an environment meeting those assumptions, and discover that it could fully recover surface spatial structure, we would have a model for a visual system exhibiting various spatial constancies. We could then ask whether the human visual system is similar to, or even identical with, the ideal model. In this sort of approach, the specification of the task is very important. Maloney adopts one particular conception of the task of colour vision, that of recovering a physical property associated with the technical notion of ‘intrinsic colour’. However, other conceptions of the function of colour vision have been formulated, which do not agree with this ‘physical instruments’ conception (Hatfield
302
colour perception
1992). Comparative biologists, for instance, have emphasized that constancy is only one function that colour vision might have (Jacobs 1993, 1996). If ‘colour’ is thought of as an attribute by which objects might be discriminated even if they do not vary in lightness or brightness, then colour vision—even in the absence of near perfect constancy—might (for example) serve for the detection of small objects in areas of dappled illumination, or for visually integrating visible objects divided by occlusion (e.g. seen through leaves). These conceptions permit a property of colour to be ascribed to object surfaces distinct from the technical notion of ‘intrinsic colour’. For instance, on a relational view of the colour property, surfaces of objects would possess colours that depend on the colour responses they produce in various types of perceivers (see Hatfield 1992 and Chapter 6 this volume; Thompson 1995). Maloney, in his brief discussion of philosophers in his concluding section, groups Thompson (1995) together with Hardin (1988) and others who deny that colour is a property of objects. Some philosophers, including Hardin (1988), have indeed argued from the variability of the colour response for real perceivers in actual environments to the conclusion that colour experience is a systematic and useful illusion. That is not, however, the only philosophical position available to accommodate the actual variability of colour responses. Such variability includes differing responses to the same SSR under the same illumination and under differing illuminations, and sameness of response to differing SSRs under the same or differing illumination. It has at least been argued that these facts are consistent with a relational view that ascribes colour to objects as a property. This commentary is not the place to argue for one or another conception of the function of colour vision, or of colour as a property of objects. The fact that differing conceptions exist is, however, relevant to understanding Maloney’s results. Even if it were discovered that under daylight ecological conditions a Maloney-type ideal visual system similar to the human system can achieve near-perfect colour constancy, that would not show that the (or a) primary function of human colour vision is to achieve constancy. Maloney has developed an important line of work by adopting the philosophical and scientific assumption that constancy is the goal. Further philosophical and theoretical discussion will be needed to sort out whether constancy is the goal, or even a goal, of colour vision. To repeat, even if further discussion should decide against the centrality of constancy in the functional conception of trichromatic colour vision, that would not reduce the interest of Maloney’s work as an exploration of the functioning of ideal visual systems in relation to environments.
References Gibson, J. J. (1966). Senses considered as perceptual systems. Houghton Mifflin, Boston. Gibson, J. J. (1979). The ecological approach to visual perception. Houghton Mifflin, Boston. Hardin, C. L. (1988). Color for philosophers: Unweaving the rainbow. Hackett, Indianapolis. Hatfield, G. (1992). Color perception and neural encoding: Does metameric matching entail a loss of information? In PSA 1992, Vol. 1, (ed. D. Hull and M. Forbes), pp. 492–504. Philosophy of Science Association, East Lansing, MI. Hatfield, G. (2002). Psychology, philosophy, and cognitive science: Reflections on the history and philosophy of experimental psychology. Mind and Language 17, 207–232. Jacobs, G. H. (1993). Distribution and nature of colour vision among the mammals. Biological Review, 68, 413–471. Jacobs, G. H. (1996). Primate photopigments and primate color vision. Proceedings of the National Academy of Science USA, 93, 577–81. Marr, D. (1982). Vision. W. H. Freeman, San Francisco. Shepard, R. (1992). The perceptual organization of colors: An adaptation to regularities of the terrestrial world? In The adapted mind: Evolutionary psychology and the generation of culture, (ed. J. H. Barkow, L. Cosmides, and J. Tooby), pp. 495–532. Oxford University Press, New York. Thompson, E. (1995). Colour vision. Routledge, London.
commentary: surface colour perception constraints
303
Commentaries on Maloney Intrinsic colours—and what it is like to see them Zoltán Jakab In this brief commentary, I shall defend two related points, one about colours, the other about colour appearances. Maloney defines intrinsic colour in two non-equivalent ways: first, in terms of photoreceptor excitations, and, second, as a kind of reflectance property. As we shall see, the definition in terms of photoreceptor excitations (eqn 9.4) faces more than one problem. Maloney’s definition of intrinsic colours as reflectance properties that correspond to the linear-models-weights representation of surface reflectances fares much better, though, as we shall see, it faces problems of its own. Consider the first proposal. Photoreceptor excitations are not intrinsic properties (here meaning local properties; properties that are not relations to perceivers) of distal surfaces, nor do they represent any such property. Instead, photoreceptor excitations represent sensor quantum catches (Maloney and Wandell 1986, p. 29). Sensor quantum catches are not intrinsic but perceiver-dependent properties of perceived objects. The reason for this is that sensor quantum catches require the existence of perceivers. Were there no perceivers, there would be no sensor quantum catches. Colour objectivism is the view that the existence of object colours does not depend on the existence of perceivers. So a colour objectivist cannot maintain that object colours are sensor quantum catches, on pain of inconsistency. Thus, this first proposal is incompatible with Maloney’s espoused colour objectivism. The proposal faces additional problems. First, even if we keep the illuminant constant as Maloney suggests (p. 285), we can only do so on arbitrary grounds, for there are many different illuminants that we might equally well choose as the reference illuminant. Secondly, according to this proposal, intrinsic colour depends on photoreceptor sensitivity profiles,1 and such profiles are known to vary substantially from one trichromat human to another (Hardin 1988, pp. 76–82; Lutze et al. 1990; Neitz and Neitz 1998). So even if we were to decide, for mathematical purposes, to keep them constant, the intrinsic colour of any particular object in any particular fixed circumstance (reference illuminant, surround, etc.) will vary between normal trichromat human perceivers (Jakab 2001; Kuehni 2001; see also Block 1999). To summarize, if intrinsic colour is identified with photoreceptor excitations, then intrinsic colour depends on properties of observers in such a way that particular colours cannot be specified without mentioning some characteristics of observers (i.e. their photoreceptor excitations), nor can they be physically instantiated in the absence of observers. Since, intuitively, intrinsic colours should be local properties of the distal objects of perception, this is a controversial result. The second proposal for intrinsic colour (representation of surface reflectances by linear models: basis functions and weights; p. 286) fares better. The idea here is that the basis functions of linear models mirror some fundamental, universal reflectance characteristics of terrestrial surfaces.2 These fundamental reflectance characteristics derive, in turn, from some general physical and chemical properties of those surfaces (Maloney 1986, pp. 1677–1678). Colour vision represents particular reflectances by linear combinations of a small set of basis functions. If we identify intrinsic colours 1 For this reason it is not correct to say that the matrix depends only on the unknown illuminant (p. 286). ε It does also depend on photoreceptor sensitivities (e.g. Wandell, 1995, p. 307, eqn 9.12) and this, too, makes a difference to intrinsic colour (according to the notion under evaluation). 2 One such characteristic is that reflectance is a smooth, slowly varying function of wavelength (Maloney 1986; Westland and Thomson 1999). This is not to say that there exists an unambiguous one-to-one correspondence between a determinate list of reflectance characteristics and any particular selection of linear model basis functions. As I understand it, that is not, in fact, the case.
304
colour perception
with the fundamental reflectance characteristics that human colour vision is sensitive to, we avoid the above controversy about perceiver dependence. Notice, however, that individual differences in colour perception still introduce a problem for this approach. For if, as a matter of fact, one and the same surface in the same circumstances can look one colour to one normal observer and another colour to another, then that surface cannot have an absolute colour, only a relative one. However, as McLaughlin’s analysis shows (Chapter 6, this volume), even though colours have to be perceiver-relative, they can still be intrinsic, that is, perceiver-independent properties of surfaces. One and the same surface reflectance is one colour for one observer (in one circumstance, etc.), and another colour for another observer (in another circumstance). Still, (1) we can specify particular colours in terms that do not make reference to any parameter of observers (e.g. redness = surface reflectance such-and-such); and (2) correspondingly, particular colours remain instantiated in the absence of observers. Neither of the latter two conditions is satisfied if colour is thought to be photoreceptor excitation. According to both conceptions offered by Maloney, intrinsic colours depend, for their identity, on properties of, or relations to, observers. This is because individual differences in colour perception make the notion of absolute colour untenable (see McLaughlin, Chapter 16 this volume). This looks like a retreat of some sort since, intuitively, intrinsic colours are supposed not to depend on properties of (or relations to) observers; they are supposed to depend only on local properties of the distal object of perception—that’s what ‘intrinsic’ is meant to emphasize. This notion of intrinsic colour falls with colour absolutism. Still, Maloney’s second conception is compatible with a modified notion of intrinsic colour: criteria (1) and (2). For, as McLaughlin (Chapter 16 this volume) has argued, colour objectivism does not require colour absolutism. Let us turn to the distinction between colours and colour appearances. In another paper, Maloney (1999, pp. 409–414) discusses how the linear models framework relates to the opponent processing model of colour perception. Briefly, the idea is that, following Stiles (1961, p. 264; Maloney 1999, p. 410), for purposes of theoretical analysis, colour vision can be divided into two very general stages: (1) adaptational states of the pathways of chromatic processing, and (2) the processes that adjust and modify these adaptational states. Colour processing consists of a number of transformations of retinal signals, including multiplicative scaling, additive shifts, and opponent recombination. The outcome of all these transformations is colour appearance. These transformations contain certain parameters (coefficients for multiplicative scaling, constants for additive shift, and so on) that are systematically modified by some characteristics of visual stimulation. The general schema is: transformations on receptor inputs at a given retinal location are influenced by previous retinal input and simultaneous input at other parts of the retina. This information about retinal surround determines the parameters for transformation of the cone signals at the retinal point under consideration. Now, the linearmodels-based algorithms of surface reflectance estimation figure in adaptational control: they are part of the transformations by which colour appearance is reached from retinal input (Maloney 1999, p. 413). The first transformation of photoreceptor excitations is their multiplication by the lighting matrix −1 ε (p. 287; see also Wandell 1995, p. 307; Maloney 1999, p. 413). The lighting matrix is illumination-dependent, and this transformation has the function of discounting the effect of illuminant changes, thereby achieving (approximate) colour constancy. The result of this transformation is the visual representation of surface reflectance by linear-models weights. This representation then undergoes a further transformation that determines colour appearance. This further transformation (function F in Maloney 1999, p. 413) is arbitrary in the sense that, in principle, some species with trichromat colour vision and photoreceptors of the same kind as ours could discriminate the same reflectance types as can trichromat humans, form the same linear-models-weights representations of them, yet still apply some different F function (second-site multiplicative attenuation, opponent recombination: Maloney 1999, p. 410) to them, so that, despite the fact that such organisms
commentary: surface colour perception constraints
305
discriminate the same reflectance ranges by their colour experiences as we do, their colour space (unique-binary division, similarity metrics) would be substantially different from ours. As Maloney says (1999, p. 413), in principle any one-to-one transformation of the linear-models-weights representation would serve equally well to determine colour appearance; constraints on this transformation should come from further assumptions about how this second stage of colour processing operates in humans. That is, particularities of surface reflectance estimation by colour vision do not alone determine colour appearance. Colour appearance crucially depends on further transformations in the visual system that are independent of information about surface reflectance, but play a key role in shaping our colour space. This observation is extremely relevant to the evaluation of so-called representational externalist theories of colour experience (Dretske 1995; Tye 1995, 2000). Dretske and Tye claim, in effect, that the phenomenal character of colour experience (roughly the same as colour appearance) is determined straightforwardly by information about surface reflectance represented in colour vision. Colours, in their view, are types of surface reflectances. Moreover, the representational content of colour perceptions arises from the information that these perceptions carry about colours,3 and the phenomenal character of colour experiences is the same as their colour content. Object colour figures as the key component in colour content, and colour content just is colour phenomenology: this means that object colours crucially determine what it is like to see them, i.e. the phenomenal characters of colour experiences. Dretske’s and Tye’s views thus constitute the most straightforward denial of Lockean secondary quality theories. However, as we have seen, Maloney’s model has the consequence that intrinsic colours (surface reflectances) do not determine what it is like to see them. Therefore, if his general approach to colour vision is right, then representational externalism about colour phenomenology is wrong, and some internalist approach to phenomenal colour experience has to be correct. (Internalism is the view that the phenomenal character of colour experiences, that is, what it is like to see colours, is determined by what happens in the nervous system).4 Finally, note that nothing in what I have said questions the idea that colour experience reliably tracks types of surface reflectance.
3 Tye endorses a non-teleological notion of content that is very close to Fodor’s account (Fodor 1990). For Fodor, content is essentially the same as information. Dretske’s notion of content is teleological, still it is very close to that of information (see Dretske 1981, 1988; McLaughlin, Chapter 16 this volume). 4 For a defence of internalism, see McLaughlin (2003). My inclination is to side with McLaughlin (and Maloney) regarding the determinants of colour experience, and break out of Atherton’s dilemma (Commentary on Chapter 16 this volume) by saying that revelation simply is a mistaken intuition that is far from being untouchable by scientific development. Just as science once taught us that our intuitive views about intrinsic inclination to fall were plain wrong (i.e. a rather unreasonable way of thinking about free fall), it now is teaching us that the idea of revelation is wrong in much the same way. Therefore we are entitled to separate colours from what it is like to see them.
306
colour perception
References Block, N. (1999). Sexism, racism, ageism, and the nature of consciousness. Philosophical Topics 26, 39–70. Dretske, F. (1981). Knowledge and the flow of information. MIT Press, Cambridge, MA. Dretske, F. (1988). Explaining behavior; Reasons in a world of causes. MIT Press, Cambridge, MA. Dretske, F. (1995). Naturalizing the mind. MIT Press, Cambridge, MA. Fodor, J. A. (1990). A theory of content II: the theory. In A theory of content and other essays, (ed. J. Fodor). The MIT Press, Cambridge Mass. Hardin, C. L. (1988). Color for philosophers: Unweaving the rainbow. Hackett, Indianapolis. Jakab, Z. (2001). Color experience: empirical evidence against representational externalism. Ph.D. thesis, Carleton University, Ottawa. Available at http://www,carleton.ca/iis/TechReports Kuehni, R. G. (2001). Determination of unique hues using Munsell color chips. Color Research and Application 26, 61–66. Lutze, M., Cox, J., Smith, V. C., and Pokorny, J. (1990). Genetic studies of variation in Rayleigh and photometric matches in normal trichromats. Vision Research 30, 149–162. Maloney, L. T. (1986). Evaluation of linear models of surface spectral reflectance with small numbers of parameters. Journal of the Optical Society of America A 3, 1673–1683. Maloney, L. T. (1999). Physics-based models of surface color perception. In Color vision: From genes to perception, (ed. K. R. Gegenfurtner and L. T. Sharpe), pp. 387–418. Cambridge University Press, Cambridge, UK. Maloney, L. T. and Wandell, B. A. (1986). Color constancy: a method for recovering surface reflectance. Journal of the Optical Society of America A 3, 29–33. McLaughlin, B. P. (2003). Color, consciousness, and color consciousness. In Consciousness: New philosophical perspectives, (ed. Q. Smith and A. Jokic). Oxford University Press, Oxford. Neitz, M. and Neitz, J. (1998). Molecular genetics and the biological basis of color vision. In Color vision: Perspectives from different disciplines, (ed. W. Backhaus, R. Kriegl, and J. S. Werner). de Gruyter, Berlin. Stiles, W. S. (1961). Adaptation, chromatic adaptation, colour transformation. Anales de la Real Sociedad Espanola de Fisica y Quimica Seria A-Fisica 57, 149–175. Tye, M. (1995). Ten problems of consciousness. MIT Press, Cambridge, MA. Tye, M. (2000). Consciousness, color, and content. MIT Press, Cambridge MA. Wandell, B. A. (1995). Foundations of vision. Sinauer and Associates, Sunderland, MA. Westland, S. and Thomson, M. (1999). Spectral colour statistics of surfaces: Recovery and representation. Colour and Imaging Institute, Derby University, Derby.
chapter 10
COLOUR CONSTANCY: DEVELOPING EMPIRICAL TESTS OF COMPUTATIONAL MODELS david h. brainard, james m. kraft, and philippe longère Preface I first become interested in studying vision when, as an undergraduate, I read the first chapter of David Marr’s book Vision (Marr 1982). In that chapter, he articulates the view that vision can be understood as a system that extracts an explicit representation of the world from the retinal image, and that our understanding of human vision is usefully informed by consideration of machine vision algorithms that accomplish the same task. My studies, to that point, had focused on physics and computer science, and this was my first exposure to the notion that psychological questions (e.g. How does vision work?) could be connected to physics (e.g. image formation) and computer science (e.g. image processing). I found the idea sufficiently exciting that I pursued study in psychology. Subsequently, I ended up studying colour constancy because I viewed it as a relatively simple model problem that embodies the general processing task faced by vision: how can the visual system create a useful representation of surface properties (e.g. colour appearance) from a retinal image that confounds the physical properties of surfaces with those of the illuminant? An attractive feature of colour constancy is that there has been substantial progress both in our understanding of human performance and also in our understanding of how to achieve constancy in computer vision systems. In principle, though, these two lines can stand separately—one need not model human performance by drawing on the computational work, and computational solutions to colour constancy have application in digital image processing whether or not they connect to human performance. Indeed, in much of the literature the promise of connections between computation and performance has not been explicitly pursued. The idea that an understanding of the computational requirements of colour constancy can inform our study of human performance has, however, remained tantalizing. In my own work, I have pursued both quantitative measurements of human constancy and have considered the computer vision problem presented by colour constancy. My hope remains that the two lines of research can indeed be brought together in a satisfactory fashion. In the present chapter, we review the current state of this enterprise, with particular emphasis on how psychophysical experiments can be structured so that the results speak directly to whether a particular computational theory is a good model of human colour vision. David H. Brainard
308
colour perception
Introduction Object recognition is difficult because there is no simple relation between an object’s properties and the retinal image. Where the object is located, how it is oriented, and how it is illuminated also affect the image. Moreover, the relation is under-determined: multiple physical configurations can give rise to the same retinal image. In the case of object colour, the spectral power distribution of the light reflected from an object depends not only on the object’s intrinsic surface reflectance, but also factors extrinsic to the object, such as the illumination. The relation between intrinsic reflectance, extrinsic illumination, and the colour signal reflected to the eye is shown schematically in Fig. 10.1. The light incident on a surface is characterized by its spectral power distribution, E(λ). A small surface element reflects a fraction of the incident illuminant to the eye. The surface reflectance function, S(λ), specifies this fraction as a function of wavelength. The spectrum of the light reaching the eye is called the colour signal and is given by C(λ) = E(λ)S(λ). Information about C(λ) is encoded by three classes of cone photoreceptors, the L, M , and S cones. The top two patches rendered in Fig. 10.2 illustrate the large effect that a typical change in natural illumination (see Wyszecki and Stiles 1982) can have on the colour signal. This effect might lead us to expect that the colour appearance of objects should vary radically, depending as much on the current conditions of illumination as on the object’s surface reflectance. Yet the very fact that we can sensibly refer to objects as having a colour indicates
Illuminant E()
S() Surface
L C()
M S
Eye Figure 10.1 Effect of changing the illuminant on light reflected to the eye. The light incident on a surface is characterized by its spectral power distribution E(λ). A small surface element reflects a fraction of the incident illuminant to the eye. The surface reflectance function S(λ) specifies this fraction as a function of wavelength. The spectrum of light reaching the eye is called the colour signal, and is given by C(λ) = E(λ)S(λ). Information about C(λ) is encoded by three classes of cone photoreceptors, the L, M, and S cones. Note that this is a simplified imaging model. In general, the function S(λ) depends on the geometry of the observer, illuminant, and object.
colour constancy: developing empirical tests
309
otherwise. Somehow our visual system stabilizes the colour appearance of objects against changes in illumination, a perceptual effect that is referred to as colour constancy. Because the illumination is the most salient object-extrinsic factor that affects the colour signal, it is natural that emphasis has been placed on understanding how changing the illumination affects object colour appearance. In a typical colour constancy experiment, the independent variable is the illumination and the dependent variable is a measure of colour appearance (Helson 1938; Helson and Jeffers 1940; Helson and Michels 1948; Hunt 1950; Burnham et al. 1957; McCann et al. 1976; Arend and Reeves 1986; Valberg and Lange-Malecki 1990; Arend et al. 1991; Brainard and Wandell 1992; Lucassen and Walraven 1993, 1996; Bauml 1994, 1995; Brainard et al. 1997; Brainard 1998). These various experiments employ different stimulus configurations and psychophysical tasks, but taken as a whole they support the view that human vision exhibits a reasonable degree of colour constancy. Recall that the top two patches of Fig. 10.2 illustrate the limiting case, where a single surface reflectance is seen under multiple illuminations. Although this case illustrates the effect of the illuminant, it fails to capture an essential feature of the computational problem Illuminant 1
Illuminant 2
Wavelength (nm)
Wavelength (nm)
Surface 1
Wavelength (nm)
Surface 2
Wavelength (nm)
Figure 10.2 Renderings of two surfaces under two illuminants. The top row shows the same surface rendered under two different illuminants. Each rendering was obtained using an illuminant spectral power distribution and surface reflectance function to compute the spectrum of the colour signal. From this the Smith–Pokorny estimates (Smith and Pokorny 1975; DeMarco et al. 1992) of the L, M and S cone spectral sensitivities were used to obtain the quantal absorption rates of each cone class in response to the colour signal. These, in turn, were used, together with typical red, green, and blue phosphor emission spectra and monitor gamma curves, to compute RGB coordinates for the rendering. The RGB coordinates were chosen using standard methods (e.g. Brainard 1995) so that the light they cause to be emitted from the monitor has the same effect on the cones as the colour signal being rendered. The RGB coordinates were used to produce the figure by methods outside of the authors’ control. The spectral plots show the surface reflectance functions and illuminant spectral power distributions used for this example. (See also colour Plate 28 in the centre of this book.)
310
colour perception
faced by a visual system that attempts to achieve colour constancy. This is the ambiguity created because of the interaction between illuminant and surface reflectance, an ambiguity illustrated if we consider Fig. 10.2 in its entirety. The rendered patches in the second row of the plate show the effect of the same illuminant change on the information encoded about an additional surface. Note that when seen under the first illuminant, this second surface presents the same spectral signature as does the first surface under the second illuminant. When we consider both illuminant and surface variation, the essential ambiguity underlying colour constancy emerges: how can the visual system determine which object is present in the world if the information reaching the eye is identical for two different object– illuminant configurations? Clearly colour constancy is not possible in general, since the visual system cannot distinguish the two simple scenes rendered in the top right and bottom left patches of Fig. 10.2. Given that colour constancy is not possible in general, it makes little sense to provide a simple answer to the question of how colour constant human vision is. It is more sensible to investigate constancy for some specified ensemble of scenes (Maloney 1999). Of particular interest are ensembles that are representative of scenes we encounter in daily viewing. In this chapter, our aim is to link two lines of research. The first is theoretical work on the computational problem of colour constancy. The goal of computational theories is to define particular ensembles of scenes in which some degree of colour constancy is possible, and to express algorithms that achieve constancy for these ensembles. Computational theories of colour constancy stand independent of their relevance to human vision. None the less, we have found that the computational work provides useful guidance for a research programme designed to understand human colour vision. Our treatment of the computational work is intended primarily to clarify how computational models can be elaborated to make predictions about human performance. The second line of research is empirical measurements of human colour constancy made in our laboratory. Here the emphasis is on studies of performance for stimulus conditions closely related to natural viewing, and on measurements that connect to computational theory.
Computational theory Most computational theories of colour constancy (e.g. Buchsbaum 1980; D’Zmura and Lennie 1986; Lee 1986; Maloney and Wandell 1986; Trussell and Vrhel 1991; D’Zmura and Iverson 1993; Funt and Drew 1993; D’Zmura et al. 1995; Brainard and Freeman 1997; Finlayson et al. 1997) share the same basic two-step framework. In the first step, the image is analysed to yield an estimate of illuminant properties. In the second step, this estimate is used to process the light reflected to the eye from each surface. The second step produces a description of surface properties that is approximately independent of the actual illuminant. Within this two-step framework, individual theories are distinguished by the ensemble of scenes to which they are meant to apply and by how they accomplish each step. To illustrate how computational work can provide a basis for developing statements about human performance, it is useful to consider one theory in some detail. For this illustrative purpose we have chosen Buchsbaum’s classic (1980) theory, expressed with respect to the human visual system.
colour constancy: developing empirical tests
311
As emphasized above, any computational theory must define a restricted ensemble of scenes to which it applies. In the case of Buchsbaum’s theory a single scene in the ensemble consists of a collection of flat matte surfaces arranged in a single plane and illuminated diffusely by spatially uniform illumination. Light from each surface in the scene is reflected to the eye. The eye contains three classes of cone photoreceptors (L, M, and S cones) that encode the spectral properties of the light reflected from each surface to the eye. Thus the image may be specified by the quantal absorption rates of the L, M, and S cones at each image location. This simplified ensemble of visual scenes is sometimes referred to as the Mondrian World because of the resemblance of its individual scenes to paintings by the Dutch artist Piet Mondrian (Land and McCann 1971; see also Maloney 1999). For any scene from the Mondrian World, we can describe the spectral power distribution of the illuminant by a function of wavelength E(λ) and the spectral reflectance of each surface by a function Sj (λ). The light reflected from the jth surface to the eye then has spectral power distribution Cj (λ) = E(λ)Sj (λ). It is convenient to discretize these spectral quantities and express them as vectors (e.g. Wandell 1987; Brainard 1995). Thus we can use the vector e to describe E(λ), where e is an Nλ -dimensional column vector. The entries of e represent the power of the illuminant at Nλ sample wavelengths λn spaced evenly across the visible spectrum. Similarly, we can represent the surface reflectance functions by the Nλ -dimensional column vector sj , where the nth entry of sj is Sj (λn ). Given this representation, the spectral power distribution reflected to the eye from the jth surface is cj = diag(e) sj = diag(sj ) e,
(10.1)
where the function diag() creates a diagonal matrix with the entries of its argument on the diagonal. The information about the spectrum of light encoded by a single class of cones is the rate at which photons are absorbed by the photopigment contained within the cone. This rate may be computed from the cone’s spectral sensitivity. Let L(λ) be the spectral sensitivity of the L cones, M(λ) the spectral sensitivity of the M cones, and S(λ) the spectral sensitivity of the S cones. Form the 3 by Nλ matrix R, where the nth entry of the first row of R is L(λn ), the nth entry of the second row is M(λn ), and the nth entry of the third row is S(λn ). We can then compute the quantal absorption rates of the three classes of cones in response to a spectral power distribution cj , through the equation rj = R cj
(10.2)
where rj is a three-dimensional column vector whose entries are the quantal absorption rates for the L, M, and S cones respectively. A feature of the Mondrian World is that the minimal spatial structure of the images does not carry information about the illuminant. Thus we can summarize the information available from the image about the illuminant by the list of quantal absorption rates {rj }. In addition, the ordering of the elements in the list is not important. We refer to the list {rj } as the colour statistics of the image. It is straightforward to show that when we restrict attention to the Mondrian World, colour constancy remains an under-determined computational problem. It is possible to
312
colour perception
choose two illuminants and two collections of surfaces that produce identical colour statistics. Thus Buchsbaum added additional constraints to the ensemble of scenes to which his theory applies. The first constraint concerned the spectral form of individual illuminants and surfaces. Rather than allowing arbitrary choices of e and the sj , Buchsbaum assumed that both illuminants and surfaces were constrained to lie within three-dimensional linear models. For illuminants, this assumption is that the illuminant e can be written as e = Be we where Be is an Nλ by 3 dimensional matrix and we is a three-dimensional column vector. The columns of the matrix Be are referred to as the basis vectors for the model, while the entries of the vector we , are referred to as the model weights for the particular illuminant e. For surfaces, the linear model assumption is similar. In this case we write sj = Bs wsj with where Bs is an Nλ by 3 dimensional matrix and wsj is a three-dimensional column vector. We can combine the linear model constraints with Equations 10.1 and 10.2 to obtain rj = R cj = R diag(e) Bs wsj = R diag(sj ) Be we .
(10.3)
There is considerable evidence that small-dimensional linear models provide a reasonable description of many illuminants and surfaces (e.g. Cohen 1964; Judd et al. 1964; Maloney 1986; Parkkinen et al. 1989; Jaaskelainen et al. 1990; Romero et al. 1997; see Maloney 1992). A second constraint on the scenes was that the spatial average of the surfaces in any particular scene is constant across scenes. This is often referred to as the Grey World assumption. To understand how colour constancy is possible in a Mondrian World with scenes constrained as described above, let s¯ be the spatial average of the sj and r¯ be the spatial average of the corresponding rj . Then we can write r¯ = R diag(¯s)Be we .
(10.4)
This follows because the spatial averaging operation commutes with the linear process of image formation described by Equation 10.3. If the spatial average of the surface reflectance is known, then Equation 10.4 may be inverted to solve for the illuminant: eˆ = Be Ms−1 r¯
(10.5)
where Ms is the three-by-three matrix given by [R diag(¯s)Be ]. The matrix Ms is invertible because the dimension of the linear model for illuminants (3) is matched to the number of human cone types (L, M, and S). Given the estimate of the illuminant eˆ , computation of the individual sj , is obtained through sj = Bs Me−1 rj (10.6)
where Me = [R diag(ˆe)Bs ]. The matrix Me is invertible because the dimension of the linear model for surfaces (3) is also matched to the number of human cone types (L, M, and S). Equation 10.5 is the key to Buchsbaum’s algorithm. By assuming that the spatial average of surface reflectances in the scene, s¯, is known, it is possible to form the matrix Ms and apply Equation 10.5 to estimate the illuminant. Although Buchsbaum’s theory is designed for the Mondrian World with linear model constraints, the estimation procedure may be
colour constancy: developing empirical tests
313
applied to any set of image data. The estimate will be accurate to the extent that (1) the scene conforms to the Mondrian World assumptions; (2) the linear models Be and Bs describe the illuminant and surfaces that comprise the scene; and (3) the actual spatial average of surfaces matches the assumed s¯. Note that in Equation 10.5 the illuminant estimate depends on the scene only through the spatial average of the receptor responses, r¯ . In this sense, the spatial average summarizes the scene with respect to the illuminant estimate obtained by Buchsbaum’s algorithm. Several other theories (e.g. Maloney and Wandell 1986; Forsyth 1990; Trussell and Vrhel 1991; D’Zmura et al. 1995; Brainard and Freeman 1997; Finlayson et al. 1997) are also designed for the Mondrian World. As with Buchsbaum’s theory, the algorithms associated with these theories work in two steps, first estimating the illuminant and then using the illuminant estimate to obtain surface reflectance estimates. These theories differ from Buchsbaum’s primarily in what information is used to make the illuminant estimate. For example, the illuminant estimate returned by Maloney and Wandell’s (1986) algorithm depends on the colour statistics only through their covariance matrix, while that returned by Forsyth’s (1990) algorithm depends only on the convex hull of the colour statistics. As we will see below, understanding which properties of the colour statistics affect an algorithm’s estimate makes possible empirical tests of the algorithm’s usefulness as a model of human performance. Although we will not consider them further in this chapter, it is worth noting that there is a growing literature on theories that operate for richer scenes than those within the Mondrian World (D’Zmura and Lennie 1986; Hurlbert 1986; Lee 1986; Tominaga and Wandell 1989; D’Zmura and Iverson 1993; Funt and Drew 1993; see Hurlbert 1998; Maloney 1999). The algorithms associated with these theories generally estimate the illuminant using both information contained in the colour statistics and information contained in the spatial structure of the image. Linking computation and performance How can we employ Buchsbaum’s (1980) theory (or any computational algorithm) as a model of human performance? It is not entirely obvious how to proceed. For example, the algorithm produces estimates of the illuminant spectral power distributions and surface reflectance functions, whereas human observers make psychophysical judgements. Such judgements are not of the direct spectral functions but rather assess, in one way or another, the colour appearance of illuminants and surfaces in the scene. Thus the algorithm output and human judgements are not commensurate. To develop an algorithm into a model requires additional linking theory. Suppose that σ is a vector whose entries describe the perceptual experience of colour. To connect an algorithm such as Buchsbaum’s to human performance, we can suppose that σ is related to estimated surface reflectance sˆ by some unknown but fixed function f(), so that σ = f (ˆs). Although the form of f() is unknown, we will assume that it does not depend on context and that it is one-to-one. This simple linking assumption does not allow us to predict colour names from algorithm output. But it does allow the following general prediction to be made about the relation between human performance and algorithm output: two
314
colour perception
surfaces seen in the context of different images should appear the same if, and only if, the algorithm estimates the same surface reflectance for each surface. We will refer to this idea as the match-prediction linking hypothesis. If we accept the match-prediction linking hypothesis, we can make predictions about human performance. Using a psychophysical procedure, we establish pairs of stimuli that, when seen in the context of different images, appear the same. A typical procedure would be asymmetric colour matching (e.g. Burnham et al. 1957; Stiles 1967; Arend and Reeves 1986; Brainard and Wandell 1992; Brainard et al. 1997). Given pairs of stimuli that match across contexts, we ask whether the surface reflectances estimated by an algorithm for these stimuli also match. To the extent that they do, the algorithm provides a good description of human performance. The difficulty with taking this approach is that an algorithm’s specific estimates depend on a number of parameter choices. For example, in Buchsbaum’s algorithm the choice of linear models Be and Bs will affect the estimated surface reflectances. These would either have to be set through parameter search or clever guess. Although this is not necessarily prohibitive, it seems desirable to investigate more directly whether the core principles of a computational theory can be used to understand human performance. For Buchsbaum’s algorithm, Equation 10.5 shows that the illuminant estimate it returns depends on the image only through the spatial average r¯ ; if we have two different images with the same spatial average (¯r), the algorithm will return the same illuminant estimate. In addition, the surface reflectance function estimated at a location depends on the image only through the light reflected from the surface at that location (rj ) and the illuminant estimate (see eqn 10.6). Thus if two images have the same spatial average and we embed a surface that reflects the same light to the eye in each image, Buchsbaum’s algorithm is a candidate model for human performance only if the two surfaces appear the same. This prediction holds independent of the choice of linear models Be and Bs . In the next section we consider experiments that measure human colour constancy, with the goal of connecting the experiments to the ideas discussed above.
Colour constancy in the nearly natural image The effect of the illuminant To allow precise stimulus specification and control, many experiments that attempt to quantify colour constancy employ rather simple stimuli. One configuration that has been used extensively in recent years is a computer simulation of a scene consisting of flat matte surfaces seen under diffuse illumination (e.g. Arend and Reeves 1986; Troost and de Weert 1991; Brainard and Wandell 1992; Arend 1993; Bauml 1994, 1995; Lucassen and Walraven 1996). These stimuli are essentially instantiations of scenes from the Mondrian World. Recent experiments on colour appearance also employ closely related stimuli (e.g. Wesner and Shevell 1992; Singer and D’Zmura 1994; Jenness and Shevell 1995; Delahunt and Brainard 2000). When Mondrian World scenes are simulated on monitors, however, they appear somewhat artificial. This is probably not due to problems of the simulation but rather to the fact that the scenes that match the Mondrian World assumptions are rare in nature and the
colour constancy: developing empirical tests
315
Test patch
B
R
G
G
B
R
Projection colorimeter
B
R G
R
B G
Figure 10.3 Room apparatus. Schematic of the experimental room. The dimensions of the experimental room were approximately 4 m × 3 m. Four triads of computer-controlled lights provided the ambient illumination. A projection colorimeter allowed adjustment of the colour appearance of a test patch located on the far wall of the room. (Adopted from Figure 1 of Brainard 1998.)
visual system may not treat them in the same way as it does natural images. Indeed, one can argue that seemingly simple scenes are very difficult for the visual system to parse. We might expect that before using colour statistics to estimate the illuminant, the visual system attempts to determine which regions are objects and which are light sources, which image variations represent illumination boundaries, and which represent variations in reflected light due to geometric factors (see Adelson 1999; Gilchrist et al. 1999). If this is the case, the processes that normally make these determinations may produce unstable or conflicting results when presented with impoverished stimuli. As a result, performance measured for simple stimuli could be much more difficult to understand than performance for stimuli which provide a rich set of cues. These considerations motivated us to study colour constancy using stimuli consisting of actual illuminated surfaces, configured in three dimensions. By doing so, we hoped to study constancy as it operates in natural viewing. In the work reported here, however, we focus on results obtained using scenes that are (approximately) uniformly illuminated. This simplifies the comparison of human and algorithmic performance, since it is not necessary to consider processes that segment the image into distinctly illuminated regions. The apparatus used in the first set of experiments is an entire room, shown schematically in Fig. 10.3 and described in detail elsewhere (Speigle and Brainard 1996; Brainard et al. 1997; Brainard 1998). The ambient illumination of the room is produced by three sets of computer-controlled stage lamps arranged in four triads. One set has red filters, one has
316
colour perception
green filters, and one has blue filters. The light from each triad passes through a diffuser to minimize coloured shadows. By varying the intensities of the three sets of lamps, we can vary the spectral power distribution of the ambient illumination. A test surface on the far wall of the room is located so that it can be illuminated by a projection colorimeter. The illumination from the colorimeter consists of a mixture of red, green, and blue primaries. This illumination is focused and aligned so that it is spatially coincident with the test surface: it is not explicitly visible to the observer. The overall light reflected to the observer from the test surface thus consists of two components. The first is the normal reflection of the ambient illumination, while the second is generated by the colorimeter. Varying the intensity of the colorimeter primaries has the perceptual effect of changing the colour appearance of the test surface. Essentially, we have taken the stimulus configuration exploited by Gelb (Gelb 1950; see also Katz 1935; Koffka, 1935) and brought it under computer control (see also Uchikawa et al. 1989; Valberg and Lange-Malecki 1990; Kuriki and Uchikawa 1996, 1998). As noted above, asymmetric colour matching provides a convenient and natural experimental method for linking computational theory and human performance. This procedure is particularly well suited to studying colour constancy when there is a spatial change in the illumination (simultaneous colour constancy) so that the matches can be made between two surfaces that are viewed at the same time (e.g. Arend and Reeves 1986; Brainard 1997). It is also possible to use asymmetric matching to study colour constancy for the situation of interest here, uniformly illuminated scenes where the illuminant varies from one time to another (successive colour constancy; Brainard and Wandell 1991, 1992; Bauml 1995; Jin and Shevell 1996). In this case, however, the matches typically involve a memory component and are more difficult for observers. A simpler experimental task is to measure the achromatic locus by having observers adjust the chromaticity of a surface (or image region) until it appears achromatic (Helson and Michels 1948; Werner and Walraven 1982; Fairchild and Lennie 1992; Arend 1993; Bauml 1994; Chichilnisky and Wandell 1996; Maloney and Yang, Chapter 11 this volume). This task is performed easily and reliably by even the most naive of observers. A direct comparison of asymmetric matching and achromatic adjustment in a simultaneous colour constancy experiment indicates that the two tasks tap the performance of the same visual mechanisms (Speigle and Brainard 1999). We measured how the achromatic locus depends on changes in illumination. Figure 10.4 shows typical results. Each of the open circles shows the chromaticity of an experimental illuminant. Each of the corresponding closed circles shows the chromaticity of the achromatic locus, measured for one observer, under the corresponding illuminant. The achromatic loci were determined by averaging loci determined in separate sessions. The x and y standard errors of measurement for each locus are smaller than the plotted points.1 1 We verified that for our conditions the chromaticity of observers’ achromatic adjustments does not depend on luminance (Brainard 1998). This invariance does not hold in general (Helson and Michels 1948; Werner and Walraven 1982; Chichilnisky and Wandell 1996; see also Mausfeld and Niederee 1993; Mausfeld 1998; Delahunt and Brainard 2000) but is obeyed for decrements seen against uniform surrounds (Chichilnisky and Wandell 1996).
colour constancy: developing empirical tests 0.6
CIE y chromaticity
+
317
Achromatic Illuminant
+
Observer PW
0.5
0.4
0.3
0.2 0.2
0.3 0.4 0.5 CIE x chromaticity
0.6
Figure 10.4 Basic achromatic results. The figure shows the CIE 1931 chromaticities of the achromatic loci (solid circles) measured under two experimental illuminants (chromaticity shown by open circles) for one observer. The between-session standard error of the mean is smaller than the plotted points. The maximum within-session standard deviation of the individual achromatic settings is indicated by the crosses at the upper left of the figure. (Adopted from Figure 3 of Brainard 1998.)
The achromatic loci plotted are the chromaticities of the light reflected to the eye that appeared achromatic (i.e. the chromaticities of the proximal stimulus). To interpret the data in terms of colour constancy, consider the chromaticity of the light reflected from a surface that appears white under typical daylight. Such a surface has a reflectance spectrum that is nearly constant across wavelength, and thus the light reflected from it always has a chromaticity close to that of the illuminant. Figure 10.5 plots the chromaticity of the light reflected from a Munsell N 9.5/surface under two illuminants. This surface appears achromatic when seen under the standard viewing conditions for which the Munsell system is defined, and for a colour-constant visual system it will continue to appear achromatic under other viewing conditions. Thus for a colour-constant visual system, the chromaticity of the achromatic locus should coincide with the chromaticity of the light reflected from this surface. We conclude that colour constancy is indicated when the chromaticity of the achromatic loci lies near that of the illuminants (see Fig. 10.5). This pattern is roughly what is seen in the data shown in Fig. 10.4. It is possible to go from the data shown in Fig. 10.4 to a constancy index. The calculations are described in detail elsewhere (Brainard 1998). The index takes on a value of 0 for the case when the achromatic loci are unaffected by the illuminant (no constancy) and 1 when the achromatic loci track the illuminant perfectly (complete constancy). For intermediate cases, the index may be thought of as describing the extent to which the achromatic loci track the illuminant change. The value of the index for the data shown in Fig. 10.4 is 0.80, and the mean value across a wide range of conditions (different objects in the room,
318
colour perception
CIE y chromaticity
0.6
‘White’ surface Illuminant
0.5
0.4
0.3
0.2 0.2
0.3 0.4 0.5 CIE x chromaticity
0.6
Figure 10.5 Data expected for a colour-constant visual system. The figure plots the chromaticity of the light reflected from a Munsell N 9.5/surface (solid circles) under two illuminants. The chromaticities of the illuminants are indicated by the open circles.
different illuminant changes) was 0.82 (Brainard 1998). Interestingly, this is more constancy than is typically seen in studies conducted with monitor displays. (Comparable indices are generally in the range 0.50–0.60, see Brainard and Wandell 1991; Fairchild and Lennie 1992; Brainard et al. 1993.) The relatively high constancy index shown by observers in our experiments is consistent with everyday experience: object colours do not change much with changes in illuminant. We believe that laboratory experiments employing the sort of nearly natural stimuli described above assess constancy as it operates in the real world. Testing computational models The experiment described above quantifies colour constancy across changes of illumination. It does not, however, tell us much about how the visual system achieves the measured constancy. In the experiment, the surfaces that make up the scene remain constant as the illuminant is varied. Such a design, almost ubiquitous in studies of colour constancy, eliminates from the stimulus ensemble the illuminant–surface ambiguity, described in the introduction, which makes constancy a difficult computational task. Indeed, most computational theories can predict good constancy under circumstances where the same collection of surfaces is viewed under an unknown illuminant. To test these theories it is necessary to conduct experiments where both the surfaces in the scene and the illuminants are varied. To do so, we (Kraft and Brainard 1999) had observers look into a small (approximately 1 m × 1 m) chamber in which the spectrum of the illuminant and the spectral reflectance of all visible surfaces could be controlled independently. Figure 10.6 shows images of the chamber in two different configurations. Between the two, some of the objects in the chamber were changed, so that the mean surface reflectance (¯s) in the scene is quite different in the
colour constancy: developing empirical tests
319
Figure 10.6 Pictures of the experimental chamber when the spectral average has been equated. This plate shows pictures of the experimental chamber used by Kraft and Brainard (1999). Across the two images, both the illuminant and the surfaces in the scene have been changed. The two changes have a reciprocal effect, so that the spatial average of the L, M, and S cone quantal absorption rates is the same in both images. The images shown are rendered versions of hyperspectral images taken of the stimuli. The hyperspectral imaging system (Longère and Brainard 2001) provided 31 narrow-band (approximately 10 nm bandwidth at 10 nm spacing between 400 and 700 nm) images of the scene. The hyperspectral images were also used to determine the spatial average of the cone quantal absorption rates. (Adopted from Figure 1 of Kraft and Brainard 1999.) (See colour Plate 29 in the centre of this book.)
two cases. In addition, the illumination in the two chambers is also different. The combined effect of the surface and illuminant manipulations is to make the spatial mean of the two images (¯r) identical. As with the experiments in the full room, the appearance of a test patch in the chamber could be adjusted through the use of the projection colorimeter. The observers’ task was again to adjust the chromaticity of the test patch until it appeared achromatic. The prediction of Buchsbaum’s algorithm for our experimental situation is straightforward. Given that the spatial average of the two images is the same, the matchprediction hypothesis says that when two test patches seen in the respective images match in appearance, the light reflected to the eye should be the same. Achromatic adjustments do not establish complete perceptual matches. But it is plausible that each point on the achromatic locus measured in one image matches some point on the achromatic locus measured in the other image. Given that we find that the chromaticity of light that appears achromatic is independent of test luminance (see Footnote 1), we arrive at the prediction that the achromatic locus should have the same chromaticity when measured in the two images. Figure 10.7 plots the achromatic loci measured for one observer in this experiment. The achromatic loci are significantly different from each other, as they were for three other observers (keep in mind that the standard errors for the achromatic loci are smaller than the plotted points; see Kraft and Brainard 1999). From this fact, we can conclude directly that the spatial average of the image is not the only statistic governing colour appearance. This,
320
colour perception Achromatic Illuminant CIE y chromaticity
Observer DHB
CIE x chromaticity Figure 10.7 Achromatic settings with spatial average equated. The format of the figure is the same as for Fig. 10.4. Here the achromatic settings were made in the context of two images where the illuminant differed (open circles) but the spatial average of the image was held constant. The between-session standard error of the mean is smaller than the plotted points. (Data are replotted from Figure 2 of Kraft and Brainard 1999.)
in turn, says that Buchsbaum’s algorithm cannot completely describe human performance. Perhaps it is worth noting that this result does not rule out the possibility that the algorithm would describe performance if the stimuli conformed strictly to the assumptions of the Mondrian World. The constancy index for the data shown in Fig. 10.7 is 0.29. The mean index for four observers in the same experiment was 0.39. These indices are considerably lower than the value of 0.82 found for the experiments conducted in the full room. The reduction is not due to the fact that observers were looking into a chamber rather than sitting in an entire room: control experiments with the chamber, where only the illuminant was varied, yielded constancy indices of about 0.83.
Discussion In this chapter we have emphasized the link between computational theories of colour constancy and human performance. In doing so, we have implicitly endorsed what Maloney refers to as the illumination estimation hypothesis (Maloney and Yang, Chapter 11 this volume). This is the idea, central to the motivation here, that the visual system estimates the illuminant and that the estimate is used to govern the perception of surface colour (see also Speigle and Brainard 1996; Brainard et al. 1997; Mausfeld 1998; Gilchrist et al. 1999). The work reviewed here does not directly test the illuminant estimation hypothesis, since observers do not make any judgements of perceived illumination. Recent work (Rutherford 2000) suggests that the illuminant estimation hypothesis is at best an approximation (see also Beck 1959, 1961; Oyama 1968; Kozaki and Noguchi 1976; Noguchi and Kozaki 1985; Logvinenko and Menshikova 1994). Even if human surface colour appearance does not depend on an explicit illuminant estimate, we need not refrain from using computational
colour constancy: developing empirical tests
321
theory to develop and test models of which image statistics influence the perception of surface colour. Indeed, the models we have elaborated are designed to make predictions about asymmetric surface colour matches (or closely related measures of appearance). In this sense, they are agnostic about whether the visual system computes an estimate of illuminant or whether such an estimate plays a governing role in surface colour perception. Our experimental logic can be used to show that a particular theory does not provide a complete description of human performance. In the case of Buchsbaum’s algorithm, we learn that something other than the spatial average of the cone responses in the image contributes to how the visual system processes colour information.2 The experiments do not, however, rule out a role for the spatial average. Indeed, the fact that the constancy index is greatly reduced when the spatial average is held constant suggests that this statistic may play an important role in colour perception. A more definitive statement is not possible based on our experiments, since by silencing the spatial average we also affected other image statistics. Yang and Maloney (Yang 1999; Maloney and Yang, Chapter 11 this volume) have recently taken an empirical approach complementary to ours, where they make small perturbations to one image statistic while holding others constant. Experiments of this sort can be used to establish that particular statistics are used by the visual system. A crucial feature of our experimental design is that we manipulate both the illuminant and surfaces in the scene. Without doing so, we could not match the spatial average in the image while at the same time changing the illuminant. This is a point of wide applicability. Most computational theories derive their estimate of the illuminant from specific scene statistics. To test whether a particular theory provides a complete description of human performance, we can proceed by silencing the statistics used by that algorithm. To do so in a non-trivial manner, it is necessary to vary both the surfaces in the scene and the illuminant. To date, only a few other experimentalists have explored conditions where both the surfaces and illuminants varied (Gilchrist and Jacobsen 1984; McCann 1994; Kuriki and Uchikawa 1998; see also Gilchrist 1988). It is our opinion that further experiments where only the illuminant is varied are unlikely to advance our knowledge of constancy much beyond its current state. More experiments are needed where the essential ambiguity between surfaces and illuminants is restored to the experimental situation. In addition to conducting the experiment described above, where the spatial average of the image was held constant across a change of illuminant, we have measured achromatic loci in a variety of other images where surfaces in the scene were varied across an illuminant 2 We should note that theories that postulate that the spatial average is the statistic that sets the visual system’s effective estimate of the illuminant vary in terms of exactly how the average is computed. In our experiment, we matched the spatial average taken over image pixels, equally weighted. One can consider variants that weight distinct image regions identically (e.g. Gershon and Jepson 1989), that take a spatially weighted average for each local image region (e.g. Land 1986; see Brainard and Wandell 1986), and that use the geometric rather than the arithmatic average of the L-, M-, and S-cone responses (again Brainard and Wandell 1986; Land 1986). Strictly speaking, additional experiments would be needed to rule out all of these variants for the class of rich stimulus configurations we used. There are, however, a growing number of results for simpler laboratory images that make it difficult to adhere to any of these variants (Singer and D’Zmura 1994; Jenness and Shevell 1995; Brown and MacLeod 1997).
322
colour perception
change. We will not review the particulars of these manipulations here; most are described in Kraft and Brainard (1999). Across the conditions we studied, constancy indices (mean across observers) varied considerably, ranging from 0.06 to 0.83. The lowest indices corresponded to spatially simple scenes where the surfaces were changed to reduce information about the illuminant change. The highest indices were obtained when the surfaces in the scene were held constant across an illuminant change. The variation of constancy index with experimental conditions emphasizes the fact that how well the visual system adjusts to a change of illuminant depends on the stimulus ensemble: when little information is available about the illuminant change, the visual system is not very colour constant. We find it encouraging that we have found stimulus manipulations that cause the constancy indices to vary widely. This indicates that we have brought into the laboratory a set of factors that operate in rich images and that have a substantial impact on human performance. Identifying these factors more precisely and bringing them under parametric control should allow more systematic investigation of how colour appearance is governed in complex natural scenes. Although our stimuli consisted of real illuminated three-dimensional objects, we did not manipulate the spatial structure of the scenes. The spatial structure (either actual or perceived) of a scene can affect colour appearance even when the colour statistics of the image are held fixed (Gilchrist 1977, 1980; Knill and Kersten 1991; Bloj et al. 1999). Such effects are not captured by the experiments and models described here. It is possible that for our stimulus configurations, the visual system takes advantages of cues such as specular highlights (Lee 1986; D’Zmura and Lennie 1986; Tominaga and Wandell 1989; see Yang 1999; Maloney and Yang, Chapter 11 this volume) and mutual illumination (Funt et al. 1991; Funt and Drew 1993; see Bloj et al. 1999). Whether this is the case, or whether for our scenes the colour statistics alone provide most of the information used by the visual system, is an interesting and open question. Another simplified aspect of our scenes is that the illumination was close to spatially uniform. Thus the task of segmenting the image according to different illuminants has a particularly simple solution for our images. How such segmentation operates in images with multiple illuminants (simultaneous constancy) remains a central unsolved problem that is not addressed by our work. Recent theories (Adelson 1999; Gilchrist et al. 1999) have identified a number of heuristics that might guide the segmentation process. These theories also suggest that once the image has been segmented into separate regions, visual processing within regions is guided by the colour statistics or some summary of them. Our work focuses on exactly how the image statistics are used within uniformly illuminated regions. Within the context of these recent theories, our work is complementary to explorations of how the segmentation processes operate. It may be possible to quantify the relation between human performance and the information about the illuminant change that is actually available in a pair of images. Up to this point, we have considered computational theories as potential models for human performance. But computational models can also be used to provide a benchmark against which to compare human performance. This sort of analysis has been very successful in understanding data obtained from experiments that measure performance on objective psychophysical tasks such as detection and discrimination (e.g. Green and Swets 1966; Geisler 1989). In
colour constancy: developing empirical tests
323
1.0
Human Index
0.8 0.6 0.4 0.2 0.0 –0.2 –0.2
0.0
0.2
0.4
0.6
0.8
1.0
Bayes Index Figure 10.8 Comparison of human performance with Bayesian algorithm. The figure plots the constancy indices obtained for human observers against constancy indices obtained for the Bayesian algorithm of Brainard and Freeman (1997). The algorithm was run using points selected at random from calibrated LMS images of the stimulus. The image acquisition procedure is described in the caption for Fig. 10.6. The prior distribution for illuminants was constructed to match the range of illuminants that our apparatus could produce. The prior distribution of surfaces was obtained by analysing measurements of Munsell papers, as described in Brainard and Freeman (1997). The small negative constancy indices obtained in some cases occur because the illuminant estimate shifts slightly in a direction opposite to the actual illuminant change.
such applications, one predicts the performance of an ideal observer that uses all of the information in the stimulus optimally to perform some task. An ideal observer benchmark provides a principled method for evaluating how efficiently a real observer performs a particular task, and thus to identify sites of information loss in visual processing. Brainard and Freeman (1997) used Bayesian decision theory to develop an ideal observer for colour constancy in the Mondrian World. Their work assumes that in any scene, the surface and illuminant spectra are drawn at random from a population whose distribution is known. When the prior assumptions are met, the algorithm returns an estimate of the illuminant that is optimal, in the sense that it minimizes the expected illuminant estimation error.3 The Brainard and Freeman algorithm may be applied to each image for which Kraft and Brainard (1999) measured achromatic loci. We can compute a constancy index for the algorithm by treating the chromaticity of its illuminant estimates in the same way that we treat the achromatic loci measured for human observers. Figure 10.8 shows the constancy indices obtained for human observers plotted against the constancy indices obtained for the Bayesian algorithm. What is apparent in the plot is that there is a strong correlation between the human and Bayesian indices. If we take the performance of the Bayesian algorithm as 3
See Brainard and Freeman (1997) for a detailed description of exactly what error is minimized.
324
colour perception
a measure of how much information is available for an ideal observer to estimate the illuminant, we see that the variation in human performance across the conditions is well explained by information differences between the various conditions. The slope of the regression line between the human and Bayes indices is 0.77. This could be taken as a measure of the degree of human constancy, relative to ideal performance, across the whole set of image manipulations. We do not wish to claim that the Brainard and Freeman (1997) algorithm provides a good model of human performance, even for stimulus configurations where the colour statistics alone drive the visual system’s estimate of the illuminant. A strong test of the particular algorithm requires that we apply the same logic as we developed earlier in the chapter: find two images for which the algorithm predicts the same illuminant estimate and then measure colour appearance for these two images. Doing so will require development of more sophisticated stimulus control techniques than we currently have at our disposal. The algorithm does, however, measure the information available from the colour statistics about the illumination change across a pair of images. It is therefore intriguing that the algorithm is able to make accurate predictions of how human performance varies across a wide range of experimental conditions.
Acknowledgements Supported by NIE EY 10016. We thank E. Adelson, P. Delahunt, W. Freeman, and L. Maloney for useful discussions.
References Adelson, E. H. (1999). Lightness perception and lightness. In The New Cognitive Neurosciences (ed. M. Gazzaniga) (2nd edn), pp. 339–351. MIT Press, Cambridge, MA. Arend, L. E. (1993). How much does illuminant color affect unattributed colors? Journal of the Optical Society of America A 10, 2134–2147. Arend, L. E. and Reeves, A. (1986). Simultaneous color constancy. Journal of the Optical Society of America A 3, 1743–1751. Arend, L. E., Reeves, A., Schirillo, J., and Goldstein, R. (1991). Simultaneous color constancy: papers with diverse Munsell values. Journal of the Optical Society of America A 8, 661–672. Bauml, K. H. (1994). Color appearance: effects of illuminant changes under different surface collections. Journal of the Optical Society of America A 11, 531–542. Bäuml, K. H. (1995). Illuminant changes under different surface collections: examining some principles of color appearance. Journal of the Optical Society of America A 12, 261–271. Beck, J. (1959). Stimulus correlates for the judged illumination of a surface. Journal of Experimental Psychology 58, 267–274. Beck, J. (1961). Judgments of surface illumination and lightness. Journal of Experimental Psychology 61, 368–373. Bloj, M., Kersten, D., and Hurlbert, A. C. (1999). Perception of three-dimensional shape influences colour perception through mutual illumination. Nature 402, 877–879.
colour constancy: developing empirical tests
325
Brainard, D. H. (1995). Colorimetry. In Handbook of optics: Vol. 1. Fundamentals, techniques, and design, (ed. M. Bass), pp. 26.1–26.54. McGraw-Hill, New York. Brainard, D. H. (1998). Color constancy in the nearly natural image. 2. achromatic loci. Journal of the Optical Society of America A 15, 307–325. Brainard, D. H. and Freeman, W. T. (1997). Bayesian color constancy. Journal of the Optical Society of America A 14, 1393–1411. Brainard, D. H. and Wandell, B. A. (1986). Analysis of the retinex theory of color vision. Journal of the Optical Society of America A 3, 1651–1661. Brainard, D. H. and Wandell, B. A. (1991). A bilinear model of the illuminant’s effect on color appearance. In Computational models of visual processing (ed. M. S. Landy and J. A. Movshon). MIT Press, Cambridge, MA. Brainard, D. H. and Wandell, B. A. (1992). Asymmetric color-matching: how color appearance depends on the illuminant. Journal of the Optical Society of America A 9, 1433–1448. Brainard, D. H., Wandell, B. A., and Chichilnisky, E.-J. (1993). Color constancy: From physics to appearance. Current Directions in Psychological Science 2, 165–170. Brainard, D. H., Brunt, W. A., and Speigle, J. M. (1997). Color constancy in the nearly natural image. 1. asymmetric matches. Journal of the Optical Society of America A 14, 2091–2110. Brown, R. O. and MacLeod, D. I. A. (1997). Color appearance depends on the variance of surround colors. Current Biology 7, 844–849. Buchsbaum, G. (1980). A spatial processor model for object colour perception. Journal of the Franklin Institute 310, 1–26. Burnham, R. W., Evans, R. M., and Newhall, S. M. (1957). Prediction of color appearance with different adaptation illuminations. Journal of the Optical Society of America 47, 35–42. Chichilnisky, E. J. and Wandell, B. A. (1996). Seeing gray through the on and off pathways. Visual Neuroscience 13, 591–596. Cohen, J. (1964). Dependency of the spectral reflectance curves of the Munsell color chips. Psychonomic Science 1, 369–370. D’Zmura, M. and Iverson, G. (1993). Color constancy. I. Basic theory of two-stage linear recovery of spectral descriptions for lights and surfaces. Journal of the Optical Society of America A 10, 2148–2165. D’Zmura, M. and Lennie, P. (1986). Mechanisms of color constancy. Journal of the Optical Society of America A 3, 1662–1672. D’Zmura, M., Iverson, G., and Singer, B. (1995). Probabilistic color constancy. In Geometric representations of perceptual phenomena: Papers in honor of Tarow Indow’s 70th birthday (ed. R. D. Luce, M. D’Zmura, D. Hoffman, G. Iverson, and A. K. Romney), pp. 187–202. Lawrence Erlbaum Associates, Mahwah, NJ. Delahunt, P. B. and Brainard, D. H. (2000). Control of chromatic adaptation: Signals from separate cone classes interact. Vision Research 40, 2885–2903. DeMarco, P., Pokorny, J., and Smith, V. C. (1992). Full-spectrum cone sensitivity functions for X-chromosome-linked anomalous trichromats. Journal of the Optical Society A 9, 1465–1476. Fairchild, M. D. and Lennie, P. (1992). Chromatic adaptation to natural and incandescent illuminants. Vision Research 32, 2077–2085. Finlayson, G. D., Hubel, P. H., and Hordley, S. (1997). Color by correlation. Proceedings of the IS&T/SID Fifth Color Imaging Conference: Color Science, Systems, and Applications, Scottsdale, AZ, pp. 6–11. Forsyth, D. A. (1990). A novel algorithm for color constancy. International Journal of Computer Vision 5, 5–36.
326
colour perception
Funt, B. V. and Drew, M. S. (1993). Color space analysis of mutual illumination. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 1319–1326. Funt, B. V., Drew, M. S., and Ho, J. (1991). Color constancy from mutual reflection. International Journal of Computer Vision 6, 5–24. Geisler, W. S. (1989). Sequential ideal-observer analysis of visual discriminations. Psychological Review 96, 267–314. Gelb, A. (1950). Colour constancy. In Source book of gestalt psychology (ed. W. D. Ellis), pp. 196–209. The Humanities Press, New York. Gershon, R. and Jepson, A. D. (1989). The computation of color constant descriptors in chromatic images. Color Research and Application 14, 325–334. Gilchrist, A. L. (1977). Perceived lightness depends on perceived spatial arrangement. Science 195, 185. Gilchrist, A. L. (1980). When does perceived lightness depend on perceived spatial arrangement? Perception and Psychophysics 28, 527–538. Gilchrist, A. L. (1988). Lightness contrast and failures of constancy: A common explanation. Perception and Psychophysics 43, 415–424. Gilchrist, A. and Jacobsen, A. (1984). Perception of lightness and illumination in a world of one reflectance. Perception 13, 5–19. Gilchrist, A., Kossyfidis, C., Bonato, F., Agostini, T., Cataliotti, J. et al. (1999). An anchoring theory of lightness perception. Psychological Review 106, 795–834. Green, D. M. and Swets, J. A. (1966). Signal detection theory and psychophysics. John Wiley and Sons, New York. Helson, H. (1938). Fundamental problems in color vision. I. The principle governing changes in hue, saturation and lightness of non-selective samples in chromatic illumination. Journal of Experimental Psychology 23, 439–476. Helson, H. and Jeffers, V. B. (1940). Fundamental problems in color vision. II. Hue, lightness, and saturation of selective samples in chromatic illumination. Journal of Experimental Psychology 26, 1–27. Helson, H. and Michels, W. C. (1948). The effect of chromatic adaptation on achromaticity. Journal of the Optical Society of America 38, 1025–1032. Hunt, R. W. G. (1950). The effects of daylight and tungsten light-adaptation on color perception. Journal of the Optical Society of America 40, 336–371. Hurlbert, A. (1986). Formal connections between lightness algorithms. Journal of the Optical Society of America 3, 1684–1694. Hurlbert, A. C. (1998). Computational models of color constancy. In Perceptual constancy: Why things look as they do (ed. V. Walsh and J. Kulikowski), pp. 283–322. Cambridge University Press, Cambridge. Jaaskelainen, T., Parkkinen, J., and Toyooka, S. (1990). A vector-subspace model for color representation. Journal of the Optical Society of America A 7, 725–730. Jenness, J. W. and Shevell, S. K. (1995). Color appearance with sparse chromatic context. Vision Research 35, 797–805. Jin, E. W. and Shevell, S. K. (1996). Color memory and color constancy. Journal of the Optical Society of America A 13, 1981–1991. Judd, D. B., MacAdam, D. L., and Wyszecki, G. W. (1964). Spectral distribution of typical daylight as a function of correlated color temperature. Journal of the Optical Society of America 54, 1031–1040. Katz, D. (1935). The world of colour. Kegan, Paul, Trench, Truber, London.
colour constancy: developing empirical tests
327
Knill, D. C. and Kersten, D. (1991). Apparent surface curvature affects lightness perception. Nature 351, 228–230. Koffka, K. (1935). Principles of gestalt psychology. Harcourt, Brace, New York. Kozaki, A. and Noguchi, K. (1976). The relationship between perceived surface-lightness and perceived illumination. Psychological Research 39, 1–16. Kraft, J. M. and Brainard, D. H. (1999). Mechanisms of color constancy under nearly natural viewing. Proceedings of the National Academy of Sciences USA 96, 307–312. Kuriki, I. and Uchikawa, K. (1996). Limitations of surface-color and apparent-color constancy. Journal of the Optical Society of America A 13, 1622–1636. Kuriki, I. and Uchikawa, K. (1998). Adaptive shift of visual sensitivity balance under ambient illuminant change. Journal of the Optical Society of America A 15, 2263–2274. Land, E. H. (1986). Recent advances in retinex theory. Vision Research 26, 7–21. Land, E. H. and McCann, J. J. (1971). Lightness and retinex theory. Journal of the Optical Society of America 61, 1–11. Lee, H. (1986). Method for computing the scene-illuminant chromaticity from specular highlights. Journal of the Optical Society of America A 3, 1694–1699. Logvinenko, A. and Menshikova, G. (1994). Trade-off between achromatic colour and perceived illumination as revealed by the use of pseudoscopic inversion of apparent depth. Perception 23, 1007–1023. Longère, P. and Brainard, D. H. (2001). Simulation of digital camera images from hyperspectral input. In Vision models and applications to image and video processing (ed. C. van den Branden Lambrecht), pp. 123–150. Kluwer, Dordrecht. Lucassen, M. P. and Walraven, J. (1993). Quantifying color constancy: evidence for nonlinear processing of cone-specific contrast. Vision Research 33, 739–757. Lucassen, M. P. and Walraven, J. (1996). Color constancy under natural and artificial illumination. Vision Research 36, 2699–2711. Maloney, L. T. (1986). Evaluation of linear models of surface spectral reflectance with small numbers of parameters. Journal of the Optical Society of America A 3, 1673–1683. Maloney, L. T. (1992). Color constancy and color perception: The linear models framework. In Attention and performance XIV: Synergies in experimental psychology, artificial intelligence, and cognitive neuroscience (ed. D. E. Meyer and S. E. Kornblum), MIT Press, Cambridge, MA. Maloney, L. T. (1999). Physics-based approaches to modeling surface color perception. In Color vision: From genes to perception (ed. K. T. Gegenfurtner and L. T. Sharpe), pp. 387–416. Cambridge University Press, Cambridge. Maloney, L. T. and Wandell, B. A. (1986). Color constancy: A method for recovering surface spectral reflectances. Journal of the Optical Society of America A 3, 29–33. Marr, D. (1982). Vision. W. H. Freeman, San Francisco. Mausfeld, R. (1998). Color perception: from Grassman codes to a dual code for object and illumination colors. Color vision—perspectives from different disciplines (ed. W. G. K. Backhaus, R. Kliegl, and J. S. Werner), pp. 219–250. Walter de Gruyter, Berlin. Mausfeld, R. and Niederee, R. (1993). An inquiry into relational concepts of colour, based on incremental principles of colour coding for minimal relational stimuli. Perception 22, 427–462. McCann, J. J. (1994). Psychophysical experiments in search of adaptation and the gray world. Proceedings of IS&T’s 47th Annual Conference, Rochester, pp. 397–401.
328
colour perception
McCann, J. J., McKee, S. P., and Taylor, T. H. (1976). Quantitative studies in retinex theory: A comparison between theoretical predictions and observer responses to the ‘Color Mondrian’ experiments. Vision Research 16, 445–458. Noguchi, K. and Kozaki, A. (1985). Perceptual scission of surface lightness and illumination: An examination of the Gelb effect. Psychological Research 47, 19–25. Oyama, T. (1968). Stimulus determinants of brightness constancy and the perception of illumination. Japanese Psychological Research 10, 146–155. Parkkinen, J. P. S., Hallikainen, J., and Jaaskelainen, T. (1989). Characteristic spectra of Munsell colors. Journal of the Optical Society of America 6, 318–322. Romero, J., Garcia-Beltran, A., and Hernandez-Andres, J. (1997). Linear bases for representation of natural and artificial illuminants. Journal of the Optical Society of America A 14, 1007–1014. Rutherford, M. D. (2000). The role of illumination perception in color constancy. Ph.D. thesis, Department of Psychology. University of California, Santa Barbara. Singer, B. and D’Zmura, M. (1994). Color contrast induction. Vision Research 34, 3111–3126. Smith, V. and Pokorny, J. (1975). Spectral sensitivity of the foveal cone photopigments between 400 and 500 nm. Vision Research 15, 161–171. Speigle, J. M. and Brainard, D. H. (1996). Luminosity thresholds: Effects of test chromaticity and ambient illumination. Journal of the Optical Society of America A 13, 436–451. Speigle, J. M. and Brainard, D. H. (1999). Predicting color from gray: The relationship between achromatic adjustment and asymmetric matching. Journal of the Optical Society of America A 16, 2370–2376. Stiles, W. S. (1967). Mechanism concepts in colour theory. Journal of the Colour Group 11, 106–123. Tominaga, S. and Wandell, B. A. (1989). The standard surface reflectance model and illuminant estimation. Journal of the Optical Society of America 6, 576–584. Troost, J. M. and de Weert, C. M. (1991). Naming versus matching in color constancy. Perception and Psychophysics 50, 591–602. Trussell, H. J. and Vrhel, M. J. (1991). Estimation of illumination for color correction. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing Society, 2513–2516. Toronto, Canada May 14–17, 1991. Uchikawa, K., Uchikawa, H., and Boynton, R. M. (1989). Partial color constancy of isolated surface colors examined by a color naming method. Perception 18, 83–91. Valberg, A. and Lange Malecki, B. (1990). ‘Color constancy’ in mondrian patterns: A partial cancellation of physical chromaticity shifts by simultaneous contrast. Vision Research 30, 371–380. Wandell, B. A. (1987). The synthesis and analysis of color images. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-9, 2–13. Werner, J. S. and Walraven, J. (1982). Effect of chromatic adaptation on the achromatic locus: The role of contrast, luminance and background color. Vision Research 22, 929–944. Wesner, M. F. and Shevell, S. K. (1992). Color perception within a chromatic context—changes in red green equilbria caused by noncontiguous light. Vision Research 32, 1623–1634. Wyszecki, G. and Stiles, W. S. (1982). Color science—concepts and methods, quantitative data and formulae. John Wiley & Sons, New York. Yang, J. N. (1999). Illuminant estimation in surface color perception. D. thesis, Department of Psychology, New York University.
commentary: colour constancy: developing empirical tests
329
Commentaries on Brainard, Kraft, and Longère Surface colour perception and its environments Laurence T. Maloney The chapter by Brainard and colleagues begins with a brief summary of the computational literature on colour constancy and ends with a partial summary of the elegant and important experimental work on colour perception undertaken by Brainard and his colleagues over the past decade. The intent of the chapter is not simply to juxtapose model and experiment, but to emphasize the importance of the interplay between experimentation and theory in colour science, and it fulfils its intention very well. In comparison with many other subdivisions of visual perception, colour science is in particular need of theory. There are (at least) two reasons why this is so. First of all, when we study how well human observers judge properties of the environment, including shape, or depth, and separation (length), we usually know, or can determine, the correct answer to any question that we pose to the observer. Put a bit more formally, we have agreed-upon measurement procedures that allow us to determine which of two lengths is longer, or which of two objects is further away (e.g. a ruler). We do not expect to disagree with other observers concerning judgements of length (Asch 1956) and we readily resolve differences between what we see and what we measure in favour of the measurements, explaining the discrepancy as due to a visual illusion (Coren and Girgus 1978). In such studies, theory still plays a role, and an important one, but its role is primarily to explain the patterns of deviation from what our measurement procedures tell us is ‘ground truth’. For colour perception, we typically don’t know what counts as the right answer. We don’t have measuring devices to tell us the (true) colour of an object; some researchers (e.g. Brown, Chapter 8 this volume) even reject the possibility that we could ever identify measurable, physical properties of objects that correspond to the subjective experience of colour. Consequently, the study of colour is typically framed in terms of invariances or constancies: the experimenter doesn’t know what colour a homogeneous object ‘should’ be, but has the intuition that whatever it might be, it should remain the same under changes of illumination in the scene. If we were challenged to justify the claim that colour is invariant under a scene transformation, we could not do so. We would reply to the same challenge in the case of length by simply verifying that measured length remained invariant under the specified transformation. It is interesting, then, that many of the computational theories of colour constancy that Brainard and colleagues mention begin with explicit models of light–surface interaction that include parametric descriptions of surface spectral reflectance functions. The ensuing colour constancy algorithm is a recipe for estimating these parametric surface descriptors, wholly or in part. These descriptors are, of course, measurable properties of the hypothetical surfaces postulated within the framework of each model, and the analogue of colour perception for these models is the explicit estimation of properties of surface in the environment. While such estimation theories noisily compete to describe human colour judgements, they quietly agree that colour is the subjective correlate of unspecified physical properties of surfaces (for a review, see Maloney 1999). The first contribution of theory to the study of colour perception, then, is development of explicit models of what might count as the physical properties corresponding to colour. Implicit in the structure of such a theory is a claim that there is no fundamental difference between colour, on the one hand, and length or shape, on the other. We are simply less familiar with the rules governing colour in our world. If we eventually conclude that no estimation theory is an adequate description of human colour perception, then we will likely gain insight into the radical difference between perceptual attributes, such as length, that have agreed upon measurement procedure, and perceptual attributes, such as colour, that do not. The study of colour vision is in need of theory for a second reason. The geometric structure of the environment around us is extremely well described as a Euclidean geometry, inside the laboratory and out. Three numbers characterize a location, and relations such as collinearity, orthogonality, parallelism, and so on are so well mirrored by ordinary geometry that we can pass from computational
330
colour perception
description to physical measurement and back with confidence. In contrast, we know of no accurate parametric descriptions of lights and surfaces in the natural environment that require as few as three parameters to characterize a light, and three parameters to characterize a surface. That is not to say that there are not models of lights and of surfaces that provide excellent approximations (Maloney 1999), but that a critical observer will likely be able to detect the difference between a natural environment and an approximation based on whatever three-parameter models of lights and surfaces. Further, the computational theories we have so far require that descriptions of light and surface use no more than three free parameters (Maloney 1999: MacLeod and Golz, Chapter 7 this volume). Consequently, the models are designed to operate in abstractions or idealizations of the natural environment that a human observer can discriminate from the natural environment, at least under some circumstances. I term these abstractions environments (Maloney, Chapter 9 this volume). If we wish to test such a theory as a model of human performance, we can usefully divide our task into two. First, we examine the human observer’s performance when placed in the idealized environment assumed by the theory. Typically, the prediction is that the observer will have perfect colour constancy across the range allowed by the environment. The model of MacLeod and Golz (Chapter 7 this volume) is an exception, in that it predicts failures of lightness constancy even within its environment. Then we can examine the behaviour of the model outside of its environment and compare the performance of the observer to the same altered environment. The key in both cases is to simulate accurately an environment composed of idealizations of lights and surfaces specified mathematically. This sort of experiment is described in Maloney and Yang (Chapter 11 this volume). I argue that this approach is the correct way to test the kinds of computational models that have been developed in the past two decades. In conclusion, then, our lack of understanding of the physics of light–surface interaction in the environment requires a tighter link between theory and experiment than in other areas of perception. While Brainard and colleagues would likely agree with this general conclusion, they have taken a different tack in dealing with the uncertainty surrounding the proper environment for the study of human colour perception. They have developed a series of carefully controlled three-dimensional environments that they refer to as ‘nearly natural’. These approximations, constructed of known surface materials and illuminants, allow them to measure human colour constancy performance under something like natural viewing conditions. Since the idealized environments accompanying computational models of colour constancy are invariably intended as approximations of the natural environment, the nearly natural environment cannot be too far away from their normal ‘operating range’. As Brainard and colleagues note, the very high degree of colour constancy they find under some experimental conditions, and the very low colour constancy they find under others, is a strong indication that they have built an environment appropriate for the study of surface colour perception. And yet I would argue that the future belongs to accurate simulations of arbitrary ‘unnatural’ environments, environments chosen to match the assumptions of a particular theory under test. The technology needed to do this sort of experiment is complex: it includes high-intensity binocular display devices with a wide spectral gamut, as well as accurate computer graphics, rendering software for simulating light–surface interactions in complex, three-dimensional scenes. Once this sort of equipment is readily available, it should be possible to explore the match between human colour performance and computational theories systematically.
References Asch, S. E. (1956). Studies of independence and conformity. A minority of one against a uniform majority. Psychological Monographs 70 (9). Coren, S. and Girgus, J. S. (1978). Seeing is deceiving: The psychology of visual illusions. Lawrence Erlbaum, Hillsdale, NJ. Maloney, L. T. (1999). Physics-based models of surface color perception. In Color vision: From genes to perception (ed. K. R. Gegenfurtner and L. T. Sharpe), pp. 387–418. Cambridge University Press, Cambridge.
commentary: colour constancy: developing empirical tests
331
Commentaries on Brainard, Kraft, and Longère Comparing the behaviour of machine vision algorithms and human observers Vebjørn Ekroll and Jürgen Golz The indefiniteness in the concept of colour lies, above all, in the indefiniteness of the concept of the sameness of colours, i.e. of the method of comparing colours. Wittgenstein (1977) The inner coherence of the previously published work of Brainard and his research associates1 makes it tempting to speak of a concentrated research effort which seems to be less well defined by basic assumptions than by its dedication to tracking them down, being explicit about them, and subjecting them to empirical test. Their present contribution is no exception in this regard. In this case Brainard and colleagues isolate an assumption which is implicit to a large body of research on colour constancy, formalize it, and discuss its implications. Although this is not always recognized, interdisciplinary cross-talk between psychologists and psychophysicists, on the one hand, and computer vision scientists and physicists, on the other, is not only impeded by the fact that we speak different academic languages, but also by the fact that these disciplines are not separate due to mere historical chance. They are separate mainly because they have different realms of phenomena as their subject, and any vaguely promising attempt to bridge the gap between them is worthy of our attention. The match-linking proposition hypothesis proposed by Brainard and colleagues represents a conceptual clarification which eases the comparison of the predictions of colour constancy algorithms and human performance. Interesting implications of this linking proposition are deduced in an elegant and conclusive manner, and it is clear that the linking proposition makes a large number of interesting theoretical questions accessible to empirical investigation. The much commendable explicitness of the analysis by Brainard and colleagues also makes it a very interesting target for critical discussion. Due to the important implications of the match-linking proposition hypothesis, it should be considered carefully whether the assumptions upon which it is based can be regarded as correct in any given experimental situation. The basis for the linking proposition of Brainard et al. (see p. 314) is the assumption that there exists a function f() which relates estimated surface reflectances sˆ—algorithm output—to perceived colours, σ . This ensures that if the estimated surface reflectances sˆ(A) and sˆ(B) corresponding to two surfaces A and B are equal, their perceived colours σ (A) and σ (B) must also be equal. (The assumption that the function f() is one-to-one ascertains that the converse is also true.) It is assumed implicitly that the perceived colours, σ , which are the basis for human matching behaviour, are the appropriate counterparts to estimated surface reflectances sˆ. This implicit assumption seems to be a very plausible one, but it is by no means ascertained that it is correct. And if it is not, application of the match-linking proposition hypothesis may lead to erroneous or misleading conclusions, as will become clear in the following example. A priori, it is not clear whether a colour match between two patches made by a human observer was made on the basis of perceived surface colour or some other variable. An obvious alternative is unasserted colour, a term introduced by Arend (1994) and defined as the chromatic counterpart of brightness, that is, an aspect of perceived colour which is presumed to be more elementary and prior to any parsing into surface and illumination performed by the visual system. In order to ascertain that subjects actually base their matches on perceived surface colour, and not on any other aspects which are not intended in investigations of colour constancy, they may be appropriately instructed. This is known to have a substantial effect on the matches made (Arend 1994). However, even when clear 1 Among the more recent work: Brainard et al. (1997); Brainard (1998); Kraft and Brainard (1999); Speigle and Brainard (1999); Delahunt and Brainard (2000); Kraft et al. (2002); Rutherford and Brainard (2002).
332
colour perception
instructions are given, asymmetric colour matches are difficult to make. The subjective difficulties associated with making asymmetric colour matches are well known (Katz 1911; Gelb 1929; Whittle 1994a,b), although it is less well understood why they appear. An immediately plausible explanation would be that these difficulties are due to the subjective uncertainties which may be associated with the comparably impoverished and artificial stimuli which are typically employed in such experiments (see p. 315). As Brainard et al. note, there are many good reasons for studying more natural images than the typical cathode ray tube (CRT) displays. The subjective difficulty of making asymmetric matches gives a further good reason; if the subjective matching problems are due to the artificiality of the stimulus, they ought to disappear when more realistic stimuli are used, since they would be more likely to trigger a unique perceptual parsing into illumination and surface reflectance components. Interestingly, this is not the case, as is clear from the comments made by Brainard et al. (1997) in their study using a nearly naturalistic stimulus set-up. They state: ‘The observers were able to set reliably what they regarded as the best match. At this match point, however, the test and the match surfaces looked different, and the observers felt as if further adjustments of the match surface should produce a better correspondence. Yet turning any of the knobs or combinations of knobs only increased the perceptual difference. We verified that the observers’ adjustments near the best match were not limited by the gamut of our apparatus.’ They suggest the following explanation for this phenomenon: ‘One intriguing possibility is that our color experience at a location is described by more than three variables. This is possible if the influence of the illuminant (or, more generally, of the viewing context) has the effect of changing the perceptual representation of color in a way that cannot be compensated for simply by varying the tristimulus coordinates at a single location. Such an effect might be expected if the visual system uses color to code both surface and illuminant identity.’ This interpretation is supported by a recent analysis made by Niederée (1998), in which it is deduced from standard assumptions that a perceptually complete colour code for stimuli as simple as infield-surround configurations must be at least four-dimensional. If the visual system ‘uses color to code both surface and illuminant identity’ the function f( ) relating algorithm output to perceived colour would probably be more appropriately assumed to depend on both the estimated surface reflectance sˆ and the estimated illumination î, thus challenging the rationale for the match-linking proposition hypothesis. In this case, equal estimated surface reflectances obviously do not imply equal perceived colours. The subjective matching problem and the presumably related fact that the perceived colour cannot be adequately represented by a three-dimensional colour code, even in simple stimulus configurations (Katz 1911; Evans 1949, 1964, 1974; Niederée 1998; see also Ekroll et al. 2002a,b, for a related phenomenon), should be taken seriously, not only as interesting phenomena in their own right, but also because our failure to understand them deprives us of a fuller understanding of results from the very promising research programme suggested and pursued by Brainard et al. A preliminary strategy that could contribute to our understanding of these phenomena would be to investigate under which experimental conditions the subjective matching problems are particularly prominent, and under which conditions they are less prominent or even absent, as suggested by Bäuml (1999). It is, for instance, interesting to note that they have been reported to be absent in experiments with haploscopically superimposed displays (Whittle 1994a, b). However, it also seems imperative to develop ideas about which functional role the higher dimensionality of perceived colour might play. Some ideas about this can be found in Mausfeld (Chapter 13 this volume) and Niederée (1998). Although we have focused on a problematic aspect of the match-linking hypothesis proposed by Brainard and colleagues, there is every reason to pay close attention to their research. The explicitness and empirical rigour of the approach is, in our opinion, bound to enhance our understanding of colour perception one way or another. An aspect of Brainard and colleagues’ research which we have not addressed, but are particularly enthusiastic about, is the study of more naturalistic scenes under laboratory conditions. This line of research is likely to further our understanding of colour perception
commentary: colour constancy: developing empirical tests
333
for several reasons, some of them mentioned by the authors themselves. A simple point demonstrating the value of this research is that, as far as results from artificial displays and naturalistic scenes differ, this discrepancy may draw our attention to factors and cues influencing colour perception which have been overlooked by present theory (Kraft et al. 2002; Logvinenko et al. 2002). And, as already noted, the experimental study of naturalistic displays will ultimately show whether problems and phenomena often attributed to the artificiality of the experimental stimulus may be discarded as such, or rather reflect problems of our theoretical concepts. Of course, naturalness of stimuli does not imply that the tasks subjects are asked to perform (e.g. asymmetric matching), and which seem natural on the basis of our—potentially misleading—theoretical preconceptions about what the visual system does or aims to do (e.g. estimate reflectances), are in fact natural with respect to the internal structure of the visual system (cf. Mausfeld, Chapter 13 this volume).
References Arend, L. (1994). Surface colors, illumination, and surface geometry: Intrinsic-image models of human color perception. In Lightness, brightness and transparency (ed. A. L. Gilchrist), pp. 159–213. Lawrence Erlbaum Associates, Hillsdale, NJ. Bäuml, K. H. (1999). Simultaneous color constancy: How surface color perception varies with the illuminant. Vision Research 39(8), 1531–1550. Brainard, D. H. (1998). Color constancy in the nearly natural image. 2. Achromatic loci. Journal of the Optical Society of America A. Optics and Image Science Vis 15(2), 307–325. Brainard, D. H., Brunt, W. A., and Speigle, J. M. (1997). Color constancy in the nearly natural image. I. Asymmetric matches. Journal of the Optical Society of America A. Optics and Image Science 14(9), 2091–2110. Delahunt, P. B. and Brainard, D. H. (2000). Control of chromatic adaptation: signals from separate cone classes interact. Vision Research 40(21), 2885–2903. Ekroll, V., Faul, F., Niederée, R., and Richter, E. (2002a). Das natürliche Zenrum der Chromatizitätsebene ist nicht immer achromatisch: Ein neuer methodischer Zugang zur Untersuchung relationer Farbkodierung. Poster presented at the 5th Tübinger Wahrnehmungs-Konferenz. Available online at http://www.psychologie.uni-kiel.de/∼vekroll/research.htm. Ekroll, V., Faul, F., Niederée, R., and Richter, E. (2002b). The natural centre of chromaticity space is not always achromatic: A new look on colour induction. Proceedings of the National Academy of Sciences USA 99(20), 13352–6. Evans, R. M. (1949). On some aspects of white, gray and black. Journal of the Optical Society of America 39(9), 774–779. Evans, R. M. (1964). Variables of perceived colour. Journal of the Optical Society of America 54(12), 1467–1474. Evans, R. M. (1974). The perception of color. Wiley, New York. Gelb, A. (1929). Die ‘Farbenkonstanz’ der Sehdinge. In Handbuche der normalen und pathologischen Physiologie (ed. A. Bethe, G. v. Bergman, G. Embden, and A. Ellinger), pp. 594–687. Springer, Berlin. Katz, D. (1911). Die Erscheinungsweisen der Farben und ihre Beeinflussung durch die individuelle Erfahrung. Barth, Leipzig. (English translation of later, revised, edition available as Katz, D. (1935). The world of color. Kegan, Paul, Trench, Truber, London.) Kraft, J. M. and Brainard, D. H. (1999). Mechanisms of color constancy under nearly natural veiwing. Proceedings of the National Academy of Sciences USA 96(1), 307–312.
334
colour perception
Kraft, J. M., Maloney, S. I., and Brainard, D. H. (2002). Surface-illuminant ambiguity and color constancy: effects of scene complexity and depth cues. Perception 31, 247–263. Logvinenko, A. S., Kane, J., and Ross, D. A. (2002). Is lightness induction a pictoral illusion? Perception 31, 73–82. Niederée, R. (1998). Die Erscheinungsweisen der Farben und ihre stetigen Übergangsformen: Theoretische und Experimentelle Untersuchungen zur relationalen Farbkodierung und zur Dimensionalität vollständiger perzeptueller Farbcodes. Postdoctoral thesis, Philosophische Fakultät der Christian-Albrechts-Universität, Kiel. Rutherford, M. D. and Brainard, D. H. (2002). Lightness constancy: A direct test of the illumination estimation hypothesis. Psychological Science 13, 142–149. Speigle, J. M. & Brainard, D. H. (1999). Predicting color from gray: The relationship between achromatic adjustment and asymmetric matching. Journal of the Optical Society of America A. Optics and Image Science Vis 10, 2370–2376. Whittle, P. (1994a). Contrast brightness and ordinary seeing. In Lightness, brightness and transparency (ed. A. L. Gilchrist), pp. 111–158. Lawrence Erlbaum Associates, Hillsdale, NJ. Whittle, P. (1994b). The psychophysics of contrast brightness. In Lightness, brightness and transparency (ed. A. L. Gilchrist), pp. 35–110. Lawrence Erlbaum Associates, Hillsdale, NJ. Wittgenstein, L. (1977). Remarks on colour. Basil Blackwell, Oxford.
chapter 11
THE ILLUMINANT ESTIMATION HYPOTHESIS AND SURFACE COLOUR PERCEPTION laurence t. maloney and joong nam yang Preface This chapter, and the work described in it, is the direct consequence of writing an earlier chapter (Maloney 1999) concerning approaches to surface colour perception. There are a number of algorithms, the goal of which is to compute surface descriptors analogous to colour and to be able to do so despite changes in scene illumination and layout. Any of these algorithms could, in principle, be taken as a model for human colour processing and, long ago, it seemed worthwhile to organize them for presentation to a community of researchers interested primarily in biological colour vision. For the most part, the algorithms shared a common structure: (1) estimate the spectral properties of the illuminant across the scene; (2) correct the retinal information corresponding to each surface patch in the scene. This sort of approach is familiar to colour scientists. The notion that we estimate the chromaticity of the illuminant and then discount it from apparent surface colour dates at least to Helmholtz. The algorithms differed primarily in their approaches to estimating the chromaticity of the illuminant. Each employed a different ‘trick’, based on assumptions about the physics of light and surface, to estimate illuminant chromaticity. One could imagine standing in a scene and pointing successively to the illuminant cues employed by each of the algorithms: specular highlights, shadow edges, inter-reflections in corners, and, of course, the light source itself, if it were visible. It’s very natural to ask which of these cues (if any) affect human colour perception. When Nam and I first went through the literature, there was no conclusive evidence that human colour visual processing made use of any specific cue to the illuminant. Nam chose to pick a few candidate cues and test whether they could influence human colour perception, as described here. Determining what illuminant cues are employed in human vision is evidently a central problem for the field. It’s likely that the visual system employs more than one and, as a consequence, the estimation of illuminant chromaticity is a cue combination problem, open to investigation using the sort of models and experimental methods employed in studying depth and shape cue combination. L. T. Maloney and J. Nam Yang
336
colour perception
Introduction In experiments concerning depth perception, the experimenter typically knows the correct answer for every trial. Real or simulated objects are placed at a known distance from the experimental observer, and he or she is asked to estimate absolute depth or judge relative depth. A summary of the observer’s performance begins with a description of how accurate the observer’s judgements were; how close the observer came to the correct response. We know that depth perception is a complex process, that the observer makes use of multiple depth cues (Kaufman 1974), and even that the observer may use different depth cues in different scenes (Landy et al. 1995). In contrast, in studying surface colour perception, we still have relatively little idea of how human observers estimate the surface properties that correspond to colour (Maloney 1999), or what these surface properties might be (Maloney, Chapter 9 this volume). Previous research indicates that observers make roughly the same colour judgements when they view the same surfaces in different contexts, a phenomenon known as colour constancy. Reports of colour constancy lead us to suspect that human observers are estimating surface properties that are just as objective as the depth or dimensions of objects in a scene, but we do not yet know how we achieve the degree of colour constancy that we do. The degree of surface colour constancy that we experience depends on viewing conditions: under some circumstances, we have essentially none (Helson and Judd 1936), and under others, we show a remarkable, nearly perfect, degree of constancy (Brainard et al. 1997; Brainard 1998). Unqualified assertions that we have ‘approximate colour constancy’ (e.g. Hurvich 1981, p. 199) are misleading. If we are to understand colour vision under circumstances where the colours assigned to surfaces are little affected by changes in illumination, then we need to examine why we succeed at assigning invariant colour descriptors to surfaces under some conditions and fail dramatically under others. What do some scenes have, that other scenes don’t, that enhances colour constancy? If we asked an analogous question concerning depth vision, we could answer it with some confidence: in scenes with few or no depth cues, human perception of depth will fail. Even in scenes with useful depth cues, human observers will still fail if early visual processing misinterprets them or fails to use them, two sorts of errors that lead to visual illusions (Coren and Girgus 1984). In this chapter, we consider an analogous explanation for failures (and successes) of surface colour perception, based on a model of surface colour perception proposed by Maloney (1999). A key step in this model is estimation of the colour of the illuminant (or equivalent information) at each point in a scene. This idea is scarcely new: we find it in embryo in Helmholtz (1909/1962, Vol. 2, p. 287), and its clearest modern expression is the ‘dual-code’ hypothesis of Mausfeld (1998; Chapter 13 this volume). Maloney (1999) goes on to propose an explicit mechanism for estimating the illuminant by combining multiple illuminant cues, by analogy to depth cue combination. He describes possible illuminant cues taken from the computational literature (two of which we will describe in detail below) but leaves open the question of which cues are used in human vision.
the illuminant estimation hypothesis
337
An evident implication of this illuminant estimation hypothesis is that the number and strength of illuminant cues present in a scene limit the degree of colour constancy possible: little colour constancy is possible in scenes devoid of illuminant cues. If a colour visual system fails to make use of the cues available, we would also expect errors in surface colour perception as a consequence. In this chapter, we will first describe the illuminant estimation hypothesis in detail and discuss some of the candidate cues to the illuminant found in the computational literature. Then we will describe recent empirical tests of the illuminant estimation hypothesis that lead to the conclusion that the human visual system makes use of multiple illuminant cues, not all of which are present in every scene. We will also present evidence suggesting that the visual system does not always make use of illuminant cues that are present in a scene.
The illuminant estimation hypothesis Notation The colour signal that comes to the eye contains information about light and surface reflectance in the scene. The initial data available to the visual system are simply the excitations of photoreceptors at each location, xy, in the retina: xy
ρk =
E(λ)S xy (λ)Rk (λ)dλ,
k = 1, 2, 3.
(11.1)
Here, S xy (λ) is used to denote the surface spectral reflectance function of a surface patch imaged on retinal location xy, E(λ) is the spectral power distribution of the light incident on the surface patch, and Rk (λ), k = 1, 2, 3 are the photoreceptor sensitivities, all indexed by wavelength λ in the electromagnetic spectrum.1 The visual system is assumed to contain photoreceptors with three distinct sensitivities (k = 1, 2, 3), although, of course, at most one photoreceptor can be present at a single retinal location. E(λ) and S xy (λ) are, in general, xy unknown, while the Rk (λ), k = 1, 2, 3 are taken to be known. Figure 11.1 illustrates this simplified model of surface colour perception. Any visual system that is perfectly colour constant (Fig. 11.1) must somehow invert Equation 11.1, transforming photoreceptor excitations into surface colour descriptors that depend only on S xy (λ). Any visual system that is nearly colour constant must compute an accurate approximation to this inverse. Environments and algorithms Without further constraints on the problem, Equation 11.1 cannot be inverted, and the problem cannot be solved, even approximately (Ives 1912; Sällström 1973). How, then, 1 Equation 11.1 is a simplification of the physics of light–surface interaction. We are ignoring the effects of changes in the positions of light source or sources, the surface normal to the patch, and the position of the eye (see Maloney 1999).
338
colour perception
Illuminant E()
Surface reflectances Sxy() Photoreceptor excitations Rk (), k = 1,2,3 Figure 11.1 A simplified model of surface colour perception. S xy (λ) is used to denote the surface spectral reflectance function of a surface patch imaged on retinal location xy, E(λ) is the spectral power distribution of the light incident on xy the surface patch, and Rk (λ), k = 1, 2, 3 are the photoreceptor sensitivities, all indexed by wavelength λ in the electromagnetic spectrum. Light is absorbed and re-emitted by the surface toward the eye, where the retinal image is sampled spatially and spectrally. Under conditions of colour constancy, the eventual perception of surface colour must depend primarily on the surface S xy (λ) and not on the illuminant E(λ).
is colour constancy, approximate or exact, ever possible for a visual system like ours? In the past 20 years, a number of researchers have sought to develop models of biologically plausible, colour constant visual systems (for reviews, see Hurlbert 1998; Maloney 1999). For our purposes, we can think of each model as containing: (1) a mathematical description of an idealized world (referred to as an environment by Maloney 1999) and (2) an algorithm that can be used to compute invariant surface colour descriptors within the specified environment. The statement of the environment, of course, comprises the constraints that make it possible to invert Equation 11.1, and the algorithm is a recipe for doing just that. Given an algorithm embodied in a visual system, biological or artificial, and viewing conditions that satisfy the environmental assumptions of the algorithm, we would expect that the surface colour estimates returned by the visual system would be colour constant. Once removed from its environment, the algorithm may fail partially or completely (of course, as noted above, human colour constancy also fails dramatically under some viewing conditions). An active area of research concerns the match, or lack of match, between mathematically described environments, and particular subsets of the terrestrial environment where we suspect that human surface colour perception is constant, or nearly so (Maloney 1986; Parkkinen et al. 1989; van Hateren 1993; Vrhel et al. 1994; Romero et al. 1997; Bonnardel and Maloney 2000; for a review, see Maloney, Chapter 9 this volume). This article is less concerned with environments than with the algorithms corresponding to them.
the illuminant estimation hypothesis
339
Two-stage algorithms Many recent algorithms have a common structure: first,2 information concerning the illuminant spectral power distribution is estimated. This information is usually equivalent to knowing how photoreceptors would respond if directly stimulated by the illuminant without an intervening surface (Maloney 1999). This illuminant estimate is then used to invert Equation 11.1 to obtain invariant surface colour descriptors, typically by using a method developed by Buchsbaum (1980). The algorithms differ from one another primarily in how they get information about the illumination. There are currently algorithms that make use of surface specularity3 (D’Zmura and Lennie 1986; Lee 1986), shadows (D’Zmura 1992), mutual illumination (Drew and Funt 1990), reference surfaces (Brill 1978; Buchsbaum 1980), subspace constraints (Maloney and Wandell 1986; D’Zmura and Iverson 1993a,b), scene averages (Buchsbaum 1980), and more (Maloney 1999). An evident conclusion is that there are many potential cues to the illuminant in everyday, three-dimensional scenes. Cue combination Of course, many of the cues just listed may be absent from a particular scene, or very weak. A scene without specular objects, for example, provides no specular information concerning the illuminant. Given that there are several possible cues to the illuminant, not all of which need be present in every scene, it is natural to consider illuminant estimation as a cue combination problem, analogous to cue combination in depth/shape vision (see Landy et al. 1995 for a review of depth/shape cue combination). This idea did not originate with Maloney (1999): Kaiser and Boynton (1996, p. 521), for example, suggest that illuminant estimation is best thought of as combination of information from multiple illuminant cues. Brainard and colleagues (Brainard et al. 1997; Brainard 1998) note that the patterns of errors in surface colour estimation are those to be expected if the observer incorrectly estimates scene illumination and then discounts the illuminant using the incorrect estimate (‘the equivalent illuminant’ in their terms). Their observer supports the hypothesis that the observer is explicitly estimating the illuminant at each point of the scene. As we noted before, Mausfeld and colleagues (Mausfeld 1998) advanced the hypothesis that the visual system explicitly estimates illuminant and surface colour at each point in a scene, their ‘dual code hypothesis’. Many of the linear model algorithms reviewed by Maloney (1999), taken as models of human vision, presuppose this ‘dual code hypothesis’. In this chapter, we examine in detail how the human visual system might form illuminant estimates. The goal is to develop a plausible model of human surface colour perception as a process that develops an estimate of the ambient illuminant at each point in a scene by combining multiple cues to the illuminant. Of course, it is important to test this model and to determine which cues are significant in human vision. We view the current state of this model by analogy with depth and shape vision in the middle of the nineteenth century, 2 Some of the algorithms compute estimates of illuminant information and surface colour descriptors simultaneously, rather than successively. It does no violence to these algorithms to describe them as sequential. 3 Specularity is defined more precisely below. It is the ‘shiny’ component of certain surfaces, surfaces that can have highlights.
340
colour perception
where researchers were certainly aware that there were multiple, possible depth cues, but also uncertain as to which were used in human visual processing.
Illuminant estimation as cue combination Preliminaries Consider the following simple model of illuminant estimation: each of several cues (specularity, etc.) is used to estimate the illuminant parameters which we denote as ρ E = (ρ1E , ρ2E , ρ3E ), where ρkE
=
E(λ)Rk (λ)dλ,
k = 1, 2, 3,
(11.2)
are the photoreceptor excitations for each class of photoreceptor when directly viewing the illuminant, referred to as the chromaticity of the illuminant. One obvious way to gain information about illuminant chromaticity is to look directly at the light sources in a scene. Correct use of this direct viewing cue presupposes that the visual system can determine that particular items in the visual field are sources of illumination, and that it can also sort out which surfaces are illuminated by which illuminants, no easy task. Bloj et al. (1999) report evidence suggesting that the visual system has some representation of how light ‘flows’ from surface to surface in a three-dimensional scene. We do not yet know whether a direct viewing cue is employed in human vision under any circumstances. We denote an estimate of the illuminant based on a direct viewing cue by ρˆ DV = (ρˆ1DV ,ρˆ2DV ,ρˆ3DV ). The hat (‘∧’) symbol is commonly used is statistics to denote ‘an estimate of ’, and we’ll use it in that sense. If a visual system cannot obtain a direct view of the light sources, then it must develop an estimate, ρˆ E = (ρˆ1E , ρˆ2E , ρˆ3E ), of these parameters4 indirectly. The various algorithms above are methods for computing an estimate ρˆ E = (ρˆ1E , ρˆ2E , ρˆ3E ) when certain assumptions about the scene are satisfied (the environment ). In this chapter we will report experimental tests of two candidate cues based on specularity, one we refer to as the specular highlight cue, the other, as the full surface specularity cue. The illuminant estimates based on these cues are denoted ρˆ SH = (ρˆ1SH , ρˆ2SH , ρˆ3SH ) and ρˆ FSS = (ρˆ1FSS , ρˆ2FSS , ρˆ3FSS ) respectively. We postpone defining the latter until later in the chapter and discuss only the former here. Of course, we want to estimate the illuminant at each point in a scene, and it may vary from point to point. For our purposes, though, we can imagine that, for the remainder of this chapter, we are interested in one specific point in a scene and are trying to estimate the illumination impinging on it. In many scenes there are ‘highlights’ on curved surfaces that typically correspond to illuminants present in the scene. If we trust that a particular highlight is not distorting the colour of the light source, and that the reflected light source is the source of the illumination 4 Many of the algorithms estimate parameters that are equivalent to these ‘direct-view’ parameters, in the sense that one set of parameters can be computed from the other. The choice of parameterization is not important here, but is important when we come to the linear combination rule described further on.
the illuminant estimation hypothesis
341
of a part of the scene, we can readily imagine that the photoreceptor excitations of the highlight, ρ˜ SH = (ρ˜1SH , ρ˜2SH , ρ˜3SH ), are a useful estimate of ρ E , the illuminant parameters (we will explain why the estimate is marked with a tilde ‘∼’, and not a hat ‘∧’, in just a moment). The illuminant estimation hypothesis Figure 11.2 is a diagram illustrating the cue combination process. It is similar to a model of depth and shape cue combination proposed by Maloney and Landy (Maloney and Landy 1989; Landy et al. 1995). Explicit cues to the illuminant are derived from the visual scene and, eventually, combined by a weighted average at the extreme right, after two intervening stages, labelled promotion and dynamic reweighting, explained next. The final rule of combination can be written as, ρˆ E = αDV ρˆ DV + αSH ρˆ SH + αFSS ρˆ FSS + · · ·
(11.3)
The αs are scalar weights, between 0 and 1, that express the importance of each of the cues in the estimation process. The cue estimates shown correspond to the hypothetical cues discussed above: direct viewing (DV), specular highlights (SH), and full surface specularity (FSS). If, for example, the direct viewing cue is not used in human vision, then αDV = 0. Experimental tests of the hypothesis αDV = 0, and similar hypotheses for other cues, serve as a formalism that allow us to decide that a cue is, in fact, used in human vision (αDV > 0). Of course, there may be other cues to the illuminant, beyond these (Maloney 1999). We are by no means claiming that any of these cues are active in human vision. Before describing
Cue promotion
Scene
Illuminant estimate
Uniform background Specular highlights Full surface specularity Dynamic reweighting
Figure 11.2 Illuminant cue combination. In the illuminant cue combination model of Maloney (1999), distinct illuminant cues are extracted from the scene via illuminant estimation modules, analogous to depth modules in depth perception. The different sources of information concerning the illuminant are promoted to a common format (see text) and then combined by a weighted average, whose weights may vary from scene to scene as the availability and quality of ‘illuminant cues’ vary.
342
colour perception
how we carry out such tests, we need to say a bit about dynamic reweighting and promotion in Fig. 11.2. Dynamic reweighting There may be no shadows, no specularity, or no mutual illumination between objects in any specific scene. The illuminant may be in the current visual field (directly viewed), or not. We may not bother to look around and find it in a given scene. In the psychophysical laboratory, we can guarantee that any, or all, of the cues above are absent or present as we choose. If human colour vision made use of only one cue to the illuminant, then, when that cue was present in a scene, we would expect a high degree of colour constancy and, when that cue was absent, a catastrophic failure of colour constancy. Based on past research, it seems unlikely that there is any single cue whose presence or absence determines whether colour vision is colour constant. An implication for surface colour perception is that the human visual system may make use of multiple cues and different cues in different scenes. The relative weight assigned to different estimates of the illuminant from different cue types may also change. Landy et al. (1995) report empirical tests of this claim, which imply that depth cue weights do change in readily interpretable ways. In particular, consider the sort of experiment where almost all cues to the illuminant are missing. The observer views a large, uniform surround (Fig. 11.3a) with a single test region superimposed. The observer will set the apparent colour of the test region under instruction from the experimenter, and it is plausible that the only cue to the illuminant available is the uniform chromaticity of the surround. In very simple scenes, observers typically behave as if the chromaticity of the surround were the chromaticity of the illuminant (for discussion, see Maloney 1999). If we rewrite Equation 11. 3, this time explicitly including the uniform background cue (UB), ρˆ E = αDV ρˆ DV + αSH ρˆ SH + αFSS ρˆ FSS + αUB ρˆ UB + · · · ,
(11.4)
then an intelligent choice of weights for the scene of Fig. 11.3a is αDV = αSH = αFSS = 0 and αUB = 1, consistent with the behaviour of the human visual system. Consider, in contrast, the more complicated scene in Fig. 11.3b. There is still a large, uniform background, but there are other potential cues to the illuminant as well, notably the specular highlights on the small spheres. Will the observer continue to use only the chromaticity of the uniform background, or will he or she also make use of the chromaticity of the specular highlights as well? Will the influence of the uniform background on colour appearance decrease when a second cue is available? Will αSH or αFSS be greater than 0 and αUB less than 1? Cue promotion A second, and surprising, analogy between depth cue combination and illuminant estimation, is that not all cues to the illuminant provide full information about the illuminant parameters ρ E = (ρ1E , ρ2E , ρ3E ). Some of the methods lead to estimates of ρ E up to an
(a)
Background Scene
Highlights
Illuminant estimate
Others...
(b)
Background Scene
Highlights
Illuminant estimate
Others... Figure 11.3 Dynamic reweighting. (a) A scene with only one illuminant cue (uniform background). (b) A scene with two illuminant cues, the second based on surface specularity.
344
colour perception
unknown multiplicative scale factor. The same is, of course, true of depth cue combination, where certain depth cues (such as relative size) provide depth information up to an unknown multiplicative scale factor. By analogy with Maloney and Landy (1989), we refer to such cues as illuminant cues with missing parameters. A cue that provides an estimate of ρ E up to an unknown scale factor, is an illuminant cue missing one parameter, the scale factor. If the missing parameter or parameters can be estimated from other sources, the illuminant cue with parameters can be promoted to an estimate of the illuminant parameters, ρ E . The problem of combining depth cues, some of which have missing parameters, is termed cue promotion by Maloney and Landy (1989) and is treated further by Landy et al. (1995). In terms of notation, variables with a ‘tilde’ (ρ˜ E ) denote unpromoted estimates of the illuminant, variables with a ‘hat’ (ρˆ E ) denote the same estimate after promotion. In this chapter, we will not be further concerned with cue promotion. We began with the question, what do some scenes have, that others don’t have, that enhances colour constancy? The answer we propose, in the spirit of the illuminant estimation hypothesis, is that some scenes are rich in accurate illuminant cues, and the visual system makes use of them, leading to accurate estimates of illuminant chromaticity and a high degree of colour constancy. Other scenes, including the sort of scene represented in Fig. 11.3a, contain few cues to the illuminant and we would not expect that the visual system could arrive at accurate estimates of illuminant chromaticity or surface colour. In fact, Yang and Shevell (2003) show that scenes with fewer illuminant cues led to poorer colour constancy. They compared two scenes: one with fewer cues to the illuminant than the other, but both with the same amount of illumination. The scene with more cues to the illuminant had a higher degree of colour constancy. Many of the algorithms described by Maloney (1999) can be identified with potential cues to the illuminant as noted above. What the cues to the illuminant employed in human vision are, and how they are combined, remain open questions. In the following sections of this chapter we first define two cues based on surface specularity and then describe novel experimental methods that allow one to measure which illuminant cues are influencing human surface colour perception.
Two cues based on surface specularity The two cues make use of surface specularity as, in effect, a mirror that can be used to view the illuminant directly. The first cue, specular highlight, uses the photoreceptor excitations corresponding to one or more specular highlights in a scene as an estimate of the photoreceptor excitations to be expected if the visual system could view the illuminant directly, the illuminant chromaticity, ρ E = (ρ1E , ρ2E , ρ3E ), defined above. Of course, if the visual system is to use this cue it must, explicitly or implicitly, determine which parts of a scene, if any, contain specular highlights. Mistaking a red stop light for a specular highlight can lead to remarkable failures of colour constancy (and a traffic citation). There are clues, however, that help discriminate a specularity from a light source or just a light-coloured surface patch. If the reflecting surface is convex and the observer moves his head, a specularity will ‘follow’ the surface. Specularities, unlike light-coloured surface patches, are virtual images of light sources and, viewed binocularly, do not have
the illuminant estimation hypothesis
345
the disparities corresponding to the surface on which they are formed (Blake and Bülthoff 1990). If you gaze at any convenient specularity and close first one eye and then the other, you will see the specularity jump with respect to the surface underlying it. Its disparity corresponds to the full optical path from the eye to the light source by way of the surface, not to just the distance to the surface. In summary, the visual system can gather some information that helps it to identify specular highlights in a scene. Yet not all specularities serve as spectrally neutral mirrors: gold and copper, for example, are highly specular when polished, but the photoreceptor excitations corresponding to specularities in a golden mirror are not those described by Equation 11.2 (Nassau 1982; Wyszecki and Stiles 1982, pp. 55–56). If a visual system makes use of a specular highlight cue, we might suspect that it can also detect some of the boundary conditions where specular highlights are not useful cues to the illuminant, and suppress the cue accordingly by setting the corresponding weight to 0. The second cue considered, full-surface specularity, was proposed independently by Lee (1986) and D’Zmura and Lennie (1986). It makes use of surface specularity information concerning the illuminant, but does not restrict attention to specular highlights or require that specular highlights be perfect mirrors reflecting the illuminant. It is particularly useful when applied to surfaces that are only slightly specular (e.g. human skin) and where the specular information in a highlight is contaminated by the non-specular components of the surface. The algorithm is an elegant method for separating specular information from non-specular information by comparison across multiple surfaces. One of the environmental assumptions underlying the D’Zmura–Lennie–Lee cue is that the spectral characteristics of surfaces are accurately described by a model due to Shafer (1985). In the Shafer model, a surface reflectance is a superposition of an idealized matte surface (‘matte’) and a perfect neutral mirror (‘specular’): S(λ) = αS ∗ (λ) + β
(11.5)
where α and β are non-negative ‘geometric’ scale factors that vary with the relative position of the light source and the eye, and S ∗ (λ) is the surface spectral reflectance function of the matte surface for some fixed choice of viewing geometry (see D’Zmura and Lennie 1986; Maloney 1999). The geometric scale factors are further constrained so that S(λ) is a valid surface reflectance function with values between 0 and 1 at every wavelength (when α is large relative to β, the surface will look like a piece of coloured blotting paper, when β is large relative to α, the surface will look like a mirror). The key idea in the algorithm proposed by Lee and D’Zmura and Lennie is that, for any extended surface under near-punctate illumination, α and β will naturally vary as the angles from the eye and from the light source to different points on the surface patch vary. This variation is enough to allow estimation of the contribution of the specular component uncontaminated by the matte component (indicated in Fig. 11.4), by, in effect, constructing a virtual mirror in which the eye may view the illuminant. The full-surface specularity cue is available even for objects that are only slightly specular, such as human skin. The most specular point on a face may still be an evident mixture of the colour of the illuminant and the colour of the underlying matte component of the face.
346
colour perception
Ma
tte
1
Specular
Surface 1 Surface 2
e2
tt
Ma
Figure 11.4 The full surface specularity cue of D’Zmura–Lennie–Lee. The elegant algorithm developed independently by Lee (1986) and D’Zmura and Lennie (1986) makes use of two surfaces with distinct matte components and appreciable degrees of specularity to arrive at an estimate of the colour of the illuminant not contaminated by the colour of either matte component. All of the chromaticities for a single surface fall on a plane determined by the chromaticity of the matte component and the chromaticity of the specular component. Two surfaces with distinct matte components determine planes that intersect in a line containing the chromaticity of the specular component common to both. This specular component is the desired information about the illuminant.
The D’Zmura–Lennie–Lee approach can be used to estimate photoreceptor excitations corresponding to the illuminant in conditions where the specular highlight cue would give a seriously misleading estimate. One peculiarity of the D’Zmura–Lennie–Lee algorithm is that there must be at least two surfaces available in the scene with matte surface reflectance functions that are distinct (specifically, not proportional). In a scene with many Shafer objects, all with the same ‘colour’ (matte surface reflectance function), the D’Zmura–Lennie–Lee cue is not available. We will use this fact in designing the experiments reported below. The Shafer model is inaccurate as a description of certain naturally occurring surfaces (Lee et al. 1990) but it is not known how well it approximates surfaces in the everyday environment. It is, however, an accurate approximation of a large class of surfaces known as dielectrics, that includes plastics. This is the model that we use in rendering all of the objects used as stimuli. Consequently, our stimuli satisfied the environmental assumptions for both specularity cues.
Cue perturbation methods We measured the influence of each of the three candidate cues to the illuminant using a cue perturbation approach analogous to that described by Landy et al. (1995). The key
the illuminant estimation hypothesis
347
idea underlying the approach is easily explained. Suppose that we are viewing a threedimensional scene filled with objects, textures, lines, shadows, specularities, and more. We want to measure the extent to which a particular physical cue (texture) contributes to a particular psychophysical judgement (the shape of a textured object). Is texture an important cue to shape in this particular scene? We could as readily ask whether specularity influences our perception of surface colour, and, if so, to what extent. How can we evaluate the importance of potential sources of information (‘cues’) in very complicated, realistic scenes? In such scenes, there are many potential sources of information concerning shape, surface colour, position, and so forth, and presumably all of the visual cues are signalling the same shape, surface colour, etc. Just because texture is a useful depth cue doesn’t mean that it is used when other cues are available. What we would like to do is to perturb the shape information signalled by texture, or the illuminant information signalled by specularity, while holding everything else in the scene constant. If the perturbation affects perceived shape or surface colour, we have evidence that the cue is being used by the visual system, and the magnitude of the effect, compared to the magnitude of the perturbation, allows us to quantify the influence of the cue in a particular scene. We next describe in more detail how to perturb illuminant cues and measure their influences. First of all, we simulate binocular scenes where multiple candidate cues to the illuminant are available. We next measure the observer’s achromatic setting for two different illuminants (illuminants I1 and I2 ) applied to the scene: these achromatic settings are plotted in a standard colour space, as shown in Fig. 11.5, marked I1 and I2 ). The direction and magnitude of any observer change in achromatic setting, in response to changes in the illuminant, are useful measures of the observer’s degree of colour constancy, which is not of immediate concern to us. We are content to discover that the chromaticity of the surface the observer considers to be achromatic changes when we change the illuminant, presumably because of information about the illuminant signalled by illuminant cues available to the observer. However, so far, we can conclude nothing about the relative importance of any of the illuminant cues present, since all signal precisely the same illuminant in both rendered scenes. We next ask the observer to make a third achromatic setting in a scene where the illuminant information for one cue is set to signal illuminant I2 , while all other cues are set to signal illuminant I1 (this sort of cue manipulation is not difficult with simulated scenes, but would be difficult or impossible to do in a real scene. The methods used are described in Yang and Maloney 2001). The experimental data we now have comprises three achromatic settings: under illuminant I1 , under illuminant I2 , and under illuminant I1 with one cue perturbed to signal illuminant I2 . We wish to determine whether the visual system is ‘paying attention’ to the perturbed cue, that is, whether the perturbed cue has a measurable influence on colour perception as measured by achromatic matching. What might happen? One possibility is that the observer’s setting in the scene with one cue perturbed to signal illuminant I2 is at the point labelled α in Fig. 11.5, identical to the setting that he or she chose when all cues signalled illuminant I1 . We would conclude that the perturbed cue had no effect whatsoever on surface colour perception—it is not a cue to the illuminant, at least in the scene we are considering.
348
colour perception (a)
(b) I2
I2
V⬘
V⬘
I1
I1 U⬘
U⬘
Figure 11.5 Data from a perturbation experiment. The point marked I1 is the achromatic setting of a hypothetical observer when the test patch is embedded in a scene illuminated by reference illuminant 1. The point marked I2 , is, similarly, the achromatic setting when the same scene is illuminated by reference illuminant 2. The remaining points correspond to hypothetical achromatic settings when one illuminant cue signals I2 , and the remainder signal I1 . The setting α is consistent with the assertion that the perturbed cue has no effect (an influence of 0). The setting β is consistent with the assertion that the perturbed cue is the only cue that has influence (an influence of 1). The setting γ is consistent with an influence of 0.5 as it falls at the midpoint of the line joining I1 and I2 .
Suppose, on the other hand, the observer’s achromatic setting in the scene with one cue perturbed to signal illuminant I2 (and all others are set to signal illuminant I1 ) is at the point marked β in Fig. 11.5, the same as it was when all cues signalled illuminant I2 . This would suggest that the observer is only using the manipulated cue, ignoring the others. A third possibility is that the observer chooses a setting somewhere between his or her settings for the two illuminants (point γ in Fig. 11.5), along the line joining them. Let δ be the change in setting when only the perturbed cue signals illuminant 2 (the distance from I1 to γ and let be the change in setting when all cues signal illuminant 2 (the distance from I1 to I2 ). We define the influence of the perturbed cue to be: I=
γ − I1 δ = I2 − I1
(11.6)
The value I should fall between 0 and 1. A value of 0 implies that the perturbed cue is not used (point α), a value of 1 implies that only the perturbed cue is used (point β).5 Point γ corresponds to an influence of 0.5 as it falls at the midpoint of the line joining A to B. A critical factor in illuminant estimation studies, such as those described here, is that the images that are displayed on a computer monitor must be rendered correctly. Human colour constancy with simulated images is markedly less than that obtained with real scenes (Arend et al. 1991; Brainard 1998; Kurichi and Uchikawa 1998). With real scenes, the index 5 Of course, the idealized results just described are not what we expect to obtain experimentally. In the perturbed scenes, the observer is free to make achromatic settings that do not fall on the line joining the settings in the two unperturbed scenes. We expect such an outcome, if only as a consequence of measurement errors. The computation of influence we actually employ is described in more detail in Yang and Maloney (2001) and Brainard (1998, Fig. 4B).
the illuminant estimation hypothesis
349
reaches an average of 0.84 (Brainard 1998) while typical results with rendered scenes lead to values of 0.5 or less. In Yang and Maloney (2001), we have taken several steps to ensure that the scenes we present are as accurate as possible. In describing the apparatus we will touch on some of them.
The apparatus Yang and Maloney (2001) built a large, high-resolution, stereoscopic display (Fig. 11.6), described in detail there. The observer sat at the open side of a large box, positioned in a chin rest, gazing into the box. Its interior was lined with black, felt-like paper. Small mirrors directly in front of the observer’s eyes permitted him or her to fuse the left and right images of a stereo pair displayed on computer monitors positioned to either side. An example of a stimulus (image pair) is shown in Fig. 11.7. The stimulus is achromatic (shades of grey). However, none of the image pairs used by Yang and Maloney were achromatic, all contained saturated colours. Once an image was displayed, the observer pressed keys that altered the colour of a small test patch until it appeared achromatic. The observer could adjust the colour of the patch in two dimensions of colour space but could not change its luminance (roughly speaking, brightness).
Wooden box (1.24 × 1.24 × 1.24 m 3) Fused image
Mirrors Left CRT
60.25 cm 10 cm
Left CPU
eyes Shutters
Right CRT Right CPU
Control computer Figure 11.6 The experimental apparatus. The observer viewed simulated scenes through a computer-controlled Wheatstone stereoscope. Two computers served to display the two images of a stereo image pair to the left and right eyes of an observer, respectively. A third computer controlled the experiment and recorded the observer settings in an achromatic matching task.
350
colour perception
Figure 11.7 A example of a stimulus (binocular image pair). The figure shows a stereo image pair (for crossed fusion) similar to those employed in the experiments. The actual image pairs were strongly chromatic, not achromatic, as suggested by this figure. All image pairs were produced by a novel rendering method described by Yang and Maloney (2001), based on the RADIANCE package.
We used the physics-based rendering package RADIANCE (Larson and Shakespeare 1997) to render each of the images in a stereo pair, simulating the appearance of a specified three-dimensional layout of spheres tangent to a plane perpendicular to the observer’s Cyclopean line of sight, as shown in Fig. 11.7. The objects within the scene were rendered as if they were, on average, the same distance in front of the observer as the optical distance from each of the observer’s eyes to the corresponding display screen (70 cm). The matte component of each rendered surface (background, spheres) was rendered6 so as to match it to a particular Munsell colour reference chip from the Nickerson–Munsell collection (Kelley et al. 1943). The entire scene was illuminated by a combination of a punctate and a diffuse light. The spectral power distribution of the diffuse light was always that of either standard illuminant D65 or standard illuminant A (Wyszecki and Stiles 1982, p. 8). The punctate illuminant was always positioned behind, to the right of, and above the observer in the rendered scene. The square test patch (0.5◦ of visual angle on a side) was tangent to the front surface of one of the spheres.
Experiment 1 In the first experiment, we used scenes similar to that illustrated in Fig. 11.7, scenes with evident and numerous specular highlights. The spheres were highly specular, the background slightly specular, and the matte components of all of the spheres were homogeneous and identical. If the observer makes use of the background and any single sphere, then he may calculate an estimate of the illuminant using the full surface specularity cue. Since the background is only slightly specular, the information from this cue may not be reliable. 6 Computer graphics rendering does not correctly model the spectral effects of light–surface interaction (Ives, 1912; Evans, 1948; Maloney 1999). We modified the rendering package used to render colours exactly and to allow us to specify the full surface reflectance functions of surfaces in the scene (see Yang and Maloney 2001).
the illuminant estimation hypothesis
351
Of course, the specular highlight cue is very much available. We are studying whether any specular cue influences surface colour perception in these scenes. We hoped to detect the visual system making use of these highlights and, as we will see next, the results suggest that we succeeded. It is worth noting here that, had we failed, we could not exclude the possibility that in some other scene, the visual system would make use of specular information, even if it did not in the scene we used in this experiment. The logic of dynamic reweighting and the influence measure might require some acclimatization: we can show conclusively that a particular cue has influence in a particular scene, but can never show that a particular cue has no influence in all possible scenes, simply because we will never measure its influence in all possible scenes. Yet, if specularity had little or no influence in scenes similar to Fig. 11.7, it’s difficult to imagine scenes in which it would. Figure 11.8a shows the achromatic settings for four observers. The horizontal and vertical axes are the u ′ and v ′ coordinates of the CIE chromaticity diagram, as in the hypothetical data of Fig. 11.5. Open shapes represent mean achromatic settings when the scene was rendered under illuminant A and illuminant D65, in turn, and the mean achromatic setting when the specular highlight illuminant cue alone signalled illuminant D65, all other cues signalling illuminant A. Standard deviations for each setting are shown as vertical and horizontal bars at the centre of each shape. Fig. 11.8b shows the effect of perturbing the specular highlight cue toward A, when all of the other illuminant cues signal D65. The observers’ achromatic settings for the two consistent images are clearly different. The observer is responding to changes in the illuminant. The changes in response are qualitatively similar to the changes in lighting and the observer changes are similar to those found in previous studies (Arend et al. 1991; Brainard 1998). The setting points for the perturbed cue fall near the line joining the setting points for the two unperturbed scenes. Note that the influence is asymmetric, in that the cue perturbation from illuminant A in the direction of illuminant D65 has a much greater influence than that from illuminant D65 in the direction of illuminant A. For the former settings, specular information had significant influence on achromatic settings: the measured influence ranged from 0.3 to 0.83. We repeated this experiment with a different choice of Munsell surface for the objects and the background. (10GY 5/6 for the objects and 10P 4/6 for the background.) When the colours of the objects and background were changed, the achromatic settings changed little, consistent with results reported in previous studies (Brainard 1998; Kurichi and Uchikawa 1998). The influence measures changed very little as well, and there was still a marked asymmetry in influence between the two illuminant conditions. The outcome of this experiment indicates that the illuminant information conveyed by specularity can affect the apparent colours of surfaces in a scene.
Experiment 2 We next varied the number of specular objects from 1 to 11 (in Experiment 1, there were always 11 specular spheres in each stereoscopic scene). The stimuli were otherwise identical to those used in the previous experiment.
0.52
v⬘
(a)
CHF
GT
0.45 0.52
EC
BRM
0.45 0.16
0.52
v⬘
(b)
0.20
0.24 0.16 u⬘
0.20
CHF
0.24
GT
0.45 0.52
EC
BRM
0.45 0.16
0.20
0.24 0.16 u⬘
0.20
0.24
Figure 11.8 Specular illuminant cues: results of Experiment 1. The achromatic settings for four observers are shown, plotted the in u ′ v ′ coordinates in CIE chromaticity space. In each small plot, a white circle marks the mean of multiple settings by one observer for the illuminant D65 consistent-cue condition, a black circle marks the mean for multiple settings by the same observer for the illuminant A consistent-cue condition, and the centre of the head of the vector marks the mean of multiple settings for the perturbed-cue condition. The base of the vector is connected to the consistent cue setting corresponding to the illuminant signalled by the non-perturbed cues. Horizontal and vertical bars indicate one SE for each setting. The projection of the perturbed setting on to the line joining the unperturbed settings is marked. For all observers, the perturbation from A to D65 led to a strong measured influence on the achromatic settings, while the perturbation from D65 to A led to little or none. (a) The perturbed cue signalled D65, all others, A. (b) The perturbed cue signalled A, all others, D65. (From Yang and Maloney 2001, Experiment 1.)
the illuminant estimation hypothesis
353
Influence
1.0
0.5
0.0 1
2
6 9 Number of objects
11
Figure 11.9 Influence versus the number of specular objects. We varied the number of objects in the scenes of Experiment 1 and measured the influence of the specularity cue. Influence is plotted versus number of objects. Measured influence was not measurably different from 0 for 1–6 objects, but evidently non-zero for 9 or 11 objects. Different symbols correspond to different observers. A heavy line joins the means of the observers’ influence measures. (From Yang and Maloney 2001, Experiment 2.)
How might the visual system react? One possibility is, that as soon as a single specular highlight is present in the scene, the visual system would use it to estimate the chromaticity of the illuminant. Adding more specular highlights would not lead to a change in influence since, after all, the specular highlights all signal the same piece of information. Figure 11.9 summarizes the results for varying the number of objects in the scene. Hurlbert (1989) found little influence of perturbation when there was only one big ball available in the scene; the results in Fig. 11.9 are in agreement with her results. Specularity exerted little or no influence until the number of objects exceeded 6. With 9 or 11 objects we found a markedly non-zero influence of specularity. The overall plot of influence versus number of identical specular objects is evidently non-linear, with an accelerating slope (convex).
Experiment 3 In the two experiments just described, the influence of specularity could be due to the full surface specularity cue (D’Zmura and Lennie 1986; Lee 1986) but the design of the stimulus rendered that cue relatively weak. Experiment 3 is identical to Experiment 1 except that the 11 specular spheres, which all shared a common matte component in Experiment 1, now had 11 distinct matte surface spectral reflectance functions. The stereo pairs used resembled Fig. 11.7 in spatial layout. In these scenes, we might expect that both the specular highlight cue and the full surface specularity cue are used by the visual system. If so, we might see a net increase in the
(a)
0.52
CHF
GT
0.45 0.52
EC
JA
0.45 0.16
v⬘
(b)
0.20
0.52
0.24 0.16 u⬘
0.20
CHF
0.24
GT
0.45 0.52
EC
JA
0.45 0.16
0.20
0.24 0.16 u⬘
0.20
0.24
Figure 11.10 A test of the full surface specularity cue (D’Zmura–Lennie–Lee). In Experiment 3, we tested whether the full surface specularity cue exerts a significant influence on achromatic settings. The data presentation format is identical to that of Fig. 11.8. The vertical and horizontal bars associated with each mean measurement correspond to plus or minus one standard error of the mean. For all observers the perturbation from A to D65 or from D65 to A led to little influence on the achromatic settings. (a) The perturbed cue signalled D65, all others, A. (b) The perturbed cue signalled A, all others, D65. (From Yang and Maloney 2001, Experiment 4.)
the illuminant estimation hypothesis
355
influence of specularity, since altering the illuminant information conveyed by specularity now affects both cues: the observer influence should be the sum of the influences of the two cues. If only the specular highlight cue is used, we would expect the same results as in Experiment 1. The results, shown in Figs 11.10a and 11.10b, are, therefore, surprising. For the majority of the observers, perturbation of the illuminant information signalled by specularities had no significant influence, not even the degree of influence we observed in Experiment 1. We conclude that the visual system failed to use the full surface specularity cue in scenes that would seem to make it maximally available and, in these same scenes, that the specular highlight cue also had little influence.
Discussion In perturbing the specularity cue in Experiments 1–3, we are changing the average scene colour. It is very natural to ask, could a change of this magnitude in the average chromaticity of a scene explain the apparent influence of the specularity observer here? Putting aside our results, for the moment, we note that the evidence in favour of a major influence of average scene colour on achromatic settings in any but simple centre-surround scenes is weak. The results of Jenness and Shevell (1995) reject the hypothesis in centre-surround scenes flecked with white [although Brenner and Cornelissen (1998) challenged their interpretation of the experimental outcomes]. Brown and MacLeod (1997) and Hahn and Geisler (1995) reject the same hypothesis concerning average scene colour in simple two-dimensional scenes. The results of Yang and Maloney (2001) flatly reject the hypothesis for the sorts of scenes we have used. We will mention briefly just one of these results. The changes in average scene colour effected by perturbing stimuli in Experiment 1 and in Experiment 3 were identical: the stimuli differed only in matte components, the perturbation involves only the specular components. Yet the large changes of achromatic setting by observers in Experiment 1 were not observed in Experiment 3. The results reported here, together with previous research, suggest that there are at least two cues to the illuminant used in human vision (see Yang and Shevell 2002, for the stereo disparity cue). The first, the uniform background cue is known to affect surface colour perception in very simple scenes, as described above. For another demonstration of this cue in action, see Yang and Shevell (2003) under two explicit illuminant conditions. Our results suggest that there is a second cue, based on specularity, that we tentatively identify with the specular highlight cue. Our results suggest that the influence of this cue can vary dramatically as the number of specular objects present in the scene (or alternatively, the density of specular objects) is varied. This result is consistent with the claim that the weights given to different illuminant cues change from scene to scene (‘dynamic reweighting’). It is not obvious how to interpret the results of Experiment 2. Perhaps the visual system requires a certain minimum redundancy in specularities before deciding that a bright spot in the scene is a legitimate cue to the illuminant. The asymmetry observer in Experiment 1 is also intriguing. Possibly the visual system gives very little weight to specular cues that are far from neutral. After all, some surfaces have non-neutral spectral components (e.g. gold). The visual system may be organized so as to discard specularities that are intensely coloured,
356
colour perception
simply to avoid errors due to such surfaces. As a consequence, a specularity signalling a neutral D65 illuminant is given much higher weight than a specularity signalling a ‘reddish’ illuminant A, leading to the observer asymmetry. A brief summary of the research reported here would be to say that we have employed methods and ideas current in depth and shape vision to an analogous problem in colour vision. Whether it is useful to model one step in surface colour perception as illuminant cue combination remains to be seen. It is the case that the illuminant estimation hypothesis underlying this work provides a very natural vehicle for linking work in computational colour vision with work in psychophysics, an achievement of some value. In addition, however, it is interesting to consider how these experiments illustrate certain unspoken assumptions in the study of depth, shape, and colour. In Fig. 11.7 (a coloured, not grey-level version), each sphere, and even the background, exhibits a wide range of discriminable colours in both of the stereo images, even though each is ‘made’ of a single surface material. The stimulus can be described parsimoniously in terms of surfaces and illuminants and their relative locations, in essentially something like the graphical language we employed in specifying the scenes to the rendering package we used. The resulting pair of retinal images is (superficially) much more complex. Shading, shadows, inter-reflections, specularity, and the like, have conspired to produce very complex stimuli, if we insist on describing them retinally. If, however, we wish to study surface colour perception, the estimation of objective surface properties through human colour vision, then it would make sense to describe the stimuli and their manipulation as independent variables of the environment without, and not to an arbitrary, intermediate, retinal stage in colour processing.
Acknowledgements All of the research described here was supported in part by Grant EY08266 from the National Institute of Health, National Eye Institute. The ATR Corporation, Kyoto, Japan kindly provided support and facilities to the first author during the preparation of the manuscript. He is grateful to Dr Shigeru Akamatsu of ATR for support and encouragement. We especially thank Michael Landy for advice and comments during both research and writing.
References Arend, L. E., Reeves, A., Shirillo, J., and Goldstein, R. (1991). Simultaneous color constancy: Papers with diverse Munsell values. Journal of the Optical Society A 8, 661–672. Blake, A. and Bülthoff, H. (1990). Does the brain know the physics of specular reflection? Nature 343, 165–168. Bloj, M. G., Kersten, D., and Hurlbert, A. C. (1999). Perception of three dimensional shape influences colour perception through mutual illumination. Nature 402, 877–879. Bonnardel, V. and Maloney, L. T. (2000). Daylight, biochrome surfaces, and human chromatic response in the Fourier domain. Journal of the Optical Society of America 17, 677–687. Brainard, D. H. (1998). Color constancy in the nearly natural image. 2. Achromatic loci. Journal of the Optical Society of America A 15, 307–325.
the illuminant estimation hypothesis
357
Brainard, D. H., Brunt, W. A., and Speigle, J. M. (1997). Color constancy in the nearly natural image. 1. Asymmetric matches. Journal of the Optical Society of America A 14, 2091–2110. Brenner, E. and Cornelissen, F. W. (1998). When is a background equivalent? Sparse chromatic context revisited. Vision Research 38, 1789–1793. Brill, M. H. (1978). A device performing illuminant-invariant assessment of chromatic relations. Journal of Theoretical Biology 71, 473. Brown, R. O. and MacLeod, D. I. A. (1997). Color appearance depends on the variance of surround colors. Current Biology 7, 844–849. Buchsbaum, G. (1980). A spatial processor model for object colour perception. Journal of the Franklin Institute 310, 1–26. Coren, S. and Girgus, J. S. (1984). Seeing is deceiving; The psychology of visual illusions. Erlbaum, Hillsdale. Drew, M. S. and Funt, B. V. (1990). Calculating surface reflectance using a single-bounce model of mutual reflection. Proceedings of the Third International Conference on Computer Vision, Osaka, Japan, December 4–7. IEEE Computer Society, Washington, DC. D’Zmura, M. (1992). Color constancy: Surface color from changing illumination. Journal of the Optical Society of America A 9, 490–493. D’Zmura, M. and Iverson, G. (1993a). Color constancy: I. Basic theory of two-stage linear recovery of spectral descriptions for lights and surfaces. Journal of the Optical Society of America A 10, 2148–2165. D’Zmura, M. and Iverson, G. (1993b). Color constancy: II. Results for two-stage linear recovery of spectral descriptions for lights and surfaces. Journal of the Optical Society of America A 10, 2166–2180. D’Zmura, M. and Lennie, P. (1986). Mechanisms of color constancy. Journal of the Optical Society of America A 3, 1662–1672. Evans, R. M. (1948). An Introduction to Color. Wiley, New York. Hahn, L. W. and Geisler, W. S. (1995). Adaptation mechanisms in spatial vision: I. Bleaches and backgrounds. Vision Research 35, 1585–1594. Hateren, J. H. van (1993). Spatial, temporal and spectral pre-processing for color vision. Proceedings of the Royal Society of London Series B 251, 61–68. Helmholtz, H. von (1909/1962). Helmholtz’s treatise on physiological optics (ed. J. P. C. Southall) (3rd edn). Dover, New York. Helson, H. and Judd, D. B. (1936). An experimental and theoretical study of changes in surface colors under changing illuminations. Psychological Bulletin 33, 740–741. Hurlbert, A. (1989). The computation of color. Ph.D. dissertation, The Massachusetts Institute of Technology. Hurlbert, A. (1998). Computational models of color constancy. In Perceptual constancies; Why things look as they do (ed. V. Walsh and J. Kulikowski), pp. 283–322. Cambridge University Press, Cambridge, UK. Hurvich, L. M. (1981). Color vision. Sinauer, Sunderland, MA. Ives, H. E. (1912). The relation between the color of the illuminant and the color of the illuminated object. Transactions of the Illuminating Engineering Society 7, 62–72. Jenness, J. W. and Shevell, S. K. (1995). Color appearance with sparse chromatic context. Vision Research 35, 797–805. Kaiser, P. K. and Boynton, R. M. (1996). Human color vision (2nd edn). Optical Society of America, Washington, DC. Kaufman, L. (1974). Sight and mind: An introduction to visual perception. Oxford University Press, Oxford, UK.
358
colour perception
Kelley, K. L., Gibson, K. S., and Nickerson, D. (1943). Tristimulus specification of the Munsell Book of Color from spectrophotometric measurements. Journal of the Optical Society of America 33, 355–376. Kurichi, I. and Uchikawa, K. (1998). Adaptive shift of visual sensitivity balance under ambient illuminant change. Journal of the Optical Society of America A 15, 2263–2274. Landy, M. S., Maloney, L. T., Johnston, E. B., and Young, M. (1995). Measurement and modeling of depth cue combination: In defense of weak fusion. Vision Research 35, 389–412. Larson, G. W. and Shakespeare, R. (1997). Rendering with radiance. Morgan Kaufmann, San Francisco. Lee, H.-C. (1986). Method for computing the scene-illuminant chromaticity from specular highlights. Journal of the Optical Society of America A 3, 1694–1699. Lee, H.-C., Breneman, E. J., and Schulte, C. P. (1990). Modeling light reflection for computer color vision. IEEE Transactions on Pattern Analysis and Machine Intelligence 12, 402–409. Maloney, L. T. (1986). Evaluation of linear models of surface spectral reflectance with small numbers of parameters. Journal of the Optical Society of America A 3, 1673–83. Maloney, L. T. (1999). Physics-based models of surface color perception. In Color vision: From genes to perception (ed. K. R. Gegenfurtner and L. T. Sharpe), pp. 387–418. Cambridge University Press, Cambridge, UK. Maloney, L. T. and Landy, M. S. (1989). A statistical framework for robust fusion of depth information. In Visual communications and image processing IV. Proceedings of the SPIE, 1199 (ed. W. A. Pearlman), pp. 1154–1163. Maloney, L. T. and Wandell, B. A. (1986). Color constancy: A method for recovering surface spectral reflectance. Journal of the Optical Society of America A 3, 29–33. Mausfeld, R. (1998). Colour perception: From Grassmann codes to a dual code for object and illuminant colors. In Color vision: Perspectives from different disciplines (ed. W. Backhaus, R. Kliegl, and J. S. Werner), pp. 219–250. de Gruyter, Berlin. Nassau, K. (1982). The physics and chemistry of color: The fifteen causes of color. Wiley, New York. Parkkinen, J. P. S., Hallikainen, J., and Jaaskelainen, T. (1989). Characteristic spectra of Munsell colors. Journal of the Optical Society of America A 6, 318–322. Romero, J., Garcia-Beltran, A., and Hernandez-Andres, J. (1997). Linear bases for representation of natural and artificial illuminants. Journal of the Optical Society of America A 14, 1007–1014. Sällström, P. (1973). Colour and physics: Some remarks concerning the physical aspects of human colour vision. University of Stockholm, Institute of Physics Report, 73–09. Shafer, S. A. (1985). Using color to separate reflectance components. Color Research and Applications 4, 210–218. Vrhel, M. J., Gershon, R., and Iwan, L. S. (1994). Measurement and analysis of object reflectance spectra. Color Research and Applications 19, 4–9. Wyszecki, G. and Stiles, W. S. (1982). Color science; concepts and methods, quantitative data and formulas (2nd edn). Wiley, New York. Yang, J. N. and Maloney, L. T. (2001). Illuminant cues and surface color perception: Tests of three candidate cues. Vision Research 41, 2581–2600. Yang, J. N. and Shevell, K. S. (2002). Stereo disparity improves color constancy. Vision Research 42, 1979–1989. Yang, J. N. and Shevell, K. S. (2003). Surface color perception under two illuminants: The second illuminant reduces color constancy. Journal of Vision (in press).
commentary: the illuminant estimation hypothesis
359
Commentary on Maloney and Yang Surface colour appearance in nearly natural images David H. Brainard A pervasive theme in the literature on computational colour constancy is that constancy is not possible in general, and that it may only be achieved within a restricted class of scenes (see, for example, Maloney 1999). It is not surprising, then, that empirical studies reveal some viewing conditions both where human constancy is very good and others where it is essentially non-existent (Brainard et al. Chapter 10 this volume). It follows that the goal of empirical research should not be to determine whether human constancy is good or bad. Rather, we should either try to establish which viewing conditions support good constancy and which do not; or try to study constancy within a set of viewing conditions that are themselves intrinsically well motivated (e.g. approximations to natural viewing). Maloney and Yang rightly emphasize this theme in the introduction to their chapter. The difficulty in exploring how appearance varies with viewing conditions is the astronomical number of possible scenes available for study. It is not feasible to measure how the colour appearance of a surface varies as it is embedded in arbitrary scenes. A critical aspect of any research programme is the choice of scene attributes to be varied in the experiments. This choice differentiates work done in various labs and traditions within the fields of colour appearance and colour constancy. The classic approach is to use geometrically simple images, often consisting of a uniform test region superimposed on a uniform background (e.g. Helson and Michels 1948; Burnham et al. 1957; Stiles 1967). Within this tradition, parametric study of the effect of background spectrum on appearance is possible, but there is no guarantee that results obtained allow prediction of appearance when tests are presented in more complex images. It is, however, possible to formulate hypotheses about how test-on-background results might generalize. For example, perhaps a complex image has the same effect on the appearance of a test as a uniform background whose spectrum is the same as the spatial average of the individual spectra in the complex image. This is the equivalent background hypothesis discussed by Maloney and Yang (this chapter; Yang and Maloney 2001). Other authors have discussed a more general version of the hypothesis, where each complex image would be equivalent to some uniform background, but not necessarily the one obtained by computing the spatial average. One way to test the equivalent background hypothesis is to extend the test-on-background configuration by adding chromatic contrast to the background region. Experiments of this sort reject the equivalent background hypothesis (Brown and MacLeod 1997; Shevell and Wei 1998) and tell us that more work is required. Although it offers parametric precision, the path of gradually elaborating the test-on-background configuration may not provide the most efficient way to reach understanding of surface appearance for the images we encounter in day-to-day viewing. Maloney and Yang have chosen to work with directly with stimuli that are more like natural images. Although they are not the first authors to do so (see, for example, McCann et al. 1976; Gilchrist 1977; Breneman 1987; Brainard et al. Chapter 10 this volume), their work provides a novel approach to the central dilemma outlined above: by explicitly formulating and testing hypotheses about what cues in the image are used by the visual system, Maloney and Yang seek a stimulus parameterization that will allow systematic study of surface colour appearance in nearly natural images. In particular, three aspects of the work should be noted. First, the specific cues studied by Maloney and Yang are well motivated by consideration of computational theory (see also Brainard et al. Chapter 10 this volume). Secondly, the experimental stimuli are designed to be similar to natural images, but are sufficiently simplified that software control, and thus parametric manipulation of the relevant stimulus attributes,
360
colour perception
is possible. This method is appealing, and we have recently employed similar procedures (Delahunt 2001). Thirdly, the methods provide enough power both to show specific cues in action and to show circumstances where the cues have no effect. Recent work in my lab may also be viewed within the cue-combination framework outlined by Maloney and Yang. In Maloney and Yang’s experiments (see also Yang and Maloney 2001), comparisons are made that assess the effect of varying a single cue while the information carried by a set of other cues remains (more or less) fixed. This sort of experiment directly tests whether the information provided by the single cue is used in some fashion by the visual system. In the experiments done by Kraft and myself (Kraft and Brainard 1999; see Brainard et al. Chapter 10 this volume), we picked a particular cue (e.g. the spatial mean of the image) and compared performance across conditions where its action was silenced. This sort of experiment directly tests whether the information provided by a single cue is the only information used by the visual system. In essence, Maloney and Yang held many cues fixed and varied one, while we varied many cues and held one fixed. Both approaches are special cases of a more general approach where multiple cues are varied systematically in combination, and it would be surprising not to see such experiments from both labs in the not too distant future.
References Breneman, E. J. (1987). Corresponding chromaticities for different states of adaptation to complex visual fields. Journal of the Optical Society of America A 4, 1115–1129. Brown, R. O. and MacLeod, D. I. A. (1997). Color appearance depends on the variance of surround colors. Current Biology 7, 844–849. Burnham, R. W., Evans, R. M., and Newhall, S. M. (1957). Prediction of color appearance with different adaptation illuminations. Journal of the Optical Society of America 47, 35–42. Delahunt, P. B. (2001). Prediction of color appearance with different adaptation illuminations. Ph.D. thesis, Department of Psychology, UC Santa Barbara. Available for download from http://color.psych.upenn.edu/brainard/papers/DelahuntThesis.pdf. Gilchrist, A. L. (1977). Perceived lightness depends on perceived spatial arrangement. Science 195, 185. Helson, H. and Michels, W. C. (1948). The effect of chromatic adaptation on achromaticity. Journal of the Optical Society of America 38, 1025–1032. Kraft, J. M. and Brainard, D. H. (1999). Mechanisms of color constancy under nearly natural viewing. Proceedings of the National Academy of Sciences USA 96, 307–312. Maloney, L. T. (1999). Physics-based approaches to modeling surface color perception. In Color vision: From genes to perception, (ed. K. T. Gegenfurtner and L. T. Sharpe), pp. 387–416. Cambridge University Press, Cambridge, UK. McCann, J. J., McKee, S. P., and Taylor, T. H. (1976). Quantitative studies in retinex theory: A comparison between theoretical predictions and observer responses to the ‘Color Mondrian’ experiments. Vision Research 16, 445–458. Shevell, S. K. and Wei, J. (1998). Chromatic induction: border contrast or adaptation to surrounding light? Vision Research 38, 1561–1566. Stiles, W. S. (1967). Mechanism concepts in colour theory. Journal of the Colour Group 11, 106–123. Yang, J. N. and Maloney, L. T. (2001). Illuminant cues in surface color perception: Tests of three candidate cues. Vision Research 41, 2581–2600.
chapter 12
THE INTERACTION OF COLOUR AND MOTION donald d. hoffman Preface Human vision constructs the experiences of colour and motion in coordination. In this chapter I discuss recent experiments and computational theories which show, in the case of ‘dynamic colour spreading’, how this coordination can occur. The idea that started this chapter occurred to me in the summer of 1991. I was studying the perception of visual motion, and I was fascinated by the phenomenon of neon colour spreading, and I wondered if somehow the two might be made to interact. There was reason to think otherwise at the time, since there was good evidence that colour and motion were processed by separate neural pathways, and influential papers suggested that these pathways were fairly independent. Nevertheless, during lunch one day it occurred to me that the perception of neon-like colour spreading might be evoked by moving dots that changed colours systematically when they entered or exited a specifically shaped region of space, such as a virtual disc. This was relatively easy to program in Mathematica, so I quickly created the display that afternoon. The first results were disappointing, but just a few minutes of playing yielded displays that gave impressive colour spreading when there was apparent motion, and almost no spreading when the motion stopped. Here was a great tool that could be used to explore psychophysically the interaction of colour and motion. I began to explore, and a few weeks later was joined by Carol Cicerone. Over the next few years we and our students studied various aspects of these displays. Some of the results are reported in this chapter. More recently I was introduced by Daniel Wollschlaeger to the phenomenon of flank transparency, a technique apparently quite familiar to old map-makers, who would draw the outlines of countries in black ink, and then put a thin flanking line of colour alongside the black outline. This colour then spread to the entire interior of the country. By outlining different countries in different colours, the different countries appeared differently coloured throughout their interiors, and were easily distinguished. And all with minimal use of coloured ink. I immediately wondered again if this flank colour spreading could be enhanced by apparent motion. A series of experiments done with Daniel Wollschlaeger and Tony Rodriguez have shown that it can, and that its properties are similar in most respects to the colour spreading induced with moving dots. A new and interesting result has emerged from the dynamic flank displays. If the flanks are the same colour as the lines they flank, one still gets the perception of transparency and of neon-like spreading. However, these displays contain only two distinct colours: the white of the background, and then whatever colour is chosen for the lines and flanks. All extant theories of transparency require at least three, and usually four, differently coloured regions in an image to induce transparency. So these flank-transparency displays, which only require two different colours to induce transparent filters, indicate that new theories of transparency are needed, theories which incorporate apparent motion into the generation of transparent filters. Clearly, there is much to be learned by studying the interaction of colour and apparent motion. This chapter gives hints of the possibilities. D. D. Hoffman
362
colour perception
Introduction On 2 January 1986, Jonathan I. had a car accident and suffered a concussion. He recovered within a few days, except in one respect. He lost all ability to perceive, imagine, or dream in colour (Sacks and Wasserman 1987; Sacks 1995, pp. 3–41). This would be a difficult loss for anyone, but it was particularly poignant for Mr I. since he was an artist and, at age 65, had made his living for decades working with colour. His loss of colour was due to damage to the cerebral cortex, not to selective loss of retinal cones as in typical cases of colour blindness, and was therefore diagnosed as cerebral achromatopsia. The condition is rare, but documented cases of what appears to be cerebral achromatopsia go back several centuries (Boyle 1688; Collins 1925). Louis Verrey discovered in 1888 that the regions of cortex affected in cerebral achromatopsia are the lingual and fusiform gyri of the inferior occipital lobe. Later work has confirmed Verrey’s finding. There is now substantial evidence that area V4 of the inferior occipital cortex is critical to the perception of colour (Zeki 1973, 1980, 1983a,b, 1985; Desimone et al. 1985; Desimone and Schein 1987; Lueck et al. 1989; Dufort and Lumsden 1991; Zeki et al. 1991; Heywood et al. 1992; Motter 1994; Yoshioka and Dow 1996; Yoshioka et al. 1996), that magnetic stimulation of V4 in normal subjects can cause colour experiences called ‘chromatophenes’ (Zeki 1993, p. 279; Sacks 1995, p. 28), and that magnetic inhibition of V4 in normal subjects can cause temporary achromatopsia (Sacks 1995, p. 34, but see Hadjikhan et al., 1998 regarding V8). In short, without V4 you can’t construct colour. You might still discriminate different wavelengths of light, but you won’t experience different hues. In October of 1978, L. M. entered a hospital after suffering for 3 days with headaches and vomiting. A series of tests indicated a stroke that had damaged the lateral border between the occipital and temporal lobes of the cortex in each hemisphere. She recovered from the stroke and was, in most respects, normal, with one notable exception. She could not see motion. She could see objects and colours, and otherwise had normal vision. But, as Zihl reported in 1983, ‘She had difficulty, for example, in pouring tea or coffee into a cup because the fluid appeared to be frozen, like a glacier. In addition, she could not stop pouring at the right time since she was unable to perceive the movement in the cup (or a pot) when the fluid rose’ (Zihl et al. 1983, p. 315). Her condition is an instance of cerebral akinetopsia (Zeki 1991). There is now substantial evidence that cerebral akinetopsia results from damage to area V5 of the cortex (Zeki 1991), and that V5 is critical for the perception of motion in monkeys (Newsome et al. 1985; Newsome and Pare 1988; Salzman et al. 1990) and in humans (Riddoch 1917; Zihl et al. 1983, 1991; Baker et al. 1991; Zeki et al. 1991). Moreover it has been found that magnetic inhibition of V5 in normal subjects by transcranial magnetic stimulation can cause temporary akinetopsia (Beckers and Hömberg 1992; Beckers and Zeki 1995; Ffytche et al. 1995). In short, without V5 you can’t construct much motion. It is tempting to interpret these findings, and a wealth of related anatomical and physiological findings, as evidence for independent processing of colour and motion in human vision (Livingstone and Hubel 1987). There is, for instance, neuroanatomical and neurophysiological evidence for segregated processing of colour and motion by, respectively, distinct parvocellular and magnocellular pathways. These separate pathways are evident
the interaction of colour and motion
363
as early as the retina, and continue well into prestriate cortex (Zeki 1974; Maunsell and van Essen 1983; Albright 1984; van Essen 1985; Siegel and Andersen 1986; DeYoe and van Essen 1988; Newsome and Pare 1988); and there are psychophysical data to suggest their segregated processing. It was noted in 1911 by Stumpf, for instance, that the perception of motion in colour displays is greatly reduced at isoluminance (Stumpf 1911; Cavanagh et al. 1984; Todorovic 1996). A natural question is, how independent are colour and motion? Do we construct motion and colour separately, or does our construction of one affect our contruction of the other? That the two interact, at least in part, has been known since the French monk Benedict Prevost, in 1826, observed colours near his fingers when he waved his hands in the dimness of the cloisters (Cohen and Gordon 1949; Gregory 1987). Similar interactions between motion and colour were subsequently rediscovered many times, including rediscoveries by Gustav Fechner and Sir David Brewster (Cohen and Gordon 1949). Perhaps the most famous rediscovery was by Benham, who, in 1894, marketed a popular disc with the black and white pattern shown in Fig. 12.1. This ‘Benham’s top’ is mounted on a spindle so that it can spin about its centre. If you spin it counterclockwise at modest speed, you see an artificial spectrum: the innermost arcs form dark violet rings, the next arcs form pale blue rings, the next green, and the outermost red. If you spin it clockwise, the sequence of colours reverses, from dark violet at the outermost to red at the innermost. Another interaction between colour and motion was discovered by Bidwell in 1896 and called ‘Bidwell’s ghost’. In one instance of his demonstration, you see a spinning disc illuminated by an incandescent lamp. The surface of the spinning disc looks to be bluish green. But when the disc slows down, you see that its surface is half black and half white, with a slit through which a red lamp flashes. The bluish green that you see when the disc spins rapidly is, roughly, the colour complement to the flashing red. Wallach discovered an interaction between colour and motion in 1935, while studying the aperture problem. He used a pattern of lines seen through a rectangular aperture (Fig. 12.2). A still view of Wallach’s display gives a faint impression of red colour spreading in the upper
Figure 12.1 The Benham top.
364
colour perception
Figure 12.2 Wallach’s (1935) neon colour display. Red lines are depicted as grey.
half of the rectangle. There also appears to be a faint illusory contour passing horizontally through the middle of the rectangle. Wallach found that the colour spreading could be enhanced if he put the slanted lines in motion, and if the motion was perceived in a certain way. He rigidly translated the lines horizontally, say to the left. Sometimes observers perceived the lines as translating to the left and sometimes, due to the ambiguity induced by the rectangular aperture, as translating upwards. When they saw the lines translating upwards, subjects reported that the red colour spreading was greatly enhanced. They also reported that the lines looked uniformly black, even in the upper half of the display, but seemed to be sliding under a red filter. Wallach’s display, then, is notable in at least three respects. First, it is the first published example of neon colour spreading in a static display. Secondly, it clearly demonstrates that motion can enhance colour spreading. Thirdly, it shows that motion can alter the perceived colour of image features such as lines (e.g. changing them from red to black). Cortese and Andersen (1991) created a display in which apparent motion in an achromatic display leads to the perception of brightness spreading and illusory contours. Their display consists of a black background on which are scattered a few hundred small, white dots. The dots never move from frame to frame of the display. But some dots turn on or turn off according to the following algorithm. They simulate the rotation of a rigid (but invisible) ellipsoid that floats in front of the field of white dots. Any dots that are occluded by the ellipsoid are turned off, the rest are turned on. As the ellipsoid rotates from frame to frame, some dots switch on and others switch off, all near the boundary of the simulated ellipsoid. Literally, then, the display just consists of white unmoving dots switching on and off against a black background. But what observers perceive is an ellipsoid in three dimensions whose surface appears ‘blacker than black’, that is a black darker and more striking than that of the background. The ellipsoid is bounded by a clear illusory contour. Cortese and Andersen’s display is notable in at least two respects. First, it demonstrates that perceived motion can lead to perceived brightness (or darkness) spreading. In this regard it is like Wallach’s display, but without any hues. Secondly, it shows that this induced brightness spreading can be seen as three dimensional, for instance as the surface of an ellipsoid which curves in three dimensions. This clearly shows an interaction between motion, brightness, and the visual construction of surfaces (for related displays and results, see also Gibson et al. 1969; Stappers 1989; Shipley and Kellman 1993).
the interaction of colour and motion
365
1 2 3
Figure 12.3 The stimulus of Dobkins and Albright (1993). Green is depicted as light grey, red as dark grey.
The demonstrations of Benham, Wallach, and Cortese and Andersen suggest that visual motion can affect perceived colour. The converse is also true: colour can affect perceived motion. Some evidence for this comes from isoluminant displays. Although it is true, as Stumpf discovered, that perceived motion is greatly reduced at isoluminance, it does not completely disappear (Cavanagh and Favreau 1985; Derrington and Badcock 1985; Cavanagh and Anstis 1991). Subjects can reliably discriminate direction of motion in coloured displays at isoluminance (Sato 1988; Lindsey and Teller 1990; Dobkins and Albright 1993). Dobkins and Albright, for instance, have shown that colour affects perceived motion at isoluminance in displays such as that shown in Fig. 12.3. This figure depicts three frames from a movie. Each frame has a band of red and green patches, all of equal luminance. From one frame to the next this band translates horizontally by a precise amount: half the width of a patch. Subjects view the display through an aperture, so that they can’t see the left and right ends of the band. The question is, Which way will the band appear to move, left or right? The answer is that subjects prefer to see it move to the right. By so doing they match green patches with green and red with red. In the process of constructing objects and motion, we prefer to construct objects that don’t change colour. But this suggests that colour affects our construction of motion, even if there are no luminance differences around. And this further suggests that the parvocellular pathway, which processes information about colour, can affect the magnocellular pathway, which processes information about motion.
A useful display In the summer of 1991 I was considering the interaction of colour and motion, and wondered if I could construct another display that would demonstrate this interaction. I found a straightforward extension of the displays of Cortese and Andersen that did the trick. An example is shown in Fig. 12.4. Here are two frames from a movie. The frame on the left has 900 dots placed at random according to a uniform distribution. The frame on the right has the same dots placed at exactly the same locations. So no dots move at all from frame to frame. The only difference between frames is in the assignment of colours to dots: a slightly different set of dots is coloured green on the right than on the left. This movie was a pleasant surprise. I saw a green disc, much like a spotlight or a green filter, moving over the field of red dots. The green disc has a ghostly glow, and a well-defined subjective border surrounding it. You can get some idea of the effect by cross fusing the two frames of the figure. You’ll see a faint disc floating above the field of dots. The colour
366
colour perception
and border, however, are much more striking in the movie, which can be seen online at: http://www.socsci.uci.edu/cogsci/personnel/hoffman/dcs-demo.html. This display can be varied through almost limitless combinations of colours and virtual shapes. In place of glowing green discs I have seen, for instance, glowing red squares, glowing blue stripes, and even glowing shapes in three dimensions. My favourite is a glowing blue cigar rotating in space. Sometimes, in these displays, instead of seeing the green disc (or other coloured shape) in front, observers see it behind. The screen is perceived as an off-white sheet of paper, and all dots are perceived as holes punched in this paper. Through the holes observers see a red sheet of paper behind the white one, and sandwiched between these two sheets they see a moving green disc. This is an elaborate construction from static dots changing colour. But there’s more. When observers see the disc in front, they see its surface as transparent, glowing, and a desaturated green. But when they see it behind, they see its surface as opaque, not glowing, and a saturated green. We coordinate the quality of the surfaces we construct with the depth at which we place them. This motion-induced spread of colour is called dynamic colour spreading. What we do to create it is impressive. We create motion, even though all dots in the display never move. We create an object and give it a shape, either in two dimensions or in three. We often, though not always, endow that object with a border, sometimes smooth and sometimes with sharp corners. We further endow that object with a surface of a definite quality, either opaque or transparent, either saturated or desaturated. We place that object in space, either in front of a white sheet or behind it. And we move that object in space, either rotating it or translating it, or both. And all this from a few dots that change colour but don’t move. It takes very little to trigger our creative genius.
Figure 12.4 Two frames from a display of colour from motion. Green dots are depicted as smaller, red dots as larger.
the interaction of colour and motion
367
By the way, it doesn’t much matter whether one tracks the moving disc or keeps one’s eye fixated at one point of the display. The moving green disc is about equally compelling in either case. This eliminates simple optical smearing as the explanation for the spread of colour.
Psychophysical studies Dynamic colour spreading is an engaging effect. Everyone who sees it is surprised and intrigued. They see, say, a moving green disc, and yet also see that there is no green disc and no motion. This paradoxical perception demands further exploration. But there is an even more compelling reason to explore dynamic colour spreading systematically. It is not merely a perceptual curiosity, like a mirage or an afterimage. It is rather a window into one of the central processes of vision: the construction of objects and their properties. The visual world does not come to us prepackaged into objects and their properties. Objects are an achievement, the product of a sophisticated and active process of construction. The shower of photons hitting each retina does not come with objects prelabelled. Photons are not tagged as ‘I’m a photon that came from the cat over there’ or ‘I’m the photon from that brown desk’. Their tags are of a quite different nature: position, wavelength, time, and polarization. Anyone who has tried to build a computer vision program that converts showers of photons into a description of a world of objects, can only be struck by the complexity of the task and the facility with which human vision pulls it off. The shower of photons is discrete. The retina captures, at any moment, an integer number of photons, say 8013 or 12 359; and the photons are captured at a discrete set of locations at the retina. There are roughly 6 or 7 million cones and 120 million rods in each eye. So human vision must work, at any given time, with a discrete number of photons captured at a discrete set of locations. Yet the objects we construct have, often enough, continuous surfaces. The top of a table, the screen of a television, a sheet of paper, all have surfaces that appear to us continuous, not discrete. This means that we must not only carve the world into objects, we must also endow these objects with continuous surfaces, even though the information available to us from photons is discrete. Constructing continuous surfaces from discrete information is central to our visual construction of objects (Shipley and Kellman 1993, 1994), and it is precisely this process that is exaggerated and highlighted by the displays of dynamic colour spreading. The discrete nature of the information is exaggerated by the wide spacing between the dots. The construction of continuous surfaces is also highlighted: we clearly see a coloured and continuous surface in the gaps between the dots. What becomes strikingly obvious in these displays is what is true all the time. Every continuous surface we see is something we construct from information that is discrete and has gaps. We effortlessly fill in these gaps. If we didn’t we would never see continuous objects. So displays of dynamic colour spreading give us a method to probe one of the central processes of vision, the construction of objects, their surfaces, and their other properties; and one of the first points that becomes clear from these displays is that motion can greatly facilitate this construction. If you look at a single, static frame of a display, you see no
368
colour perception
motion, no coloured disc, no filling in of colour, only a scattered set of dots, some green and some red. But put the display in motion, and the moving green disc appears. These are the extreme cases. What happens if you systematically vary the amount of motion that is seen. Will the perception of the constructed green surface vary proportionately? I was soon joined by Carol Cicerone in studying dynamic colour spreading, and this was one of the first questions we tried to answer (Cicerone and Hoffman 1991, 1992). Effects of motion We created displays consisting of 12 frames of red and green dots, similar to the frames shown above. Each frame was 5◦ tall and wide as viewed at a distance of 42 inches (107 cm). Each had 900 dots placed at random according to a uniform distribution, and each dot subtended 3 minutes of arc. The centre of the region in which dots were coloured green was translated vertically by 0.125◦ on each successive frame. This region started 0.75◦ below the centre of the display on frame 1, and reached 0.75◦ above the centre on frame 12. These same 12 frames were shown at different speeds on different trials. The nine different speeds used were 0.063, 0.125, 0.300, 0.675, 1.08, 1.50, 1.88, 2.53, and 2.93◦ per second. Subjects fixated the centre of the display, and on each trial rated the perceived motion and colour spreading of the green disc. They also rated the perceived difference in depth between the green dots and the red dots. The rating scale went from 0 to 4, where 0 meant that the observer was absolutely certain that the stimulus attribute was absent; 1, that the observer was moderately certain it was absent; 2, uncertain whether it was present or absent; 3, moderately certain it was present; and 4, absolutely certain it was present. Four different diameters of green disc were used: 0.30, 0.60, 1.2, and 2.4◦ of visual angle. The total design of the experiment was 9 speeds × 4 diameters × 20 repetitions. The trials were presented over four experimental sessions. Within each session five repetitions of each combination of speed and diameter were presented in pseudorandom order. The results showed that as the speed of the display increases, so too does the perception of apparent motion and colour spreading of the disc. The construction of motion and the construction of colour go hand in hand. The results also illustrated another interesting aspect of the perception. In a static frame of the display, the red dots and green dots seem to be at slightly different depths, with the green dots slightly in front of the red. As the display speeds up, this difference in apparent depths decreases, until all dots appear to be in a single plane. Concomitantly, the green dots cease to look green, and instead are perceived as red, just like all the surrounding dots. The green of the green dots is somehow detached from them and reattached, in modified form, to the newly created disc, and the entire field of dots is then made of uniform depth and colour. The diameter of the disc matters. The ratings of colour spreading were strongest for the 1.2◦ diameter, suggesting that the effectiveness of the process for constructing colour spreading depends on the size of the region over which colour must spread. Experiments conducted by Fidopiastis et al. (2000) now suggest that it also depends on the number and placement of dots within this region.
the interaction of colour and motion
369
Effects of dot density and dot placement Fidopiastis et al. (2000) varied the number of dots in each display: 100, 400, or 900 dots. They also varied the way in which dots were placed: random, pseudorandom, and aligned. The random condition was as before, with dots placed at random, according to a uniform distribution, within the square. In the pseudorandom condition the square was tesselated into an array of smaller squares. The arrays were composed of either 10 × 10, 20 × 20, or 30 × 30 squares, depending on the total number of dots in the display. Within each small square was placed one dot at random, according to a uniform distribution within that small square. In the aligned conditions, the dots were evenly spaced in rows or columns, again either in arrays of 10 × 10, 20 × 20, or 30 × 30 dots. Examples of the three types of dot placement for the 400-dot displays are shown in Fig. 12.5. Fidopiastis et al. found that ratings of colour spreading and boundary clarity increase as the number of dots increases. This is no surprise. A higher density of dots means more information from which to construct motion and colour, and less area over which that motion and colour must be spread. A similar result is reported by Shipley and Kellman (1993) in an achromatic display. Instead of changing dot colours, Shipley and Kellman gave their dots a small displacement as a virtual object passed by. This leads to no colour spreading, but it does lead to the perception of a shape with a clear boundary. Instead of collecting ratings, they had subjects discriminate among 10 different shapes in a forced-choice procedure. They found that subjects’ accuracy in discrimination increased significantly as the number of dots increased from 50 to 400. Fidopiastis et al. also found that the ratings of colour spreading increase with increasing regularity in placement of the dots. The perfectly aligned dots give by far the most compelling colour spreading, followed by the pseudorandom placement, with the random placement giving by far the weakest colour spreading. They found that this result holds for green discs and green squares, and so probably doesn’t depend on the precise shape of the virtual object that is constructed. Why does dot placement matter? We don’t know yet. One possibility is that the more regular the spacing of the dots, the less likely there are to be big gaps which must be filled during the process of constructing the colour spreading. For perfectly aligned dots, the gaps
Aligned
Pseudorandom
Random
Figure 12.5 Stimuli used by Fidopiastis et al. (1998). Green dots are depicted as smaller, red dots as larger.
370
colour perception
are consistently the same modest size everywhere. For the randomly placed dots there are regions with large gaps and other regions in which dots crowd together closely. If the efficiency of the spreading process is limited by the largest gap that must be filled, then the randomly placed dots would, in general, lead to less efficient spreading. A pilot experiment by Fidopiastis et al. also found that dot placement affects performance in a shape detection task. In this pilot experiment, they replaced the green disc with a green square, the corners of which could be either sharp or slightly rounded. The subjects’ task was to decide whether the corners in a given trial were sharp or rounded. Detection, as measured by d ′ , increased significantly as the dots were more regularly placed, confirming the results obtained by ratings judgements. More careful studies of the effects of dot placement must be done to determine what exactly is responsible for the changes in strength of colour spreading. This will give us some insight into the precise processes that create colour spreading. Effects of dot colour So far, I have discussed displays that use green and red dots, but these colours are not special. Many different colour combinations give clear perceptions of motion, colour spreading, and boundaries. One can, for instance, use blue dots rather than green, black dots instead of red, and see striking colour spreading in the form of a blue disc (Shipley and Kellman 1994). Systematic experiments have not yet been run to compare the relative effectiveness of different colours in producing colour spreading and subjective boundaries in displays of dynamic colour spreading. Casual observations suggest that blue is more effective than green in producing convincing colour spreading, but less effective than green in producing subjective boundaries. Casual observation also suggests that green is more effective than red in producing colour spreading. These would be interesting observations to follow up, given that the ratio of L to M cones is about 2 : 1 (Nerger and Cicerone 1992), and that S cones are even less densely distributed than M cones. There may be a relationship between effectiveness of colour spreading and cone density, with lower-density cones yielding better colour spreading. Lower-density cones may also yield poorer subjective boundaries. The colour of the spreading within the disc depends primarily on the colour of the dots within the disc, and little, if at all, on the colour of the dots in the surround, according to Miyahara and Cicerone (1997). They obtained this result with a colour-matching task. Subjects viewed displays of colour-spreading discs and adjusted the hue, saturation, and brightness of two solidly coloured test discs until they matched the colour-spreading disc. Miyahara and Cicerone used red and green dots of various luminances in their experiment. If the dots inside the spreading disc were red, then the disc itself was red, and didn’t vary with changes in the luminance of the dots in the surround. Similarly, mutatis mutandis, if the dots inside the spreading disc were green. It remains to be seen if this result extends to other combinations of hues. If so, then it suggests that colour contrast is not the mechanism that drives dynamic colour spreading. Miyahara and Cicerone (1997) also report that luminance differences between the dots within and without the disc are not required to obtain colour spreading. They used 12-frame displays of dynamic colour spreading, as described above, with red and green dots, the luminance relations of which were systematically varied. Six colour-normal observers rated
the interaction of colour and motion
371
the apparent motion, colour spreading, and subjective boundary on a five-point scale. They found that the best colour spreading is obtained if the dots within the disc are more luminant than those without. However, colour spreading is obtained even near isoluminance. This suggests that differences in chromaticities alone, without differences in luminance, are sufficient to drive dynamic colour spreading. Near isoluminance, subjective boundaries almost disappear, suggesting that subjective boundaries are not required for the perception of dynamic colour spreading. Similar results hold for spreading and contour in static neon colour spreading (Redies et al. 1984). Again, the results of Miyahara and Cicerone have been obtained using only red and green dots. Other chromaticities need to be explored to see whether their results hold more generally. In the displays discussed so far, the colour spreading is homogeneous if it is seen at all. However, it is possible to alter the displays so that the spreading is not homogeneous. Consider, for instance, a display in which the dots inside the disc are green and those outside are red. On each frame of the display, one can make a certain fraction of dots inside the disc some colour other than green. Suppose the fraction is 10%. Then, on each frame, each of the dots that should be green has a 10% chance of being another colour, say red. Which dots are actually flipped to the other colour varies randomly from frame to frame. When this display is viewed, one still sees a disc-shaped unit moving, and this disc is primarily filled with green colour spreading. However, little holes appear in this colour spreading around the dots that have flipped to another colour. Since these flipped dots change from frame to frame, one sees a dynamic pattern of holes appearing and disappearing in the green colour spreading. Nevertheless one sees a coherent unit moving, even if the fraction of dots that are flipped to red is as much as 50%. Shipley et al. (1993) have obtained a similar result in dynamic displays using black and white dots. They used a static field of black and white dots, randomly intermixed, against a black background. As they moved an invisible virtual shape over this display, they simply changed the colours of the dots (from black to white or vice versa) within the boundaries of the virtual shape. These changes of dot colours were the only information available to observers for judging the virtual shape. In a ten-alternative forced-choice experiment, subjects were well above chance in discriminating the virtual shapes. Effects of stereo disparity It has long been known that subjective contours can be fused to produce subjective surfaces in three dimensions. Lee, for instance, found that subjective contours obtained in motion displays by accretion and deletion of texture elements can be fused to create the perception of an object in depth (Gibson et al. 1969; Lee 1970; Shipley et al. 1993). Static subjective contours may also be fused (Bloomfield 1973; Gregory and Harris 1974; Lawson et al. 1974; Ramachandran and Cavanagh 1985; Nakayama et al. 1990). With crossed disparities the resulting subjective surfaces appear in front of the inducing elements; with uncrossed disparities they appear behind. The phenomenal appearance of the subjective surface and contours can change dramatically with a simple shift between crossed and uncrossed disparities (Nakayama et al. 1990), as illustrated by Fig. 12.6. By fusing this figure you can see both the crossed and uncrossed cases. In the crossed case the surface appears to be a
372
colour perception
Figure 12.6 Subjective surfaces from stereo.
diaphanous film; in the uncrossed case it appears to be opaque (a distinction in surface qualities clearly described by Katz in 1935). Displays of dynamic colour spreading can also be viewed in stereo. The trick is simple. All dots remain at zero disparity throughout the display. The only disparity is in the assignment of colour to dots. This technique is illustrated above, with two frames from a display placed side by side. When fused with crossed disparity, the two frames in static view lead to the perception of a transparent filter floating above the field of dots. When fused with uncrossed disparity, they sometimes lead to a weak perception of an opaque disc floating behind the field of dots. Pilot studies by Elisabeth Luntz indicate that these effects are dramatically enhanced when the display is put in motion. In the crossed case, ratings of transparent colour spreading are very high, whereas in the uncrossed case ratings of an opaque surface are very high. In these displays we have strong evidence of motion, disparity, and colour all interacting in our construction of objects and their surfaces. Effects of dichoptic presentation The stereo experiments just described show that dynamic colour spreading can be affected by stereo disparity, thus indicating that at least part of the colour-spreading effect can take place in the visual system at or beyond the point of binocular combination of the inputs from the two eyes. A minor modification of these stereo displays provides further evidence for the role of more central neural processing in the construction of dynamic colour spreading. The modification turns the stereo display into a dichoptic display as follows. On every odd frame of the stereo movie, simply remove all green dots from the left side of the frame, and leave the right side untouched. On every even frame remove all green dots from the right side of the frame, and leave the left side untouched. When the resulting movie is shown, the green dots defining the disc region are shown first to the right eye alone, then the left eye alone, and so on. If the display is viewed in stereo, these alternating presentations of green dots can still be fused to produce compelling dynamic colour spreading (Cicerone and Hoffman 1997). It is possible to find frame presentation rates for which the display viewed monoptically produces no dynamic colour spreading, but which when viewed dichoptically produces strong dynamic colour spreading (Cicerone and Hoffman 1997). This again suggests that central neural mechanisms are involved in dynamic colour spreading. (Dichoptic displays have been used before to study apparent motion; see Carney and Shadlen (1993), and critiques of the approach by Georgeson and Shakleton (1992). Physiological evidence also suggests that apparent motion may be achieved by neural mechanisms at or beyond the site of binocular combination.)
the interaction of colour and motion
373
Computational theories There is as yet no satisfactory computational theory to account for our perception of surfaces and contours in displays of dynamic colour spreading. We have just surveyed some of the perceptual phenomena that such a theory must account for. But a brief review of key points is in order: 1.
2.
A sparse field of dots in which no dot ever moves, but in which individual dots change hue and brightness, can trigger the perception of subjective contours and of homogeneous colour spreading through regions in which there are no dots. The subjective contours and colour spreading can be seen as defining a flat surface in two dimensions, or as defining a curved surface in three dimensions.
3. 4.
The subjective contours are usually smooth, but can have clear and sharp corners. The subjective contours and colour spreading can deform smoothly over time. They are not restricted to rigid motion in two dimensions.
5.
The clarity of the subjective contours and colour spreading depends on the density of the dots and on the precise placement of the dots. Dots placed in a rectangular array yield better contours and spreading than dots placed at random. Crossed disparity in the assignment of colours to dots can make the colour spreading appear transparent. Uncrossed disparity can make it appear opaque.
6. 7.
Colour spreading can occur, near isoluminance, without an accompanying subjective contour.
This is not an exhaustive list, but a summary of some main points to be faced by computational theories. No theory to date can account for all these points. But there are a few theories that go part way. Perhaps the most comprehensive theory is Grossberg and Mingolla’s (1993) FACADE neural network, updated to incorporate motion (see also Grossberg 1994). This update includes adding an eight-level motion-oriented contrast filter, which allows the system to detect and outline moving objects. To account for dynamic colour spreading, this network would need to be expanded to detect apparent motion and create subjective boundaries from changes in colour of static features (like sparse arrays of dots). Shipley and Kellman (Shipley and Kellman 1997; Cunningham et al. 1998) have investigated this problem and taken an interesting step to solve it. They have found that, in principle, it is possible, in displays of dynamic colour spreading, to compute the orientation of a straight-line subjective boundary from the colour changes in three non-colinear dots. By piecing together many such line segments, it may be possible to compute a global subjective boundary. Prophet et al. (1998) have also investigated this issue. Their algorithm assigns threedimensional coordinates to each dot in each frame of the display. The first two coordinates of a dot are its x and y coordinates in the display, which never change for any given dot. The third coordinate, z, is the frame number. They then save the three-dimensional coordinates
374
colour perception
of those dots that change colour from one frame to the next. After accumulating the threedimensional coordinates of such dots over several frames, they use these coordinates as control points for interpolating a surface (over space and time). The intersection of this surface with the plane z = t gives the subjective boundary of the virtual shape at time t .
Conclusion Colour is not simply surface reflectance, or triples consisting of surface reflectances as filtered through cone sensitivity functions. Colour is a complex construction of human vision. It is a construction not carried out in isolation, independent of other visual constructions. Instead it is a construction carefully coordinated with the construction of visual motion, surfaces, depths, transparency, and light sources. The nature and complexity of these coordinated constructions has barely been sampled by psychophysics to date. And no existing computational theories are yet adequate to what little of that complexity has been sampled. Displays of dynamic colour spreading provide a fertile area for psychophysical study of our coordinated construction of colour, surfaces, motion, and lights. They also provide a challenging arena for testing out computational theories of these constructions. The interaction and convergence of psychophysical and computational studies of colour should lead to a more profound understanding of the sophistication and complexity of the processes by which we construct colour, an understanding which should be a great aid to certain discussions in the philosophy of mind which turn on theories of colour vision.
Acknowledgement This work was supported by US National Science Foundation grant 0090833.
References Albright, T. D. (1984). Direction and orientation selectivity of neurons in visual area MT of the macaque. Journal of Neurophysiology 52, 1106–1130. Baker, C.L., Hess, R.F., and Zihl, J. (1991). Residual motion perception in a ‘motion-blind’ patient, assessed with limited-lifetime random dot stimuli. Journal of Neuroscience 11, 454–461. Beckers, G. and Hömberg, V. (1992). Cerebral visual motion blindness: Transitory akinetopsia induced by transcranial magnetic stimulation of human area V5. Proceedings of the Royal Society of London B 249, 173–178. Beckers, G. and Zeki, S. (1995). The consequences of inactivating areas V1 and V5 on visual motion perception. Brain 118, 49–60. Bidwell, S. (1896). On subjective colour phenomena attending sudden changes in illumination. Proceedings of the Royal Society 60, 368–377. Bloomfield, S. (1973). Implicit features and stereoscopy. Nature 245, 256–257. Boyle, R. (1688). Some uncommon observations about vitiated sight. Taylor, London. Carney, T. and Shadlen, M. N. (1993). Dichoptic activation of the early motion system. Vision Research 33, 1977–1995. Cavanagh, P. and Anstis, S. M. (1991). The contribution of color to motion in normal and color-deficient observers. Vision Research 31, 2109–2148.
the interaction of colour and motion
375
Cavanagh, P. and Favreau, O. E. (1985). Color and luminance share a common motion pathway. Vision Research 25, 1595–1601. Cavanagh, P., Tyler, C. W., and Favreau, O. E. (1984). Perceived velocity of moving chromatic gratings. Journal of the Optical Society of America A 1, 893–899. Cicerone, C. M. and Hoffman, D. D. (1991). Dynamic neon colors: Perceptual evidence for parallel visual pathways. Mathematical Behavior Sciences Memo 91–22. University of California, Irvine. Cicerone, C. M. and Hoffman, D. D. (1992). Dynamic neon colors: Perceptual evidence for parallel visual pathways. Advances in Color Vision Technical Digest, 4, 66–68. Cicerone, C. M. and Hoffman, D. D. (1997). Color from motion: Dichoptic activation and a possible role in breaking camouflage. Perception 26, 1367–1380. Cohen, J. and Gordon, D. A. (1949). The Prévost–Fechner–Benham subjective colors. Psychological Bulletin 46, 97–136. Collins, M. (1925). Colour-blindness. Harcourt, Brace and Co., New York. Cortese, J. M. and Andersen, G. J. (1991). Recovery of 3-D shape from deforming contours. Perception and Psychophysics 49, 315–327. Cunningham, D. W., Shipley, T. F., and Kellman, P. J. (1998). Interactions between spatial and spatiotemporal information in spatiotemporal boundary formation. Perception and Psychophysics 60, 839–851. Derrington, A. M. and Badcock, D. R. (1985). The low level motion system has both chromatic and luminance inputs. Vision Research 25, 1879–1884. Desimone, R. and Schein, S. J. (1987). Visual properties of neurons in area V4 of the macaque: Sensitivity to stimulus form. Journal of Neurophysiology 57, 835–868. Desimone, R., Schein, S. J., Moran, J., and Ungerleider, L. G. (1985). Contour, color and shape analysis beyond the striate cortex. Vision Research 25, 441–452. DeYoe, E. A. and van Essen, D. C. (1988). Concurrent processing streams in monkey visual cortex. Trends in Neuroscience 11, 219–226. Dobkins, K. R. and Albright, T. D. (1993). What happens if it changes color when it moves? Psychophysical experiments on the nature of chromatic input to motion detectors. Vision Research 33, 1019–1036. Dufort, P. A. and Lumsden, C. J. (1991). Color categorization and color constancy in a neural network model of V4. Biological Cybernetics 65, 293–303. Ffytche, D. H., Guy, C. N., and Zeki, S. (1995). The parallel visual motion inputs into areas V1 and V5 of human cerebral cortex. Brain 118, 1375–1394. Fidopiastis, C., Hoffman, D. D., Prophet, W., and Singh, M. (2000). Constructing surfaces and contours in displays of color from motion: The role of nearest neighbors and maximal disks. Perception 29, 567–80. Georgeson, M. A. and Shackleton, T. M. (1992). No evidence for dichoptic motion sensing: A reply to Carney and Shadlen. Vision Research 32, 193–198. Gibson, J. J., Kaplan, G. A., Reynolds, H. N., and Wheeler, K. (1969). A study of optical transitions. Perception and Psychophysics 5, 113–116. Gregory, R. L. (1987). Oxford companion to the mind, pp. 78–79. Oxford University Press, Oxford. Gregory, R. L. and Harris, J. P. (1974). Illusory contours and stereo depth. Perception and Psychophysics 15, 411–416. Grossberg, S. (1994). 3-D vision and figure-ground separation by visual cortex. Perception and Psychophysics 55, 48–120. Grossberg, S. and Mingolla, E. (1993). Neural dynamics of motion perception: Direction fields, apertures, and resonant grouping. Perception and Psychophysics 53, 243–278. Hadjikhani N., Liv, A. K., Dale, A. M., Cavanagh, P., and Tootell, R. B. H. (1998). Retinotopy and color sensitivity in human visual cortical area V8. Nature Neuroscience 1, 235–41.
376
colour perception
Heywood, C. A., Gadotti, A., and Cowey, A. (1992). Cortical area V4 and its role in the perception of color. Journal of Neuroscience 12, 4056–4065. Katz, D. (1935). The world of colour (Translated from German by R. B. Macleod and C. W. Fox). Kegan Paul, Trench, Trubnov and Co., London. Lawson, R. B., Cowan, E., Gibbs, T. D., and Whitmore, C. D. (1974). Stereoscopic enhancement and erasure of subjective contours. Journal of Experimental Psychology 103, 1142–1146. Lee, D. N. (1970). Binocular stereopsis without spatial disparity. Perception and Psychophysics 9, 219–221. Lindsey, D. T. and Teller, D. (1990). Motion at isoluminance: Discrimination/detection ratios for moving isoluminant gratings. Vision Research 30, 1751–1761. Livingstone, M. S. and Hubel, D. H. (1987). Psychophysical evidence for separate channels for the perception of form, color, movement, and depth. Journal of Neuroscience 7, 3416–3468. Lueck, C. J., Zeki, S., Friston, K. J., Deiber, M.-P., Cope, P., Cunningham,V.J. et al. (1989). The colour centre in the cerebral cortex of man. Nature 340, 386–389. Maunsell, J. H. R. and Van Essen, D. C. (1983). Functional properties of neurons in middle temporal visual area of the macaque monkey. I. Selectivity for stimulus direction, speed and orientation. Journal of Neurophysiology 49, 1127–1147. Miyahara, E. and Cicerone, C. M. (1997). Color from motion: Separate contributions of chromaticity and luminance. Perception 26, 1381–1396. Motter, B. C. (1994). Neural correlates of attentive selection for color or luminance in extrastriate area V4. Journal of Neuroscience 14, 2178–2189. Nakayama, K., Shimojo, S., and Ramachandran, V. S. (1990). Transparency: Relation to depth, subjective contours, luminance, and neon color spreading. Perception 19, 497–513. Nerger, J. L. and Cicerone, C. M. (1992). The ratio of L cones to M cones in the human parafoveal retina. Vision Research 32, 879–888. Newsome, W. T. and Paré, E. B. (1988). A selective impairment of motion perception following ibotenic acid lesions of the middle temporal visual area of the macaque monkey. Journal of Neuroscience 8, 2201–2211. Newsome, W. T., Wurtz, R. H., Dürsteler, M. R., and Mikami, A. (1985). Deficits in visual motion processing following ibotenic acid lesions of the middle temporal visual area of the macaque monkey. Journal of Neuroscience 5, 825–840. Prophet, W., Hoffman, D. D., and Cicerone, C. M. (1998). Contours from apparent motion: A computational theory. In From fragments to objects (ed. P. Kellman and T. Shipley), pp. 509–30. Elsevier, Amsterdam. Ramachandran, V. S. and Cavanagh, P. (1985). Subjective contours capture stereopsis. Nature 317, 527–530. Redies, C., Spillmann, L., and Kunz, K. (1984). Colored neon flanks and line gap enhancement. Vision Research 24, 1301–1309. Riddoch, G. (1917). Dissociation of visual perception due to occipital injuries, with especial reference to appreciation of movement. Brain 40, 15–57. Sacks, O. (1995). An anthropologist on Mars. Vintage Books, New York. Sacks, O. and Wasserman, R. (1987). The case of the colorblind painter. New York Review of Books, 19 November. Salzman, C. D., Britten, K. H., and Newsome, W. T. (1990). Cortical microstimulation influences perceptual judgements of motion direction. Nature 346, 174–177. Sato, T. (1988). Direction discrimination and pattern segregation with isoluminant chromatic random-dot cinematograms (RDC). Investigative Ophthalmological Visual Science 29, 449. Shipley, T. F. and Kellman, P. J. (1993). Optical tearing in spatiotemporal boundary formation: When do local element motions produce boundaries, form, and global motion? Spatial Vision 7, 323–339.
the interaction of colour and motion
377
Shipley, T. F. and Kellman, P. J. (1994). Spatiotemporal boundary formation: Boundary, form, and motion perception from transformations of surface elements. Journal of Experimental Psychology: General 123, 3–20. Shipley, T. F. and Kellman, P. J. (1997). Spatiotemporal boundary formation: The role of local motion signals in boundary perception. Vision Research 37, 1281–1293. Shipley, T. F., Cunningham, D. W., and Kellman, P. J. (1993). Spatiotemporal stereopsis. In Studies in perception and action II (ed. S. S. Valenti and J. B. Pittenger), pp. 279–283. Erlbaum, New Jersey. Siegel, R. M. and Andersen, R. A. (1986). Motion perceptual deficits following ibotenic acid lesions of the middle temporal area (MT) in the behaving rhesus monkey. Society for Neuroscience Abstracts 12, 1183. Stappers, P. J. (1989). Forms can be recognized from dynamic occlusion alone. Perceptual Motor Skills 68, 243–251. Stumpf, P. (1911). Über die Abhängigkeit der visuellen Bewegungsempfindung und ihres negativen Nachbildes von den Reizvorgängen auf der Netzhaut. Zeitschrift für Psychologie 59, 321–330. Todorovi´c, D. (1996). A gem from the past: Pleikart Stumpf ’s (1911) anticipation of the aperture problem, Reichardt detectors, and perceived motion loss at equiluminance. Report 25/96 of the Center for Interdisciplinary Research, Bielefeld, Germany. Van Essen, D. C. (1985). Functional organization of primate visual cortex. In Cerebral cortex (ed. A. Peters and E. G. Jones), pp. 259–327. Plenum, New York. Wallach, H. (1935). Über visuell wahrgenommene Bewegungsrichtung. Psychologische Forschung 20, 325–380. Yoshioka, T. and Dow, B. M. (1996). Color, orientation and cytochrome oxidase reactivity in areas V1, V2, and V4 of macaque monkey visual cortex. Behavioural Brain Research 76, 71–88. Yoshioka, T., Dow, B. M., and Vautin, R. G. (1996). Neuronal mechanisms of color categorization in areas V1, V2, and V4 of macaque monkey visual cortex. Behavioural Brain Research 76, 51–70. Zeki, S. M. (1973). Colour coding in rhesus monkey prestriate cortex. Brain Research 53, 422–427. Zeki, S. M. (1974). Functional organization of a visual area in the posterior bank of the superior temporal sulcus of the rhesus monkey. Journal of Physiology 236, 549–573. Zeki, S. M. (1980). The representation of colours in the cerebral cortex. Nature 284, 412–418. Zeki, S. M. (1983a). Colour coding in the cerebral cortex: The reaction of cells in monkey visual cortex to wavelengths and colours. Neuroscience 9, 741–765. Zeki, S. M. (1983b). Colour coding in the cerebral cortex: The responses of wavelength-sensitive cells in monkey visual cortex to changes in wavelength composition. Neuroscience 9, 767–781. Zeki, S. M. (1985). Colour pathways and hierarchies in the cerebral cortex. In Central and peripheral mechanisms of colour vision (ed. D. Ottoson and S. Zeki). Macmillan, London. Zeki, S. M. (1991). Cerebral akinetopsia (cerebral visual motion blindness). Brain 114, 811–824. Zeki, S. M. (1993). A vision of the brain. Blackwell, Boston. Zeki, S. M., Watson, J. D. G., Lueck, C. J., Friston, K. J., Kennard, C., and Frackowiak, R. S. J. (1991). A direct demonstration of functional specialization in human visual cortex. Journal of Neuroscience 11, 641–649. Zihl, J., Cramon, D. von, and Mai, N. (1983). Selective disturbance of movement vision after bilateral brain damage. Brain 106, 313–340. Zihl, J., Cramon, D. von, Mai, N., and Schmid, C. H. (1991). Disturbance of movement vision after bilateral posterior brain damage: Further evidence and follow up observations. Brain 114, 2235–2252.
378
colour perception
Commentary on Hoffman The interaction of perceived colour and perceived motion? Richard Brown In this chapter Donald Hoffman asks the question, how independent are colour and motion? He is careful to treat both colour and motion as products, not causes, of visual perception, in the process raising, for motion, the same realism/subjectivism issues which (defined) so much of the ZiF group’s discussions of colour. Don seems to take the subjectivism of motion for granted here, rather than tackling these philosophical issues head-on. But it’s hard for me to think about the central question of whether colour and motion—so defined—are constructed independently, without making some distinction between the properties of the stimuli and the resulting perceptions. In many of the cases discussed in this chapter, it remains unclear to me whether the reported interactions are truly between perceived motion and perceived colour, rather than both the perceived motion and the perceived colour being affected by common, lower-level aspects of the stimuli. This paper begins by relating compelling case studies of cerebral achromatopsia and cerebral akinetopsia, which appear to offer a neuroanatomical double dissociation of colour perception from motion perception. He also presents neuroscientific evidence for distinct motion and colour areas in the visual system, although it should be noted that while there is general agreement that some degree of functional specialization is found in anatomically distinct visual cortical areas, the actual extent of specialization and its correspondence to basic modalities such as colour and motion remains highly contentious (cf. Schiller 1996). It also remains a matter of controversy whether the human colour and motion centres correspond directly to the anatomical areas V4 and V5 described in monkeys. Don then moves to consideration of various psychophysical results suggesting interactions between colour and motion, and here is where special caution may be required. His first two examples, Benham’s disc and Bidwell’s disc, both involve spinning discs, the motion of which leads to surprising colours that depend on the speed and direction of the discs’ motions. This is offered as evidence that motion affects colour perception, but in both of these cases the spinning motion of the discs is not necessary for the colour phenomena, but merely provided a convenient mechanism, especially in the nineteenth century when they were first created, to produce the rapid temporal signals which are necessary for the colour phenomena. Today both effects can be achieved on computer monitors with stationary flickering patterns that produce neither stimulus motion nor perceived motion. Much of the paper concerns observations and experiments on ‘dynamic colour spreading’. Don and his collaborators have developed and elaborated engaging demonstrations in which sparse patterns of stationary dots change colours to generate the perception of extended, coloured, moving discs (and other shapes). A series of experiments has helped define the relations between the physical parameters of the stimuli and the resulting perceptions. These effects do seem highly relevant to the question of interactions between colour and motion, although it might be valuable to know how much of the colour spreading depends directly on the perceived motion, rather than on the dynamic activity of the dots which generates the perceived motions. (Perhaps a dynamic display in which the dots blink but aren’t seen to move would help resolve this.) An additional caveat is that while V5 has been implicated in basic motion perception, there are tantalizing indications that the perception of shape from motion may not involve V5, as people with lesions in this area suffer serious deficits in basic motion perception, but remain relatively unimpaired in perceiving shape from motion or even ‘biological motion’. It seems possible that the moving illusory objects seen in these dot displays may represent a type of shape-from-motion perception, in which case, the resulting colour spreading
commentary: the interaction of colour and motion
379
may not be such strong evidence for an influence on colour perception of the motion perceptions generated in V5. For me, the most intriguing part of Don’s paper was his discussion of the relation between constructing illusory coloured surfaces from the sparse dots in his demonstrations, and the visual system’s more general problem of constructing objects extended in space and time from discrete photon events. The analogy here is clear, but his conclusion that the same process may be involved in solving both of these problems seems highly speculative. One important difference is that the individual dots in the dynamic colour spreading displays are clearly resolved, meaning that there is positive evidence for black space between the dots, and the paradox is that extended illusory surfaces appear despite this. It is the visibility of the spaces between the dots which make these displays interestingly different from other types of moving coloured areas seen on computer displays, which, after all, are also composed of discrete stationary dots which can only change colour, but in which these dots cannot normally be resolved. On the other hand, in the case of light sampling by discrete photoreceptors, there is no evidence to distinguish whether the retinal image itself is continuous or composed of discontinuous dots. Applying the principle of genericity here would seem to predict the perception of a continuous surface being discretely sampled, rather than a discrete collection of dots which happen to be aligned precisely with discrete photoreceptors. But any leads into this deep issue of how visual systems construct extended things from finite, noisy clues seem worthy of further exploration.
Reference Schiller, P. H. (1996). On the specificity of neurons and visual areas. Behavioral Brain Research 76, 21–35.
This page intentionally left blank
chapter 13
‘COLOUR’ AS PART OF THE FORMAT OF DIFFERENT PERCEPTUAL PRIMITIVES: THE DUAL CODING OF COLOUR rainer mausfeld Preface In recent years the field of colour perception has often been praised as a paradigm of cognitive science. While this certainly has some validity, it contrasts with the fact that the field makes very little contact with the sort of enquiries into mental representations to be found elsewhere in cognitive science (understood as the naturalistic study of the mind/brain). I find this quite puzzling, because in the earlier literature of the field it was clearly recognized—by Bühler, Gelb, Kardos, and Koffka, for instance —that ‘colour’ could only be understood as part of the general problem of perceptual representations. Their insights could not, of course, take advantage of the theoretical language provided by what has been called the ‘cognitive revolution’. For that reason, and also because they were overshadowed by the success of more technical fields, they fell almost entirely into oblivion. The technical fields, successful with respect to their own specific goals, were colorimetry, neurophysiological investigations into peripheral colour coding, and, more recently, functionalist-computational approaches that emphasize certain pre-given performance criteria. The success of these fields has not been hampered by the fact that they share certain common-sense conceptions of colour, particularly the idea that colour is an autonomous attribute that can be studied almost in isolation from other perceptual attributes. Because such common-sense conceptions of colour appear to be, by and large, innocuous to advances in these fields, no need has arisen, so far, to relinquish them. However, precisely because of the successes of these fields, enquiries into colour perception, understood as the endeavour to develop explanatory frameworks for the role of ‘colour’ within our perceptual and cognitive architecture, have suffered a less fortunate fate. The conceptual vocabulary that enquiries into colour perception borrowed from fields such as neurophysiology, which pursue different explanatory purposes, has remained alien to its intrinsic structure and has veiled its core problems. My interest in colour perception (which, a long time ago, was incited by Russell’s Problems of philosophy) has been motivated by the question of how we can, within naturalistic enquiry, describe the conceptual structure with which our perceptual system is biologically endowed. Such questions have long been pursued in ethology and have yielded intriguing results. The theoretical picture that is emerging has gained further support from other fields of enquiry, ranging from phenomenological observations to studies with newborns. Although our understanding of the principles of perception is still quite thin, it is, I believe, to a large extent the result of ethological enquiries that unifying principles seem to be appearing at the horizon. In the following chapter, I tentatively explore a line of thinking, inspired by ethology, that tries to break away from common-sense conceptions of colour that, in the context of scientific enquiry, appear unmotivated. I argue that ‘colour’ is not a homogeneous and autonomous attribute, but rather plays different roles in different representational primitives, in line with what current research seems increasingly to be (re-) discovering. Of course, the theoretical picture of the role of colour within perceptual architecture that is emerging from an ethology-inspired approach (emphasizing the conceptual structure of perception) is still faint and inevitably speculative. This holds, however, for most attempts to formulate overarching principles of perception. What makes the ethological approach particularly attractive for me is that it provides some glimmers of the sort of fruitful falling-into-place of a variety of important ideas, observations, and findings. R. Mausfeld
382
colour perception
Introduction Colour is one of the most conspicuous aspects of visual experiences. Together with shape it imparts objects their individual distinctiveness and is a salient characteristic of the appearance of objects. Whereas shape is a property of physical objects that seems to be intrinsic to them, apparently a necessary part of their physical description, the nature of colour seems to be much more enigmatic. On the one hand, colour experiences are, by and large, tied in a lawful way to physical properties of the ‘external world’, on the other hand, colour experiences have a peculiarly subjective nature. Although the structure of our entire phenomenal world of perception is, in a sense, brought forth by the internal conceptual structure of the brain, we tend to ascribe different degrees of objective and subjective origins to its different aspects as a consequence of this conceptual structure. Colours fall right on the boundary that we have drawn by bifurcating the world into the physical and the psychological; more than other perceptual attributes, they seem to be Janus-faced. This is also mirrored in the incoherent and vacillating linguistic usage of colour expressions in everyday language (for instance, we can speak of an object as looking purple though being blue, or as having lost its colour). Our everyday usage of colour concepts hovers between two quite different meanings of colour, i.e. colour patches and colour experiences (which has given rise to tremendous philosophical confusion). This ambiguity, with respect to the entities colours are ascribed to, does not, however, prevent ‘colour’ being conceived as a kind of autonomous and independent attribute in common-sense taxonomies. However, scientific enquiry has to go beyond common-sense taxonomies—here as elsewhere in the natural sciences—and to pursue lines of enquiry that are dictated by attempts to develop explanatory frameworks of interesting range and depth. In scientific investigations ‘colour’ does not demark a single field of rational inquiry or a unitary explanatory domain. Questions centring around colour phenomena can, for instance, refer to abstract theories of perception, to the minutiae of neurophysiological coding, to the evolutionary history and functional role of colour perception, to the role of colours in animal communication, to dyeing techniques in arts and industry, to aesthetical or emotional effects, or, more generally, to common-sense psychology and common-sense physics. Each of these domains has its own specific goals and prompts different questions to be asked. Detached from specific domains of enquiry, attempts to ascertain what the essence or ‘quidditas’ of ‘colour’ is, are thus pointless and of no relevance for any of these domains. Notwithstanding that scientific enquiry ultimately strives, wherever possible, toward explanatory unification over different domains, jumbling up different explanatory goals and different levels of analysis in colour perception may veil problems of theoretical importance and hinder a theoretical understanding of the perceptual principles on which it is based. If we, more specifically, turn to a more narrowly defined domain of enquiry and try to develop abstract theories that describe the role ‘colour’ plays within the basic architecture of the human perceptual system, we are again tempted by common-sense taxonomies to regard ‘colour’ as a kind of autonomous and independent attribute that can be investigated more or less in isolation. However, a proper acknowledgement of relevant facts and observations leads to a quite different theoretical picture: contrary to what common-sense taxonomies
the dual coding of colour
383
suggest, ‘colour’ is not an autonomous attribute and cannot be studied detached from other aspects of our perceptual architecture. The corresponding pre-conception—still highly influential in colour science—that, with respect to the human perceptual system, ‘colour’ is a single and autonomous attribute, has greatly impeded the development of appropriate explanatory accounts of perception. Technology-shaped refinements of common-sense taxonomies Among the biggest obstacles for theoretical enquiries into the internal perceptual structure underlying colour perceptions are what Evans (1974, p. 197) called the ‘errors of the application of colorimetric thinking to perception’, i.e. inappropriate use of abstractions and concepts that were developed, as refinements of common-sense taxonomies, to serve purposes of colour technology. Because these abstractions, particularly those that are presumed to capture ‘basic attributes’ of colour, seem quite natural from the point of view of our ordinary way of talking about colour (which itself has been modified by a technology-shaped progression toward an increasingly abstract colour vocabulary), they were also considered as the natural and almost compulsory point of departure for dealing with colour within perception theory. Their apparent cogency was augmented by selecting specific types of colour phenomena and experimental settings that seem to speak in favour of the corresponding abstractions being particularly revealing for the nature of colour perception. As a result, these conceptual frameworks have impeded the identification of types of phenomena that mirror core colour-related aspects of the structure of internal representations. The apparent cogency of these conceptual frameworks, which were taken as a matter of course in perception theory, was furthermore fed by a widespread general misconception of the nature of perception that fits perfectly within these frameworks, namely the measurementdevice misconception of perception (which, in turn, is connected intimately with empiricist preconceptions about the structure of the mind). According to this conception, the core of which is itself part of common-sense reasoning about perception, the perceptual system is some kind of measurement-device that has to inform us about elementary physical quantities.1 Due to these ways of conceptualizing perception, attempts to understand theoretically the role of colour within the structure of perceptual representations have been severely hindered by the merging of two lines of thinking that have their roots in common-sense conceptions, namely abstractions derived from technology-shaped refinements of common-sense taxonomies and the measurement-device misconception of perception. Approaches based on these lines of thinking have become, despite their utter inadequacy, the dominant paradigm in perceptual research on colour. This is due to the fact that they appear, from the perspective of our everyday way of dealing with colour, intuitively plausible, and that they 1 Corresponding ideas have been highly influential since the beginning of systematic enquiries into the nature of perception. They come in many guises and are rarely spelled out as explicitly as, for instance, by Granit (1955, p. 9), who characterized psychophysics as the ‘systematic investigation of our private measuring instruments with the aid of public measuring instruments’.
384
colour perception
provide, together with suitably selected phenomena and experimental procedures, a framework that appears to be quite coherent when the focus is primarily on colorimetry and the neurophysiology of early coding. The fact that this apparent coherence has been bought by concealing core aspects of the role of colour within internal representations becomes obvious as soon as the vast theoretical distortions that accompany these lines of thinking, when dealing with core perceptual phenomena, are recognized. Before I delve into these in more detail below, a simple example may serve as an illustration, namely the issue of so-called object colours, such as brown. As a typical quote from the perception literature, Boynton (1975, p. 316) remarked that ‘the sensation of brown arises de novo by induction from the surrounding field’; obviously colours like brown are regarded as less ‘original’ than the ‘primordial colours’, such as red, orange, yellow, or blue, which are considered to be closely tied to the wavelength composition of the light and thus, as suggested by this formulation, do not arise de novo. This way of distinguishing between ‘original colours’ and colours that ‘arise de novo’ reflects a variant of the measurement-device misconception of perception, according to which ‘the visual system is concerned with estimating the spectral functional shape of the incoming color stimulus’ (Buchsbaum and Gottschalk 1983). In the case of brown, the ‘original colour’ is taken to be a dark orange, which, due to its surround, is ‘modified’ to yield the ‘dark orange that we call “brown” ’ (Boynton 1971, p. 368): a rather odd formulation which provides evidence of the theoretical distortions produced by the underlying conceptual framework. Since these enigmatic modifications, which are assumed to produce new kinds of colour de novo from ‘original colours’, cannot be accommodated within this framework, one has to retreat, as for instance Judd (1960, p. 257), to unspecified ‘different modes of processing’ of retinal colour signals ‘in the central nervous system’. In contrast, current functionalist-computational approaches and their philosophical aftermath are often accompanied by a distal variant of this misconception, according to which ‘the goal of colour vision is to recover the invariant spectral reflectance of objects (surfaces)’ (Poggio 1990, p. 147).2 Those colours are, accordingly, regarded as ‘original colours’ that are closely tied surface reflectance characteristics. Thus, brown is regarded as an ‘original colour’ rather than arising de novo because, like other colours, it is to be identified with spectral reflectances of surfaces that exhibit this property. ‘Colour’ and the structure of representational primitives In this chapter I will approach ‘colour’ from the perspective of cognitive science, which has, in several of its subfields, marshalled convincing evidence that our mental apparatus is, as part of our biological endowment, equipped with a rich internal structure pertaining to, for example, structural knowledge about properties of the physical world, distinguishing between physical and biological objects, or imputing mental states to oneself and to others. With respect to perception theory, this evidence indicates that the structure of internal coding is built up in terms of a rich set of representational primitives. Rather than asking what colour really is, or making presuppositions about its ‘proper causal antecedents’ or 2 A similar claim in, say olfactory perception, that the olfactory system is concerned with estimating the atomic structure of molecules would duly be rejected as absurd.
the dual coding of colour
385
about the ‘proper intentional objects’ of colour, I will focus on how it figures within the structure of representational primitives of perception. Notwithstanding that we are still far from having a clear theoretical picture about the kind of primitives that underlie perceptual representations, primitives that refer to classes of internal entities such as ‘surfaces’, ‘three-dimensional objects’, or ‘events’ (to be understood as internal, and not as physical concepts) suggest themselves as fundamental pillars of the internal representational structure of perception. These primitives determine the data format, as it were, of internal coding. Each primitive has its own proprietary types of parameters, relations, and transformations, which define its internal structure and govern its relation to other primitives. While colour, as such, is a biologically given part of the form of our experience, the role colour plays within the conceptual structure of the perceptual system, and within perceptual architecture, is open to rational enquiry. The evidence bearing on the role of colour within the structure of perceptual representations is enormously rich. Experimental observations and findings, phenomenological observations3 on the interplay of surfaces and (chromatic) illumination, as well as corresponding physical considerations, provide a rich source for theoretical conjectures about this role. Current thinking in perceptual psychology has focused predominantly on processes of information flow, and has paid little attention explicitly to addressing the problem of the structural format within which the internal coding processes take place, or to identifying the primitives on which complex perceptual representations are built (rather, corresponding questions have often been trivialized by preferences for thin sets of quite elementary primitives). A similar diagnosis holds for cognitive psychology in general, where ‘one typically finds rather perfunctory discussion of information structure only as a prelude or postlude to extensive treatment of processing’ (Jackendoff 1987). An essential task of perceptual psychology thus continues to be the identification of the primitives of the internal conceptual structure of perception, of their ‘data structure’, and of the associated proprietary types of transformations that operate on these primitives. While not much is presently known about the structure of the representational primitives, evidence has been accumulated supporting the idea that quite different representational primitives include free parameters that can be characterized as pertaining to the attribute ‘colour’. If ‘colour’ figures in different kinds of representational primitives, one can hardly expect to understand its internal structure by investigating it in isolation. ‘Colour’ is not a ‘natural kind’, as it were, of internal processing, i.e. it is not a class of explanatory importance of internal states or processes that are held together by the same set of properties. In common-sense taxonomies, in contrast, we have come to regard ‘colour’ as a kind of autonomous and independent attribute. A major obstacle to gaining a deeper understanding of the role of ‘colour’ in the internal conceptual structure of perception is that we illegitimately transfer common-sense reasoning about colour to scientific enquiry of perception. I will, consequently, argue—in line 3 Although I will regularly draw on phenomenological observations that appear to be revealing for the structure of perceptual representations, phenomenological observations as such do not necessarily have a particular relevance for perception theory, nor do they carry a kind of ‘epistemological superiority’. Phenomenological observations do not provide ‘direct access’ to the nature of representational primitives; rather, they result from an interplay of various faculties, including those of linguistics and interpretation. Thus they are, within a naturalistic enquiry into the principles of perception, on a par with many other sources that provide relevant facts and observations.
386
colour perception
with Koffka’s insight that ‘colour, localization, shape and size must be regarded as different aspects of one and the same process of organization’ (Koffka 1936, p. 134)—that attempts to identify the representational primitives of the structure of perception and their ‘data structure’ by investigating attributes such as colour (or depths, etc.) in isolation are doomed to fail (apart from lucky coincidences). This is just as problematic as trying to determine an n-dimensional manifold from a random sample of one-dimensional projections. Rather, questions about colour perception can only be formulated within theoretical frameworks that explicitly address the nature and structural relations of the primitives of perceptual representations in which colour figures. A general theoretical approach that I believe to be well founded in its general conceptions, and which has already yielded intriguing explanatory frameworks of promising range and depth, notably when couched in computational terms, is one of ethology and internalism. Corresponding approaches attempt to provide explanatory accounts of the perceptual system in terms of its internal functioning; they employ, with respect to visual perception, a level of analysis that focuses on how structural properties of the physico-geometrical light pattern reaching the eye (which can have been causally generated by quite different physical processes) are exploited by the visual system in terms of its primitives. No notions of reference to the environment, ‘proper function’, etc., figure in these approaches, which consider notions such as ‘perceptual error’ or ‘veridicality’ to be of little relevance for understanding the internal structure and functioning of the perceptual system (although they are an indispensable part of ordinary or metatheoretical discourse).4 The general approach to colour that I pursue here has, in its core elements, a long history in perception (cf. the Appendix in Mausfeld 2002). However, apart from a few exceptions in the early twentieth century, research perspectives in colour science have followed different routes of thinking. The driving forces in the field have been attempts to understand the (early) neurophysiological coding of colour and issues of colorimetry (cf. Koenderink and van Doorn, Chapter 1 this volume). The influences of these fields resulted, in perceptual psychology, in an extremely elementaristic perspective on colour that allied itself with a measurement-device misconception of perception. Both the elementaristic perspective and the measurement-device misconception of perception (a variant of which also showed up in functionalist-computational approaches) have hampered the general approach pursued here from being applied to colour. Since I have dealt with these issues elsewhere (Mausfeld 1998, 2002), I will restrict myself to addressing two specific consequences of these general obstacles, namely misconceptions about the ‘basic attributes’ of colour and the neglect of illumination-related issues in colour research; furthermore, I will address a third obstacle that lies in the conflation of different levels of analysis. What I intend to point out can be summarized as follows. 4 Regarding levels of analysis that pertain to, for example, evolutionary history or ‘proper function’ as external to an explanatory account of the nature of perception and as belonging to metatheoretical discourse, does not, of course, amount to denying any dependencies. The question is not how things are related to each other in reality; perception is related to, and dependent on, various aspects of reality, such as phylogenetic development, metabolism, the immune system, or the physics of the brain. The question is rather, ‘What constitutes an appropriate level of idealization for successful explanatory frameworks of perception?’
the dual coding of colour
387
Obstacles to an appropriate account of the role of ‘colour’ within perceptual architecture 1.
2.
3.
The alleged basic attributes of colour, usually referred to as hue, saturation, and brightness, as well as associated notions of a three-dimensional colour space, are theoretical notions that arose as abstractions from technology-driven refinements of commonsense taxonomies. Their usefulness is confined to the purposes for which they were developed, namely colour technology and colorimetry. With respect to perceptual psychology and its aim to understand the internal structure of colour representations, these theoretical notions, and the general perspective underlying them, have prevented the right questions being asked and impeded the development of appropriate explanatory frameworks for colour perception. In particular, they are responsible for issues of illumination perception largely being neglected (or trivialized by what may be called the adaptational perspective), and subsequently being addressed, in a mis-idealized way, as the problem of colour constancy. The properties of the external world that causally give rise to the physico-geometrical structure of the sensory input, on the one hand, and the relations between properties of the sensory input and the internal outputs or percepts of the visual system, on the other hand, are two utterly different problems that need to be distinguished carefully. Therefore, the core question of perception theory, viz. how are structural properties of the incoming light array exploited by the visual system in terms of its primitives, must not be conflated with the question, What properties of the environment give rise to perceptually relevant properties of the incoming physico-geometrical light array? Because of this, notions of ‘reference’ or ‘veridicality’ do not figure in perception theory proper but pertain to a different level of analysis (and are also part of ordinary and metatheoretical discourse about perception).
Summary of main theses I feel that a useful step would be to deal with these obstacles in some detail in introductory sections before turning to a general ethological and internalist approach to perception. After having introduced this general framework, I will deal with some specific questions about the role ‘colour’ plays as a constituent of the representational format of perceptual primitives. The main theses I shall argue for in this chapter can be summarized as follows. 1.
Within an ethological and internalist account of perception, a categorical distinction is made between a sensory system and a perceptual system. The sensory system deals with the transduction of physical energy into neural codes and their subsequent transformations into codes that are ‘readable’ by, and fulfil, the structural and computational needs of the perceptual system; its internal concepts are entirely definable in the same physico-geometrical language that we use to describe the sensory input. The perceptual system, on the other hand, contains, as part of our biological endowment, the rich perceptual vocabulary, which is based on primitives that cannot be defined in terms of the primitives of the sensory system, in terms of which we perceive the ‘external world’.
388
colour perception Furthermore, the perceptual system provides the computational means to make these perceptual concepts accessible to higher-order cognitive systems, where meanings are assigned in terms of ‘external world’ properties.
2.
The sensory codes serve a dual function: first, they provide triggering cues for representational primitives and thus they determine the potential data formats in terms of which input properties are to be exploited. Secondly, they are used by the activated primitives to determine the values of their free parameters.
3.
Colour figures as a free parameter in the structure of (at least) two different representational primitives that, from a metatheoretical perspective, can be regarded as pertaining to the representation of ‘surfaces’, and the representation of ambient and local illuminations (note that within an ethological and internalist account, the term ‘representation’ only refers to postulated elements of internal structure and does not involve any notion of reference to the external world). Consequently, ‘colour’ does not constitute, as common-sense taxonomies suggest and as most of current research presupposes, a single domain of an autonomous attribute, but is rather a constituent of the format of different representational primitives. The interdependencies in the data structure of representational primitives do not simply mirror corresponding physical regularities, but rather are co-determined by internal aspects, such as internal functional constraints and internal architectural constraints. Because of this, internal concepts, such as ‘surface colour’, defy definition in terms of a corresponding physical concept (even in the sense of the latter providing necessary and sufficient conditions for the former). Rather, as corresponding empirical evidence indicates, ‘colour’ is dependent on the entire structure of the types of representational primitives in which it figures and on their interrelations, and cannot be studied independently of them. The sensory system pre-processes the retinal colour code for the structural and computational demands of the relevant representational primitives. It provides a variety of relations between, and transformations of, retinal colour codes, on which decompositions of the retinal colour code into a dual colour code can be based that fulfil the demands of the representational primitives involved.
4.
5.
First obstacle: Misconceptions about attributes of colour and ‘modes of appearance’ I will first draw attention to some of the factors that have so greatly impeded the asking of appropriate questions about the role of ‘colour’ within the structure of perceptual representations, questions that had been clearly identified at the time of the gestaltists, within the limits of the conceptual apparatus available at that time. Although, in the earlier literature, there was an awareness that colour does not mark a homogeneous domain with respect to core internal structure, this has almost been forgotten in approaches that have dominated the field since then. The extent to which we have lost sight of these previous insights is quite surprising. The main reasons for this appear to me to lie in the following facts: first, in line with empiricist approaches to the mind, perceptual psychology predominantly pursues an
the dual coding of colour
389
elementary data-processing approach, and is still loath to address issues of representational primitives and the ‘internal semantics’ of the perceptual system. Secondly, investigations into colour perception tend to employ conceptual frameworks that have been established for technological purposes. I will begin by recalling a few basic facts about the laws governing matches of small spots of light in otherwise dark surrounds. These matches can be described by the well-known linearity laws of additive colour mixture, often referred to as Grassmann laws. Because of the validity of these laws, equivalence classes of lights that cannot be distinguished perceptually can be represented numerically by a three-dimensional vector space. Such numerical representations of metameric matches do not say anything about the colour appearances (except about the distinguishability–indistinguishability aspect) of the points of this space, which represent equivalence classes of metameric lights. In other words, there is no natural way of assigning colours to the points of this space. In particular, this vector space does not represent equality or inequality of colour attributes such as hue, saturation, and brightness. The ratio of the length of two vectors does not correspond to a ratio of brightnesses, and a line in this space does not necessarily correspond to a constant hue. The empirical fact of trichromacy, on which the three-dimensionality of the representing vector space is based, only means that no more than three degrees of freedom are needed to match the colour of an isolated light patch; it does not, however, say anything about whether a coordinatisation of this vector space exists that corresponds to a set of ‘basic attributes’ of colours that can be described in a natural way. Hue, saturation, and brightness Because the geometrical representations associated with these numerical representations of metameric matches exhibited a certain similarity to the geometrical representations of colours in colour order systems, such as the Munsell system, it was apparently tempting to describe them in terms of special coordinates that are assumed to capture basic colour attributes. The attractiveness of this way of linking Grassmann representations of metameric lights with geometrical representations of appearance in colour order systems is further enhanced if the alleged basic colour attribute can be operationally defined by simple physical operations. This explains why, since Helmholtz, hue, brightness, and saturation, which can be derived from the corresponding physical operations of selecting a wavelength, increasing light intensity, and diluting a light stimulus with white light, have been chosen as basic colour attributes.5 These attributes, which are usually regarded as a natural, unique, and 5 Ideas of regarding hue, saturation, and brightness as the ‘natural kinds’ of colour appearance, as it were, also found their way into corresponding philosophical enquiries into the nature of colour. For instance, Thompson et al. (1992) based their concept of the ‘phenomenal structure of colour space’ on these attributes. According to their account, phenomenal colour space is placed at the top of a hierarchy of colour spaces, which are vaguely related to levels of neural organization. Ironically enough, human phenomenal colour space is identified with CIE-space and the corresponding tristimulus values. These coordinates, however, are based on colour matching experiments with respect to small spots of light (aperture colours) and have been chosen from a family of linearly related colour codes that includes those that are commonly interpreted as receptoral colour codes. Thus, at the top of the hierarchy of colour spaces, i.e. phenomenal colour space, we find ourselves back at the level of receptoral colour coding.
390
colour perception
complete classification for describing colour appearances (see, for example, Judd 1951, p. 837; Palmer 1999, p. 97), are typically defined as: • brightness: ‘the attribute of a visual sensation according to which a given stimulus
appears to be more or less intense’ (note the ambiguity of the concept ‘intense’ in this description); • hue: ‘the attribute of a color perception denoted by blue, green, yellow, red, purple, and so on’; • saturation: ‘the attribute of a visual sensation which permits a judgment to be made of the degree to which a chromatic stimulus differs from an achromatic stimulus regardless of their brightness’ (Wyszecki and Stiles 1982, p. 487). Helmholtz and von Kries, who basically introduced this description, were aware that it is completely arbitrary, in terms of essentially physical categories. However, for example, von Kries preferred to trade psychological arbitrariness for an apparent precision of colour concepts that results from their strong tie to physical operations. He remarked that a division of colour appearances in terms of hue, saturation, and brightness ‘does not claim to be a natural one; without much ado we can regard it as a completely arbitrary one. Such a description is, however, a completely rigorous one, since it only refers to objective properties of the light that causes the corresponding appearances’ (von Kries 1882, p. 6). In the early literature many writers clearly recognized the problems that arose from using elementary physical categories as a surrogate for perceptual ones (e.g. Stumpf 1917, p. 86; Hering 1920, p. 40). From the time of Helmholtz to the present day, controversies have raged about how to choose ‘basic colour attributes’ appropriately, and how many of them are needed to capture essential aspects of colour. These controversies are not simply about terminology, but, rather, have to do with intricate theoretical issues and differences in theoretical perspectives. Evans (1948, p. 39) spoke of ‘chaos in this matter’ and went on to say that ‘the beginning reader in the subject can have little idea of how confused the subject has been in the past’. If colour experiences could be carved up into basic attributes of hue, saturation, and brightness in a way that is as conspicuous and obvious as it is often presumed to be today, such chaos would hardly be understandable. I will mention only a few examples of these controversies about how to abstract what can be regarded as basic attributes, in the context of certain aims and purposes. According to Evans (1948, p. 39), ‘the most confusing word which will be encountered is brightness’. Although such an abstraction does not seem problematic for isolated colour patches viewed in a dark surround, its inadequacy becomes obvious in what Evans called the ‘simplest configuration’ for capturing essential qualities of colour, namely centre-surround situations. Observations in these cases led Evans (1974) to claim that five independent variables of perceived colour are needed to capture basic attributes of colours; among these he considered ‘brilliance’ an essential attribute, which he understood as the surrounddependent amount of positive or negative greyness, the latter also being described as apparent fluorescence or ‘flourence’ (Evans 1974, p. 99).6 6 For centre-surround situations, Niederée (1998) provided, on the basis of a set of straightforward and empirically innocuous assumptions (if one is willing to accept the topological assumptions which, at least implicitly,
the dual coding of colour
391
Centre-surround situations suffice to yield appearances such as luminous grey. Aspects of ‘brightness’ and ‘greyness’ are thus phenomenally dissociated, which in itself is a phenomenon of great theoretical relevance. It has been known since Hering that one needs at least two independent variables to capture aspects of achromatic colours. In reflections on art, the difference between a ‘brightish white’ and a ‘whitish bright’ is crucial and has been recognized as such ever since painters became interested in representing the effects of light (Schöne 1954, p. 203). These examples indicate the importance of specifying the theoretical context within which one intends to develop abstractions that are suited to capture the ‘non-chromatic intensity’ aspect of colour experiences. Without such a specification, there are no criteria to decide whether ‘brightness’ is to be conceived as an attribute pertaining, for example, to a colour patch itself, i.e. a local property, or as an attribute pertaining to a colour patch within an entire configuration, i.e. a relational property, or, referred to as ‘lightness’, as an ‘attribute of a visual sensation according to which the area in which the visual stimulus is presented appears to emit more or less light’ (Wyszecki and Stiles 1982, p. 487). While for Evans brightness is the most problematic concept, others consider saturation as the most inappropriate concept of the standard set of alleged basic colour attributes. According to Wyszecki (1986, p. 9–5), ‘the concepts, terms, and definitions of chroma and saturation are perhaps the most controversial in the literature of colour appearance’. Hering (1920, p. 40) rejected the concept of saturation altogether, as a mixing-up of perceptual and physical aspects (he preferred the concept of veiling, ‘Verhüllung’, of colour). Stumpf (1917, p. 86) also dismissed ‘saturation’ as a colour attribute completely. He conceived saturation to be ‘a cognitive abstraction and a cognitively added relation capturing the approximation of a colour to its ideal’. In a similar vein the concept of saturation was rejected by many others, among them Katz, G. E. Müller, and K. Bühler. Hunt (1977), at that time chairman of the CIE Colorimetry Committee, introduced the concept ‘colourfulness’, because judgements of saturation also refer to the brightness and thus do not capture, in certain situations, the qualitative aspect that a hue may be exhibited weakly or strongly. The issues underlying these controversies are not merely terminological in nature but, rather, mirror crucial differences in underlying purposes and theoretical perspectives. However, this is veiled by the fact that these kinds of basic attributes, no matter how they may be defined in detail, seem to describe roughly what appears, within our present-day ordinary way of dealing with colour, as qualitative ‘dimensions’ of colour. When we are called upon to describe differences in colours in our visual world by abstracting from all other aspects of spatial and temporal context and psychological attitude, and confining our judgement to ‘pure colour aspects’, it seems to be natural to distinguish roughly variations in the kind of hue—‘the main quality factor in colour’ (Evans 1948, p. 118)—in the ‘intensity’ of the patch, and in the amount of its chromatic vividness. Still, this kind of taxonomy is yielded by an abstraction that requires a proper mental attitude and rests itself on conceptions that were shaped by developments of colour technology; sensory qualities do not come with a tag underlie almost all models of colour), a rigorous proof that the dimensionality of colour codes must be greater than three. As to the question why colour orthodoxy settled on three ‘basic attributes’ of colour, contrary to what should be obvious from the rich evidence available, Evans (1974, p. 137) suspected that ‘only a persistent desire to keep the system three-dimensional (so it can be visualized?) can explain the circumlocutions that have been resorted to, to make it so appear’.
392
colour perception
indicating how to slice them in a certain way into ‘basic qualities’ (cf. Aubert 1865, p. 186). The specification of basic colour attributes is brought forth, within certain theoretical and practical contexts, by corresponding abstractions, as has been emphasized repeatedly in the literature. Stumpf (1917, p. 8), for instance, insisted that a specification of colour attributes is based on the ‘ability and the conditions for an isolating abstraction’; and Burnham et al. (1963, p. 5), in a report on behalf of the Inter-Society Color Council, regarded these ‘visually abstractable dimensions’ as representing ‘an abstraction from a total visual experience’ and emphasized that they ‘represent a cultural development upon which there is reasonably general agreement’. Concepts of basic colour attributes, such as hue, saturation, and brightness, are theoretical terms that have been developed and abstracted from colour experiences for certain purposes. Although they have become part of our ordinary language, they are still artificial abstractions (which, of course, are based on and exploit certain perceptual capacities). However, for perception theory, a proper understanding of colour will most likely be impeded by confusing these theoretical terms with basic structural ‘dimensions’ of the internal organization of colour. Modes of appearance The problems caused by the ‘errors of the application of colorimetric thinking to perception’ (Evans 1974, p. 197) become particularly obvious when reference to so-called ‘modes of appearance’ is made. Introduced, within the context of perceptual psychology, in Katz’s (1911) ground-breaking work, observations on these modes of appearances yielded subtle conceptual distinctions (e.g. Martin 1922; Evans 1948, 1974; Beck 1972) that are of great theoretical interest to perceptual psychology. It is important to note that the corresponding concepts have a purely descriptive status and are themselves in need of an explanation in terms of some abstract principles of the internal coding of colour. In the context of colorimetry, the concept of a ‘mode of appearance’ turned, however, into a pseudo-explanatory one that was called upon to alleviate the obvious inadequacies of the ‘basic attributes’ of colorimetry in situations other than small, decontextualized colour patches; although in the latter situation these attributes indeed suffice to describe completely the colour appearance, they are all too obviously inadequate for more complex situations. In order to accommodate corresponding observations, it became common in colour science to invoke a ‘switch in the mode of appearance’ (in such usage the concept of ‘mode’ wavers in its meaning between denoting, in the sense of Katz, colour appearances, or judgemental modes, or attentional modes). Such a move made it possible to simply bypass the theoretical problems encountered, by declaring that modes of appearance merely modify the ‘original colour’, which is the colour as produced by the aperture mode.7 It was Katz himself who prepared the way for this concept because he held the view that the ‘same colour’—given in its ‘pure form’ by the aperture mode—may have different modes of appearance and that its different 7 An example can be found in Judd (1960, p. 257), who, in his attempts to provide an explanation for certain phenomena, referred to an object mode in addition to mechanisms of chromatic adaptation. Thus it was only natural that the reigning orthodoxy in colour science confined itself to studying the aperture mode (e.g. Boynton 1979, p. 28), while the‘modes of appearance’ became the epicycles of theorizing within an adaptational perspective.
the dual coding of colour
393
modes of appearance are all based on the same retinal process (Katz 1911, p. 38).8 Many controversies were spawned by the question, whether different modes of appearance have to count as different colours or simply as different modes of appearance of the same colour.9 Within perspectives on colour perception that were determined by a neurophysiologically oriented elementaristic approach to colour, as well as by colorimetric purposes, the modes of appearance have an enigmatic and peculiar ad hoc character. According to these elementaristic perspectives, there are some kinds of ‘raw colours’ or ‘original colours’ that are directly tied to the receptor excitations elicited by the local incoming light stimulus, and that are transformed and modified in subsequent stages of processing in order to fulfil certain requirements, such as sensitivity regulation (or, according to more recent variants, optimal and efficient coding or invariance requirements). In the wake of these approaches it became a matter of course to conceive decontextualized small colour patches (that have virtually no localization or orientation)—such as the ones underlying Commission International d’Éclairage (CIE) colour space—as the building blocks of colour perception. Perceptual representations of, say, surface colours, are, according to this view, built up by ‘secondary’ or ‘higher’ processes, in a locally atomistic way from these raw colours, and the modes of perceptions are merely modifications of the ‘original colours’ by context-dependent factors. Consequently, the interesting theoretical problems that lie beneath their surface were, within such perspectives, not taken seriously or not even recognized. Ideas from the field of colorimetry, which invested great efforts into developing standard procedures for capturing colour appearances, thus became a major obstacle to approaching issues of colour within perception theory in an appropriate manner. The cultural development of colour terms The process of standardizing colour, an issue that is of vital concern for a great variety of practical and industrial purposes, and largely divorced from perception theory, has in turn influenced our ordinary way of dealing with colour. It is, though, not a singular process in the culturally driven process of developing abstractions for dealing with perceptual experiences. I will briefly mention a few observations that provide evidence that, from the 8
The approach of Katz—which in this regard follows that of Helmholtz and Hering—is, as Gelb (1929, p. 656) criticized, ‘basically rooted in a distinction between “‘lower” (so-called retinal based) and “higher” (modified by experience) visual achievements’ and thus rests on an inappropriate ‘segregation of lower, primary processes and higher accessory processes’. 9 In line with Katz, Jones (1953), in his report as chairman of the Committee on Colorimetry of the Optical Society of America, expressed the view that ‘the mode of appearance does not change colour per se’. In a similar vein, Krantz, using topological arguments (and specifically making the assumption that the existence of an asymmetric match is stable for small perturbations of colour appearance) concluded that ‘changes in viewing conditions do not introduce new dimensions, rather, they at most create some new combinations of values (e.g. brown) in a fixed set of dimensions’ (Suppes et al. 1989, p. 254). In contrast, Evans (1974, p. 137) called into question the unjustified ‘assumption that these changes must occur in the same perceptual variables that are controlled by an isolated stimulus’. Previously, Troland, who had chaired a Committee on Colorimetry which attempted to set forth a clear terminology in the field of colorimetry, considered the modes of appearance to count as different colours. He argued that ‘hue, saturation and brilliance do not exhaust all possible attributes of colours, since it is possible for them to vary in dimensions distinct from any of these three’ (Troland 1929, I, 254). Because of this, he assumed ‘seven different modes of colour appearance’, which he considered to be ‘not reducible to physical terms’.
394
colour perception
very beginning of human culture, the building up of a colour terminology has mirrored not only the significance of certain biologically important objects, but, to an increasing extent, the invention and cultural role of colouration techniques and dyeing processes, the cultural context, and the degree of linguistic abstraction achieved. My reasons for dealing with these issues are twofold. First, these observations are further evidence—in addition to the fierce controversies within colorimetry about what the ‘basic attributes’ of colour are— that the alleged basic attributes of hue, saturation, and brightness are abstractions rather than ‘natural kinds’ of colour experiences. Secondly, these observations of the cultural development of colour terms exhibit a regularity that seems to me to be of theoretical interest in its own right with respect to the perspective pursued here, namely a shift from ‘forms of light’ to object properties. This shift is consonant with the idea that the internal concept of ‘colour’ is not a unitary one but rather figures in the data format of two different representational primitives, and indicates that the way in which we exploit these primitives linguistically has changed. In our common-sense perceptual taxonomies, our conscious awareness is of objects and their material character, whereas colour appearances only seem to be a kind of medium we are reading through, as it were, in the visual system’s attempts to functionally attain the biologically significant object. People at earlier stages of cultural evolution had no grounds for abstracting away from concrete experiences and for assigning names to ‘pure sensations’.10 Colour itself was not the primary distinguishing feature of objects, and for most natural objects the name alone was sufficient to describe the colour.11 Thus, any vocabulary that referred to the domain of colour was accommodated exactly to the respective demands of daily needs and cultural practices.12 Along with these needs and practices, the way we talk about colour is continuously changing. From Homer’s emphasis on forms of light, such as brightness, lustre, and the changeability of colours13 to the subsequent and continuing interest in the proper colour of objects and in colour as such, there has been a culturally shaped progression toward an increasingly abstract colour vocabulary. The cognitive bases for this progression in the linguistic description of colour experiences are cognitive processes of similarity classification and abstractive categorization. When we talk today about colour we refer to abstracta such as ‘red’, ‘green’, ‘brown’ or ‘purple’. Usually we do not understand these terms as referring to a specific external world object, but rather as descriptions of perceptual qualities as such. We have thus abstracted away from any object 10 As Evans (1974, p. 199) noted, ‘In everyday life the colors of objects are not stable and there is no point in trying to assign an exact color to an object’. Our ability to discriminate colours, which exceeds our ability to identify colours by a factor of 1000–10 000, is apparently primarily exploited by mechanisms that subserve achievements such as surface segmentation, rather than being mirrored in corresponding phenomenal categorizations. 11 As an example from the extensive literature, Allen (1892, p. 254) (cf. also Rivers 1901, p. 63) concluded that ‘abstract colour terms are the names of concretes, whose original signification has been forgotten’ (cf. also Marty 1879; Hochegger 1884). 12 For example, as Hochegger (1884, p. 57), Allen (1892, p. 271), or Rivers (1901, p. 63) observed, the ancient languages under scrutiny did not have colour names for flowers. 13 Compare, for example, Rowe (1974) and Maxwell-Stuart (1981). Hochegger (1884, p. 36) found it ‘remarkable that etymological investigations on abstract colour names always find the roots in words that mean shiny, glowing, burning, shimmering, dingy, burnt, etc. Even the expressions for colours which seem to be abstract are, in fact, not primordial but rather emerged from paleness, brightness, glossy, matt, dingy etc.’
the dual coding of colour
395
of perceptual reference and have assigned a meaning to a sensation.14 Yet, this process of increasing abstraction that we can observe in the development of a colour vocabulary, seems to exhibit an interesting regularity: namely a shift from an emphasis on forms of light, such as brightness, lustre, and the changeability of colours, to an emphasis on hue as an object property.15,16 The occurrence of such a shift can, in principle, be accommodated in a natural way within the general perspective that I argue for below, namely that the internal concept of ‘colour’ is not a unitary one but rather figures in the data format of two different representational primitives. The shift from ‘forms of light’ to object properties indicates that the way in which we linguistically exploit representational capacities of the perceptual system has changed due to cultural and technological factors. Cultural processes have favoured an increasing linguistic apprehension of ‘colour’ as part of the internal data format of surface representations, while, at the same time, lessening the importance of ‘colour’ as part of the internal data format of the transmission medium.
Second obstacle: Neglect of illumination perception, and the predominance of an adaptational perspective The neglect of illumination-related issues in perception theory can be traced back to the work of Helmholtz and Hering. Although phenomena such as coloured shadows, transparency and veiling, Meyer’s tissue contrast, etc. played an important part in their controversies, and although both clearly recognized the challenge that ensued from so-called constancy phenomena, they did not arrive at a proper account for the role of the internal representation of the illumination. In Helmholtz’s account, there are some traces of an internal representation of the ambient illumination, but he made short work of the illumination by simply deriving it from the entirety of colours in a visual scene and taking the mean of all colours in a visual scene as a kind of measure for a comparison process by which the concept of white is redefined (Helmholtz 1896). In line with elementaristic perspectives on colour perception, theoretical accounts of colour constancy have tended to treat variations in the ambient illumination as a kind of ‘context effect’, i.e. as an effect that modifies and distorts the ‘true’ or ‘original’ focal colour, which thus has to be internally restored by compensating processes. In other words, the ‘primary elements’ of colour perception are constituted on the level at which a stable correspondence between local properties of the sensory input and the neural reaction can 14 It is an interesting observation in its own right that we are endowed with the cognitive capacity to segregate perceptually and conceptually, in a long cultural and intellectual process, pure sensational qualities, and abstractions based on them, from the immediate perceptual experiences of the external world (conspicuous examples of the ways in which we take advantage of this capacity are geometry, or music and the theory of harmony). 15 This can be illustrated by the vocabulary of ancient languages, such as Greek, where aspects of light and shadow, and the changeability of the appearance of objects, were of much greater importance than object colours, in our modern sense of appearances, that are correlated with invariant spectral remission properties. For the English language Casson (1997, p. 238) showed that colour terms evolved ‘as a response to an increasingly complex colour world in the Middle English period (1150–1500)’ by a shift from brightness aspects to hue aspects. He pointed out that ‘the eight Old English terms that evolved into basic color terms were predominantly brightness terms that had minor hue sense (except red, which had a dominant hue sense)’ (Casson 1997, p. 226). 16 See, however, van Brakel (2002) for a alternative perspective on these matters.
396
colour perception
be observed, and are then further processed and transformed, modified, or supplemented by ‘secondary’, ‘higher order’ processes to yield perceptual achievements or appearances. The local connection between these ‘original’ colours and colour appearances is considered to be the ‘normal case’, and thus the so-called constancy phenomena are regarded as more surprising and in greater need of explanation than the ‘normal case’. Such a view, like corresponding views elsewhere in perception that derive from folk physics a priori kinds of classification of perceptual effects into basic or primary ones, and secondary or contextual ones, again mirrors a measurement-device misconception of perception. In fact, however, which phenomena are to be considered ‘basic’ and which ‘secondary modifications’ depends entirely on the theory of the representational primitives underlying colour perception.17 Within the elementaristic perspective on colour, a natural way of dealing with corresponding phenomena has been to treat them under the heading of adaptation. Adaptational perspectives, which were abetted by ideas from neurophysiology, emphasize the role of simple elementary mechanisms that neutralize the effects of changes of the illumination. The most prominent of these is a von Kries-type normalization of the receptor output by an illumination-dependent factor, which allows any effects of adaptation to be translated back into physics and to be described as if only the effective local physical stimulus had changed. Within functionalist perspectives, it had been observed as early as the beginning of the twentieth century (e.g. Ives 1912) that von Kries-type multiplicative processes were able to compensate in large part for the effects of illumination changes. Accordingly, various rescaling schemes have been proposed that normalize the colour signals with respect to the prevailing illumination (e.g. Koffka 1932).18 Due to the great success of the elementaristic research paradigm, both in revealing the nature of elementary neural coding of colour and in providing colorimetric formulae that allowed the perceived colours to be predicted under a variety of circumstances (e.g. Judd 1940), the deeper perceptual problems associated with illumination-related phenomena, such as the so-called problem of colour constancy, were consigned to oblivion for the decades to follow. The two authoritative texts in which the then-reigning research perspectives culminated gave colour constancy short shrift: under the heading of chromatic adaptation, they only devoted a few sentences to it (Boynton 1979, p. 183f.; Wyszecki and Stiles 1982, p. 440f.). It is important to be aware of what the perceptual achievement that needs to be explained in situations of chromatically illuminated objects actually is. There is no perfect colour constancy, even under favourable natural conditions, in the sense that two locations of the same spectral reflectance have an identical appearance under two different illuminations. 17 MacLeod (1947) clearly recognized how ‘misleading’ such a separation into primary and secondary determinants is, as it serves the purpose of avoiding enquiries into the structure of perceptual representations underlying colour perception; he considered it a futile attempt ‘to explain the behaviour of organised fields in terms of laws generalised from the behaviour of supposedly unorganized fields’, whereas, in fact, ‘some degree of field organization’ has to be presupposed in order to account for corresponding phenomena. 18 For recent developments along these lines that also address issues of coding efficiency and constraints derived from the statistics of natural images, see Webster (Chapter 2 this volume), and MacLeod and Golz (Chapter 7 this volume).
the dual coding of colour
397
What is actually achieved by the visual system is not an illumination-invariant transformation of retinal colour codes, nor an estimation of spectral reflectance functions, but rather the percept ‘colour of an object’, which is more stable than could be expected on the basis of the local sensory input alone. In this sense, the percept ‘colour of an object’ seems to be more strongly tied to the spectral reflectance characteristics of the object than to the wavelength composition of the local sensory input. There is, however, no colour constancy in the strict sense that two locations of the same spectral reflectance ‘look the same’ in all respects under two different illuminations. One can see the ‘same colour’ but yet have a different colour experience by seeing it under a different illumination. As Gelb (1929, p. 672) stated tersely: ‘Given this state of affairs, can one raise the question in the usual sense, why things keep their appearance with respect to colour in spite of changes in the intensity and kind of illumination? Obviously not.’ The phenomena concerning the interplay of surfaces and illumination in colour perception point to much deeper principles of the visual system than those of some re-normalization of the local colour code (or, as in functionalist-computational approaches, those of an alleged propensity of the visual system to keep its colour equivalence classes congruent with the physical structure of ‘reflectances of surfaces’). Because elementaristic perspectives on colour perception are based on a theoretical language that has no room for ‘semantic’ perceptual units, they have to invoke various case-dependent ad hoc assumptions, referring to spatial or temporal context, or to attitudes of the observer, in order to ‘explain’, for the phenomenon in question, how the raw colours are transformed. This finally led to a theoretical picture according to which ‘chromatic adaptation is, in fact, one of the greatest mysteries of colour science today’ (Billmeyer and Saltzman 1981, p. 21). From their initial conception, such ideas of taking normalizing transformations of primary colour signals as a central mechanism subserving colour constancy have been accompanied by corresponding objections emphasizing the inadequacy of such approaches. For instance, Jaensch put forth an ambitious programme that attempted to identify structural similarities between contrast phenomena and constancy phenomena (Jaensch and Müller 1920; Jaensch 1921). His, and similar, attempts were sharply attacked by several authors, notably Gelb (1929), Koffka (1932), and Kardos (1934). In particular, it was emphasized that one cannot, on the basis of adaptational concepts, arrive at suitable theoretical concepts for dealing with illumination perception. Evans (1974, p. 197) succinctly stated that ‘one of the major errors of the application of colorimetric thinking to perception is the assumption (usually unconscious) that what is seen must be explicable by a simple combination of a single stimulus and an eye sensitivity modified by colour adaptation’. Earlier writers, such as Gelb or Kardos, were not willing to sacrifice their insights into essential aspects of colour perception for an explanatory scheme that can, in a deflationary way, accommodate almost all kinds of changes of colour appearance by suitable ‘colorimetric formulae’ of chromatic adaptation (e.g. Judd 1940).19
19 Faul (see comments in Chapter 2 this volume), in a similar context, speaks of the ‘obvious danger of ending in a “Ptolemaic theory” of the visual system that is descriptively satisfying but theoretically unfruitful’.
398
colour perception
Kardos (1934, p. 173) recognized how strongly adaptational concepts are tied to elementaristic and locally atomistic (mis-)conceptions of colour coding; he concluded from his analyses that ‘the psychophysical processes that result in a perception of an object colour, cannot be understood as a response to the local stimulus by a sense organ that is adapted and re-tuned to some illumination’ but rather considered it as an ‘immediate reaction’ to a specific input configuration. Gelb (1929, p. 672) insisted ‘that the problem of colour constancy, rather than being a problem of an alleged discrepancy between ‘stimulus’ and ‘perceived colour’, has to do with the general problem of the constitution and structure of our perceptual visual world. The phenomenal segregation into illumination and illuminated object (i.e. the correlate of the percept ‘object colour’) reveals a propensity of our sensorium and is nothing but the expression of a certain structural form of our perceptual visual world’. In the same vein, Cassirer (1929, p. 155) considers the phenomena that can be observed under chromatic illumination not to result from some additional processing, but rather as an expression of the ‘very primordial format of organisation’. Since at that time these writers did not have the conceptual apparatus provided by computational approaches at their disposal, they had to retreat to circumlocutions in order to express their insights into the structural role of colour within perceptual representations. Still, these insights were far from being mere speculations, but rather were, even at that time, strongly suggested by the theoretical and empirical evidence available. Yet, they were almost completely ignored in subsequent approaches. The problem of colour constancy came to be regarded as a problem confined to ‘pure’ colour perception, where transformations of some ‘raw colours’ result in a discounting of the illuminant. As a result of this way of idealizing away the perception of the illumination, the problem of colour constancy came to be mis-idealized and misrepresented. Whereas elementaristic approaches to colour perception dispense with the problem of illumination perception by treating it as a problem of context-specific modifications of ‘original colours’, current functionalist-computational approaches, which attempt to derive structural properties of colour perception from relevant physical constraints of the external world, tend to trivialize it by conflating perceptual and physical categories (cf. Mausfeld 2002). Corresponding ideas that the structure of internal colour representations is determined by the computational goal of recovering from the sensory input a function that depends only on the surface reflectance properties of objects—and a related philosophical position, called ‘colour physicalism’, according to which colours are to be identified with sets of reflectances20 —express a distal variant of the measurement-device misconception of perception, and also reveal, again, an empiricist preconception of perception. As this way of referring to spectral remission functions illustrates, functionalist-computational approaches to colour 20 The assertion that the ‘objective basis’ of ‘colours’ was spectral reflectance, or that ‘colours’ were even to be identified with spectral reflectances, is anthropocentric and attests to an abiological orientation in the face of the available ethological facts (e.g. on colour coding of different directions with respect to the sun in the celestial navigation of birds, or with respect to the water surface in the directional orientation of fish). Such assertions seem to be based on the illegitimate transfer of common-sense colour taxonomies and common-sense reasoning about colour to scientific enquiry. Likewise, philosophical attempts to justify the realism or other aspects of commonsense reasoning on colour are of no particular interest and relevance for biological enquiries into the role ‘colours’ play within cognitive architecture.
the dual coding of colour
399
perception tend to throw together two different levels of analysis. One level pertains to the question regarding what properties of the environment give rise to perceptually relevant properties of the incoming light array, and a second, completely different, problem is to investigate how structural properties of the incoming light array are exploited by the visual system in terms of its primitives.
Third obstacle: Conflating levels of analysis In enquiries into the nature of representational primitives, we can, and, taking a specific subsystem of the organism as the unit of analysis, should actually, avoid any notions of the ‘proper’ object of perception and the ‘true’ antecedents of the sensory input among the infinite set of potential causal antecedents (though such notions are, of course, an indispensable part of both ordinary and metatheoretical discourse). The same characteristics of a light array reaching the eye can be produced physically in many different ways. The percept as such, say of a cube, does not testify to its origin; it can equally result from a distal object, from certain properties of the incoming light array, or from a neural stimulation at various levels of the visual system. There is no way to assign different degrees of ‘reality’ to these percepts, depending on the way they have been physically caused.21 With respect to the percept ‘surface under chromatic illumination’, the same spatiotemporal light pattern that is caused by a certain interaction of physical surfaces and light sources, and that elicits corresponding percepts, can be produced by light sources alone [using, for example, a slide or a cathode ray tube (CRT) screen]. The visual system cannot distinguish these cases: it simply doesn’t know whether the causal chain giving rise to this pattern arises from surfaces and light, or lights alone. A goal of perceptual psychology is to identify the equivalence classes of input patterns that give rise to the same internal representations or percepts, and thus to provide an abstract explanatory framework for the structure of perceptual representations. A description of such equivalence classes in the language of physics will very likely lead to very abstract mathematical entities that are quite unnatural from the point of both theoretical physics and folk physics. This again highlights the futility of attempting to provide a description of the equivalence classes of colour codes in terms of their possible physical causes: colours do not constitute a well-formed physical kind. Because the equivalence classes are ‘held together’ by the structure of our perceptual system, rather than by the structure of the physical environment itself, any reference to the potential distal causes of the incoming light array is extrinsic to a formal theory of colour perception. Again, no notions of reference to the environment figure in formal theories that provide explanatory frameworks for our understanding of the internal structure of colour. The question of whether colours ‘represent’ what they normally stem from in our environment is of little relevance to our formal theories of perception, although corresponding considerations are an indispensable part of our metatheoretical talk about colours. 21 We can, however, introduce a notion of ‘reality’ that is not tied to a notion of ‘reference to the external world’ but refers solely to an internal attribute. This has been emphasized by Michotte, who conceived of ‘phenomenal reality’ as a ‘dimension of our visual experience’, which is closely linked with ‘the potential for being manipulated’ (Michotte 1948/1991, p. 181).
400
colour perception
The only physics of the external world that figures in a formal theory of visual perception is that of the physico-geometric properties of the incoming light array. In terms of these properties, we can completely characterize the relation of representational primitives to the sensory input, and thus their ‘proximal semantics’, as it were, which can be understood as the equivalence classes of the physical input situations by which these primitives are triggered. The ‘proximal semantics’ of the perceptual system is, in other words, defined by its relation to the sensory system.22 Perceptual psychology aims, within the conceptual framework of the natural sciences, to provide, at a suitable level of description, explanatory frameworks for a specific subsystem of the brain. The functioning of these systems is essentially determined by the way physico-geometrical properties of the sensory input are exploited by the perceptual system in terms of its primitives. Questions as to which distal physical situations are the potential causal antecedents of the values of certain sensory codes, as well as questions of evolutionary history, pertain, aside from heuristic purposes, to different levels of analysis that are extrinsic, although they may supplement perception theory proper. Corresponding methodological principles are routinely employed in other domains of the natural sciences with respect to other ‘natural objects’, and there is no reason to deviate from them in the case of perceptual systems. They are considered uncontroversial, for instance, in scientific enquiries into the digestion system and the stomach, where no one would maintain that in order to understand its function one has to take into account its evolutionary history, or physical or chemical regularities of food composition in a certain environment. An explanatory account of its function will most likely refer to various types of internal constraints that result from its interplay with other systems, such as the circulatory system or the immune system, and would not change even if the organism lived under circumstances where the necessary nutrients were provided in an entirely artificial way. With respect to colour, the structure of relevant internal representations cannot simply be revealed by referring to physical properties, such as surface reflectance characteristics, from the outset, because there are no such things in the incoming light array. They cannot even be assumed to be necessary causes for the corresponding categories. Internal concepts, such as ‘surface colours’, are not constituted by the corresponding categories of physics, or tied to them, for example, in the sense of the latter being necessary and sufficient conditions for the former. Rather, they are constituted not only by regularities of the external physical world but also by biological regularities that are contingent with respect to physics, by internal physical and architectural constraints, and by contingent properties of internal coding, constraints about which nothing much is presently known. Current functionalist-computational approaches to colour perception tend to substantially (rather then merely heuristically) base their physical descriptions of the sensory input on categories of the yet-to-be explained perceptual output, such as ‘surface’, ‘shadow’, or ‘illumination’, and to tacitly presuppose the perceptual concepts and categories which they profess to produce as a result of the computational procedures. By conflating different levels 22 Note that the ‘proximal semantics’ denotes a feature that is defined purely syntactically; the ‘proximal semantics’, as well as the structural relations among representational primitives, are given by design and are thus essentially impervious to change by experience. (What is modifiable by experience are the values of certain parameters, the latitude of which is determined in a highly specific way that is proprietary to a structure of perceptual representations.)
the dual coding of colour
401
of analysis in this way, more specifically by conflating propositions about the physical world as such with those about the world as structured by the yet-to-be-explained perceptual system of an observer, they dodge an essential task of perceptual research, viz. the identification of the internal conceptual structure of perception.
Triggering and parameter setting: The dual function of sensory codes with respect to representational primitives The elementaristic perspective in colour perception, the conceptual framework of which rests fundamentally on the measurement-device misconception of perception and is shaped by concepts from neurophysiology and colorimetry, is obviously ill-equipped to deal in a theoretically fruitful way with the complex role ‘colour’ plays within cognitive architecture. As pointed out frequently in the earlier literature, enquiries into colour perception, if divorced from general enquiries into the structure of representational primitives, will fail to capture appropriately the relevant aspects of this role, and almost inevitably result in a distorted theoretical picture. Theoretical frameworks appropriate for colour perception must be general enough to also be appropriate for dealing with the structure of representational primitives. The theoretical perspective from which I will approach colour perception is derived basically from two kinds of sources, which are intimately connected in some of their core ideas. First, an ethological approach, as pioneered—taking the entire organism as the level of analysis—by von Uexküll, Lorenz, and Tinbergen, and couched, with respect to specific subsystems, in computational terms by, for example, Hassenstein and Reichardt (1956), and extended to richer and more complex biological functions by, for example, Wehner (1987), Marler (1999), or Gallistel (1998). Secondly, by an internalist line of thinking, as described above, which found its most elaborate expression in Chomsky’s internalist enquiries into the nature of language and mind (e.g. Chomsky 2000). A cardinal feature of an ethology-inspired internalist approach, which in its basic conceptions is in line with deep conceptual clarifications of the nature of perception that have been achieved in the history of the field, notably in the seventeenth century,23 is that it focuses attention on the rich internal conceptual structure with which the perceptual system is biologically endowed. In specific domains, such an approach has already yielded intriguing explanatory frameworks of promising range and depth. In perceptual psychology, its basic tenets are receiving support from a wealth of empirical and theoretical evidence that has been marshalled by gestalt psychology, Michotte’s ‘experimental phenomenology’, studies with newborns and young children, and computational analyses: this evidence indicates that the structure of internal coding is built up in terms of a rich set of representational primitives.24 23 Among these seventeenth-century achievements (cf. for example, Yolton 1984, 2000; Wilson 1990), which Chomsky (1997) referred to as the first cognitive revolution, the work of Arnauld (1683/1990; see also Nadler 1989) and Cudworth (1731; see also Passmore 1951) is of particular relevance for perception theory (cf. Mausfeld 2002a, Appendix). 24 Michotte was particularly sensitive to the problem of meaning in perceptual theory, which he regarded as being intrinsic to the structure of primitives that underlie perceptual organization and that ‘prefigure’ the phenomenal world.
402
colour perception
The relation between the sensory input and the representational primitives The theoretical picture that has emerged from corresponding studies can be condensed abstractly in this way: perception cannot be understood as the ‘recovery’ of physical world structure from sensory structure by input-based computational processes. Rather, the sensory input serves as a kind of sign for biologically relevant aspects of the external world that elicits internal representations on the basis of given representational primitives.25 Although the sensory input is a causally necessary requirement for perceptual representations, the perceptual computations triggered are under the control of an internal programme based on a set of representational primitives; they are representation-driven rather than stimulus-driven. These primitives determine the data format, as it were, of internal coding. Each primitive has its own proprietary types of parameters, relations, and transformations that govern its relation to other primitives. The data structure for the internal representational primitive ‘surface’, for instance, can be expected to include a set of free parameters, which refer to attributes such as ‘colour’, ‘stability’, ‘tenacity’, ‘ruggedness’, ‘orientation’, etc. (again to be understood as internal, and not as physical attributes), as well as parameters for ‘ambient illumination’ and ‘local illumination’. Note again, that within an ethological and internalist approach the use of the term ‘surface representation’ serves only as a convenient abbreviation for an element of postulated internal structure (whose nature we presently only poorly understand), whose core properties seem to be describable, at a meta-theoretical level, in terms of perceptual achievements that are related to actual surfaces; it is not, in any meaningful sense, to be understood as a representation of physical surfaces, and neither involves any particular ontological commitments about mental entities nor, on this level of analysis, any reference to the external world. The values of the free parameters, which lie in a specific region of the corresponding parameter space, have to be determined by the sensory input (and are probably modulated by factors such as ‘attentional weight’). The sensory codes thus serve a dual function: first, they provide triggering cues for representational primitives and thus they determine the potential data formats in terms of which input properties are to be exploited. Secondly, they are used by the activated primitives to determine the values of their free parameters. The activation of a representational primitive and the determination of the free parameters have to be interlocked dynamically. On the one hand, values can only be assigned to free parameters once the data format has been determined; on the other hand, the activation of a specific data format requires that the values assigned to the free parameters be in a permissible range and lie in a specific region of the corresponding parameter space (if certain types of parameters belong to more than one representational primitives, their values are very likely constrained differently).
25 Thus, even ‘highly impoverished’ sensory inputs can trigger perceptual representations, the ‘complexity’ of which far exceeds that of the triggering stimulus, and the relation of which to the sensory input can be contingent from the point of physics or geometry.
the dual coding of colour
403
Although the properties and interdependencies of the free parameters of representational primitives have to mirror, with respect to the perceptual system as an entirety, biologically relevant structural properties of the external world, empirical evidence strongly suggests that they are co-determined by internal aspects, such as internal functional constraints or internal architectural constraints, such as legibility requirements at interfaces. The complex, and up-to-now poorly understood, interdependencies of free parameters, which do not simply mirror external physical regularities, contribute to the fact that representational primitives defy definition in terms of a corresponding physical concept (even in the sense of the latter providing necessary and sufficient conditions for the former); rather, they have their own peculiar and yet-to-be identified relation to the sensory input and may also depend intrinsically on other representational primitives, in a way that cannot simply be derived from considerations of external regularities, however appropriately we have chosen our vocabulary for describing the external world.
Non-reducible primitives of the perceptual system When dealing with perceptual systems as complex as ours, this general theoretical picture requires, in my view, a refinement by distinguishing in a specific way between a sensory system and a perceptual system. Before I characterize this distinction, I will try to motivate it. In the earliest evolved sensory systems, such as those confined to phototaxis, the function of the sensory input is to control movement of the organisms with respect to external objects, and thus is, in a sense, completely exhausted by the way it interfaces with the motor system. In the course of evolution, sensory systems of increasing complexity have evolved which exploit and integrate different kinds of input properties for the purpose of the same output function, such as prey catching, and, at even higher levels of complexity, exploit the same input property independently for the purposes of several different output functions, such as feeding and spatial orientation.26,27 In even more complex sensory systems that have to subserve a great variety of tasks simultaneously, the outputs of many subsystems must be integrated into a common representational structure and made available internally for purposes of a great variety of higher-order representations, such as those that perceptually exploit the behaviour of conspecifics. Architectural complexity increased further when perceptual systems came to evolve that ‘are not linked to specific motor outputs but to cognitive systems involving memory, semantics, planning, and communication’ (Goodale 1995, p. 175), in other words,
26 In bees, for instance, colour vision proper and wavelength-dependent behaviour coexist and subserve independent functions (cf. Goldsmith 1990). The action spectra for wavelength-dependent behaviour underlying bees’ celestial orientation and navigation, depend on more than one pigment, without exhibiting metameric classes, whereas trichromatic colour vision is exclusively employed in feeding and recognition of the hive. For a related dissociation of wavelength processing and colour perception proper in the human case, see Heywood et al. (1991). Cf. also D’Zmura (Chapter 4 this volume). 27 The corresponding sensory–motor subsystems can be organized, functionally as well as neurally, quite independently (e.g. Ingle 1983), without resulting, beyond some internal co-ordination, in some kind of common representing structure whose internal function goes beyond those of the single subsystems.
404
colour perception
representational systems that provide the means to assign ‘meanings’ in terms of ‘external world’ objects and properties.28 Along with increasing computational demands on perceptual architecture, and various kinds of internal constraints associated with it, a system of internal perceptual representation has emerged (by processes that are still not understood), which extends far beyond physical aspects of the external world. The rich conceptual structure of the perceptual system cannot simply be understood as mirroring physical categories of the external world. Rather, an adequate explanation is tantamount to apprehending the ‘internal semantics’ of the system. The ‘internal semantics’ of the perceptual and the cognitive system includes, as had already clearly recognized by Cudworth (1731, p. 155), ‘intelligible ideas of cause, effect, means, end, priority and posteriority, equality and inequality, order and proportion, symmetry and asymmetry, aptitude and ineptitude, sign and thing signified, whole and part’ as well as other ‘ideas of the mind which were not stamped or imprinted upon it from the sensible objects without, and therefore must needs arise from the innate vigor and activity of the mind it self ’. Because the complex conceptual structure of the perceptual system cannot be derived or inductively inferred from the structure of the sensory input, it is, I believe, necessary to distinguish a sensory system from a perceptual system in enquiries into human perceptual capacities. In line with empiricist preconceptions about the conceptual structure of the mind, there have been many highly influential attempts to deny, or call into question, the need for such a distinction. Such conceptions regard it as desirable to explain the properties of a system entirely in terms of observables. This is, first of all, a perplexing postulate, since it is entirely alien to the methodological principles normally employed in the natural sciences, where we impute existence, subject to empirical verification, to whatever increases the explanatory range and depth of frameworks that account for the relevant observations and facts. Still, conceptions that presume that the conceptual structure underlying perception can be derived from ‘sensory information’ prevail, in various guises, in perception theory. According to such preconceptions, sensory concepts are ‘fundamental’ and are given as part of our biological endowment, whereas non-sensory or non-observational concepts have to be defined in terms of sensory concepts or built up from them inductively. It is well known from the history of epistemology that corresponding programmes in epistemology of founding non-observational terms entirely in sensory ones foundered, even in their most sophisticated variants. In perception theory, sophisticated research programmes along these lines, such as Marr’s influential approach or Shepard’s ideas about evolutionary internalized regularities, have enriched the structure of the sensory system by a rich set of internal assumptions and heuristics about the physical world or internalized physico-mathematical regularities that cannot, by themselves, be derived from the sensory input but rather have 28 Since the evolution of more complex structures apparently takes, as a matter of speaking, advantage of already existing older ones, it is partly mirrored in the functional and neural organization of the primate brain. For instance, Goodale and Milner (cf. Goodale 1995)—elaborating on a distinction proposed earlier by Schneider, and Ungerleider and Mishkin—distinguished a dorsal and a ventral cortical stream, which they associated with different transformations of the sensory information, namely transformations that relate it to the entirety of visual information in the case of the ventral stream, and transformation into egocentric frameworks for motorial purposes in the case of the dorsal stream.
the dual coding of colour
405
to be regarded as part of the biological endowment of the system. However, as mentioned above, the conceptual structure underlying human perception extends far beyond concepts that refer to physical properties of the world. Unless one belittles and grossly underestimates the richness of the conceptual structure of our perceptual system, an appropriate explanatory account of it cannot be derived from the conceptual structure of the sensory system, as empiricist theories of the mind purport to be the case. The sensory system, as understood in the present distinction, deals with the transduction of physical energy into neural codes and their subsequent transformations into codes that are ‘readable’ by, and fulfil, the structural and computational needs of the perceptual system; we can refer to these codes as ‘cues’ or ‘signs’. Its internal concepts are definable in the same physico-geometrical language that we use in psychophysics to describe the sensory input, and its operations are purely sensory-based transformations, such as filtering and convolutions, calculation of certain derivatives of luminance distributions, gain control operations, or any other mathematical operation of the sensory input or of codes obtained from other such operations.29 Although the conceptual structure of the sensory system can be described in terms of the physico-geometrical language used for a description of the sensory input, we cannot simply give a direct physical explanation of its properties. Rather we need an additional, more abstract level of analysis, often referred to as ‘computational level’. The reason for this is that even the sensory system is representationdriven (with respect to its internal conceptual structure) rather then input-driven, i.e. the sensory system can generate the same information from a variety of physically different input signals and make it accessible in a highly versatile way for a variety of more complex representations. The sensory system, according to the distinction made here, pre-processes the sensory input—in a way that is dynamically interlocked with the specific requirements of the representational primitives involved—in terms of a rich set of input-based concepts that are tailored for the structural and computational demands of the perceptual system. The perceptual system, on the other hand, contains, as part of our biological endowment, the exceedingly rich perceptual vocabulary in terms of which we perceive the ‘external world’, such as ‘surface’, ‘physical object’, ‘intentional object’, ‘potential actors’, ‘self ’, ‘other person’, or ‘event’ (with respect to a great variety of different categories and time scales), with their appropriate attributes such as ‘colour’, ‘shape’, ‘depth’, or ‘emotional state’, and their appropriate relations such as ‘causation’ or ‘intention’. Thus, its representational primitives, which not only pertain to physical and biological aspects but also to mental states of others, cannot be defined in terms of the primitives of the sensory system: the ability to interact mentally with others rests on representational primitives (the nature of which is still at the boundary of scientific elucidation) which have their proprietary ways of exploiting the sensory input. It is an essential characteristic of the way these primitives exploit the sensory input that they go ‘beyond’ those physico-geometrical properties of the sensory input that are exploited by primitives dealing with the physical world; for instance in perceiving mental states of 29 The research programme pioneered by Marr has shown how surprisingly rich and sophisticated is the class of concepts that can be achieved on the basis of sensory-based transformations under suitable assumptions about relevant aspects of the physical world.
406
colour perception
others, they go beyond what may be called physical surface characteristics of the situation encountered. A core phenomenon of perception, which is so pervading and fundamental that it is almost overlooked as still being in need of explanation, is what is called ‘figure-ground’ segmentation, correctly regarded as a ‘major obstacle in developing computational theories’ by Weisstein and Wong (1987, p. 61). The occurrence and the specific properties of figureground segmentations30 directly mirror the conceptual structure of the perceptual system, and cannot be derived from sensory-based concepts. The phenomenon of figure-ground segmentation is the result of the way representational primitives, notably those dealing with surface interpretations, interact by virtue of their internal structure. Thus, an explanatory account of figure-ground segmentations is tantamount to an explanatory account of the structure and interplay of representational primitives. Already, this apparently simple phenomenon shows that the representational primitives of the perceptual system, and the concepts expressed by them, cannot be understood in the same physico-geometrical language that we use to describe the input, nor in the language that we use to describe the functioning of the sensory system. Although they could, in principle, be described in such a language, understanding them presupposes an understanding of the internal conceptual structure of the entire system under scrutiny, i.e. of the ‘internal semantics’ of the system.31 Whereas the relation between the ‘internal concepts’ of the sensory system can be described in terms of causation within the language of physics, the internal relations between the representational primitives of the perceptual system require a level of description that, without thereby implying any specific ontological commitments, we can refer to as ‘semantic causation’ and describe, purely syntactically, by computational processes. The same holds for the physical causation at the interface of the sensory system and the perceptual system; with respect to this ‘semantic causation’ we speak of the representational primitives of the perceptual system as being triggered by the signs provided by the sensory system.32 The perceptual system thus comprises the rich perceptual vocabulary, in terms of which the signs delivered by the sensory system are exploited. Furthermore, it provides the computational means to make these perceptual concepts accessible to higher-order cognitive 30 Figure-ground segmentations can refer to different abstract relations between a medium and a perceptual object that it ‘carries’, such as, in the usual understanding of the concept, to its figural aspects, or, in the perceptually important ‘object’ versus ‘stuff ’ distinction, to its material aspects. 31 Even extremely empiricist accounts of perception, such as Gibson’s, have to permit a level of description in terms of biological and perceptually meaningful concepts, in order to account for even the simplest kinds of observations in perception. In Gibson’s case the need to refer to a given conceptual structure of the perceptual system is camouflaged in his concept of ‘affordances’ (such as ‘obstacle’, ‘terrain’, ‘places to hide’, or ‘manoeuvrable objects’); by introducing these non-mental, adaptively significant properties of the physical world, Gibson attempts to externalize meaning, as it were. 32 The distinction between a sensory system and a perceptual system proposed here is different in character from widely made distinctions between so-called earlier or lower-level systems and higher-level systems. The latter basically correspond to the sensation–perception distinction as used by Spencer, James, Wundt, or Helmholtz, which refers to an alleged hierarchy of processing stages within the same vocabulary, by which the sensory input is transformed into ‘perceptions’. In contrast, the present distinction, which is more in line with corresponding distinctions by Descartes, Arnauld, or Cudworth (cf. Mausfeld 2002, Appendix), conceives of the perceptual system as a structure whose primitives cannot be defined in terms of the primitives of the sensory system.
the dual coding of colour
407
systems, where meanings are assigned in terms of ‘external world’ properties. From an ethological perspective, there is no reason to suspect that there is a fundamental difference, with respect to the architecture and functioning of the perceptual system, between perceiving aspects of the physical world and aspects of the mental states of others. In either case, the sensory input serves as a sign for biologically relevant aspects of the external world that elicits internal representations on the basis of given representational primitives. Competing conjoint representations In sufficiently complex perceptual systems with a high degree of representational versatility the same type of input code can be exploited by several representational primitives of the same type, or by different types of representational primitives with overlapping parameter spaces. When different aspects of the visual input are exploited by the sametype of representational primitives, for example ‘surface’ representations, we can encounter situations involving competing interlocked parameters, say for size and distance, orientation and form, or motion direction and form (which can be mirrored phenomenally in multi-stable or vague percepts). A change in the value of one type of parameter, say for coding depth, can, even in cases of otherwise identical stimulus conditions, require considerable changes in other types of parameters, say for coding motion direction or three-dimensional form. The demonstration by Hornbostel (1922) is a particularly striking classical example, showing that a change in parameters for motion direction—and a concomitant change in depth parameters—constrains form parameters in a way that is only compatible with non-rigid transformations of form. Similar observations have been made with respect to other attributes (e.g. Schwartz and Sperling 1983; Dosher et al. 1986; Kersten et al. 1992). For instance, motion can co-determine colour in various ways (Nijhawan 1997; Hoffman, Chapter 12 this volume), and Nakayama et al. (1990, p. 497) observed that ‘If perceived transparency is triggered, a number of seemingly more elemental perceptual primitives such as colour, contour, and depth can be radically altered.’ However, we still have only a poor understanding of which types of representational primitives are involved in these situations. Of particular interest in the present context is a type of architecture, where the same input can be exploited by several different but interlocked representational primitives, and consequently gives rise to multiple simultaneous layers of representations. These types of representations require special mechanisms and computational means to handle the interlocked way in which they exploit the same input, and give rise to exceedingly complex perceptual achievements. We can refer to these types of representations as conjoint representations over the same input (cf. Mausfeld 2003). The existence of conjoint representations is a pervading property of highly versatile and complex perceptual systems. Colour perception appears to be a particularly conspicuous case of conjoint representations. Because the same characteristics, with respect to colour or brightness, of a light array reaching the eye can be produced physically in many different ways (e.g. by different combinations of physical surfaces and light sources or, using a slide or a CRT screen, by light sources alone), representational primitives that subserve different distal interpretations, as it were, compete, on the basis of relevant cues, for the same input. This is an issue that I will address in more detail in the next section. Another related example (observed by
408
colour perception
Turhan 1937, p. 46) is that brightness gradients can give rise to two incompatible percepts simultaneously, one of a curved surface (as would result from an ‘interpretation’ of the sensory input in terms of a specific non-homogeneously illuminated surface) and the other of a slanted flat surface (as would result from an ‘interpretation’ of the same sensory input in terms of a homogeneously illuminated surface). However, the triggering strength of the sensory input does not suffice to tighten an unambiguous ‘interpretation’ in terms of either of the representational primitives involved. The internal vagueness with respect to the representational primitives involved is, as Turhan noted, perceptually mirrored in a peculiar impression of perceptual vagueness and indeterminacy. More complex examples of conjoint representations are pretence play, or watching a theatrical performance. In both of these cases, two types of representational structures are simultaneously activated on the basis of the same input signals, yielding two layers of competing interpretations. Michotte (1960/1991, p. 191f.) properly described the perceptual achievement: ‘[in the] duplication of space and time that occurs in theatrical representation the space of the scene seems to be the space in which the represented events are actually taking, or have taken, place and yet it is also continuous with the space of the theatre itself. Similarly for time also, instants, intervals, and successions for the spectators belong primarily to the events they are watching, but they are left nevertheless in their own present.’ As mentioned, conjoint representations require special computational means to handle the way in which different representational primitives compete for the same output of the sensory system. In line with empirical evidence, we have to assume that the equivalence classes of physical situations or output codes of the sensory system by which representational primitives are triggered yield, in general, smooth and robust triggering characteristics, both with respect to the relation of a single representational primitive to its triggering class of inputs as well as with respect to transitions between representational primitives that exploit the same input.33 Usually, in a given input situation (which can also include dynamic sequences of inputs), there is some latitude as to which representational primitives could be triggered and which values could be assigned to their free parameters; latitude that corresponds to an ambiguity about which of a set of potential external situations could have given rise to the sensory input. The extent of this latitude is determined by the structure of the joint parameter spaces involved. In such cases, the visual system often exhibits a preference for some ‘default interpretations’. These preferences can be expected to partly mirror different probabilities of external scenes by which a certain sensory input can be caused under ‘normal’ ecological conditions. However, such ecological probabilities do not solely, or even predominantly, determine ‘default interpretations’, as can be illustrated by the case of the Ames room, or by perceived non-rigid transformations of rotating rigid objects, due to ‘a coupled assignment of motion (direction of rotation) and form’ (Dosher et al. 1986, p. 973) Rather, in cases where different combinations of values can be assigned to the free parameters, internal 33 Since triggering a representational primitive is tantamount to exploiting the sensory input (or the output of the sensory system) in terms of a specific data format with a specific set of free parameters, corresponding ‘smoothness’ requirements apply, as a rule, to the mapping of physical input features to values of the free parameters.
the dual coding of colour
409
constraints that result from various kinds of stability requirements are very likely to play a crucial role in singling out ‘default interpretations’. Global stability of super-ordinate representations could be maintained, following small variations in the input, by a strategy in which global changes in the representational primitives triggered, and in the values of their free parameters, are, intuitively speaking, kept to a minimum (particularly at the interfaces of the perceptual system with the motor system and with higher cognitive systems). Such a strategy would protect the system from settling, under ‘impoverished’ situations, on some definite interpretation that would have to be changed to an entirely different interpretation following a small variation in the input.34, 35 The conceptual structure of the perceptual system provides a pillar for the conceptual structure of higher-order cognitive systems. Furthermore, it has to suit the ‘conceptual structure’ or schemata of the motor system (the sensory system also interfaces, as plenty of evidence suggests, directly with the motor system; this interface is an old one in evolutionary terms). Consequently, the representational primitives of the perceptual system and their internal structure have to ensure an optimal fit of data formats at the corresponding interfaces. As the essential conceptual tie between the sensory system and higher-order cognitive systems, the perceptual system links the signs provided by the sensory system to the conceptual structure of language and of other cognitive systems. Interestingly, but hardly surprisingly, the conceptual structure of the perceptual system seems, in humans, to resemble more the structure of language (more precisely, the structure of the lexicon of I-language)—where ‘notions like actor, recipient of action, instrument, event, intention, causation and others are pervasive elements of lexical structure, with their specific properties and interrelations’ (Chomsky 2000, p. 62)—than the structure of the sensory system. The theoretical framework boldly outlined and tentatively explored here, whose overarching methodological elements are taken from ethology and internalist approaches to the study of the mind, is, needless to say, sketchy and in want of precision and specification. However, in comparison to currently prevailing approaches to perception, which focus predominantly on aspects of processing, much has already been gained if one takes seriously the besetting foundational questions that any successful explanatory account of perception eventually has to answer, viz. the questions as to the conceptual structure of the perceptual system and the nature of the representational primitives giving rise to it. With respect to ‘colour’, a great variety of evidence has been accumulated since the beginning of systematic investigations into the nature of colour perception, suggesting that ‘colour’ cannot be considered as a kind of independent or homogeneous attribute, but, rather, serves different roles and obeys different principles in different conceptual substructures of the perceptual system.36 34 In input situations where properties are compatible with various combinations of values of the free parameters (of representational primitives of the same, or of different, types), transitions between different interpretations often appear to be, to some extent, receptive to modulations by attentional mechanisms. 35 Phenomena of multistability seem to be due primarily to properties of processes that exploit the visual system’s outputs for the purposes of other cognitive structures (cf. Leopold and Logothetis 1999). 36 This likely holds for other attributes based on common-sense taxonomies as well; for instance ‘transparency’ seems to figure in different ways in different conceptual substructures pertaining to occlusion and containment events, as developmental data indicate (Baillargeon and Wang 2002).
410
colour perception
In the following section, I will deal with the role of ‘colour’ within the conceptual structure of the perceptual system and, more specifically, address observations that suggest directly that there are (at least) two different types of representational primitives in which ‘colour’ figures as a free parameter.
‘Colour’ as two different kinds of free parameters in the structure of representational primitives The general and abstract theoretical framework boldly outlined above, binds together, on the basis of conventional methodological meta-principles pertaining to the study of complex biological systems, a few very general principles of perception that appear to me both well motivated and empirically well supported. The theoretical perspective based on these principles is inevitably conjectural and vague, in the light of what is currently understood of the principles underlying perception. While it can, all the same, serve as a guiding line for enquiry and ways of posing questions, it turns into an explanatory framework for a certain domain only after its blanks have been appropriately specified, to render possible specific, testable predictions. Since colour perception has been approached predominantly from quite different research perspectives, little experimental evidence is available that addresses the issues involved directly and with sufficient resolution. Fortunately, classical works, apart from some isolated cases in more recent years, paid great heed to questions of illumination perception, and consequently provide a wealth of qualitative observations in light for which the proposed framework can be evaluated. In order to facilitate such an evaluation, I will derive, from the more general proposal that ‘colour’ figures in different ways in two different representational primitives pertaining to ‘surfaces’ and ‘illumination’,37,38 some qualitative predictions, which then can be evaluated with respect to the available empirical evidence. So let us assume a kind of architecture and functioning of the perceptual system along the general lines described in the previous section as a basis. Let us further assume that among the primitives underlying visual perception, there is a class that pertains to perceptual entities, such as ‘surfaces’, that are usually also potential objects of manipulation, and a class that pertains to the medium, as it were, by which these objects can be attained perceptually, notably ‘ambient illumination’. Each of these different classes of primitives can be characterized by its proprietary type of logical structure or data format; thus, each type has its own proprietary types of parameters, relations and transformations that govern its relation to the sensory input as well as to other primitives. It then seems natural to 37 At the risk of being repetitious, I will again recall that, within the general approach pursued here, terms such as ‘surface representation’ are not, in any meaningful sense, to be understood as a representation of physical surfaces, but only serves as a convenient abbreviation for an element of postulated internal structure that is entirely determined syntactically, i.e. by its data structure and the kind of transformations and relations that operate on it. 38 ‘Colour’ presumably also figures as a free parameter in a variety of super-ordinate primitives that pertain to more complex biologically relevant aspects of the external world, such as those pertaining to ‘dangers’, ‘edible things’ or to ‘emotional states of others’.
the dual coding of colour
411
assume that each of these two classes has a free parameter pertaining to ‘colour’.39 Both representational primitives consequently form a conjoint representation with respect to the free parameters ‘colour’ (as well as ‘brightness’). The corresponding regions in the parameter spaces for ‘colour’ of these two representational primitives overlap with respect to both the required input from the sensory system and the outputs that feed into a corresponding parameter in super-ordinate representations. The question then arises, how these two different sets of parameters of the same type are interlocked with respect to the common higher-order representation which they subserve and in which they figure. Without further specification, we can conceive a great variety of potential architectures in which different properties of these two kinds of parameters pertaining to ‘colour’ can be recognized with respect to specific achievements that can vary greatly with the type of architecture assumed. For instance, it is, in principle, conceivable that the difference between these kinds of parameters is not mirrored in any corresponding differences in appearance, but that they feed in a phenomenally silent way, as it were, into certain processes that are in the service of specific functional achievements. With this cautionary note in mind, we can still formulate a few qualitative properties, each of which has some plausibility on the basis of the assumptions made. While neither of these properties can be deduced, in a proper sense, from these assumptions, they would, taken together, provide a sufficiently distinctive set of evidence in favour of the proposed framework. The following (interrelated) qualitative properties, with which I will deal in the sequel, seem to me particularly natural on the basis of the assumptions made. 1.
A significant indication for the existence of two different kinds of colour-related parameters involved in situations of perceived surfaces under chromatic illumination would be provided by evidence that the corresponding colour appearances are simultaneously present at the same ‘location’ as distinctive aspects of the percept. Furthermore, such evidence would gain weight if it proved impossible to compensate for phenomenal changes in one of these aspects by changes with respect to the other.
2.
The existence of the two different kinds of parameters pertaining to ‘colour’ should also be reflected in corresponding phenomenal differences of ‘colours as such’. More specifically, there should be two phenomenal realms of ‘colour’, each characterized by specific attributes, depending on the primitive in which they figure. The occurrence of such categorical phenomenal differences does not require that either a surface or an illumination is phenomenally present as a perceptually discernible entity. For functional reasons, it is to be expected that, in general, the two kinds of representational primitives involved break down the sensory colour signal and accordingly assign values to their respective ‘colour’ parameter in a smooth way. Any evidence for conditions under which small changes in certain aspects of the sensory input or changes in figure-ground segmentation result in abrupt switches of the assignment
3.
39 More precisely, the two different parameters involved can be regarded as pertaining to the same attribute, if they are based on the type of input codes of the sensory system and figure as parameters of the same type in some super-ordinate structures and computations. Again, a label such as ‘colour’ serves only as a convenient meta-theoretical characterization of a certain type of parameter.
412
colour perception of the sensory colour signal from one kind of colour parameter to the other, and thus in a corresponding reorganization of the percept, would provide significant evidence that the sensory colour signal is split up by two categorically different kinds of primitives.
4.
5.
The values of the two different kinds of colour-related parameters can be assumed to be subject to different types of internal and external constraints, which also would result in different coding properties. Corresponding evidence would be particularly compelling for seemingly elementary colour codes pertaining, according to the received view, to levels of the sensory system where a distinction into ‘surface’ and ‘illumination’ properties has not yet been established, such as thresholds and other properties of colour and brightness discrimination. The corresponding ‘colour’ parameter of each of the representational primitives involved can be expected to be intrinsically interwoven with other free parameters of the respective primitive. In particular, the attribute ‘colour of a surface’ is not autonomous, as it were, but rather depends on other attributes pertaining to this representational primitive and to its relation to other primitives of the same, or of different, types.
There have not been many experimental studies in colour science designed specifically to address any of these issues. Many other studies, notably those involving centre-surround stimuli, conducted within very different theoretical approaches, have sometimes provided indirect and partial evidence bearing on these issues. Since the evaluation of this evidence is difficult and requires many further assumptions, I will draw primarily on qualitative studies that are more directly relevant to these questions. As mentioned above, many of these studies come from a period when the problem of illumination perception and the dependency of ‘colour’ on the entire ‘structural organization’ of the percept received greater attention than in more recent periods. Phenomenal observations on surfaces under chromatic illumination As to the first type of qualitative properties based on the assumptions made above, it has been regularly reported, particularly in the classical literature, that in certain situations, even as simple as centre-surround configurations, no satisfactory match between two test fields under different context conditions can been achieved by varying the colour codes of the test field. The way to systematically investigate corresponding issues was prepared by Katz, who, in his careful phenomenological observations, noted that colour appearances under chromatic illumination have a peculiar character of a kind that cannot be encountered under normal illumination. As a consequence, ‘attempts to establish colour appearances within a field of view under qualitatively normal illumination that in all respects are equal to colour appearances that can be encountered in fields of view under chromatical illumination are prone to fail’ (Katz 1911, p. 274). Bocksch (1927, p. 373) and Gelb (1929, p. 613, 626) made precisely the same observations, which Gelb regarded as ‘intriguing and theoretically important’. Recently, corresponding observations have been made by Brainard et al. (1997, p. 2098). In general, however, the subtle phenomenal differences in colour appearances under chromatic illumination have often escaped appreciation.
the dual coding of colour
413
Of particular theoretical interest is that small chromatic deviations from a normal illumination are not perceived as chromatic changes in the illumination but rather as a change in an additional quality that cannot be assigned specifically to either surfaces or illumination, but, rather, pertains to the interplay of illumination and surfaces themselves, namely the warm–cold dimension (cf. Koenderink and van Doorn, Chapter 1 this volume).40 Brainard et al. (1997, p. 2098), in their matching experiments, also recognized that they were compelled to resort to a different dimension in describing the subtle differences that impeded a satisfactory match between test fields under different illuminations: ‘. . . the test surface (seen under a bluish illuminant) has something of a cool cast about it, whereas the match surface (seen under a yellowish illuminant) has a warm cast. To the observer it seems therefore as if the match surface should be adjusted to more bluish. But this adjustment does not change the warmth of the match surface. Rather, it has the effect of changing (say) a warm grey to a warm blue, which then still fails to match the cool grey test surface’. We also encountered a similar effect in simple centre-surround configurations, where under certain conditions subjects were not able to completely compensate for the surround-dependent colour appearance at the location of the test spot (Mausfeld 1998, p. 244; see also Ekroll et al. 2002). The problem of descriptively inadequate accounts of the phenomenal interplay of surfaces and chromatic illuminations is aggravated by the impact that the colorimetric tradition has had on our colour vocabulary. The kinds of concepts provided by the colorimetric traditions veil the subtle differences that are crucial here, where in corresponding matching experiments ‘some difference remains, although our language has no specific words to designate it’ (Koffka 1935, p. 258). The non-matchability of test fields under different chromatic illuminations indicates that different types of colour codes are simultaneously active, between which only a partial trade-off is possible. These classical findings suggested the construction of experiments in which the Grassmann codes of the area surrounding the test field were held constant, but the ‘interpretation’ of the surround colour, as being due to a chromatic illumination or to the surface characteristics of the surround, was varied. Kroh (1921), for instance, observed that the ‘hole colour’ in a white reduction screen that is illuminated by reddish light exhibits a larger shift toward green than a ‘hole colour’ in a reduction screen with a reddish surface of the same colour co-ordinates, and that ‘an infield undergoes a stronger change in appearance under the condition of a chromatic illumination than under the conditions of a chromatic surround of exactly the same retinal colour codes’ (Kroh 1921, p. 181ff.) In an important experimental study using Hering’s Nüancierungsapparat, Gelb (1932) found that ‘the colour as such of the surround does not result in a contrast effect’ and that two surrounds that yielded exactly the same colour codes had different effects if seen as a chromatically illuminated surround or a surround of a corresponding surface characteristic. He concluded from his experiments that the segmentation of the visual field into surface and illumination characteristics is a primordial act that is 40 Corresponding perceptual principles, according to which small deviations from a quantitatively specified internal reference system are not simply treated as deviations or noise but rather give rise to a new perceptual quality, can be found in various other perceptual domains (for instance, small temporal delays at a single ear between identical auditory patterns affect the voluminousness of the percept).
414
colour perception
due to the ‘structural form of our perceptual visual world’, rather than being the result of contrast-dependent transformations of the retinal colour signal. Such a conjecture about a dual organization of colour codes, between which no complete trade-off at the location of the test field is possible, is in line with the phenomenal peculiarities that are characteristic for colour appearances under (chromatic) illumination. Among these phenomena, of particular interest is the effect Helmholtz (1867, p. 407) called seeing two colours ‘at the same location of the visual field one behind the other’, and what Bühler (1922) referred to as ‘locating colours in perceptual space one behind the other’. Similar observations have been made by many others (e.g. Fuchs 1923a; Brunswik and Kardos 1929, p. 316, who attributed them to the ‘dual nature of the underlying psychophysical processes’; or Koffka 1935, p. 261f., who spoke of a ‘double representation’).41 In simple, everyday situations of, say, a white wall in a room illuminated by a reddish light, we can ‘see’ both the colour of the object (e.g. ‘white’ wall) and the colour of the illumination, though there is, as Katz (1911, p. 274) observed, a ‘curious lability of colours under chromatic illumination’. Gelb (1929, p. 678) noted in such situations that ‘the solidness and tightness of the segmentation of the visual field undergoes a loss, even at a moderately chromatic illumination’ and that ‘the concepts of “proper” colour of surfaces and “normal” illumination intimately correspond with each other’ (cf. also Kardos 1929, p. 50). Activating illumination-related primitives by simple centre-surround configurations The observations just mentioned refer to experimental situations in which actual illuminations are used for setting up the physical stimulus configuration. However, it is, as mentioned above, of no relevance whether the physico-geometrical characteristics of the incoming sensory input that activate an illumination interpretation within the visual system are physically caused by an actual illumination, or by other ways of establishing the relevant characteristics. We can, thus, expect to find other experimental observations in colour science that bear directly on these issues, although they were not constructed by using actual illuminations. Particularly, certain bi-segmentations of the visual field, as, for example, in centre-surround configurations, seem to be likely to activate internal mechanisms that have to do with internally handling the interplay of ‘surface’ and ‘illumination’ interpretations.42 Many observations in these situations can be understood in terms of such 41
While certain situations for activating representational primitives pertaining to ‘surface’ and ‘illumination’ result in percepts of having a surface-related and an illumination-related colour phenomenally present simultaneously, there are also situations that activate two surface representations with their proprietary types of parameters, so as to yield the almost paradoxical percept of seeing two surfaces at the same ‘location’ of the visual field simultaneously. Think, for instance, of looking out of a train window at dusk, and simultaneously seeing a red hat on the hat rack and a green tree at the same location in the window. In experimental settings phenomenal transitions have been found between transparent and opaque representations of surfaces (Faul 1997) or even conditions under which surfaces are simultaneously opaque and transparent (Cavanagh 1987). 42 This has already been emphasized by Bühler (1922, p. 131), who conceived of the phenomenon of simultaneous contrast in such situations as a ‘degenerate marginal phenomenon’, attesting to the visual system’s attempt to preserve colours under changes of illumination, and by Kardos (1929, p. 44). Naturally, the relevant representational primitives involved do not unfold to their fully fledged structure under these conditions, in the sense that an overwhelming part of their free parameters remains undetermined, such as, in the case of surface representations, ‘depth’, ‘orientation’, or ‘texture’.
the dual coding of colour
415
achievements (cf. Mausfeld and Niederée 1993). Among these observations are two classical effects, which played a prominent role in the controversy between Helmholtz and Hering, namely the so-called ‘tissue contrast’ effect (Meyersche Florkontrastversuch, Helmholtz 1896, p. 547) and the observation made with a half-mirror by Ragona Scina (Spiegelkontrastversuch, Helmholtz 1896, p. 557; cf. Graham and Brown 1965, p. 462). They appear innocent enough, but are actually still in want of a satisfactory explanation. If analysed in terms of the visual system’s attempts to pre-process the incoming colour signal in terms of a dual colour code, they can, however, suggest some conjectures about potential mechanisms underlying a laminar segmentation of the sensory signal into a dual colour code, to which I will turn in the subsequent section. The ‘tissue contrast’ effect can be described as follows. If a small piece of grey paper, which we can refer to as a test spot, is placed on the centre of a large piece of coloured paper and a piece of tissue paper is then placed over these pieces of paper, the test spot has a colour appearance roughly complementary to the colour of the surrounding piece of paper (while an induced colour is absent or much weaker without the tissue paper). Often, as was also noticed by Helmholtz, the complementary colour of the test spot is much more vivid than the weak colour of the surrounding piece of paper. Furthermore, the effect is strongest when test spot and surround are of approximately the same luminance; in particular, the effect is much weaker for a white test spot than for a medium grey test spot. The effect disappears if a small piece of paper is placed on top of the tissue paper, even if it is only placed over a small part of the area of the grey patch.43 The effect is increased if the tissue paper is moved back and forth, which facilitates a spatial segmentation into depth layers.44 The tissue paper phenomenon behaves as if the chromatic content of the surround is captured by the spatial layer of the tissue and then interpreted as a chromatic illumination. These different types of empirical observations bear on the qualitative prediction that both kinds of colour-related parameters are simultaneously present phenomenologically, and furthermore that it proves impossible to compensate for phenomenal changes in one kind of parameter by changes in the other kind of parameter. This seems to me to be a particularly revealing class of evidence supporting the idea that there are different representational primitives in which ‘colour’ figures, and that consequently we cannot deal with ‘colour’, as such, detached from enquires into the structure of these primitives. Modes of appearance revisited The second qualitative prediction also directly bears on this issue. Relevant phenomenological observations are provided by various types of ‘material colours’, such as the appearances of metal, soil, stones, water surfaces, skin, etc. These indicate that the colour parameters 43 Without introducing further ad hoc assumptions, such as presumed suitable non-linearities somewhere in the system, these observations cannot simply be subdued to the idea that effects of the ambient illumination can be accounted for by adaptational modifications of ‘original colours’ or primary colour codes. Even Walls (1960, p. 34), who maintained that phenomena such as those of Land’s two-colour projections can be explained entirely by elementary sensory mechanisms such as ‘spatial induction’, ‘general and local adaptation’, and ‘colour conversion’, seemed to be less confident in the case of the tissue paper contrast: ‘Tongue in cheek, one tells students that this blurs the contour, and that this facilitates induction across it’. 44 The importance of depth segmentation for a segregation of surface and illumination colour has been emphasized by Hering, and more explicitly by Bühler (1922).
416
colour perception
in surface primitives are intrinsically interlocked with other types of parameters and thus comprise aspects that go beyond ‘colorimetric colour’. Traditionally, part of the relevant empirical evidence has been classified under the heading ‘modes of appearances’, a concept which I have already mentioned in a previous section. According to this purely descriptive concept, which itself still requires explanation, the appearances of colour segregate phenomenally into mutually (almost) exclusive categories, which gave rise to the conjecture that these categories mirror internal processes or states of essentially different nature. Many subtle observations and conceptual distinctions have been made that centre around the notion of ‘modes of appearances’. In the present context, I will only refer to some rather coarse and well-established observations that seem to me of particular relevance for the issues under scrutiny. The most fundamental dichotomy seems to be the distinction between what are called ‘aperture colours’ or ‘film colours’, which are obtained under ‘complete reduction’ of the visual field, on the one hand, and ‘surface colours’ on the other hand.45 Katz characterized aperture colours as appearing fronto-parallel and having no orientation in space, appearing spatially two-dimensional but still rendering it possible ‘to visually dive into them to different depths’ (Katz 1911, p. 7).46 Surface colours, on the other hand, can exhibit any kind of afrontal orientation, as well as granularity of structure and texture. Only surface colours can appear as having a separate ‘illumination value’, as being illuminated. For colours that appear ‘matterless’ or ‘objectless’ ‘the possibility to segregate an illumination aspect from them is absent’. If, however, ‘they manifest a distinct surface character, the impression of an illumination becomes cogent’ (Katz 1911, p. 374). Katz (1911, p. 9; see, however, Martin 1922, p. 479) was convinced that ‘between surface colours and aperture colours all kinds of transitions’ can be perceived.47 Wallach (1976, 45
Particularly with respect to the first type, there is considerable variation in the vocabulary found in the literature; Katz spoke of Flächenfarben, Martin (1922, p. 452) of ‘film colours’. 46 The isolated colour patches underlying the colorimetric traditions, e.g. those used for the determination of colour-matching functions, also belong to the class of aperture colours. With the theoretical framework pursued here, aperture colours correspond to in-between stages of internal vagueness—which is not to be confused with perceptual vagueness (there is no perceptual vagueness in these cases)—where the system has not yet been able to settle on a data structure in terms of the representational primitives involved. 47 The existence of continuous transitions between surface and aperture colour, and the even further-reaching observation that there are colours of the same kind, as it were, in both classes, i.e. for example, a green light and an olive-green surface exhibit some phenomenological similarity, are themselves of great importance. Although in principle these two ‘worlds’ of colour appearances could have been completely divorced from each other phenomenologically, the adaptive requirement of colour constancy necessitates the possibility of at least a partial compensation between the two. An important consequence of the requirement of ensuring smooth transitions between conjoint representations is the existence of what is called a ‘proximal mode’ in perception (cf. Mausfeld 2003). Evidently, once we have attained the ability to exercise a suitable ‘mental attitude’, we can perceptually detach certain attributes from their ‘frame of reference’ as given by a specific representational primitive in which these attributes figure. Attributes that figure in both types of conjoined representations involved can, apparently, be dissociated from aspects that are proprietary to each of the representations involved. Thus, the existence of a proximal mode helps to protect the system from adopting a behaviour where small continuous changes in the input result in abrupt changes in internal representations. The small decontextualized colour patches underlying colorimetry are, with respect to the representational primitives involved, a degenerate situation that is closely related to the ‘proximal mode’. The percept yielded by the ‘proximal mode’ is sometimes referred to as the ‘local colour quale’. In many situations, one can focus attention on the ‘local colour quale’ as such, or on colour as a property of surfaces (cf. Landauer and Rodger 1964; Arend and Goldstein 1987); for instance, a spot appearing
the dual coding of colour
417
p. 18) regarded ‘intensity relations’ as a main factor driving different modes of appearance, and accordingly held the dichotomy between a ‘surface mode’ and a ‘luminous mode’ to be fundamental. He observed (Wallach 1976, p. 8) that continuous transitions between a grey and a luminous appearance exist, which, for instance, can be produced experimentally by using a half-ring as surround for the test field. Under such conditions, Wallach also found situations in which ‘the ring is simultaneously grey and luminous’ and pointed out that ‘the existence of a luminous grey is of great importance’. The phenomenal dissociation of brightness and greyness, and the possibility to elicit both at the same time, also suggest that there are different representational primitives in which ‘brightness’ figures as a parameter.48 It is worth noting that although the concepts ‘surface colour’ and ‘illumination’ are intimately tied together under these accounts, this does not necessitate that the illumination is also phenomenally present as a distinct separate impression. The activation of some mechanism that internally represents the ambient illumination is not necessarily mirrored as an illumination component in the phenomenal impression.49 Rather, without being phenomenally represented, it can affect the structure of the percept, which is internally yielded by processing in terms of representational primitives for ‘surface’ and ‘ambient illumination’ or ‘local illumination’. Such a dissociation can be found regularly under many other experimental conditions, such as Adelson’s corrugated plaid configuration, where often a shadow is not perceptually present, or in Todorovic’s version thereof, where an impression of a local illumination is even more lacking,50 but yet the stimulus configuration is presumably internally processed in these terms. The conception of ‘colour’ as a parameter of the data structure of representational primitives for ‘surfaces’ also gains support from a clinical observation, following certain brain lesions, of a dissociation of perception of the colour of a surface from the perception of the surface itself. Gelb (1920) reported a case where the patient was no longer able to see surface colours, i.e. all colours had the appearance of film colours. They lacked the object colour’s property of being dense and opaque and instead looked ethereal and detached from the corresponding objects and their texture; they seem to be floating in front of the objects grey when seen in the first mode of attention may appear as a shadowed part of a white object or an illuminated part of a black one in the second mode. Situations like these, in which it is possible to produce, by slight changes in the mode of attention, transitions where the ‘surface gains in whiteness to the same extent that the illumination looses brightness’ are, as Gelb (1929, p. 600) rightly noted, of ‘particular theoretical importance’. 48 Wallach also ventured some conjectures about the way the two types of parameters involved compete for the same input signal: ‘We may consider luminosity as the result of that part of the neural representation of stimulation in a given area which does not participate in a “surface color process” ’. An area elicits the appearance of an illuminant or self-luminous object, if the value of the relevant sensory signal that is used to specify a ‘brightness’ parameter in a surface representation exceeds the permissible range of this parameter. In Wallach’s words, if the value of the relevant sensory signal is ‘insufficient to involve in a color surface process all of the process that represents the stimulation in that area, leaving some of it free to function as a luminous process’, this furnishes, according to Wallach (1976, p. 10), the explanation for ‘when we seem to perceive illumination’. 49 Corresponding assumptions were not only made by Katz, Gelb, and Wallach, for example, but also underlie interpretations of findings in the centre-surround type of situations, in terms of functional illumination-related achievements, as reported regularly in the literature (e.g. Jenness and Shevell 1995; Schirillo and Shevell 2000; MacLeod and Golz, Chapter 7 this volume). 50 See Adelson (1999).
418
colour perception
and looked fronto-parallel. Although the patient did not see the colour as attached to the surfaces of objects, he showed approximate colour constancy. On the other hand, he was no longer able to see a shadow as such, but rather saw it as a dark spot. Although such observations concerning lesions (the minutiae of which are unknown) are notoriously hard to interpret, they support the idea that the concept ‘colour of an object’ requires a separate representational format to be available. It is of particular interest that in this patient the assignment of the sensory colour signal to a ‘surface’-type representational primitive, and thus the internal concept of an object colour, was not established, but that, none the less, there was a pre-processing of the sensory signal, yielding what Gelb refers to as approximate colour constancy; thus, a sensory process was still active, which was suited to the demands of a normally functioning perceptual system. I will deal with corresponding issues in the subsequent section. Transitions and switches in the activation of different types of primitives There is a dense accumulation of particulate experimental evidence that bears on the third qualitative prediction—of abrupt re-organizations of the percept in the sense of a switch between a surface- and an illumination-related appearance following apparently slight changes in the input pattern. Pertinent and compelling evidence can be found in the classical literature, where Katz, Gelb, Wallach, and many others described an abundance of situations in which ‘very small changes in external stimulus conditions or in internal modes of perceiving’ are accompanied by quite abrupt transitions between—as Gelb (1929, p. 600) put it with respect to colour—internal states that are ‘of essentially different nature’. Hering’s ‘stainshadow’ demonstration (Fleck-Schatten-Versuch) is a prototypical example of phenomena that demonstrate how certain attributes can modulate the relation between different representational primitives that exploit a given sensory input. In Hering’s demonstrations, slight changes in figural characteristics of the incoming light array, namely masking of the penumbra of a shadow by a dark line, are sufficient to induce a switch to a ‘surface’ representation that completely exhausts the information related to brightness.51,52 There are a great many other relevant observations; for more recent instances see, for example, Knill and Kersten (1991), Adelson (1993), or Buckley et al. (1994) and Bloj et al. (1999). Different properties of different kinds of ‘colour’ parameters Direct empirical evidence is still meagre, and more difficult to evaluate, for the fourth type of qualitative prediction made, namely that, conditional upon this categorical distinction, two different kinds of colour-related parameters exhibit different properties that are mirrored by corresponding differences in properties of colour codes for various other kinds of achievements. But there are a few indications in the presumed direction. For instance, 51 This is the case even when the physical construction of the situation—i.e. light source, shadow-casting object and the process of drawing the boundary—is completely transparent to the subject. The available perceptual and cognitive ‘interpretations’ are completely overruled by a single geometric characteristic. 52 Furthermore, Metzger (1932; cf. also Heider 1933) observed that if a larger screen is introduced between the light source and the shadow, which is surrounded by a dark line, such that the shadow is itself covered by the larger shadow of the screen, it suddenly lightens up and appears much brighter than the surrounding area (sometimes exhibiting a metallic appearance).
the dual coding of colour
419
Krüger (1925), among others, observed that the differential sensitivity for detecting brightness differences is much less for brightness changes of the illumination than for brightness changes of surfaces (see also Gelb and Granit 1923).53 Further important evidence that is likely to bear on corresponding issues comes from an apparently quite different domain of enquiry, namely from qualitative observations of the way ‘colour’ behaves with respect to figure-ground segmentations. It seems natural to expect that the coding properties pertaining to a representational primitive ‘ambient illumination’ (or, more abstractly, transmission medium) resemble, and are probably related to, coding properties of the ‘ground’ in figureground segmentations. Observations of figure-ground asymmetries in elementary colour properties by, for example, Rubin (1921), Fuchs (1923a,b), or Wolff (1935), particularly the observations that a fixed area in a stimulus configuration exhibits stronger colour constancy if perceived as figure than if perceived as ground, forcibly indicate that colour properties are conditional upon the representational primitive in which ‘colour’ figures.54 Interdependence of different types of parameters As to the fifth type of qualitative prediction made above, there is a wealth of experimental evidence that the attribute ‘colour of a surface’ is not autonomous, as it were, but, rather, depends intrinsically on other attributes as well, and can, in turn, influence other attributes. As rich and as variegated as the corresponding qualitative observations are, it is difficult to derive more specific theoretical constraints from them. They extend from the dependence of colour appearance on various aspects of form, as demonstrated in Fuchs’ pioneering study (Fuchs 1923a,b), to phenomena such as the Munker–White phenomenon, neon-colour spreading, ‘colour from motion’ (Hoffman, Chapter 12 this volume) to the interdependence of colour and aspects of depth and spatial organization. The fact that the organization of ‘colour’ in terms of the internal interplay of surface- and illumination-related aspects is tied intrinsically to the perceptual organization of space was particularly emphasized by Hering, Bühler, Kardos, and Gelb, who provided rich corresponding empirical evidence. Furthermore, Krauss (1928) observed that rooms perceptually shrink in depth under intense chromatic illumination. The idea that colour is not an autonomous attribute, as alleged in much of current research, was almost commonplace in the classical literature, as expressed, for instance, by Koffka and Harrower (1931, p. 215), who concluded from their extensive studies, that ‘the psychophysical processes, occurring in acts of perception, instead of being separable into colour-, space- (local sign), and form-processes are processes of field organization; colour, place and form being three interdependent aspects of this general event’. A discussion of interdependencies between ‘colour’ and other internal attributes, in terms of stimulus variables such as form, motion, depth, texture, etc., may lead one erroneously to conceive of these interdependences in terms of physical input aspects, rather than in terms of internal attributes. However, among the internal attributes that are part of the structural 53 In line with corresponding observations on figure-ground organization, it might reasonably be conjectured that the parameter values for ‘illumination’ colours are less fine-grained than values for free parameters for surface colours. 54 Examples of other corresponding findings, which can only be understood by conceiving of ‘colour’ as an aspect of the structural form of perception, are that the colour of the afterimage can, for identical sensory inputs, depend on the figure-ground segmentation (Fuchs 1923b, p. 291), or that, according to Oyama (1960; Weisstein and Wong 1987, p. 32), red regions tend to be seen as figures more than blue regions.
420
colour perception
format of representational primitives for, say, ‘surface’ can be attributes that do not have a simple physical correlate in the sensory signal; for example, attributes that we can, as a makeshift describe as ‘stability’, ‘tenacity’, ‘ruggedness’, or as ‘ripe’, ‘juicy’, ‘dry’, etc.55 The empirical evidence on which I have drawn so far is taken from quite different domains, and refers to different types of observations and data. It constitutes a particularly distinctive basis for evaluating how explanatory frameworks for colour perception fare as explanatory accounts for a significant range of facts. Naturally, in colour science, as well as in perception science in general, there is considerable disagreement about what should be regarded as significant facts to be explained, and what should count as an adequate explanation. But whichever way one looks at it, the facts and observations referred to above can be discerned as significant facts that have to be explained under any kind of successful explanatory framework. Taken individually, none of them directly provide compelling evidence in favour of the theoretical assumptions that gave rise to the above qualitative predictions. Taken as a whole, however, these facts and observations fit, or so it appears to me, in an organic way into the general theoretical framework boldly outlined above. They thus give added credence to this theoretical perspective, which, in its basic contentions, rests on well-founded theoretical bases in various domains of scientific enquiry. The evaluations of empirical findings in this section have centred around the question of the role ‘colour’ plays within the conceptual structure of the perceptual system. While we are still far from formulating appropriate specific conjectures about the structural form and interdependences of representational primitives underlying perception, available evidence suggests that ‘colour’ figures as a free parameter in two different types of representational primitives, which form a conjoint representation with respect to this parameter. Because of this, the corresponding regions in the parameter spaces for ‘colour’ of these two representational primitives overlap with respect to the required input from the sensory system, and the visual system has to provide computational means to deal with sensory inputs that are compatible with different parameter combinations in this joint region. The question then arises how the codes provided by the sensory system are exploited by the representational primitives under scrutiny, that is, how the sensory input is pre-processed in order to be compatible and fulfil the demands of the representational primitives involved. Further qualitative observations on the pre-processing of the sensory colour codes into two components The sensory system has to provide, at its interface with the perceptual system, a set of codes that optimally fulfil the computational and structural demands of the activated representational primitives. With respect to ‘colour’, the sensory system has to pre-process, within its theoretical vocabulary, the retinal colour codes in a way that allows a specification of the 55 ‘Colour’ as part of a representational primitive ‘surface’ is deeply anchored in the entire structure of this primitive. Accordingly, its phenomenal ‘dimensions’ will likely mirror these structural interdependencies and, linguistically, comprise all sorts of aspects, such as warm/cold, stirring, calm, fresh, dry, juicy, etc., which refer, in common-sense terms, both to colour ‘as such’ and to surface properties, affordances, emotional connotations, etc. Because of this, using Munsell chips in order to attempt to understand the role of colour within the structure of perceptual representations must unavoidably result in a vastly distorted theoretical picture (see Wierzbicka 1990, p. 119, for an illustrative example that refers to the ‘juicy’ aspect).
the dual coding of colour
421
corresponding kinds of free parameters. In particular, the sensory system has to pre-process the retinal colour codes such that they can be segregated into two components that provide a basis for a dual colour code. A great variety of relations on, and transformations of, retinal colour codes have been found that are potential candidates for such purposes and could act as corresponding cues for the perceptual system. Various schemes have been proposed to explain how these cues can be integrated and used for a segregation into a dual colour code. Among these are averages of the colour codes of the incoming light array, maximum values of certain codes, ratios of colour codes, various rescaling and normalization schemes, correlations between luminance and chromaticity, or the properties of the covariance matrix of colour codes (cf.; Webster, Chapter 2 this volume; MacLeod and Golz, Chapter 7 this volume; Maloney, Chapter 9 this volume). Similar to what has been found by Marr in other contexts, this shows how surprisingly rich and sophisticated is the class of sensory concepts that can be achieved on the basis of sensory-based transformations under suitable assumptions about relevant aspects of the physical world. This class of concepts is greatly enriched if other sensory codes that can act as potential cues to the illuminant are also taken into account, particularly ones that capture relevant aspects of the three-dimensional geometry of the scene. The relations and transformations just mentioned have been derived predominantly from considerations that refer to actual physical surfaces under chromatic illumination. On the basis of the above-mentioned and empirically supported assumption that centresurround configurations can already partially activate corresponding primitives, and thus elicit processes that subserve the establishment of a dual colour coding, further insights can be achieved about factors that determine or modulate this splitting-up of the retinal colour codes. I will use again the ‘tissue contrast’ effect to address a qualitative observation that seems to me of relevance in the present context. Note that placing a piece of tissue over the centre-surround configuration changes several aspects of the stimulus situation: for instance, it blurs the contours, introduces a depth segmentation between tissue paper and centre-surround configuration, introduces texture, reduces the contrast between centre and surround, and increases the whiteness of the coloured surround. While a change in any of these and other variables can be expected to influence the establishment of a dual colour code, evidence from other observations, made under a variety of conditions, indicates that the effect of what might be described in terms of a homogeneous whitening of the surround56 particularly facilitates a laminar segmentation of the incoming colour signal into a dual colour code in terms of a ‘illumination’-related component and a ‘surface’-related component. It appears as if a high component of common whiteness of surround and centre increases the tendency of the visual system to interpret the surround colour as caused by an ambient illumination, and thus to correct for this illumination colour at the location of the test spot.57 This also holds for simple centre-surround configurations which do not 56 Describing construction variables of such effects in these terms is not meant to imply that these are the relevant internal variables responsible for these effects, because the same situation can also be described equivalently in terms of other parameters. 57 As Gelb (1929, p. 627) summarized the corresponding observations, ‘if one wants to elicit pronounced phenomena of colour constancy, one should not use illumination colours that are too saturated’.
422
colour perception
elicit some segmentation into depth layers. Helmholtz had already noted that decreasing the saturation of the surround increases the strength of the so-called simultaneous contrast phenomenon, an observation that fits poorly with any ideas of mechanisms of laterally induced contrast. Many corresponding observations have been made since, such as the one made by Walls (1960, p. 34), who projected a disc of white light on a screen, which was surrounded by a broad annulus of coloured light from a second projector: ‘If now a flood of dim light is put over the screen with a third projector and gradually increased in intensity, one finds that the colored annulus is quickly washed out . . . but the colored spot is as saturated as ever, Kirschmann’s laws to the contrary. Specifically, if the annulus is blue the spot is yellow, and when the white wash has completely desaturated the blue the spot still glows like a sun. The durability of the induced color has to be seen to be believed—the white wash cannot wash it out.’ Wall’s conditions were similar to the ones used in producing coloured shadows, where a wash of white light is placed on the entire scene, particularly on the shadowed region with respect to the chromatic illumination. In the experimental set-ups just mentioned, whitening, in the sense of increasing the common whiteness component of infield and surround (or some other descriptively equivalent parameterization that adequately captures the relevant internal aspects), seems, in the absence of other relevant cues, to facilitate an internal interpretation in terms of a chromatically illuminated scene. This relation may find its counterpart in corresponding phenomenological observations of colours under chromatic illumination. As regularly reported in the literature, a chromatic illumination produces a phenomenal ‘whitening’ of the surface colours viewed. Thus, a red surface under a reddish illumination appears somehow as if the red has been washed out, is less pronounced, as if a part of the redness of the incoming colour signal is ascribed to the illumination and thus not available for an assignment to the colour of the surface.58 If centre-surround configurations suffice to (partially) trigger in the perceptual system conjoint representational primitives that handle them internally in terms of a centre surface that is chromatically illuminated by a surround-dependent illumination, whereby the chromaticity of the illumination is determined from the surround colour, it does not come as a surprise that in many investigations based on such stimulus configurations—from Bühler (1922) to Walraven (1976), Wesner and Shevell (1992), Jenness and Shevell (1995), Schirillo and Shevell (2000), Mausfeld and Andres (2002), and many others—it has been observed that regularities found with centre-surround configurations can be better understood if an interpretation in terms of an illuminated scene is employed.59 Such an interpretation cannot, and does not, refer to actual surfaces or illuminations, but rather to corresponding internal representations. Evidently, there are infinitely many different potential distal scenes, and thus different combinations of surfaces and illuminations or lights alone, that may have 58 For instance, Bocksch (1927, p. 369/376) reported from his experiments, ‘Red and colours in its vicinity were not simply seen as red or reddish. Rather the perceived colour is brightened in a peculiar way, glowing, of a spatial character and most of all diluted with white.’ Furthermore it loses its surface character and appears ‘in a peculiar way foggish and dissolved’. 59 This connection is made more explicit in Mausfeld and Niederée (1993), where centre-surround configurations are regarded, in an ethological sense, as ‘minimal configuration’ or sign stimuli for these functional achievements.
the dual coding of colour
423
caused the physico-geometrical proximal pattern of a centre-surround configuration. From a functional point of view, it would not make sense for the visual system to single out any of these potential external world interpretations in a situation that is meagre with respect to the demands of the representational primitives involved. It is even more surprising that the visual system nevertheless exhibits some dispositions to pre-process such configurations as if it favours a certain type of decomposition of the incoming light pattern in such situations. Thus, it is precisely because centre-surround configurations are impoverished with respect to the demands of the representational primitives involved that they can be used to reveal predispositions and ‘default assumptions’ in splitting-up the retinal colour code into two components. The ‘colour’ parameters of the representational primitives involved are intrinsically interwoven with other free parameters of these primitives, as the experimental evidence mentioned above indicates. Because of this, it is highly unlikely that an assignment of values to the respective ‘colour’ parameters can be made on the basis of relations, on or transformations, of retinal colour codes alone (as computational schemes based entirely on colour codes presume). Rather, these relations and transformations within the sensory system can only yield some solution space for permissible pairs of values for the free parameters involved. Various other types of sensory codes (for instance ones pertaining to spatial and figural aspects) modulate which pair of values of free parameters is singled out from the solution space. It seems reasonable to conjecture that the sensory transformations of retinal colour codes that give rise to a solution space for pairs of ‘colour’ parameters are based on procedures that, under ‘physically friendly’ conditions, exploit structural regularities that different kind of situations have in common, and thus approximate sufficiently well a variety of situations in which light and surface colour properties are entangled. These can be as diverse as viewing surfaces under chromatic illumination, viewing surfaces through interposed chromatic filters, light scattering, specular transparency or other situations of additive transparency.60 The above observation on the effect of a homogeneous whitening of the surround in centre-surround configurations now suggests that the size of the space of permissible pairs of values for the free parameters involved seems to decrease with an increasing saturation of the surround. It seems to converge on a solution where almost the entire value of a local colour code of the surround is assigned to the ‘surface colour’ parameter, whereas the parameter for the ‘illumination colour’ is assigned a value that corresponds to an internal attribute ‘neutral illumination’. For spatially inhomogeneous surround, this effect can better be described in more general terms by referring to first- and second-order statistics of retinal colour codes. Mausfeld and Andres (2002) found evidence that second-order statistics of chromatic codes of the incoming light array co-determine the decompositions of the retinal colour codes into a dual code and differentially modulate the relation of the two kinds of representational primitives involved. Roughly, large variances of colour codes in the surround reduce the solution space to values of ‘illumination colour’ that correspond to a ‘neutral illumination’. Small variances of colour codes, on the other hand, which likely yield larger solution 60
cf. Faul (1997), D’Zmura et al. (1997), Faul and Ekroll (2002).
424
colour perception
spaces, reveal a predisposition of the perceptual system to assign the space-averaged colour code of the scene to the value of the ‘illumination colour’ parameter. This is in line with corresponding everyday observations of surfaces viewed under chromatic illumination. It is also in line with an experimental observation by Metzger and Zöller (1969), who set up, in a viewing box, a scene made up entirely of objects of roughly the same chromaticity and not too different in lightness, separated in depth, and neutrally illuminated by a hidden light source. They found that ‘the colour detaches itself from the objects and seems to fill the room with a chromatic illumination, whereas the objects themselves appear achromatic’, i.e. that the chromaticity of the scene was being attributed predominantly to a corresponding illumination.61 A wealth of studies on mechanisms of ‘colour constancy’ has unearthed a rich variety of transformations of retinal colour codes that mirror relevant colour-related ecological constraints and potentially co-determine a segmentation into a dual code. The findings just mentioned provide further constraints on computational procedures by which the sensory system pre-processes the retinal colour code in terms of potential values for a dual colour code. However, as many other factors beyond ‘colour’ co-determine the solution in a given situation, transformations based on colour codes alone do not suffice, as a rule, to single-out an assignment of parameters, but can only yield some solution space of permissible pairs of values. Specific values of pairs of parameters can only be singled out by taking into account other types of codes provided by the sensory system in a given situation. With respect to our cognitive architecture, ‘colour’ is not an autonomous attribute, but rather is determined by the structure of representational primitives in which it figures. This sharply contrasts with ideas based on measurement-device conceptions of perception, which attempt to achieve an understanding of how the visual system disentangles illumination colour and surface colour almost entirely within the domain of colour.
Acknowledgement I wish to thank Franz Faul and Johannes Andres for valuable comments on an earlier draft of this chapter.
References Adelson, E. H. (1993). Perceptual organization and the judgement of brightness. Science 262, 2042–2044. Adelson, E. H. (1999). Lightness perception and lightness. In The new cognitive neurosciences, (2nd edn), (ed. M. Gazzaniga), pp. 339–351. MIT Press, Cambridge, MA. Allen, G. (1892). The colour-sense. Kegan Paul, London. 61 This effect is reduced if (1) all objects are at the same depth, (2) they are of the same form, (3) they are not distributed over the entire scene, but rather cluster at some location. A black object or some objects of different chromaticity, particularly if not placed in the centre of the scene, do not exercise a strong influence on the effect. However, if a white object is placed into the scene, the effect vanishes.
the dual coding of colour
425
Arend, L. and Goldstein, R. (1987). Simultaneous constancy, lightness and brightness. Journal of the Optical Society of America A 4, 2281–2285. Arnauld, A. (1683/1990). On true and false ideas. Edwin Mellen Press, Lampeter. Aubert, H. (1865). Physiologie der Netzhaut. Morgenstern, Breslau. Baillargeon, R. and Wang, S. (2002). Event categorization in infancy. Trends in Cognitive Science 6, 85–93. Beck, J. (1972). Surface color perception. Cornell University Press, Ithaca. Billmeyer, F. W. and Saltzman, M. (1981). Principles of color technology, (2nd edn). Wiley, New York. Bloj, M. G., Kersten, D., and Hurlbert, A. C. (1999). Perception of three-dimensional shape influences colour perception through mutual illumination. Nature 402, 877–879. Bocksch, H. (1927). Duplizitätstheorie und Farbenkonstanz. Zeitschrift für Psychologie 102, 338–449. Boynton, R. M. (1971). Color vision. In Woodworth and Schlosberg’s experimental psychology, (ed. J. W. Kling and L. A. Riggs), pp. 315–368. Methuen, London. Boynton, R. M. (1975). Color, hue, and wavelength. In Handbook of perception, Vol. V, Seeing, (ed. E. C. Carterette and M. P. Friedman), pp. 301–345. Academic Press, New York. Boynton, R. M. (1979). Human color vision. Holt, Rinehart and Winston, New York. Brainard, D., Brunt, A., and Speigle, M. (1997). Color constancy in the nearly natural image. I. Asymmetric matches. Journal of the Optical Society of America 14, 2091–2110. Brakel, J. van (2002). Chromatic language games and their congeners. In Theories, technologies, and instrumentalities of color. Anthropological and historiographic perspectives, (ed. B. Saunders and J. van Brakel), pp. 147–168. University Press of America, Lanham. Brunswik, E. and Kardos, L. (1929). Das Duplizitätsprinzip in der Theorie der Farbenwahrnehmung. Zeitschrift für Psychologie 111, 307–320. Buchsbaum, G. and Gottschalk, A. (1983). Trichromacy, opponent colours coding and optimum colour information transmission in the retina. Proceedings of the Royal Society London B220, 89–113. Buckley, D., Frisby, J. P., and Freeman, J. (1994). Lightness perception can be affected by surface curvature from stereopsis. Perception 23, 869–881. Bühler, K. (1922). Die Erscheinungsweisen der Farben. In Handbuch der Psychologie.Part I. Die Struktur der Wahrnehmungen, (ed. K. Bühler), pp. 1–201. Fischer, Jena. Burnham, R. W., Hanes, R. M., and Bartleson, C. J. (1963). Color: A guide to basic facts and concepts. Wiley, New York. Cassirer, E. (1929). Philosophie der symbolischen Formen. Part 3: Phänomenologie der Erkenntnis. Bruno Cassirer, Berlin. Casson, R. W. (1997). Color shift: Evolution of English color terms from brightness to hue. In Color categories in thought and language, (ed. C. L. Hardin and L. Maffi), pp. 224–239. Cambridge University Press, Cambridge. Cavanagh, P. (1987). Reconstructing the third dimension: Interactions between color, texture, motion, binocular disparity, and shape. Computer Vision, Graphics, and Image Processing 37, 171–195. Chomsky, N. (1997). Language and cognition. In The future of the cognitive revolution (ed. D. M. Johnson and C. E. Erneling), pp. 15–31. Oxford University Press, Oxford. Chomsky, N. (2000). New horizons in the study of language and mind. Cambridge University Press, Cambridge. Cudworth, R. (1731). A treatise concerning eternal and immutable morality. James and John Knapton, London (Reprinted 1976 by Garland, New York).
426
colour perception
Dosher, B. A., Sperling, G., and Wurst, S. A. (1986). Tradeoffs between stereopsis and proximity luminance covariance as determinants of perceived 3D structure. Vision Research 26, 973–990. D’Zmura, M., Colantoni, P., Knoblauch, P., and Laget, B. (1997). Color transparency. Percepütion 26, 471–492. Ekroll, V., Faul, F., Niederée, R., and Richter, E. (2002). The natural center of chromaticity space is not always achromatic: A new look at color induction. Proceedings of the National Academy of Sciences USA 99, 13352–13356. Evans, R. M. (1948). An introduction to color. Wiley, New York. Evans, R. M. (1974). The perception of color. Wiley, New York. Faul, F. (1997). Theoretische und experimentelle Untersuchungen chromatischer Determinanten perzeptueller Transparenz. Dissertation, Christian-Albrechts-University Kiel. Faul, F. and Ekroll, V. (2002). Psychophysical model of chromatic perceptual transparency based on subtractive color mixture. Perception and Psychophysics 19, 1084–1095. Fuchs, W. (1923a). Experimentelle Untersuchungen über das simultane Hintereinandersehen auf derselben Sehrichtung. Zeitschrift für Psychologie 91, 145–235. Fuchs, W. (1923b). Experimentelle Untersuchungen über die Änderung von Farben unter dem Einfluß von Gestalten (‘Angleichungserscheinungen’). Zeitschrift für Psychologie 92, 249–325. Gallistel, C. R. (1998). Symbolic processes in the brain: the case of insect navigation. In Methods, models and conceptual issues. An invitation to cognitive science, Vol. 4, (ed. D. Scarborough and S. Sternberg), pp. 1–51. MIT Press, Cambridge, MA. Gelb, A. (1920). Über den Wegfall der Wahrnehmung von Oberflächenfarben. Zeitschrift für Psychologie 84, 193–257. Gelb, A. (1929). Die ‘Farbenkonstanz’ der Sehdinge. In Handbuch der normalen und pathologischen Physiologie, Vol. 12, 1.Hälfte. Receptionsorgane II, (ed. A. Bethe, G. v. Bergmann, G. Embden, and A. Ellinger), pp. 594–678. Springer, Berlin. Gelb, A. (1932). Die Erscheinungen des simultanen Kontrastes und der Eindruck der Feldbeleuchtung. Zeitschrift für Psychologie 127, 42–59. Gelb, A. and Granit, R. (1923). Die Bedeutung von ‘Figur’ und ‘Grund’ für die Farbenschwelle. Zeitschrift für Psychologie 93, 83–118. Goldsmith, T. H. (1990). Optimization, constraint, and history in the evolution of eyes. The Quarterly Review of Biology 65, 281–322. Goodale, M. (1995). The cortical organization of visual perception and visuomotor control. In An invitation to cognitive science: Visual cognition, (ed. S. Kosslyn and D. Osherson), pp. 167–213. MIT Press, London. Graham, C. H. and Brown, J. L. (1965). Color contrast and color appearances: Brightness constancy and color constancy. In Vision and visual perception (ed. C. H. Graham), pp. 452–478. Wiley, New York. Granit, R. (1955). Receptors and sensory perception. Yale University Press, New Haven. Hassenstein, B. and Reichardt, W. (1956). Systemtheoretische Analyse der Zeit, Reihenfolge und Vorzeichenauswertung bei der Bewegungsperzeption des Rüsselkäfers Chlorophanus. Zeitschrift für Naturforschung 11b, 513–524. Heider, F. (1933). Remarks on the brightness paradox described by Metzger. Psychologische Forschung 17, 121–129. Helmholtz, H. von (1867). Handbuch der Physiologischen Optik. Voss, Hamburg.
the dual coding of colour
427
Helmholtz, H. von (1896). Handbuch der physiologischen Optik, (2nd edn). Voss, Hamburg. Hering, E. (1920). Grundzüge der Lehre vom Lichtsinn. Springer, Berlin. Heywood, C. A., Cowey, A., and Newcombe, F. (1991). Chromatic discrimination in a cortically colour blind observer. European Journal of Neuroscience 3, 802–812. Hochegger, R. (1884). Die geschichtliche Entwicklung des Farbensinnes. Verlag der Wagner’schen Universitätsbuchhandlung, Innsbruck. Hornbostel, E. M. von (1922). Über optische Inversion. Psychologische Forschung 1, 130–156. Hunt, R. W. G. (1977). The specification of color appearance. I. Concepts and terms. Color Research and Application 2, 53–66. Ingle, D. (1983). Brain mechanisms of localization in frogs and toads. In Advances in vertebrate neuroethology, (ed. J. P. Ewert, R. R. Capranica, and D. J. Ingle), pp. 177–226. Plenum, New York. Ives, H. E. (1912). The relation between the color of the illuminant and the color of the illuminated object. Transactions of Illuminating Engineering Society 7, 62–72. Jackendoff, R. (1987). Consciousness and the computational mind. MIT Press, Cambridge, MA. Jaensch, E. R. (1921). Über den Farbenkontrast und die sog. Berücksichtigung der farbigen Beleuchtung. Zeitschrift für Sinnesphysiologie 52, 165–180. Jaensch, E. R. and Müller, E. A. (1920). Über die Wahrnehmung farbloser Helligkeiten und den Helligkeitskontrast. Zeitschrift für Psychologie 83, 266–341. Jenness, J. W. and Shevell, S. K. (1995). Color appearance with sparse chromatic context. Vision Research 35, 797–805. Jones, L. A. (1953). The historical background and evolution of the colorimetry report. In The science of color, (ed. Commitee on Colorimetry), pp. 3–15. Optical Society of America, Washington, DC. Judd, D. B. (1940). Hue saturation and lightness of surface colors with chromatic illumination. Journal of the Optical Society of America 30, 2–32. Judd, D. B. (1951). Basic correlates of the visual stimulus. In Handbook of experimental psychology, (ed. S. S. Stevens), pp. 811–867. Wiley, New York. Judd, D. B. (1960). Appraisal of Land’s work on two-primary color projections. Journal of the Optical Society of America 50, 254–268. Kardos, L. (1929). Die ‘Konstanz’ phänomenaler Dingmomente. In Beiträge zur Problemgeschichte der Psychologie, pp. 1–77. Fischer, Jena. Kardos, L. (1934). Ding und Schatten. Eine experimentelle Untersuchung über die Grundlagen des Farbensehens. Barth, Leipzig. Katz, D. (1911). Die Erscheinungsweisen der Farben und ihre Beeinflussung durch die Individuele Erfahrung. Barth, Leipzig. Kersten, D., Bülthoff, H. H., Schwartz, B. L., and Kurtz, K. J. (1992). Interaction between transparency and structure from motion. Neural Computation 4, 573–589. Knill, D. and Kersten, D. (1991). Apparent surface curvature affects lightness perception. Nature 351, 228–230. Koffka, K. (1932). Some remarks on the theory of colour constancy. Psychologische Forschung 16, 329–354. Koffka, K. (1935). Principles of Gestalt psychology. Harcourt, Brace, and World, New York. Koffka, K. (1936). On problems of colour-perception. Acta Psychologica 1, 129–134. Koffka, K. and Harrower, M. R. (1931). Colour and organization I. Psychologische Forschung 15, 145–192.
428
colour perception
Krauss, S. (1928). Tatsachen und Probleme zu einer psychologischen Beleutungslehre auf Grundlage der Phänomenologie. Archiv für die gesamte Psychologie 62, 179–225. Kries, J. von (1882). Die Gesichtsempfindungen und ihre Analyse. Veit, Leipzig. Kroh, O. (1921). Über Farbenkonstanz und Farbentransformation. Zeitschrift für Sinnesphysiologie 52, 181–216, 235–273. Krüger, H. (1925). Über die Unterschiedsempfindlichkeit für Beleuchtungseindrücke. Zeitschrift für Psychologie 96, 58–67. Landauer, A. A. and Rodger, R. S. (1964). Effect of ‘apparent’ instructions on brightness judgments. Journal of Experimental Psychology 68, 80–84. Leopold, D. A. and Logothetis, N. K. (1999). Multistable phenomena: Changing views in perception. Trends in Cognitive Science 3, 254–264. MacLeod, R. B. (1947). The effects of ‘artificial penumbrae’ on the brightness of included areas. Miscellanea Psychologica Albert Michotte, pp. 138–154. Institut Superieur de Philosophie, Louvain. Marler, P. (1999). On innateness: Are sparrow songs ‘learned’ or ‘innate’? In The design of animal communication, (ed. M. D. Hauser and M. Konishi), pp. 293–318. MIT Press, Cambridge, MA. Martin, M. F. (1922). Film, surface, and bulky colors and their intermediates. The American Journal of Psychology 33, 451–480. Marty, A. (1879). Die Frage nach der geschichtlichen Entwicklung des Farbensinnes. Gerold, Wien. Mausfeld, R. (1998). Color perception: From Grassmann codes to a dual code for object and illumination colors. In Color vision, (ed. W. Backhaus, R. Kliegl, and J. Werner), pp. 219–250. De Gruyter, Berlin/New York. Mausfeld, R. (2002). The physicalistic trap in perception theory. In Perception and the physical world, (ed. D. Heyer and R. Mausfeld), pp. 75–112. Wiley, Chichester. Mausfeld, R. (2003). Competing representations and the mental capacity for conjoint perspectives. In Inside pictures: An interdisciplinary approach to picture perception, (ed. H. Hecht, B. Schwartz, and M. Atherton), pp. 17–60. MIT Press, Cambridge, MA. Mausfeld, R. and Andres, J. (2002). Second order statistics of colour codes modulate transformations that effectuate varying degrees of scene invariance and illumination invariance. Perception 31, 209–224. Mausfeld, R. and Niederée, R. (1993). Inquiries into relational concepts of colour based on an incremental principle of colour coding for minimal relational stimuli. Perception 22, 427–462. Maxwell-Stuart, P. G. (1981). Studies in Greek colour terminology, Vol. I, ŴAYKO. Brill, Leiden. Metzger, W. (1932). Eine paradoxe Helligkeitserscheinung. Psychologische Forschung 16, 373–375. Metzger, W. and Zöller, W. (1969). Simulierung einer buntfarbigen Beleuchtung durch Gegenstände gleicher Oberflächenfarbe. In Contemporary research in psychology of perception (ed. A. Lehtovaara and J. Järvinen), pp. 93–96. Söderstöm Osakeyhtiö, Helsinki. Michotte, A. (1948/1991). L’énigma psychologique de la perspective dans le dessin linéaire. Bulletin de la Classe des Lettres de l’Académie Royale de Belgique 34, 268–288. (The psychological enigma of perspective in outline pictures. In Michotte’s experimental phenomenology of perception, (ed. G. Thinès, A. Costall, and G. Butterworth), pp. 187–197. Erlbaum, Hillsdale, NJ, 1991.) Michotte (1960/1991). Le réel et l’irréel dans l’image. Bulletin de la Classe des Lettres de l’Académie Royale de Belgique 46, 330–344. (The real and the unreal in the image. In Michotte’s experimental phenomenology of perception, (ed. G. Thinès, A. Costall, and G. Butterworth). Erlbaum, Hillsdale, NJ, 1991.)
the dual coding of colour
429
Nadler, S. M. (1989). Arnauld and the Cartesian philosophy of ideas. Manchester University Press, Manchester. Nakayama, K., Shimojo, S., and Ramachandran, V. S. (1990). Transparency: relation to depth, subjective contours, luminance, and neon color spreading. Perception 19, 497–513. Niederée, R. (1998). Die Erscheinungsweisen der Farben und ihre stetigen Übergangsformen. Thesis, Christian-Albrechts-University Kiel. Nijhawan, R. (1997). Visual decomposition of colour through motion extrapolation. Nature 386, 66–69. Oyama, T. (1960). Figure-ground dominance as a function of sector angle, brightness, hue, and orientation. Journal of Experimental Psychology 60, 299–305. Palmer, S. E. (1999). Vision science. Photons to phenomenology. MIT Press, Cambridge, MA. Passmore, J. A. (1951). Ralph Cudworth: An interpretation. Cambridge University Press, Cambridge. Poggio, T. (1990). Vision: The ‘other’ face of AI. In Modelling the mind, (ed. K. A. Mohyeldin Said, W. H. Newton-Smith, R. Viale, and K. V. Wilkes), pp. 139–154. Clarendon Press, Oxford. Rivers, W. H. R. (1901). Vision. In Reports of the Cambridge anthropological expedition to Torres Straits, Vol. 2, (ed. A. C. Haddon), pp. 1–132. Cambridge University Press, Cambridge. Rowe, Ch. (1974). Conceptions of colour and colour symbolism in the ancient world. In The realms of colour (ed. A. Portmann and R. Ritsema), Eranos Yearbook 1972, Vol. 41. Brill, Leiden. Rubin, E. (1921). Visuell wahrgenommene Figuren. Gyldendalske Boghandel, Copenhagen. Schirillo, J. and Shevell, S. (2000). Role of perceptual organization in chromatic induction. Journal of the Optical Society of America 17, 244–254. Schöne, W. (1954). Über das Licht in der Malerei. Gebrüder Mann, Berlin. Schwartz, B. J. and Sperling, G. (1983). Luminance controls the perceived 3-D structure of dynamic 2-D displays. Bulletin of the Psychonomic Society 21, 456–458. Stumpf, C. (1917). Die Attribute der Gesichtsempfindungen. Abhandlungen der königlich preussischen Akademie der Wissenschaften. Philosophisch-historische Klasse, 8. Suppes, P., Krantz, D. H., Luce, R. D., and Tversky, A. (1989). Foundations of measurement, Vol. II. Academic Press, New York. Thompson, E., Palacios, A., and Varela, F. J. (1992). Ways of coloring. Behavioral and Brain Sciences 15, 1–26. Troland, L. T. (1929). The principles of psychophysiology. A survey of modern scientific psychology. Greenwood, New York. Turhan, M. (1937). Über räumliche Wirkungen von Helligkeitsgefällen. Psychologische Forschung 21, 1–49. Wallach, H. (1976). On perception. Quadrangle, New York. Walls, G. L. (1960). ‘Land! Land!’. Psychological Bulletin 57, 29–48. Walraven, J. (1976). Discounting the background—the missing link in the explanation of chromatic induction. Vision Research 16, 289–295. Wehner, R. (1987). ‘Matched filters’—neural models of the external world. Journal of Comparative Physiology A 161, 511–531. Weisstein, N. and Wong, E. (1987). Figure-ground organization and the spatial and temporal responses of the visual system. In Pattern recognition by humans and machines, Vol. 2, Visual perception (ed. E. C. Schwab and H. C. Nusbaum), pp. 31–64. Academic Press, Orlando. Wesner, M. F. and Shevell, S. K. (1992). Color perception within a chromatic context: Changes in red/green equilibria caused by non-contiguous light. Vision Research 32, 1623–1634.
430
colour perception
Wierzbicka, A. (1990). The meanings of color terms: semantics, culture and cognition. Cognitive Linguistics 1, 99–150. Wilson, M. D. (1990). Descartes on the representationality of sensations. In Central themes in early modern philosophy, (ed. J. A. Cover and M. Kulstad), pp. 1–22. Hackett, Indianapolis. Wolff, W. (1935). Induzierte Helligkeitsveränderungen. Psychologische Forschung 20, 158–194. Wyszecki, G. (1986). Color appearance. In Handbook of perception and human performance, Vol. 1, Sensory processes and perception, (ed. K. R. Boff, L. Kaufman, and J. P. Thomas). Wiley, New York. Wyszecki, G. and Stiles, W. S. (1982). Color science. Concepts and methods, quantitative data and formulae, (2nd edn). Wiley, New York. Yolton, J. W. (1984). Perceptual acquaintance from Descartes to Reid. University of Minnesota Press, Minneapolis. Yolton, J. W. (2000). Realism and appearance. Cambridge University Press, Cambridge.
commentary: the dual coding of colour
431
Commentaries on Mausfeld Phenomenology and mechanism Don MacLeod Mausfeld’s case for the dual coding of colour is delivered as part of a large and weighty package of arguments against prevalent ways of thinking about perception. His critique is persuasive enough and consequential enough to call for some response from the majority who have (like me) become contented representatives of the dominant paradigm. Mausfeld characterizes the currently dominant standpoint, fairly but narrowly, as heavily influenced by the measuring-device conception of perception. A broader alternative view of the dominant paradigm is that it is a mechanistic one, that treats the process of perception as a causal chain; this can subsume a range of styles of explanation of various degrees of explicitness, simplicity, and concreteness, ranging from the narrow conceptions of a measurement device or transmission of a sensory signal, to Mausfeld’s own suggestion that perceptual experiences with complex inherent structure are released or triggered by appropriate releasing stimuli. In conceding the need for more careful phenomenology, we need not turn away from the more process-oriented goals of the dominant paradigm. The mechanistic standpoint, in some appropriate form, should ultimately complement rather than compete with the phenomenological one. So if the measurement-device conception is founded on (and in turn encourages) a generally inadequate notion of the phenomenology of perception, we should consider what mechanistic conception might replace it as part of a more adequate account of vision. The two aspects of current thinking that Mausfeld singles out for criticism—the reduction of colour experience to three putative attributes that are poorly defined and poorly supported, and the neglect of perception of illumination—are, as he makes clear, serious inadequacies, and well illustrate the general need for more careful attention to the phenomenal structure of colour experience. Mausfeld’s discussion of the attributes of colour perception reveals the great confusion that the standard account of colour phenomenology both sustains and conceals. The standard account may well have originated from a desire to generalize the conception of a univariate measurement device to the many dimensions (putatively, three) of colour perception. Besides its applicability in the world of technology, such a generalization is surely valid for the representation of colour at the level of the cone photoreceptors. This clarity of organization at the initial sensory level invites extension of the idea to more central physiological processes that are presumably more directly linked to perceptual experience. Perhaps an omniscient physiologist could identify some univariant signal somewhere in the brain as the physical substrate for any given phenomenal aspect of perceived colour? But this hope, or article of faith, is attended by at least two serious problems. First, since the relevant attributes are not well defined phenomenally, and still less well defined in terms of their stimulus correlates, it is quite unclear how the envisaged physiological signals should depend on the colour stimulus, or even, at the outset, how many of them there should be. Secondly, it is not clear which attributes, if any, are permanent components of the physiological representation, and which are created when the subject is called on to make a particular type of perceptual judgement. The physiologizing perceptionist might therefore retreat to a far less specific article of faith: that the activations of the brain’s interconnected neurons collectively determine perception. This leads to ‘connectionist’ network models in the tradition of Hayek. If states of the network have many degrees of freedom, the idea of a network substrate is consistent with the view that phenomenal colour space could be of very high dimensionality (indeed it has to be, if the associations that colours elicit—such as the warm/cool distinction discussed by Koenderink and van Doorn (Chapter 1 this volume)—are admitted as an influence on, or aspect of, the colours’ phenomenal identity). If the central neurons do not fall into a
432
colour perception
small number of discrete classes in the way that the photoreceptors do, the resulting lack of obvious privileged directions in the physiological representation may capture the vagueness associated with any claimed phenomenal attributes. And if the neurons are coupled by many-to-many connections in multiple stages to the input, it is no surprise that the relations of the phenomenal attributes to physical quantities can be more or less inscrutable. Besides its immediate value as a precaution of conceptual hygiene, Mausfeld’s conceptual ground-clearing could ultimately be helpful as preparation for building a mechanistic model of this sort. But that is admittedly a remote and ambitious goal, and its fulfilment would surely take us a long way from the measurement-device conception with which physiologists began and to which they still tend to cling. The scenario of a complex cortical network model of the world, updated by sensory input, might help convince mechanistically oriented theorists that Mausfeld’s sensation/perception distinction is not artificial. Especially attractive in this context are network instantiations of generative models (Schrater and Kersten, 2002) in which intermediate (roughly, ‘feature-level’) neural representations are derived either in response to the sensory stimulus or by top-down connections from neurons that specify a perceptual model of the world. The goodness of fit to the feature-level signals allows selection, and parametric specification, of a good model to account for the current input. The primitives of perception find their place at the top level of this architecture. Current experimental neurophysiology likewise often reveals cortical signals that are more easily correlated with phenomenal experience than with proximal stimulus variables. Those signals can reasonably be considered as embodiments or precursors of the perceptual representation, rather than as responses to the sensory input. Neurons credited with providing the basis for ‘subjective contours’ are one relatively simple example. By associating these with perception, rather than with sensation, we acknowledge the essentially constructive role of the part of the system to which they belong. Clearly, the measurementdevice conception cannot advance our understanding of such signals. The analysis of the earlier sensory stages of the visual process is one area where the measurement-device conception can still be useful, but even there it usually needs to be generalized a little, to accommodate transformations of the sensory input by, for instance, spatial and temporal filtering. The core of Mausfeld’s dual-coding thesis addresses a particularly consequential error in the analysis of the attributes of colour: neglect of the perception of illumination. This neglect has led to models of colour constancy in which illumination is discounted, and an illumination-invariant representation of a scene is derived from its image. Rather than assume that information about the relative reflectances of surfaces are derived in this way, and perception of illumination is introduced afterwards—a possible, if not attractive ‘fix’ for such models—Mausfeld suggests that cues present in the image can trigger a ‘laminar segmentation’, in which locally varying surface quality and illumination are both represented in separate parameters of the perceptual representation. The triggering metaphor usefully supplements and bridges the metaphors of passive response (filtering, attenuation, association, combination lock) on the one hand and of active cognition (cueing, inference, hypothesis, calculation, diagnosis) on the other. Like the latter, the trigger metaphor emphasizes the active role of the observer in responding to cues by a selection of already-made interpretations, but it does not exaggerate that, as the other metaphors of this type tend to do. Nor does it inappropriately represent automatic processes, inaccessible to introspection, by processes of deliberate reflection and choice. On the contrary, the triggering metaphor, like the ethological releasing stimulus concept, encourages a mechanistic conception of the active part of perception. The challenge of giving substance the triggering metaphor may be a productive one, as it calls for a merging of active and passive conceptions of perception within a more or less explicit and definite mechanistic framework. Neural networks—especially those of the ‘generative model’ type—have the potential to meet this challenge. Network models can reflect the unity of the phenomenal world with almost no taint of the elementarism to which Mausfeld takes exception. As noted above, they take
commentary: the dual coding of colour
433
us a long way from the measurement-device conception, although they retain the notion of passively propagating univariant signals or activations as the basis of the selection of the appropriate model to account for the current input. The triggering metaphor also captures the apparent immediacy of perception, including colour constancy. At the same time, it reminds us that even phenomena that unfold rapidly, or practically instantaneously, can nevertheless be the products of a long and perhaps complex causal chain, making that apparent immediacy misleading. In his discussion of conflation of levels Mausfeld draws attention to a mildly troublesome pathology that has developed through carelessness in conceptual hygiene. His prescribed remedy, however, is the drastic one of amputation by cutting into the causal chain at the input to the sense organ: his favoured theoretical framework incorporates ‘no notions of reference to the environment’. Perhaps a healthier theoretical organism could result from milder treatment, where conflation is avoided by appropriate integration rather than amputation. Here Mausfeld draws an analogy with the study of the digestive system, where ‘no one would maintain that in order to understand its function one has to take into account its evolutionary history, or physical or chemical regularities of food composition in a certain environment’. But the digestive system does not have the function of creating a representational model of the sources of the ingested food. To make the analogy a closer one, suppose that eating a hamburger triggers a colour change, the eater taking on the colour of the cow from which the meat was taken. In trying to understand this digestive achievement, we would indeed want to consider what mechanisms make it possible for the appropriate ‘inference’ or response to ensue from the ingestion of the meat, and what particular characteristics of the ground meat provide the basis for our digestive system’s inferences as to cow colour. But we would also want to know why the proximal cues are effective for providing the obtained information about the cow—indeed, if the transmission of such information was the primary purpose of digestion, the hamburger would acquire a significance, derived, as with all ethological releasing stimuli, from its association with a certain class of distal object (in this case a cow). How does cow colour influence the particular hamburger meat cues that we are sensitive to? Are there other possible cues that we neglect? Is the context in which the hamburger is eaten important to the chromatic ‘inference’ we make from it? How closely does the accuracy of our recovery of cow colour approach the limit of what is physically possible? In vision, physical and physiological optics (epipolar geometry, diffraction, ocular aberrations, photoreceptor sampling, quantum fluctuations, spectral filtering) are obviously key constraints on perception. The study of perception of material quality through the bi-directional reflectance function (reflected phenomenally in such things as gloss) is one example of how subjective phenomena are being linked to distal (not just proximal) physics in interesting ways. It is indeed possible to define the internal semantics of the visual system with reference to the proximal stimulus only, excluding all such influences and environmental associations. But it is not clear what is gained, and surely something important is lost, by the amputation. For the perceptual model to be useful, its parameters must be systematically related to counterparts (physical, but not necessarily with a simple physical definition) in the external world that the model represents. When surface reflectance, for instance, is estimated by heuristics such as taking ratios of retinal stimulus intensities, we may talk of accurate or inaccurate ‘recovery’ or estimation of a physical characteristic of the perceived object. Mausfeld objects to this terminology, which does invite certain misinterpretations. The term ‘recovery’ might encourage non-scientists to commit a Rylean category mistake; the ‘estimated’ quantity may have no simple and well-defined physical referent, and the perceptual ‘estimate’ need not have a simple or well-defined functional dependence on its physical counterpart, beyond approximate monotonicity. But these misinterpretations are less tempting in treatments of the visual process as a causal chain extending from object to perception, so in that context loose talk of ‘recovery’ may be arguably innocuous.
434
colour perception
The importance of the proximal stimulus in Mausfeld’s view of vision is reflected in his suggestion that the primary task of perception theory (or at least of ‘formal theories of perception’) is the determination of ‘the equivalence classes of input patterns that give rise to the same internal representations or percepts’. This sounds very restrictive, but it does not exclude mechanistic analysis if the internal representations considered include not just those that are introspectively or behaviorally accessible—and are in that sense ‘perceptual’ representations—but intermediate neural representations as well. Thus Mausfeld’s chapter need not encourage mechanistically minded readers to abandon that orientation. But it may well induce them to adjust their intellectual priorities or research priorities in a productive direction. Perception of illumination, including the stimulus conditions for its occurrence, is at present almost completely neglected, though ripe for investigation. Haze and transparency, the role of shape-from-shading computations in colour perception, and the categories and attributes of surface colour, including the perception of material characteristics from visible texture, are other examples of phenomena that have been neglected partly as a result of insufficiently careful analysis of what colour vision is like. These are only beginning to receive the attention they deserve. Elucidation of the frequently complex stimulus-dependence of these aspects of perception will doubtless lead to mechanistic models, and perhaps these can eventually pass from neuromythological status to neurophysiological confirmation. The difficulty of this final step will surely be great, and may be greater than we easily imagine. But even without models, without neurophysiological data, and without neuromythology, the attempt to analyse properly what is phenomenally given and to relate that to the optical stimulus is a useful enterprise and an immediately feasible one.
References Mausfeld, R. (2002). The physicalistic trap in perception. In Perception and the physical world (ed. D. Heyer and R. Mausfeld), pp. 75–112. Wiley, Chichester. Schrater, P. and Kersten, D. (2002). Vision, Psychophysics, and Bayes. In Statistical theories of the brain (ed. R. P. N. Rao, B. A. Olshausen, and M. S. Lewicki). MIT Press, Cambridge, Massachusetts.
commentary: the dual coding of colour
435
Commentaries on Mausfeld An internalist account of colour Don Hoffman Colour pervades our visual experience, and has even seemed to some to be essential to any visual experience. Socrates, in Plato’s Charmides, remarks, ‘And sight also, my excellent friend, if it sees itself must see a colour, for sight cannot see that which has no colour.’ Yet despite its pervasive influence on our most dominant sense, colour remains an enigma whose proper scientific and philosophical enquiry remains a point of much debate. Colour has been identified in scientific theories with wavelengths of light, and with reflectance functions of surfaces. Colour has been identified in philosophical theories with objective properties of a mind-independent world, and with subjective perceptions of observers. The range of such theories does not suggest any convergence by scientists or philosophers to a commonly accepted framework for investigating colour. This might seem remarkable in light of modern technological advances that allow us, with high fidelity, to record, transmit, and display colour for television, cinema, and virtual reality. How could we achieve such technology without a commonly accepted framework? In his chapter, Rainer Mausfeld proposes that the simplified representations of colour that have been developed for technological purposes are in part responsible for retarding the development of an adequate account of the full range of colour phenomena. He goes on to propose an ethological–internalist framework for investigating colour that holds promise for developing an adequate account. I agree that the pointillist approach to colour representation that serves well for technology can be an impediment to an adequate account of colour, if taken seriously as the proper framework for developing such an account. Technological devices transmit and display colours one pixel at a time, and the representations of colour required for this purpose are three-dimensional, e.g. red-greenblue (RGB) or hue-saturation-brightness (HSB) representations. But this representation, which is adequate at the pixel level, is inadequate to account for the richer and higher-dimensional range of colour experiences that arise as soon as one looks more globally than the pixel level. And I agree with Mausfeld that to assume that the pointillist representation is somehow original or foundational, and that the richer colour experiences that arise more globally are secondary, is to get started in exactly the wrong direction for developing an account of colour. Instead, the global level is the correct starting point, and the colours experienced with pointillist displays should fall out as special or degenerate cases of the more global theory. Mausfeld’s internalism is the point of greatest interest for me. It places emphasis on the internal representations that human vision constructs from retinal images, and the role of colour in those representations. In particular, Mausfeld proposes that human vision builds representations for two distinct categories of visual entities: surfaces and illuminations. Colour is one of the free parameters that must be specified in both of these representations. Internalism is a subjectivist approach to colour. Colours are not identified with objective properties of a mind-independent world, such as wavelength or reflectance. They are instead firmly identified with the internal representations constructed by the viewer. Moreover, the causal connections that might normally obtain between objective properties of an external world and the internal representations that are constructed, is not a primary concern of the internalist. Instead, the internalist studies human visual experiences of colour, and builds an account of the representations that underlie those experiences.
436
colour perception
The only aspect of the external world that Mausfeld feels obliged to include in the internalist analysis is the physico-geometric properties of the light incident at the retina. These allow us to understand the relationships between the internal representations of the viewer and the equivalence classes of the physical inputs by which they were triggered. This is where I would like to suggest that Mausfeld’s internalism could be made even more thorough. Mausfeld is steadfast in distinguishing reference to higher-level entities of the physical world, such as ‘surface’, ‘physical object’, and ‘event’, from their internal representational counterparts constructed by the human visual system. He refuses, I think properly, to mix these categories. I propose that this mixing should also be refused for the lowest-level physico-geometric properties of light. Just as human vision builds internal representations of surfaces and objects and events, it also builds internal representations of the low-level geometric properties of light. The internalist does not need to abandon internalism to speak of these geometric properties, or to build theories of their relationship to the other, higher-level, representations constructed by human vision. Indeed it is problematic, and might not even be possible, to be a consistent internalist and yet continue to refer to mind-independent objects and their properties with any degree of confidence. For, according to the internalist view, all visual experience of the world can be ascribed to the creation of internal representations, and these representations need not bear any particular causal or resemblance relations to any supposed mind-independent realm. What is true of vision is, presumably, true of all other sensory modalities as well. So it then becomes difficult to get any independent access to the properties of any presumed mind-independent realm, and therefore difficult to compare even the most basic of these presumed properties, such as the physico-geometric properties of light, to the internal representations of the observer. What the internalist can do consistently is to compare different levels of internal representations with each other, and then theorize about the causal and semantic relations between them. The internalist can do this quite consistently even for internal representations of the most basic of the geometric properties of light and their relationship to internal representations of surfaces, objects, events, and their many properties. But if the internalist wants to make contact with any presumed mind-independent properties of a presumed mind-independent world, there is much work to be done to show how this is, in principle, possible, given internalist assumptions about perception. I buy the internalist assumptions, and I am happy to abandon claims to confident knowledge of a mind-independent realm. Psychophysical, neurophysical, and computational investigations of visual perception can continue in their current forms quite successfully without such assumptions of confident knowledge of a mind-independent realm, and restricting themselves only to internalist principles. And I think they should.
chapter 14
THE IMPORTANCE OF ERRORS IN PERCEPTION alan gilchrist Preface When I began to work on lightness perception in the late 1960s, I focused on veridicality. I was not unaware that we make lightness errors. But I felt that the greatest challenge was to account for the impressive veridicality achieved by the visual system. I also felt that the scope of the errors we make had been exaggerated by laboratory experiments. Thus I deferred the question of errors. I guessed that, once veridicality could be explained, the ‘heavy lifting’ would be done and one would be in a good position to explain the errors. I was confident that the source of the errors would be found in one or more of the components required to achieve veridicality. But that expectation was not fulfilled. Veridicality seemed to require several essential components. First, luminance ratios at edges and gradients must be encoded. Secondly, these edges must be classified in terms of whether each represents a change of reflectance or illumination. Finally, once the edges are classified, they must be integrated within classes. This yields two maps of the visual field, one a map of the reflectance array, and one a map of the pattern of illumination projected on to that reflectance array. I first began to deal with the problem of errors about 15 years ago. I tried to model the errors by postulating successively failures in edge encoding, edge classification, and edge integration. Some errors could be explained by each of these, but the results were disappointing. As I thought more about errors, I came to recognize a compelling logic. The errors are systematic, not random. They can only come from the visual system. Therefore the pattern of errors we make must constitute a kind of signature of visual processing. A. Gilchrist
Introduction To err is human. Lucky for visual scientists. But the errors are usually small and our perceptions of the world correspond very well to the world itself. Lucky for humans. Without a high degree of veridicality in visual perception we would not be here. Both veridicality and error have played important roles in the scientific study of vision. My thesis in this chapter is that, although veridicality has been used systematically in the modelling of visual perception, errors have been consulted only by fits and starts, and the pattern of errors in visual perception has not been exploited systematically. The analysis will focus on the visual perception of surface lightness, that is, the perceived whiteness, greyness, or blackness of object surfaces in the visual world. Ironically, the very success by which the visual system represents the world to us prevents us from appreciating this accomplishment. Newcomers to the study of visual perception
438
colour perception
become drawn in only when the challenges to veridical perception are laid out concretely. The same pattern can be seen in the history of visual science. The modern era of research in visual perception could not begin without some concrete mechanism, and this was provided by the discovery of the retinal image by Kepler (Lindberg 1976) and its demonstration by Descartes in the eye of the slain ox. At first, the discovery of a picture in the eye appeared to solve the problem of vision. But further thought soon showed that this discovery only reveals the profound problems confronting a theory of vision. How can veridical perception be derived from an image whose properties seem to be so different from the distal world we see?
Theoretical advances inspired by veridicality Since the discovery of the retinal image, we find four moments during which clear-cut advances were made in lightness theory. These are: 1. 2. 3. 4.
the classic work of Helmholtz and Hering during the nineteenth century; David Katz in the period 1911–1935; gestalt theory during the 1920s and 1930s; computational models of the 1970s.
Each of these advances was associated with an emphasis on veridicality. Note that I have not included the discovery of lateral inhibition in this list. That discovery provided important clues concerning the mechanism by which luminance ratios are encoded, but did not advance lightness theory per se. Despite widespread expectation, physiological work has yet to drive any important advance in lightness theory. Veridicality, by comparison, has not merely inspired theories of lightness; it has inspired the entire field. The recognition of veridicality in the nineteenth century Helmholtz (1867), Hering (1874), and Mach (1959) founded the modern era of scientific work on lightness perception by drawing attention to the veridicality of lightness. To the layperson there is nothing surprising in the observation that white objects tend to appear white and black objects tend to appear black, despite the conditions of illumination. This is surprising only in the theoretical context inherited by Helmholtz and Hering. At that time a one-to-one relationship between local stimulation and visual response was almost universally assumed. Helmholtz and Hering noted that in fact, our percepts correspond more closely to the distal stimulus than to the proximal stimulus. This fundamental challenge has served as the framework within which the enterprise of lightness theory has been conducted since their time. There were, to be sure, many attempts to deny the problem, to dissolve it rather than solve it. The structuralists insisted that the percept does indeed correspond to the proximal stimulus, even if one must be trained to see things this way. But the clock could not be turned back.
the importance of errors in perception
439
Measuring the degree of veridicality: David Katz The next important figure we meet in lightness history is David Katz (1911, 1935). Katz did not break new ground theoretically. But he took the question of distal/percept correspondence into the laboratory, developing the basic psychophysical methods for measuring lightness constancy. The Katz method most widely used begins with side-by-side lighted and shadowed fields. This is accomplished by means of a vertical screen that shields half of some visual space from the light emanating from a window or lamp. A disc of a given reflectance, often white, is mounted in the centre of the shadowed field and serves as the standard disc. A second, spinning disk with adjustable black and white sectors is mounted in the centre of the lighted field. The reflectance of the adjustable disc is chosen using a reduction screen. A large piece of cardboard with two holes cut in it is interposed between the observer and the stimulus display, so that each hole is homogeneously filled with the image of one of the target discs. The reflectance of the adjustable disc is set by an observer so that the two holes in the reduction screen appear absolutely identical in brightness. This occurs when the two discs become equal in luminance, equal, that is, in proximal stimulation. When the reduction screen is now removed, the discs appear far from equal in lightness. The observer is now asked to make the two discs appear equal in lightness by making a new setting for the adjustable disk. If the observer performs this task perfectly, the reflectance (as measured by the size of the black and white sectors) of the two discs will be equal. However, once the observer has identified this point of subjective lightness equality, there is always some discrepancy between the reflectance of the adjustable disc and the reflectance of the standard disc. This discrepancy (called a B-quotient by Katz) gives us a measure of the absolute amount of error in setting the discs to equal reflectance. By this measure, the error increases in proportion to the difference in illumination of the two fields. But a more useful measure is the amount of error relative to the illumination difference. The amount of relative error, measured in this way (and called a Q-quotient by Katz), actually decreases as the illumination difference is increased. Brunswik (1929) transformed Katz’s measure into a standardized formula by which the degree of constancy in any domain is given a value between zero and one. The formula for Brunswik’s ratio is: BR = (R − S)/(A − S), where A stands for the actual value of the object, S stands for a value corresponding to the luminance of the target, and R stands for the value of the match made by an observer. Despite some flaws, the Brunswik ratio has been used since then as a basic measure of the strength of constancy, not just for lightness, but for size and other constancies. In effect, the Brunswik ratio tells us whether the percept corresponds more closely with the proximal stimulus (0%) or with the distal (100%). Brunswik ratios for lightness have ranged widely from very low to very high. Katz himself obtained disappointingly low values, in the range of 20–45%. In part, this reveals a weakness in Brunswik’s formula. Because the visual response to luminance corresponds to the log of luminance, as shown by Weber and Fechner, Thouless (1931) modified Brunswik’s formula by using log values for its terms. Using the Thouless formula, Katz’s constancy values are in the range 73–86%. But even these values may understate the degree of constancy under
440
colour perception
typical conditions. Some of Katz conditions are now known to be associated with poor constancy performance. The main factor is articulation. Katz’s displays were quite impoverished, visually. His own student Burzlaff (1931) showed that the strength of constancy jumps to slightly over 100% when both the lighted and the shadowed fields are highly articulated, that is, they contain papers of many shades of grey. Gestalt theory: focus on constancy The Gestalt theorists (Koffka 1935; Köhler 1971) took a more radical position, rejecting the concept of sensation. The Gestalt theorists went to great lengths to show that the percept does not correspond to local stimulation, and they declared visual constancy to be the central problem in the field. They in turn were eclipsed, not so much by a psychological school as by the events surrounding the Second World War. After the war, the focus in lightness perception moved away from veridicality. Indeed, brightness, which is our perception of proximal stimulation, became as important as lightness. One might say that the contrast theories were motivated by error as revealed in the simultaneous lightness contrast illusion. But, more importantly, these theories were motivated by physiological findings in the form of lateral inhibition. In any case, veridicality was not central to work in lightness from the Second World War until the end of the 1960s, and as a consequence, little headway was made. Back to reality: computational models in the 1970s The situation changed dramatically with the coming of the computer and machine vision. If military considerations in the larger world played the key role in this development, it nevertheless reinvigorated the field of lightness. Once again the spotlight returned to veridicality. Machine vision seeks to produce an output that corresponds to reality. Errors are to be avoided and lightness errors made by human observers are of no interest. The machine vision focus on veridicality injected new life into the psychology of lightness constancy. A new breed of computational models of lightness emerged. These models talked of recovering reflectance (Horn 1974; Marr 1982). Illumination becomes entangled with reflectance in the projection of the physical environment on to the retina, and computational models often propose algorithms by which these properties are then disentangled by the visual system. This approach to the problem is known as inverse optics. Several such decomposition models were produced (Bergström 1977; Barrow and Tenenbaum 1978; Gilchrist 1979; Adelson 1990; Arend 1994). In a kind of decomposition model called an intrinsic image model (Barrow and Tenenbaum 1978; Gilchrist 1979), the retinal image is somehow parsed into layers, each layer representing a property of the distal world. One of these layers represents reflectance, and, if the model is a good one, the reflectance intrinsic image will stand in close correspondence to the pattern of surface greys in the visual environment. My own contribution to this enterprise (Gilchrist 1979, Gilchrist et al. 1983) was a model in which luminance edges and gradients within the image are first encoded at the retina, then classified as either changes in reflectance or changes in illumination. The edges are then mathematically integrated within separate classes to produce two intrinsic images: one for
the importance of errors in perception
441
reflectance and one for illuminance. If the edges are classified correctly and integrated correctly, the resulting intrinsic images should be veridical. But herein lies the problem. The model is too good. It doesn’t account for errors in lightness. Even as I set out to construct a model of veridical lightness, I was aware that lightness is not completely veridical. But I deferred that problem until after the ‘heavy-lifting’ of veridicality. I gambled that if the essential components necessary for veridicality could be identified, one or more of them would provide the key to errors. In this I have been disappointed. The key to lightness errors has now been sought, by myself and by others, in each of these components: edge encoding (Cornsweet 1970; Arend and Goldstein 1987); edge classification (Gilchrist 1988); and edge integration (Rock 1983; Kingdom 1999; Ross and Pessoa 2000); and these efforts continue. But so far such explanations have been able to account for only a limited range of lightness errors, and often only qualitatively.
The logic of errors We have seen that veridicality provides a strong constraint to models of lightness perception. But there are strong reasons to study errors in lightness as well. The basic argument is simple but compelling: the pattern of lightness errors must be the signature of the system by which vision determines the lightness values of seen objects. Lightness errors are systematic, not random. Logically, it seems that errors can come only from the visual system itself. Does it make any sense to say that errors come from the physical world? If perception is the product of an environmental contribution and an organismic contribution, the pattern of errors must be a reflection of the organismic contribution. Error constrains more than veridicality This implies that the pattern of lightness errors constitutes a powerful tool for characterizing the nature of the lightness system, much as fingerprints can be used to identify a guilty party. Indeed, the study of errors might even provide a stronger constraint to models of lightness than does veridicality. Veridicality, by itself, cannot be uniquely identified with a visual process. One can easily imagine that the same degree of veridicality is achieved by different models. But a given pattern of errors is unique to a certain kind of visual software. Lightness errors exist almost everywhere one looks. When we see objects and surfaces in the real world, we assume that our perceptions are veridical. But we never know that for sure because we do not walk around with a set of Munsell chips for testing our percepts. Brunswik (1944), it will be recalled, followed human observers around, asking them to judge the size of various objects and then measuring the objective size of those objects. Tiziano Agostini and I have made analogous tests of lightness perception and we found lightness errors everywhere.
Definition of errors In general, I regard an error as a discrepancy between the percept and the distal stimulus. To be more operational, I will define a lightness error as the difference between the reflectance
442
colour perception
of the physical target surface within the stimulus display and the reflectance of the chip selected from the Munsell chart as matching the target. Role of the matching chart In my laboratory, we collect most of our data using a matching task with a Munsell chart. The chart is composed of 16 Munsell chips on a white background, all under reasonably bright illumination. Does our use of the Munsell chart imply that perception of the chips on the chart is totally without error? No. All that is necessary, at least for the moment, is that the Munsell chips are seen with very little error, that is, error that is small relative to the error one is trying to measure. The same constraint applies in all measurement. No measuring device is perfect. It is only necessary that the errors in the measuring device are small relative to the errors one wants to measure. To this end, the matching chart embodies those conditions under which perception of lightness is currently known to be optimal. These conditions include: (1) a large number of patches seen under the same illumination; (2) bright illumination; and (3) a large white background. If and when further research reveals additional factors not optimized in this chart, the chart should be changed accordingly. Indeed, research has shown that lightness constancy is better for targets on a white background than for targets on a grey or black background (Helson 1943). Thus the grey background used to define the standard Munsell conditions should now be replaced by a white background. This makes the measuring device stronger and allows it to be used for measuring ever smaller errors. We need not assume that the chart represents perfect lightness perception, only perception good enough to allow continued theoretical development. For our study of locus of error in the simultaneous contrast display (Gilchrist et al. 1999), it was necessary to modify the chart. We wanted to determine which of the two grey targets, the one on the white background or the one on the black, shows the greater error. Obviously the use of chips on a white background is unfair here. For this test then, we placed the matching chips on a black and white checkerboard background, so that each chip bordered white and black in equal degrees. In addition to finding that most of the error occurs for the chip on the black background, we also found that, in general, the checkerboard background produces the same pattern of data as that produced by the white background, but not the same as that produced by the black background. This further reinforces the important role of white in anchoring.
Philosophical problems with the concept of perceptual errors The concept of errors given here, with its attendant implication that phenomenal and physical states can be compared directly, activates various alarm bells. I do not intend to address these issues in depth. I will merely explain why I believe that none of these points poses a serious challenge to my proposed agenda concerning errors. Here are several such criticisms, together with my responses: 1.
The visual system does nothing other than what it was evolved to do. Thus it is misleading to call its performance erroneous. My focus on errors is not meant to portray the visual
the importance of errors in perception
443
system as poor in general, or highly error-full. Most errors are probably quite small. But that’s a separate matter. The ability of errors to constrain models is orthogonal to the magnitude of those errors. It is the pattern that counts, not the size. Fingerprints at a crime scene are often invisible to the naked eye and a white powder is used to raise their contrast. It is only necessary that the contrast rise above some threshold that allows the pattern to be identified. I am proposing merely that the pattern of errors offers a strong constraint to theories of lightness, even when the errors are slight. And yet, it seems artificial to deny perceptual errors by definition. If an animal dies because it misperceives the width of a chasm, and thus fails to put enough effort into the jump across it, it seems reasonable to call this an error in perception. 2.
3.
4.
Comparing a phenomenal state with a physical state is like comparing apples to oranges. This is a challenge, not especially to my proposal, but to the very foundation of psychophysics itself, the history of which shows that we can indeed make such comparisons, and very fruitfully. Nevertheless, in part to deter such wrangling, the definition of error given above is constructed so that both terms of the discrepancy fall wholly within the physical universe. We have no direct way to measure phenomenal states. While I agree that we cannot measure phenomenal states directly, I do believe that the range of ingenious psychophysical methods that have been developed, such as matching and nulling, provide a reasonable representation of the percept, certainly enough to support the continued development of perceptual theory. This is all that is required because, as our understanding of the processes of perception grows, so will our ability to create increasingly valid measures of the percept. There is no single, privileged way to define the physical world. Thus, a percept ruled in error relative to one definition of the physical world may not be in error by another definition. Gibson (1966) has noted that the physical world can be described at various scales, and the scale of description most relevant for physicists may not be the scale most relevant for visual perception. In my view, it is consistent with the spirit of Gibson’s point to say that lightness corresponds to reflectance much as perceived size corresponds to physical size. This need not imply that an observer is either familiar with the physical definition of reflectance or is consciously trying to equate such a quantity. A more meaningful concept for the observer may simply be the differential ability of surfaces to reflect light, but this concept is very close to the more precise concept of reflectance.
Legitimate and illegitimate errors In certain stimulus situations, when the stimulus is locally impoverished or when it contains coincidental arrangements, we might find it misleading to attribute an error to the visual system. Such cases seem to require a distinction between two classes of error. I will call these legitimate and illegitimate errors. Consider the Gelb (1929) effect. A piece of black paper appears white when it is suspended in midair and exclusively illuminated by a spotlight. The paper is physically black, but it appears white. Even Schwartz must agree that this constitutes an error. But it is a legitimate
444
colour perception
error. The test is this. If the black paper and the spotlight were taken away and replaced with a piece of white paper, the retinal image would be completely unchanged. This might be called total equivalence of the stimulus. No one could expect the visual system to detect the fact that one target is black and the other is white when the two images are identical, both locally and globally. An ideal observer would still make this mistake. Technically this is still an error; however, one that tells us how the visual system works. It shows that the system relies merely on luminances from the image. If the system had ‘reflectance detectors’, for example, such an error would not occur. Now consider the staircase Gelb effect (Cataliotti and Gilchrist 1995). A row of five squares, ranging from black to white, is presented within a spotlight. The perceptual result is a dramatic compression of the range. All the squares appear light grey; even the black square appears roughly middle grey. In this case it is hard to argue that the stimulus is impoverished. The entire range of grey shades is represented within the spotlight. Indeed, even the illumination difference between the spotlight and the prevailing illumination is available (in principle) by comparing the luminance of the brightest square to that of some white surface outside the spotlight. According to my intrinsic image model of lightness (Gilchrist 1979), this stimulus should be ideal. First, there is nothing about the stimulus to prevent the visual system from correctly encoding luminance ratios at the borders. Secondly, the edges should be easily classified. The luminance range within the overall laboratory scene is much larger than the standard range (30:1) between black and white. This suggests that somewhere the image must be segmented into at least two regions of different illumination. The borders between squares are sharp and they divide co-planar, adjacent surfaces, suggesting strongly that they are reflectance borders. The occluding border surrounding the entire group of squares is almost certainly an illumination border. Indeed, except by coincidence, occlusion borders always contain at least a component of illumination change. Finally, an integration of the reflectance edges should reveal that the darkest square of the group is 30 times darker than the brightest. On the plausible assumption that the highest value in the reflectance intrinsic image is white, the darkest reflectance should be computed to be black. But despite the apparent fact that the stimulus contains adequate, indeed rich, information, a major error occurs: the black square is perceived as middle grey. I consider this to be an illegitimate error. The criterion of total stimulus equivalence does not apply here. If the five squares ranging from black to white were replaced by a group of squares ranging from middle grey to white, the retinal image would be different, whether or not the spotlight were removed or reduced! Consider a second, familiar, example: simultaneous lightness contrast. A grey square on a black background appears lighter than an identical grey square on a white background. Here again, there is no sleight of hand, no coincidental arrangement, no local impoverishment of the image. The system merely must integrate the edges between the two targets to determine that they are identical in reflectance. But an error occurs none the less: an illegitimate error. Legitimate errors hold little interest for theory. Visual software can hardly be blamed if it fails to compute different lightness values for identical targets embedded in identical surrounds.
the importance of errors in perception
445
Illegitimate errors, however, provide a chance to move lightness theory forward. These are errors that should not occur, given our theoretical expectations. And they tell us much about what information the visual system is using.
Survey of errors in lightness It might be argued that the foregoing rationale for studying errors merely states the obvious. Just as the study of errors made by brain-damaged patients is a central method in neuroscience, is not the study of errors already deeply embedded in the psychophysical study of lightness? Have not errors already motivated many experiments in lightness? My answer is both yes and no. Certainly the study of illusions has long fascinated students of lightness. But while there is no obvious difference between lightness illusions and lightness errors, only certain kinds of lightness error have traditionally been called illusions. For example, the failure of constancy that occurs under a change of illumination is not usually treated as an illusion. Placing a grey paper on a white background makes it appear darker than it appears on a black background, and this is called an illusion. But placing a grey paper in a shadowed region makes it appear darker than it appears in a lighted region, and this is called a failure of constancy. Why should the former be called an illusion, but not the latter? Traditionally, lightness errors that are called illusions typically have certain properties. They have an entertaining quality and their illusory nature is quite accessible. One can easily ascertain that the two grey targets in simultaneous lightness contrast are physically equal, despite their different appearances. But the distinction between illusions and errors is probably best understood in relation to the prevailing theoretical climate. From the beginning, and indeed until quite recently, it was assumed that the basic lightness response at a given location in the visual field corresponds to the local stimulation there. I have called this the photometer metaphor (Gilchrist 1994). From this perspective it is completely unremarkable that placing a grey paper in shadow makes it appear darker, because its luminance value is lowered. But if two targets of equal luminance appear different because they lie on different backgrounds, this is newsworthy. The fundamental problem is that the study of lightness errors has been piecemeal, not systematic. Even within the category of errors traditionally labelled illusions, we have seen no general theory of lightness illusions. Often research has sought merely to find an explanation for a single illusion. More recently we have seen attempts to explain a cluster of illusions (Logvinenko 1999; Ross and Pessoa 2000). We need, not only a theory of all lightness illusions, but of the entire panoply of lightness errors. The potential value of lightness errors has not been appreciated. Evidence for this can be seen in the dramatic mismatch between errors predicted by models of lightness, and empirical findings on errors. Space does not permit a list of such mismatches, but perhaps three examples will suffice: 1.
Helmholtz’s (1867) theory predicts that children and animals will show less constancy than adults.
2.
Wallach’s (1948) ratio theory predicts that a grey disc on a white background will appear the same as a black disc on a grey background.
446
3.
colour perception Contrast theories based on lateral inhibition (Jameson and Hurvich 1964; Cornsweet 1970) predict that homogeneous regions will not appear homogeneous, especially near borders.
None of these predictions are consistent with the facts. And many more examples could be given. If lightness errors were used systematically to constrain theories, one would find somewhere in the literature a survey of empirical findings on lightness errors. Yet no such survey exists. Here I will try only to describe a starting point for such a survey. A fuller treatment can be found in Gilchrist et al. (1999). The empirical literature on lightness errors and illusions is vast. How can we begin to organize the findings? A first step is to divide lightness errors into two broad classes. These correspond to the two broad types of lightness constancy: illumination-independent constancy and background-independent constancy. Lightness is roughly constant in the face of both changes in illumination level and changes in background. But neither constancy is complete. Because these two kinds of constancy make such apparently different demands on lightness theory, the first challenge for a model of lightness errors is to account for both kinds of error with a single model. This is a rather new challenge for lightness theories. Traditionally, theories have sought a unified explanation for illumination-independent successes and backgrounddependent failures (simultaneous lightness contrast). Illumination-dependent failures and background-independent successes have been ignored. Intrinsic image models seek to explain both illumination-independent and background-independent successes, while ignoring failures of both. An errors project would seek a common explanation for both illumination-dependent and background-dependent failures. Such a theory would not ignore the successes but would explain them only to the extent that they exist.
An anchoring theory of errors My colleagues and I have tried to take a systematic approach to the problem of lightness errors, and this work has resulted in an anchoring model of lightness (Gilchrist et al., 1999). I will give only a simple sketch of this model here and describe how the model approaches both illumination-dependent and background-dependent failures. Perhaps the simplest way to present the anchoring model is to describe how it applies to simple (that is, single-framework) images, and then expand the analysis to complex images. I argue that the simplest conditions for studying lightness consist of two surfaces that fill the observer’s entire visual field (excluding, for example, any dark background). We satisfied this requirement by conducting a series of experiments (Li and Gilchrist 1999) in which the observer’s head is placed inside an opaque hemisphere, the diffusely illuminated interior of which is painted with two different shades of grey. Even if one assumes that the luminance difference (I refer to a difference on a log scale, which is equivalent to a ratio) is correctly encoded, a question arises as to what specific shades of grey should be assigned to the two regions by the visual system. If the luminance difference is less than the difference between black and white (that is, the log of 30:1), there is an infinite family of pairs of grey
the importance of errors in perception
447
shades that are consistent with that difference. I have called this problem the anchoring problem. Two anchoring rules can be found in the literature. Wallach (1948) and Land and McCann (1971) rely on a highest luminance rule. In this case, the lighter region appears white and the lightness of the darker region depends, in turn, on the difference between the two regions. Others, such as Helson (1964) and Buchsbaum (1980), have invoked an average luminance rule. According to this rule, an average of the two luminances defines middle grey, and the lightness of the two regions then depends on the luminance difference between each region and that average. Li and I (Li and Gilchrist 1999) tested these rules using a dome divided into two equal halves, painted black and middle grey, respectively. Our results clearly favour the highest luminance rule. The middle grey half appeared white and the black half, middle grey. According to the average luminance rule, no white or black should appear, and the perceived greys of the two halves should be equally distant from middle grey. This and other results (Bruno 1992; Brown 1994; McCann 1994; Schirillo and Shevell 1996) have supported the highest luminance rule. Two additional rules appear to exhaust the anchoring rules for such simple stimuli: an area rule and a scale normalization rule. Li and I have reported a strong effect of relative area on lightness. The rule is this: when the darker region is perceived to have a larger area than the lighter region, the lightness of the darker region varies directly with its area. As the darker region approaches 100% of the area of the dome, its lightness approaches white. When the area of the lighter region becomes very small, it comes to appear self-luminous. Scale normalization involves the luminance range within the image. When this range is less than the standard black–white range (30:1) some perceptual expansion of the range occurs. When the range is greater than the black–white range, some compression occurs. Scale normalization is similar to simultaneous anchoring on black and white, but the effect is never a complete effect and is typically rather weak (though proportional to the physical range). In generalizing this model to complex images, three assumptions are made. First, it is assumed that the image can be segmented into local frameworks or perceptual groups based on gestalt and gestalt-like grouping principles. The entire image is also treated as a framework: the global framework. Secondly, it is assumed that the rules of anchoring found in simple images (that is, domes) can be applied directly to each framework in a complex image. This implies that multiple lightness values are computed for any given target surface, one for each framework to which it belongs: the global framework and one or more local frameworks. Thirdly, it is assumed that perceived lightness is a weighted average of each of these values computed for a given target. Two key factors in this weighting are the area of each group and the number of elements it contains, historically called articulation (Katz 1935). Here, in a nutshell, is how the model accounts for both illumination-dependent and background-dependent errors. When different fields of illumination exist side by side in an image, veridicality would be achieved by anchoring each target exclusively within its own field of illumination (assuming a white surface exists within each field). But some global anchoring occurs and this causes the failures of constancy. Kardos (1934) proposed just
448
colour perception
such an explanation of illumination-dependent errors. When targets stand on backgrounds of different luminance, such as in the simultaneous contrast display, veridicality would be achieved by anchoring each target globally (taking the whole display itself as the global framework). But some local anchoring occurs. This causes the grey on the white background to rise toward white (as the highest luminance within its local framework), producing the illusion. Application of the anchoring model to a whole array of empirical lightness errors can be found in Gilchrist et al. (1999). Here my goal is less to argue for this particular theory of lightness than to argue for a more systematic use of the errors constraint.
Why errors? Why are there errors in perceived lightness at all, after millions of years of evolution? Several possibilities have already been suggested. The errors may be the price paid by the visual system in a trade-off. Wallach (1948), for example, suggested that simultaneous lightness contrast reflects the tendency to base the computation of lightness simply on the local centre/surround luminance ratio, a tendency that leads to lightness constancy when targets stand in different levels of illumination. Another possibility is tied more closely with the process of evolution. It is argued that, in evolutionary terms, the purpose of vision is merely to promote the propagation of the species and this does not require a completely veridical representation of the physical world. Thus the lightness system has evolved merely to be as good as necessary to achieve this goal.
A complete model of veridicality is not possible But in addition to these factors, there is a further constraining factor that should be noted. The option of complete veridicality is simply not available to the visual system, however well-evolved. Stimulus support for veridicality is frequently missing from the retinal image. Even if one accepts, as I do, Gibson’s (1966) recognition that earlier analyses of the proximal stimulus overstated the ambiguity of the stimulus by ignoring information contained in higher-order variables, and even if we agree that, as the scene becomes more complex, the ambiguity of the stimulus is, in general, reduced, there is a further problem apparently missed by Gibson. Complex scenes, regardless of their richness, contain pockets of ambiguity. Consider one such case involving an aperture. The aperture problem in lightness Imagine you are sitting in a richly articulated room. The far wall contains an aperture through which one may glimpse a small part of an adjacent room. Only two surfaces are visible through the aperture, one lighter and one darker. How should the visual system anchor these two surfaces? One possibility is to use whatever anchor is established in the near room. Thus, if the lighter of the two surfaces is five times darker than a white surface in the near room, it would be seen as middle grey. But this strategy implies that the illumination in the two rooms is equal, and that is not known. The alternative possibility is to ignore
the importance of errors in perception
449
the near room and treat the aperture as an independent framework. The two surfaces can be treated as if they were painted on to the inside of a dome that filled the observer’s visual field. The lighter surface would be seen as white and the darker surface seen relative to that. But this strategy implies that the lighter of the two surfaces is in fact a white surface, and this is not known. In this case there is no algorithm that can guarantee veridicality because there is simply not enough information within the aperture. Such pockets of ambiguity, which need not be apertures, are sprinkled throughout complex scenes. They occur whenever some planar region, such as the underside of a square archway, receives a unique level of illumination and contains only one or two reflectance values. In these cases there is no absolutely correct way to anchor lightness values. Both local and global anchoring carry separate risks. Thus, adopting a compromise between local and global anchoring makes good practical sense. But what about cases that contain enough information for veridicality, such as the staircase Gelb display? Why use a compromise between local and global anchoring in such cases? The answer may lie in overall economy. It is uneconomical to have two lightness systems, one for scenes with adequate stimulus information and one for scenes without. The distal–proximal–percept triad For many years following the discovery of the retinal image it was customary to focus on the relationship between the proximal stimulus and the percept. How could the richness of the percept be derived from the impoverished proximal stimulus? Gibson was influential in shifting the focus on to the relationship between the distal stimulus and the proximal stimulus. He noted that a closer look at the distal–proximal relationship shows that the proximal stimulus is not as impoverished as had been thought. I am proposing a renewed emphasis on yet another pair within the triad: the relationship between the distal stimulus and the percept. I argue that distal–percept discrepancies provide the strongest constraints for theory. The fact that these lie in separate spheres is not a big problem. Ultimately any model of lightness must still account for the proximal–percept relationship; it must still explain how the proximal stimulus is transformed into the percept. But the most potent tool for revealing the proximal-to-percept software may just be contained in the distal–percept relationship.
References Adelson, E. H. and Pentland, A. P. (1990). The perception of shading and reflectance. Vision and Modeling Technical Report 140. MIT Media Laboratory, Cambridge, MA. Arend, L. (1994). Surface colors, illumination, and surface geometry: Intrinsic-image models of human color perception. In Lightness, brightness, and transparency, (ed. A. Gilchrist), pp. 159–213. Erlbaum, Hillsdale. Arend, L. and Goldstein, R. (1987). Lightness models, gradient illusions, and curl. Perception and Psychophysics 42, 65–80. Barrow, H. G. and Tenenbaum, J. (1978). Recovering intrinsic scene characteristics from images. In Computer vision systems, (ed. A. R. Hanson and E. M. Riseman), pp. 3–26. Academic Press, Orlando.
450
colour perception
Bergström, S. S. (1977). Common and relative components of reflected light as information about the illumination, colour, and three-dimensional form of objects. Scandinavian Journal of Psychology 18, 180–186. Brown, R. O. (1994). The world is not grey. Investigative Ophthalmology and Visual Science 35, 2165. Bruno, N. (1992). Lightness, equivalent backgrounds, and the spatial integration of luminance. Perception supplement 21, 80. Brunswik, E. (1929). Zur Entwicklung der Albedowahrnehmung [On the development of the perception of albedo]. Zeitschrift für Psychologie 109, 40–115. Brunswik, E. (1944). Distal focusing of perception: Size constancy in a representative sample of situations. Psychological Monographs 56, 1–49. Burzlaff, W. (1931). Methodologische Beiträge zum Problem der Farbenkonstanz [Methodological notes on the problem of color constancy]. Zeitschrift für Psychologie 119, 117–235. Cataliotti, J. and Gilchrist, A. L. (1995). Local and global processes in lightness perception. Perception and Psychophysics 57, 125–135. Cornsweet, T. N. (1970). Visual perception. Academic Press, New York. Gelb, A. (1929). Die ‘Farbenkonstanz’ der Sehdinge [Color constancy of visual objects]. In Handbuch der normalen und pathologischen Physiologie, (ed. W. A. von Bethe), (12th edn), pp. 594–678. Springer, Berlin. Gibson, J. J. (1966). The senses considered as perceptual systems. Houghton Mifflin, Boston. Gilchrist, A. (1979). The perception of surface blacks and whites. Scientific American 240, 112–123. Gilchrist, A. (1988). Lightness contrast and failures of constancy: a common explanation. Perception and Psychophysics 43, 415–424. Gilchrist, A. (1994). Absolute versus relative theories of lightness perception. In Lightness, brightness, and transparency, (ed. A. Gilchrist), pp. 1–33. Erlbaum, Hillsdale. Gilchrist, A., Delman, S., and Jacobsen, A. (1983). The classification and integration of edges as critical to the perception of reflectance and illumination. Perception and Psychophysics 33, 425–436. Gilchrist, A., Kossyfidis, C., Bonato, F., Agostini, T., Cataliotti, J., Li, X., et al. (1999). An anchoring theory of lightness perception. Psychological Review 106, 795–834. Helmholtz, H. von (1867). Handbuch der Physiologischen Optik. Voss, Hamburg. Helson, H. (1943). Some factors and implications of color constancy. Journal of the Optical Society of America 33, 555–567. Helson, H. (1964). Adaptation-level theory. Harper & Row, New York. Hering, E. (1874, 1964). Outlines of a theory of the light sense, (transl. L. M. Hurvich and D. Jameson). Harvard University Press, Cambridge, MA. Horn, B. K. P. (1974). Determining lightness from an image. Computer Graphics and Image Processing 3, 277–299. Jameson, D. and Hurvich, L. M. (1964). Theory of brightness and color contrast in human vision. Vision Research 4, 135–154. Kardos, L. (1934). Ding und Schatten [Thing and Shadow]. Zeitschrift für Psychologie 23. Katz, D. (1911). Die Erscheinungsweisen der Farben und ihre Beeinflussung durch die individuelle Erfahrung [The modes of appearance of colors and the influence of individual experience on them]. Zeitschrift für Psychologie 7, 1–425. Katz, D. (1935). The world of colour. Kegan Paul, Trench, Trubner & Co., London.
the importance of errors in perception
451
Kingdom, F. (1999). Old wine in new bottles? Some thoughts on Logvinenko’s ‘Lightness induction revisited’. Perception 28, 929–1054. Koffka, K. (1935). Principles of gestalt psychology. Harcourt, Brace, and World, New York. Köhler, W. (1971). On unnoticed sensations and errors of judgment. In The selected papers of Wolfgang Köhler, (ed. M. Henle). Liveright, New York. Land, E. H. and McCann, J. J. (1971). Lightness and retinex theory. Journal of the Optical Society of America 61, 1–11. Li, X. and Gilchrist, A. (1999). Relative area and relative luminance combine to anchor surface lightness values. Perception and Psychophysics 61, 771–785. Lindberg, D. (1976). Theories of vision: From Al-Kindi to Kepler. University of Chicago Press, Chicago. Logvinenko, A. (1999). Lightness induction revisited. Perception 28, 803–816. Mach, E. (1959). The analysis of sensations. [English Translation of Die Analyse der Empfindungen, 1922]. Dover, New York. Marr, D. (1982). Vision. Freeman, San Francisco. McCann, J. (1994). Psychophysical experiments in search of adaptation and the gray world. Paper presented at the IS&T Annual Meeting, Rochester, NY. Rock, I. (1983). The logic of perception. MIT Press, Cambridge, MA. Ross, W. and Pessoa, L. (2000). Lightness from contrast: Selective integration by a neural network model. Perception and Psychophysics 62, 1160–1181. Schirillo, J. and Shevell, S. (1996). Brightness contrast from inhomogeneous surrounds. Vision Research 36, 1783–1796. Thouless, R. H. (1931). Phenomenal regression to the ‘real’ object. British Journal of Psychology 22, 1–30. Wallach, H. (1948). Brightness constancy and the nature of achromatic colors. Journal of Experimental Psychology 38, 310–324.
This page intentionally left blank
chapter 15
AVOIDING ERRORS ABOUT ERROR robert schwartz Preface This study began in collaboration with Alan Gilchrist. Alan was working on a book on lightness perception. He was developing a new model, one based, in no small part, on a notion of ‘error’. Alan’s project, however, met resistance from various visual scientists in the ZiF group. A major reason was their unwillingness to countenance Alan’s appeal to ‘error’. Indeed, many maintained there could be no such thing as ‘error’, at least not when it came to perceiving colour. On the face of it, this criticism was puzzling. No one doubted, for example, that on occasion we ‘mistakenly’ put on socks that do not match. And everyone agreed they might refuse to pay a house painter who used light grey paint when the contract called for dark. Moreover, Paul Whittle noted, that often those who recoiled at the notion of ‘error’ were content to talk about vision being ‘veridical’. In an effort to clarify issues, Alan and I decided to write a joint paper on ‘error’. We would attempt to spell out a sound psychophysical concept of ‘error’, untangle assorted confusions plaguing the group’s discussions, and possibly defuse some of the criticisms of Alan’s perceptual model. Now it seemed to one of the co-authors, me, that Alan’s own working definition of error had problems and needed further explication. Our collaboration began with my proposing alternative ways to specify a precise notion of ‘error’ and Alan challenging the suitability of my formulations. In the end, none of the options I offered met with Alan’s approval, and our joint enterprise was abandoned. I, then, pursued the topic on my own. My aim was neither to put forth nor defend any particular account of ‘error’. Instead, I wished to delineate the space of options available and characterize, in a very general way, the advantages and difficulties facing each approach. I came to believe, in fact, that there was room in the study of both achromatic and chromatic colour for alternative accounts of ‘error’, each perhaps useful in different contexts and for different tasks. My explorations also suggested that a few of the conceptions elaborated could make room for, if not actually capture, the intuitions of proponents of the ‘no errors in psychophysics’ thesis. Ongoing discussion with members of the ZiF group indicated, however, that this attempt to reconcile opposing positions was not making much headway. Eventually, I became convinced that my proposed rapprochement was being thwarted by unexpressed metaphysical/ontological assumptions that both sides were bringing to the table. So what started for me as a technical problem in psychophysics— helping formulate an adequate definition of ‘error’ for a new model of achromatic perception—led back to longstanding controversies in philosophy about the ‘nature’ or ‘essence’ of colour. Perhaps this convergence was to have been expected. At the heart of many of these older philosophical debates, and most of the current ones (Byrne and Hilbert 1997), is the goal of finding out ‘What colors really are’. It is usually assumed that this has a single correct answer, which a ‘philosophical theory of colour’ should provide or incorporate. Settling this metaphysical/ontological issue is thought to have important implications. Without an idea of what colours really are, we do not know what it means for colour experiences and judgements to hook up to the world, to ‘correspond to reality’. Hence we can not insightfully explain whether, or how, colour perception allows us to get things right about the physical environment. From this perspective, it is natural to suppose that a fully developed psychophysics must, in the end, deal with the same problem. In order to determine whether, how, and the extent to which perception is accurate, or veridical, it is necessary to have a proper
454
colour perception
understanding of what colours really are, what ‘true’ colour perception is right about. At the same time, only with respect to a standard or norm of correctness does the idea of ‘error’ itself make clear sense. Unfortunately, conflicting convictions about the ‘essence’ of colour remained a serious impediment to reconciliation among the members of the ZiF group. Those who preferred subjectivist accounts of ‘what colours really are,’ could find no place for the kind of physicalistically anchored notion of ‘error’ Gilchrist was pushing. On the other hand, Alan’s own position made it difficult to accommodate the subjectivist sentiments his critics harboured. This failure of accord provided, and provides me, additional reason to urge distancing psychophysics from the traditional philosophical problematic. The dilemmas philosophical theories of colour are meant to solve have, I believe, less to do with colour and colour science than with particular commitments to questionable philosophical doctrines of physicalism, realism, objectivity, and mind. What’s more, my study of ‘error’ in achromatic colour perception casts doubt on the very idea of a unique essence for colour. If my analysis is on target, there are different ways to get things wrong, along with alternative conceptions of what it is to get things right. I also see no substantive grounds for assuming that any one, or only one, of these conceptions specifies what colours ‘really’ are. For there really are several empirically and theoretically satisfactory ways to conceive of lightness and colour. R. Schwartz
avoiding errors about error
455
Introduction That we make errors in perception seems all too obvious. Less obvious is that we are often mistaken about the nature of perceptual error. A major reason for this latter confusion is failure to pay proper attention to the fact that error is a relative matter—relative to an understanding or specification of what it is to get things right. Independent of a standard of correctness, claims of error are otiose. This chapter focuses on accounts of error in the perception of achromatic colours, that is, the perception of white, black, and the greys. These ‘colours’ are said to lack hue; they constitute what is known as the ‘grey-scale’.1 As investigation will show, the idea of perceptual ‘error’ is often understood in different and conflicting ways, and there is no reason to assume that one account is privileged. Moreover, there is reason to treat various purported cases of error not as error, but as discordances among competing ways of organizing and ordering our world. Until near the end of this chapter, such qualms will be kept in abeyance. If along the way use of the term ‘error’ jars intuitions, consider it a technical term of service in psychophysics. This may not be far from the position it is best to adopt, in any case.
Terminology Not all light striking the surface of an object is reflected. Black surfaces reflect very little, white surfaces almost all, and grey ones, varying amounts in between. The ratio of reflected light to the incident light is called ‘reflectance’. ‘Lightness’ and ‘lightness perception’ are the terms used to talk about the experiential correlates of surface reflectance, our experience of the grey scale. Lightness constancy is the ability to perceive a surface has the same lightness when viewed under different conditions. (For technical details, see Wyszecki and Stiles 1982, and the glossary of Gilchrist 1994.) Anyone perusing an introductory psychology text will probably run into a demonstration of a popular illusion in achromatic colour perception. This ‘simultaneous contrast illusion’, as it is called, is easy to duplicate on one’s own. Take two small squares of paper of the exact same shade of grey, place one on a black background and the other on a white background. Under these conditions the squares do not look alike. The square on the black background appears lighter than the one on the white surround. Thus our perception of lightness is said to be in error. Lightness constancy fails. Two objects of physically identical material do not look the same; they do not match perceptually. Matching tasks are the preferred method for studying errors in lightness constancy. A standard paradigm is to have a subject select or adjust the reflectance of a surface viewed in good light so as to match a given target surface. The target may be viewed in shadow, against a special background, or under some other condition of experimental interest. The subject’s matching judgements are then compared with the physical reflectance properties of the surfaces (for details and variations on the paradigm, see Wyszecki and Stiles 1982). 1 Although limited to the achromatic case, I believe the analysis has implications for the study of chromatic colours as well.
456
colour perception
To simplify discussion of the logic of these studies and the ideas of ‘error’ employed, it will be helpful to introduce some notational abbreviations: (1) x, y, z . . . : are surfaces having uniform, physically defined reflectance values, x, y, z ...; (2) x = y: if and only if the reflectance values of the surfaces are the same; (3) Ci . . . Cn : are viewing conditions (i.e. lighting, background, distance, and angle of regard); (4) Ci = Cj : if, and only if, the viewing conditions are the same; (5) Ci x: is the perceived lightness of a given surface of reflectance, x, under viewing condition, Ci ; (6) Ci x = Cj y: if, and only if, the subject judges the them to be the same or to match perceptually.2
Reflectance errors The most straightforward notion of ‘error’ found in lightness constancy studies is specified with respect to reflectance. For example, Gilchrist et al. (1995) offer this ‘precise definition of a lightness error: any difference between the physical reflectance of the target surface and the physical reflectance of the perceptually matching surface’. This sort of error will be called ‘R-error’. Thus, S makes an R-error, if x #= y and S judges Ci x = Cj y, or x = y and S judges Ci x #= Cj y. Notice that this definition of R-error is completely general; there are no restrictions on the viewing conditions (see Gilchrist et al. 1999, p. 809).3 The conditions Ci and Cj may be the same or vastly different, and one or both may be conditions no one would think reasonable for evaluating or comparing lightness. They could be conditions in which lightness discrimination is essentially absent. Also, Ci may be daylight with the target on a neutral grey, while Cj is coloured light and the target resting on a glowing self-luminant surface. In all cases, whether the viewing conditions are ‘ideal’ or perverse, alike or very dissimilar, S is mistaken if S judges x and y to match when they differ in reflectance, or not to match when they are of the same reflectance. It is possible to extend the notion of ‘reflectance error’ to include aspects of ordering. S could be asked to judge if x looks lighter than y. Thus, suppose x > y, and S judges they do not match. S has not made an ordinary R-error. If, however, S judges y is lighter than x, then S makes an ordering error with respect to reflectance. One could attempt to push issues further by placing richer demands on S’s evaluations. S might be asked to judge if x is twice the lightness of y, or if the difference between x and y is equal that between y and z.
2 The symbol ‘=’ is used throughout not for numerical identity, but for sameness of stimuli, conditions, or experiences, as understood in studies of lightness perception. 3 In Gilchrist et al. (1999) the notion of ‘error’ is not general but is relative to Munsell viewing conditions. I discuss this matter below.
avoiding errors about error
457
S might then be claimed to make errors if the judgements do not correspond to the simple ratios or differences of the physical reflectances. Of course, many will balk at considering such discrepancies perceptual error, since they are only to be expected. It is a general feature of sensory systems that as intensities of stimuli increase, differences in intensities are harder to discern. Achromatic colour perception is no exception. There is a compression of the scale of subjective lightness experience as the intensity of the reflected light increases. Decisions about the treatment of discriminatory thresholds and scale compression, however, intrude at the very start, with the initial, austere notion of ‘R-error’. Discrimination of reflectance is not perfect. No instrument, let alone a human perceiver, can detect every physical difference in reflectance. Still, one could hold firm and maintain any failure to discriminate between two surfaces of different reflectance is an R-error. Another option is to define R-error in terms of a spread of reflectance values rather than a unique point. On this account, failure to perceive a difference between x and y is not an R-error, if the difference in reflectance is less than a specified threshold. Again, in spelling out criteria for error there is some leeway. It will simplify matters to assume for now that a satisfactory decision has been made.4 For our concerns, too, it will make things easier to limit consideration to judgements of matching and not to worry about perceptual errors involving judgements of order. The structure of matching-type errors has quite enough complexity. For example, where x = y #= z, and S judges Ci x #= Cj y, S makes an R-error. Nevertheless, if S judges Ci x #= Ck z and Cj y #= Ck z, S is free of R-error. That Ci and Cj lead to R-errors in some cases is perfectly compatible with these viewing conditions yielding accurate matching judgements in other comparison tasks involving x and y. Or consider a set-up, Cl , involving coloured or ultraviolet light. Discrimination between various targets under Cl may be quite good; so there is no R-error. None the less, in this light the items may not look very much like they do in normal daylight against a neutral background. Other results, perhaps more in conflict with ideas about the nature of lightness constancy, also follow from the definition of ‘R-error’. Suppose x and y differ in reflectance by a minuscule amount, well below any plausible discrimination threshold. Put x on a black background, y on a white one, and view them in daylight. We know, from contrast illusion studies, x will appear lighter than y. Therefore, they will be discriminated, and there is no R-error. S’s judgement is not only correct, it is more accurate than when the targets are both viewed against an ‘ideal’ neutral background. Finally, we have no hesitation claiming S makes an R-error, if x = y and S judges Ci x #= Cj y. Less appealing is the result S gets things right, makes no R-error, if S judges these same x and y match when the illumination is too poor to discriminate most differences in reflectance.
Errors of look To claim that S perceives matters correctly, especially in the latter cases, will strike many as perverse. The fact S does not make an R-error in such circumstances seems to point to a flaw 4 x, y, z . . . will be understood to represent either point values or, where appropriate to the discussion, an agreed upon spread of reflectance values.
458
colour perception
in this conception of perceptual error. Surely, S does not see things properly in the contrast illusion situation or when the viewing conditions are so deficient that almost everything appears to have the same lightness. Under illusion provoking or impoverished conditions, although certain matching judgements do jibe with the comparative reflectance values of the surfaces, the perceptual experiences are not right. The targets do not look the way they ‘really’ are. Such purported failures of perception will be called ‘look-errors,’ or ‘L-errors’ for short. Intuitions related to L-error underlie various discussions of lightness perception. In particular, it is often thought important, and makes good sense, to determine which of two perceptual experiences is responsible for an R-error. In a variety of studies, subjects are shown a target, x, under an experimental condition of interest, Ci . They are then presented a chart of achromatic chips from a Munsell (1976) book of colours and asked to choose a chip that matches x.5 The Munsell chips are presented, not under the experimental condition but under a condition thought particularly conducive to lightness discrimination. This condition, call it CM , is spelled out precisely in the Munsell book. It includes a specific white illuminant, a specific medium grey background, etc. If, in such a test situation, S chooses Munsell chip y, and x #= y, S makes an R-error. There is, however, the tendency to think that the source of the error can be pinned on the perception of x under Ci . Ci x, it is claimed, is not the right or correct look of x. Ci x is an instance of an L-error, and this L-error is used to explain the R-error. The faulty Ci x misleads S to choose a chip, from the Munsell chart, whose reflectance differs from x. A comparable distinction between L-error perceptions and those free of L-error shows up elsewhere in lightness constancy discussions. It is commonplace to be told that certain viewing conditions prevent subjects from seeing things with their true colour. ‘Failures of lightness constancy that occur in the presence of different levels of illumination take a fundamental form. Surfaces in the brightly illuminated regions tend to appear lighter gray than they really are and surfaces in shadowed regions tend to appear darker gray than they really are’ (Gilchrist et al. 1995).6 True, if Ci and Cj are alike, except that the illumination in Ci is higher, Ci x will look lighter than Cj x. This, though, is a fact about comparative appearances and says nothing about the looks of surfaces being as they ‘really are’ (see Gilchrist et al. 1999, p. 811). Similarly, in everyday conversation it is assumed that things do not look the way they really are when the lighting is very dim. Although backed in this way by intuitions, the idea that achromatic colours sometimes appear right, and at other times wrong, needs careful explication. As with all notions of
5 The Munsell book, a widely used reference work, provides colour samples organized according to a wellspecified system of colour ordering. (For a discussion of the Munsell system and others, see Wysecki and Stiles 1982.) 6 This claim cannot be taken to mean targets in bright illumination match surfaces with higher reflectance than themselves. Sometimes they will; sometimes they will not. An x in bright illumination will match a y of lower reflectance, if y is in even brighter illumination or if y is displayed against an appreciably darker background.
avoiding errors about error
459
error, to make sense of L-error we must specify an appropriate standard of correctness. With respect to what is an appearance to be judged incorrect? How are we to understand the claim that something does or does not appear with its appropriate lightness? What is it for an object to look to have its true value, to be perceived as it should be? Until these questions are answered, common intuitions about errors of look lack firm foundations. One obvious way to settle such matters is to specify that the correct or ‘right’ look for a surface is the way it appears when viewed under some ‘ideal’ condition, CI . There is Lerror, then, whenever a target surface looks different from how it does in this special set-up. Ci x looks right, if x = y and Ci x = CI y. Alternatively, Ci x is an L-error, if x = y and Ci x #= CI y. This account of L-error can be used to support those intuitions and distinctions not handled within the conceptual confines of R-error. Suppose, for example, the assumed ideal viewing condition is the one specified in the Munsell book, that is, CM = CI . Perception of a surface under this condition defines its correct look.7 Previously, when x #= y and Ci x = CM y, there was no established basis for assigning blame for the R-error. Now, relative to the choice of CM as standard, there is a justification for pinning the mistake on one appearance rather than the other. Ci x is an L-error. Choosing a standard also gives purchase on cases where neither of the samples is under the ideal condition. If x = y, Ci x #= Cj y and neither Ci nor Cj are CI , it still is possible to pin the error on one of the perceptions. L-error lies with the appearance that fails to match the perception of its target reflectance under CI . If both Ci x and Cj y fail to match the perception of the given reflectance value under CI , then there is an L-error in both, and the R-error is due to each. Intuitions about the ‘true look’ of a particular target reflectance are given similar treatment. A surface in shadow does not appear as it should, since its appearance does not match the way it looks under CI . A target in very bright light appears lighter than it really is, since it appears lighter than it would in CI . Or what amounts to the same thing, it matches a target of higher reflectance viewed under CI . The ‘accidental’ success in reflectance judgements in illusory contrast conditions and in extremely poor illumination can also be explained. Although S’s matching judgements are not R-errors, the targets do not have the correct look that goes with their reflectance values. It is an accident that S makes no R-error, since the perceptions the matching judgements rely on are themselves L-errors.
Some complications It is important to keep in mind in stipulating, say, CM as standard, that it is only the Munsell viewing condition that is being privileged. The definitions of ‘correct look’ and ‘L-error’ are in no way constrained by the selection of chips and their associated ordering in the Munsell book. Any target of any reflectance can be assigned its correct look relative to the chosen 7
The appropriateness of choosing the Munsell condition as standard will be discussed later.
460
colour perception
CI . Also, the experimenter is given no more accurate information about how x looks to S when the match involves a Munsell chip under CM , than when the matching judgements S makes do not involve Munsell chips or conditions. Nor can it be assumed when S judges Cj x = CM y, S is assigning the particular reflectance value of the Munsell chip y to the target x, rather than assigning the value of x to the Munsell chip y. By themselves, the definitions do not sanction these additional claims.8 Nothing said so far challenges the idea that a standard viewing condition, such as CM , can be chosen, and the look targets have, under this condition, deemed the right one. Still, setting a standard of correctness in terms of a designated CI leaves important issues to be resolved. I begin with a problem that might require technical finessing, although it may not be central to an account of L-error. Suppose x = y = z, Ci x = CI z, Cj y = CI z, but Ci x #= Cj y. By definition both Ci x and Cj y look correct, there is no L-error in the way either appears. Yet they do not match each other, so there is R-error. The proposed link between L-error and R-error, therefore, breaks down. One solution is to alter the definition of L-error. Another is to assume such matching judgements will not occur frequently enough to bother with. For simplicity I make this assumption. A related difficulty cannot be dismissed as readily. When x > y > z, and the differences straddle threshold borders, subjects will often report CI x = CI y, CI y = CI z, and CI x #= CI z. Since the targets are always under the ideal condition, they must always look correct. So, again, there is unexplained R-error. What’s more, the lack of transitivity of matching, even under CI , puts strain on the very idea a target has a singular, true look. An interesting, little explored, approach to these kinds of puzzles is to distinguish perceptual matching (our ‘=’) from perceptual identity. Matching is non-transitive, while identity is transitive. For CI x to be phenomenally identical with CI y, it is not enough they match each other. They must each match everything the other does (Goodman 1951; Clark 1993). Adopting this analysis of look identity has some nice advantages. It enables construction of an ordering of perceptual lightness based solely on judgements of matching. Subjects are not required to provide explicit ordering judgements. It would, however, complicate analysis of L-error to trace out the implications of employing this account of appearance identity, and I will not pursue the issue here (see Schwartz 1996). More pressing problems lie ahead.
Solipsism Suppose x #= y, the difference is quite small, and S judges CI x = CI y. Once more there is R-error with no L-error. Altering the definitions of ‘look identity’ and ‘correct look,’ though, does not seem the only or easiest way to avert this anomaly. Weakening the demand giving rise to R-error would seem a simpler solution. Stipulate that R-error occurs only when the reflectance difference exceeds a specified range. If the difference between the targets is less 8 It is, at times, assumed that the chart of Munsell chips serves as a measuring devise, on analogy with the use of the standard meter stick to measure length. Exploring the pros and cons of this analogy requires more attention than the matter can be given here.
avoiding errors about error
461
than the threshold, there is no R-error, and hence no need to appeal to L-error to explain the mistake. This sort of response can only be taken so far. The problem is the current definition of L-error is ‘solipsistic’. The notion of ‘correct look’ is specified solely with respect to judgements of how things look to an individual subject under CI . And this individualistic conception of ‘looking right’ leads to trouble. For suppose two surfaces differ enough in reflectance so that under ideal conditions they are easily discriminated by the average perceiver. If S cannot tell such targets apart under CI , it seems clear S makes an error. But an error of what kind? There is no problem attributing R-error; S fails to discriminate between reflectance differences beyond the allowable range. S’s R-error, nevertheless, cannot be attributed to L-error, since it occurs under CI . Were the deficiencies with S’s judgements confined to small threshold-type cases, the failure of L-error to underpin R-error might not be very bothersome. Unfortunately, the issue runs deeper. For all intents and purposes, S could be ‘lightness blind’. Under ideal conditions, S might perceive most achromatic reflectances as the same medium grey, or perceive them as a single dark grey up to some reflectance value and a single white for higher values. And if such radical lightness blindness is too farfetched to consider, the basic solipsistic point can be made assuming only that some people are significantly deficient in lightness discrimination. The comparable case of colour blindness is well known. While technical repairs might be sufficient to patch up earlier difficulties, the present problem is one of principle, requiring a major shift in perspective. As things stand, a subject’s lightness perception can be vastly deficient, but things will still be said to ‘look correct’. Accordingly, a subject may lack constancy on a grand scale, yet remain L-error free. It should be noted that these solipsistic difficulties are not due to general sceptical or philosophical worries about the contents of other minds—worries over whether we can ever know how things ‘really’ look subjectively to someone else. The failure of the lightness deficient to discriminate, where the rest of us do, is enough to show something amiss in how things look to them. The situation is not at all like the paradoxical case of spectrum inversion. With spectrum inversion, subjects make all the discriminations the rest of us do, but the supposition is that things look differently to them. Lightness deficiency raises no like issue of an, in principle, impossibility of testing.
Abandoning looks? Does the case of lightness deficiency mean that the notion of a ‘correct look’ should be abandoned, and with it the idea of L-type error? Right off, that would seem an overly hasty conclusion. If claims are limited to normal perceivers, it might still be possible to say something useful about errors of appearance. The definition of ‘L-error’ need not be changed, only its application is restricted to persons with non-defective vision. The correct look of a surface for a normal subject, S, is the look it gives S under CI . ‘Correctlook’ and ‘L-error’ remain individualistic notions, that is, specified relative to a given perceiver. Although, again, there is no need to assume it is possible to determine whether the subjective experiences of different people are subjectively identical. The restriction to
462
colour perception
normal perceivers merely serves to avoid the difficulties posed by the lightness deficient. It is not meant to resolve, or to depend on, resolution of inverted spectrum type quandaries. The initial limitation to the normal sighted does not preclude attributing some errors of appearance to those with defective vision. Many of the judgements of a lightness deficient perceiver, S, will be R-errors with respect to the standards set for normal persons. S has matching perceptions where normal perceivers experience the targets as non-matching. In these cases, it may seem reasonable to make the minimal claim that S’s appearances can not both be right. Then again, it is not clear what is gained by extending the notion of ‘L-error’ to the lightness deficient. It is, after all, the pattern of R-errors that is relied on to determine if a subject’s lightness perception is defective. And since the notion of ‘R-error’ is thoroughly general, it can be used to explore S’s achromatic colour constancy for any pair of reflectances, under any set of conditions. It might seem possible, then, to say most everything worth saying of the deficient perceiver’s visual competence without appeal to the more troublesome idea of an ‘L-error’. Such considerations, in fact, raise questions about the importance of having a notion like ‘L-error’ on hand. For what was just said about the lightness deficient holds largely for normal perceivers. Before S can be certified to be a normal perceiver, S’s R-errors must be examined. But once we have mapped out S’s successes and failures in matching reflectances, is there really a need for the concept of ‘L-error’ in the study of perceptual constancy? The prospect of not having to deal with L-error and the question, ‘How do things look to subjects?’, will strike many as a welcome relief. By so doing, psychophysics is nicely externalized, if not behaviouralized. On one side, there is lightness difference defined solely in terms of physical reflectance. On the other side, there are people’s overt judgements of matching. Nowhere does concern about the qualitative aspects of subjective experience obtrude. The problem with abandoning ‘looks talk’, however, is that along with gains in simplicity and methodological purity there are seeming losses. Recall the felt need to say something richer about S’s perceptual experience in order to pin down the source of R-errors, or to determine whether the target appears as it really is, or to indicate when S’s matching judgements were right by accident. Setting standards for both CI and normal vision appeared to provide the wherewithal to account for many of these aspects of achromatic colour constancy. Nevertheless, talk of ‘how things look’ to individual perceivers seems to introduce an additional subjective element into the study of lightness. And the need to relativize the specification of the ‘correct’ look, to normal perceivers and ideal conditions, may strike many as too high a price to pay in order to make invidious distinctions among perceptual appearances.
Reliable methods Given these worries, an alternative approach may be worth exploring. Much of the explanatory mileage achieved from the ‘correct look’ concept can be obtained by other means, by appealing to a notion of ‘reliability’. Consider the cases of ‘accidentally’ correct matching
avoiding errors about error
463
judgements. Although S’s judgements are sometimes accurate in contrast illusion conditions and in poor illumination, these comparison conditions are generally not good ones for lightness evaluation. In both examples, S gets things right using an unreliable comparison procedure. And therein may lie good reason for calling these judgements ‘accidents’. At the same time, the correct judgements a perceiver makes when both targets are under CI are not accidental, since this set-up is, by and large, reliable. A similar approach may be taken to the task of pinning down blame for R-errors. If x = y and Ci x #= CI y, fault can reasonably be attributed to Ci x, as long as comparing targets under CI is a reliable method for making lightness discriminations, and Ci is not. The need to appeal to the notion of a ‘reliable method,’ nevertheless, does raise serious doubts about the whole idea of a ‘correct look’. For suppose lightness discrimination were at a maximum under two different sets of conditions, CI and CI∗ . Both methods would be reliable, yet targets of the same reflectance might not match under these conditions (i.e. if x = y, CI x #= CI∗ y). In these circumstances, there would be no basis for claiming CI x versus CI∗ y is the correct look, and no basis for assigning the R-error in their failure to match.9 Similar considerations serve to loosen intuitions about the connection between accidental successes and the notion of the ‘correct look’. For suppose x = y and Ci x = CI y, but the reason they match is an ‘accident’. Ci involves two non-optimal conditions that, in this case, happen to cancel each other out; for example, an unnaturally intense illumination and a background reflectance much greater than x. Since Ci x = CI y, Ci x has the correct look. If, though, matching judgements in general are not accurate when targets are under Ci , the method is not reliable. Success under Ci is an accident, albeit, everything may look ‘as it should’.
Ideal conditions Until now the assumption that the Munsell condition, CM , may be an ideal condition for perception has gone unexamined. Justification for this claim needs further examination, for the notion of an ‘ideal’ viewing condition is not all that clear. The simplest explication might seem to be in terms of reliable methods and R-error. A condition is ideal if it is optimally reliable for lightness discrimination. There is no other condition under which normal perceivers make fewer R-errors. So understood, optimal reliability depends on the chosen allowable threshold for R-error. For example, two different conditions may both satisfy the criterion when the range for error is x ± n, but when the range is narrowed to x, only one may meet the specification. To deal with this possibility, it might be preferable to define ‘optimal’ in terms of yielding the fewest R-errors within the narrowest appropriate reflectance range. The situation, though, could be more complicated. One condition may lead to fewer errors when x ± n is the allowed range, while resulting in more error when the range is narrowed to x. At the same time, the error rate for both methods could be considerably 9 Consideration of phenomena like the ‘crispening effect’ (Whittle 1992) in enhancing discrimination, although important, would further complicate issues and cannot be explored here.
464
colour perception
higher than it is with the wider range x ±n. There are trade-offs between error reduction and precision. Thus there may not be a unique characterization of optimality, and there may be more than a single condition meeting any optimality standard adopted (see Helson 1943). Justification of a particular viewing condition as ideal, depends, therefore, both on the criterion of optimality selected and on empirical findings about how well the condition fares in competition with other viewing conditions. And no condition may be unique in meeting these demands. Leaving final resolution of these matters aside, is it reasonable to assume the Munsell condition will qualify? One problem with this assumption is that lightness discrimination is thought to be somewhat better when the illumination is higher than it is under the Munsell condition. And this possible ‘flaw’ with CM raises an interesting question about the policy of identifying ideal conditions with those optimal for lightness discrimination. Discrimination might turn out to be best when the level of illumination is well beyond that ordinarily encountered in daylight or in typical artificial light. Or R-error could be least when targets are viewed in some specially prepared non-white light or against a specially prepared background. Were this the case, the optimal and hence ideal condition would be a condition seldom, if ever, found in everyday perceptual tasks. The Munsell condition might still emerge as ideal if ‘typicality’ considerations are taken into account. Those viewing conditions securing better discrimination than CM may be unusual enough to be eliminated from consideration. For practical everyday use, there may be no point in specifying as ideal a condition hardly ever encountered in everyday lightness judgement tasks. Justifying CM as ideal would, nevertheless, remain problematic, but now for a different reason. It is hard to claim that the Munsell condition is itself very typical. The precise lighting, background, viewing distance, and viewing angle specified are not those in which we usually find ourselves. Perhaps it could be argued that CM , although not ecologically prevalent, is a good representative of more ordinary conditions. Incorporating this idea would, of course, considerably complicate the analysis of ideal conditions and further relativize an account of error.
Standards The distinction between reliable and unreliable methods was introduced to handle intuitions about lightness perception while avoiding various difficulties with the notion ‘correct look’. Adopting this approach, however, does not eliminate the need to appeal to perceiverrelative standards. Criteria for an ideal or reliable condition make reference to normal perceivers and optimal set-ups (which may depend on notions of ‘typicality’ or ‘representativeness’). Would not perceptual theory be better off if even these vestiges of relativity or ‘non-objectivity’ were expunged from psychophysics? Although this goal of purifying the study of lightness of any appeal to standards or norms of perception may sound attractive, it is misguided. The underlying rationale for the perceptual study of lightness relies on such considerations. The physical property of reflectance is a concern of psychological investigation, because differences in reflectance normally result in different lightness experiences for normal subjects. And useful talk of
avoiding errors about error
465
perceptual error presupposes standards of correctness, standards that take account of these norms of subjectivity. Anchoring evaluation of grey-scale error to reflectance is an empirically constrained choice, depending on both the nature of human visual capacities and the interests we have in describing them. It is a reasonable practice, because there is a fairly robust correlation between levels of reflectance and the achromatic colour experiences of normal perceivers. Study of grey-scale perception, however, does not require nor presuppose commitment to the idea that achromatic colour perception is a function of any single dimension of surfaces. Experience of white, black, and grey, like the experience of chromatic colour, might have evolved so as to depend on more complex or gerrymandered sets of physical properties. Or the normal visual system could have been such that it simply split the physical reflectance scale in two. Reflectances above a certain level are experienced as white and below that level as black. Or the experienced order could have been circular, with reflectances at the high and low ends matching one another. If normal perceivers responded to reflectance in these ways, it would be pointless to define error in achromatic constancy in terms of deviations from simple reflectance values. The centrality of reflectance in evaluations of lightness perception is only a fact in retrospect. It emerges from considerations about the standard ways normal visual systems experience differences in reflectance under assorted viewing conditions.10
Is R-error error?11 Throughout our discussion, R-error has been treated as a comparatively straightforward case of error, although warnings were issued about this assumption at the start. The just-concluded section should serve to remind us that perceptually based standards of correctness obtrude even here. What’s more, R-error is different from many ordinary cases of error. Subjects in matching experiments are not in any obvious sense trying to measure or compare physical reflectance values per se. They may have no idea what the term ‘reflectance’ means. Usually subjects are asked only if the targets match or match in colour, or if the targets are both made of the same material. R-errors, therefore, need not be errors in terms of the subjects’ own avowed aims. If failure of lightness judgements to accord with reflectance is to be taken as R-error, it must be with regard to considerations the experimenter brings to the task, not ones subjects may be likely to articulate. The firm conviction that R-error is unconditionally and indisputably error has its root, I believe, in the widespread acceptance of what has come to be called the ‘measuring devise metaphor’. According to this metaphor, the visual system is a devise for measuring physical properties of the environment. More specifically, the function of lightness perception is to determine or measure reflectance. Subjective grey-scale experience is the imperfect device evolution has given us to measure this physical property. Matching judgements that do not 10 As it is, the simple one-dimensional account of grey-scale experience is the result of a certain amount of abstraction. If grey-scale phenomena are treated more like other colours, and in matching tests chromatic neargrey surfaces or coloured lights are used, the picture of what is involved in achromatic judgement and error might be quite different. 11 The positions and arguments merely sketched in this section are developed more fully in Schwartz (1996).
466
colour perception
correspond to sameness or difference in reflectance are failures to meet the goal or function of lightness perception. It is with respect to this evolutionarily established standard that subjects make R-errors. Now I find it difficult to make good sense of claims about the purpose Nature has written into our experiences of the grey-scale, especially when this supposed aim is assigned normative status. But even if a case could be made for claims about the ‘real’ goal or function of lightness perception, nothing precludes taking alternative stances to error evaluation as well. For other purposes and projects it may be useful to evaluate performance with respect to a different standard of correctness than reflectance. In fact, there may be no pressing reason to think of the perceptual experiences involved in R-error as being faulty or erroneous. There is another option, and it is one I find appealing. Discrepancies between matching judgements and reflectance values of surfaces may be better understood as discordances between different ways of organizing our world, in particular, discordances between phenomenal and physical orderings. Neither way must be conceived as providing the complete and uniquely true story. R-error might then be understood to result from discrepancies between two acceptable versions of our world, one in terms of perceptually based categories and the other in terms of concepts like reflectance, fashioned primarily for physical theories of the environment. Hesitance to adopt such a pluralistic attitude may be traced, I believe, to residual essentialist metaphysical commitments. Ontologically speaking, it is presumed, achromatic colour is, and has to be, some physical property, like reflectance. Reflectance is an ‘objective’ feature of nature, and grouping surfaces according to reflectance serves to carve the world at its ‘natural joints’. More phenomenally based ideas of achromatic colour are shams or metaphysically second-rate. They do not tell us what achromatic colour ‘really’ is, and, from a scientific standpoint, they should, in principle, be eliminable. As the measuring devise model maintains, grey-scale experience is merely a fallible ‘subjective’ means for finding out about ‘how things really are’. Therefore, any discrepancy between matching judgements and reflectance values is a mistake, since reflectance is the correct or true way to categorize surfaces. Although such metaphysical intuitions are pervasive, I do not think they should bother the psychophysicist, or, for that matter, anyone else. For there is no reason to assume that there can be but one ultimately correct organization of the world, or that the physicist’s analysis of achromatic colour is ontologically privileged. The notion or notions of ‘achromatic colour’ needed for physics may differ from those that best serve the needs of psychophysics or optometry. These, in turn, may be different from those most suited to meet the requirements of a carpet manufacturer, a lighting expert, or a museum restorer. Such concepts will flourish or fade on the basis of the work they do in the areas they were designed to serve. The most the physicist, engineer, or design specialist can do is develop useful ways for categorizing the varied phenomena of achromatic colour that prove to be of intellectual or practical interest. What else could or should be expected? Claims that only one account of achromatic colour can capture its essential nature and specify what black, white, and grey ‘really’ are, hinge largely on preferred philosophical doctrines of ‘essences and reality’, rather than on substantive empirical considerations concerning perception. However, these doctrines have no priority or pride of place in telling us
avoiding errors about error
467
what Is or is not Real. Nor do they provide a higher or superior vantage point to rule on the number or adequacy of alternative conceptions of our world. Indeed, if such philosophical theories occupy any place, it will only be that of another kind of enquiry, epistemological or metaphysical, with its own constraints, interests, and focus (see Schwartz 2000).
Conclusion In this chapter I have attempted to explore the structure and complexity of claims about perceptual error in a limited domain. I have zeroed in on a few notions of ‘error’ that seem to play a role in studies of achromatic colour constancy. I have further limited the analysis to matching tasks that do not explicitly raise issues of ordering. Even so, it seemed possible to talk about error in quite different ways (e.g. R-error and L-error). And within each of these types there were competing definitions, yielding conflicting decisions as to whether a matching judgement is or is not an error. Although I have explored some of the strengths and weaknesses of various accounts of error, I have made no attempt to come down in favour of one, or to dismiss any of the others. There are several reasons for my reluctance to do so. First, at several places in the analysis there were choice points. For example, it was left an open question how best to handle appearance identity when faced with the non-transitivity of matching judgements, or how best to conceive of reliable methods. Resolution of such issues will have an effect on any precise specification of error. Secondly, all the notions of ‘error’ examined have their difficulties. Each is at odds with some of our convictions, and no conception is likely to capture all of our intuitions. Thirdly, I see no reason to assume there is, or should be, either a single kind of error or a unique characterization of error within a single kind. Any notion of error must earn its keep by the service it performs in helping describe, systematize, and explain the facts of interest in grey-scale perception. This will depend importantly on the task at hand. Such a proliferation of ‘error’ concepts will strike many as unsatisfactory. It might seem bad enough to have to deal with errors of look in addition to errors of reflectance. It would seem all the more untenable if the very same judgement is classified an error on one account and correct on another. To alleviate some of these qualms I have proposed, but not developed, the idea that phenomenal and physical accounts of achromatic colour may both have a role to play in enquiry. In turn, discrepancies between these versions need not always be thought of as errors. Adopting this more pluralistic approach, I believe, can help deflate or avoid needless controversy and debate. Perhaps, though, the most important point to emerge from our present study is that when it comes to questions of perceptual error, things are not black and white.
Acknowledgements In addition to my discussions with Alan Gilchrist, I have benefited from the comments of Larry Arend, Margaret Atherton, Dieter Heyer, Dejan Todorovic, and Paul Whittle.
468
colour perception
References Byrne, A. and Hilbert, D. R. (eds.) (1997). Readings on color. Vol. 1. The philosophy of color. MIT Press, Cambridge, MA. Clark, A. (1993). Sensory qualities. Oxford University Press, Oxford. Gilchrist, A. (ed.) (1994). Lightness, brightness, and transparency. Erlbaum, Hillsdale. Gilchrist, A., Kossyfidis, C., Bonato, F., Agnostini, T., Cataliotti, J., Li, X., et al. (1995). A new theory of lightness perception (draft). Gilchrist, A., Kossyfidis, C., Bonato, F., Agnostini, T., Cataliotti, J., Li, X., et al. (1999). An anchoring theory of lightness perception. Psychological Review 106, 795–834. Goodman, N. (1951). The structure of appearance. Harvard University Press, Cambridge. Helson, H. (1943). Some factors and implications of color constancy. Journal of the Optical Society of America 33, 555–567. Munsell Color Company (1976). Munsell book of color. Munsell Color, Baltimore. Schwartz, R. (1996). Pluralist perspectives on perceptual error. In Pluralism: theory of knowledge, ethics, and politics, (ed. G. Abel and H. J. Sankueler). Meiner Publisher, Hamburg. Schwartz, R. (2000). Starting from scratch: Making worlds. Erkenntnis 52, 151–159. Whittle, P. (1992). Brightness, discriminability, and the ‘Crispening Effect’. Vision Research 32, 1493–1507. Wyszecki, G. and Stiles, W. S. (1982). Color science: Concepts and methods, quantitative data and formulae, 2nd edn. John Wiley & Sons, New York.
commentary: avoiding errors about error
469
Commentaries on Schwartz Deconstructing the concept of error? Alan Gilchrist Robert Schwartz has written a very thoughtful piece about the logical pitfalls of talking about errors in the perception of surface lightness. In his article, written very much in response to my own claims about errors, Schwartz has taken a constructive and largely pragmatic approach. He begins by wrestling with the status of the following troublesome examples of lightness ‘errors’: 1.
Observers fail to discriminate very small reflectance differences.
2.
Under impoverished viewing conditions, otherwise detectable differences in reflectance will not be seen.
3.
An individual observer may be deficient (lightness-blind in the extreme case) in the discrimination of reflectance.
To our relief, Schwartz goes on to describe practical steps that keep these difficulties in check, including: 1.
Limiting the testing to normal perceivers.
2.
Allowing a region of uncertainly around threshold.
3.
Invoking the notion of ‘reliable methods’.
These steps make sense practically. Yet at a theoretical level, these limitations can still be considered errors, and these errors, consistent with my basic thesis, provide insight into visual processing. The failure of observers to discriminate very small reflectance differences tells us about the resolving power of the human visual system. In a similar fashion, our inability to make discriminations under low lighting conditions provides important clues as to how reflectance levels are computed. These errors remind us that the visual system does not have ‘reflectance detectors’, and thus must rely on some kind of transformation of the pattern of light entering the eye. Schwartz raises the finding that observers, when asked to equate luminance differences, actually equate luminance ratios. Here we had the wrong expectation: that observers would match differences rather than ratios. But again, there is no problem in defining this as a perceptual error. The fact is that humans are unable to match luminance differences. Nevertheless, this fact reveals something important about visual functioning. Biological sensory systems are set up to respond to proportions, not differences. Looking back, this makes good sense. When the illumination level changes, it is the luminance ratio between adjacent regions that remains constant, not the luminance difference. Historically, we arrived at this insight by first measuring an ‘error’ in the matching task. The fact that this error was later shifted from the visual system to our theoretical expectations does not seem deeply troubling. A further problem raised by Schwartz involves those cases in which two errors of opposite direction cancel out one another. This is indeed a situation of which one must be aware. The experimenter might easily conclude that no errors exist when in fact two errors exist. I believe that the lesson here is that no method for measuring errors can be followed blindly. The results must always be interpreted in the context of what else is known about error-producing conditions. In the final analysis, of course, we are looking, not merely for a laundry list of errors, but for a coherent pattern. Schwartz introduces a distinction between what he calls R-errors (reflectance errors) and L-errors (errors of look). These seem to correspond roughly to relative and absolute errors. He is more ready
470
colour perception
to accept the concept of R-errors. In the conventional simultaneous lightness contrast display, for example, two squares of equal reflectance are perceived as unequal. He acknowledges that it is difficult to avoid the conclusion that a perceptual error occurs in this case. Yet he seems not to be convinced, as he writes: ‘If failure of lightness judgements to accord with reflectance is to be taken as an R-error, it must be with regard to considerations the experimenter brings to the task, not ones subjects are likely to articulate.’ If I understand what Schwartz is saying, it seems wrong. When the white and black backgrounds in simultaneous contrast are concealed in order to reveal that the two grey squares are identical, observers readily acknowledge that their initial perception of difference was in error. More bothersome to Schwartz is my contention that we can go farther and raise the question of which of the two squares is seen more in error. This additional step, involving what Schwartz calls an L-error, does entail further assumptions and dangers. Most importantly, it seems to require a standard of correct perception. I agree. But I am not convinced that the physical reflectance of a surface cannot serve as this standard. As Schwartz acknowledges, this need not presuppose knowledge of an underlying physical property. Every surface reflects more or less of the light illuminating it, and every surface is perceived to reflect more or less of that light. Or to use even more intuitive terms, every surface is darker than a white surface in the same illumination by some degree and every surface is perceived to be darker than white by some degree. Take perceived size. Can the physical height of an object not be the standard by which its perceived height is evaluated? At least in cases such as perceived size and perceived lightness, it seems to me that the perceptual dimension has a clear referent in the physical world. When, as in the Gelb effect, a black paper is suspended in midair and illuminated by a spotlight that illuminates only the black paper, it appears white. When the illusion is revealed to a subject, either by adding a white background within the spotlight, or by turning off the spotlight, every subject expresses surprise, and readily acknowledges that their initial perception was in error. I do worry that resistance to the concept of perceptual errors sometimes implies an underlying resistance to the notion of an independent physical reality. This, after all, is the danger in the current deconstructionist trend. What does Schwartz mean when he suggests that we ‘treat various purported cases of error not as error, but as discordances among competing ways of organizing and ordering our world’? Does he mean that every percept is as valid as every other percept? When two pieces of the same grey paper are placed on white and black backgrounds, respectively, the grey on the white background appears darker than the grey on the black background. Physics tells us that the two greys are identical. Does the apparent difference between the greys tell us something else about the physical world? Or does it merely tell us something about the visual system? If it does the latter, this is just my point. I have argued that errors in lightness are systematic, not random, and thus they must constitute a signature of the software employed by the visual system. As Schwartz notes: ‘Any notion of error must earn its keep by the service it performs in helping describe, systematize, and explain the facts of interest in grey-scale perception.’
commentary: avoiding errors about error
471
Commentaries on Schwartz Talking across the divide Paul Whittle He thought he saw an albatross That fluttered round the lamp He looked again and found it was A Penny-Postage-Stamp . . . (Lewis Carroll) This comment relates to the two chapters by Gilchrist and by Schwartz. They grew out of a joint project on errors in lightness perception. The joint project foundered, but the chapters still speak to one another, and I try here to take up the conversation and also to look at some reasons why it broke down. I look at the breakdown first. University disciplines, though little more than a century old in their present form, have a strong grip on us. The initial socialization lasts 7 or 8 years, and thereafter the structure of goads and rewards, of peer expectations and attitudes, amplifies the differences. These are particularly acute across the arts–science divide. We can still talk to each other, but only just. Science is anti-intellectual. Nature is to be made to yield up her secrets by experiment and observation. The invention of new apparatus and methods are particularly highly prized. Experimenting is hard, creative, time-consuming work. One moves rapidly on from one experiment to the next, caught up in the excitement of the chase; a somewhat manic mode of life that is currently strongly reinforced by the twin pressures to publish and to acquire ever more and larger research grants. Most scientists feel guilty if they are caught reading books in the daytime. Philosophers, on the other hand, live in a world of books and ideas. They prize thinking, analysis, argument, and the large view. The scientist looks at them and says, ‘Well, you won’t get very far that way; that was tried for centuries without great success.’ But a philosopher looking at experimental psychology, the field in question here, can still deeply agree with Wittgenstein’s remark of many decades ago, that ‘psychology has experimental method but conceptual confusion’. That is part of the background to why this conversation faltered. It should have been lively. Gilchrist contributes the phenomena, thought-provoking demonstrations and experiments and the ‘rules’ that describe them, and suggests some explanations. These together constitute the topic, the field of ‘lightness perception’, a subdomain of colour vision. Schwartz offers a careful analysis of the basis of Gilchrist’s work: ‘errors’ of perception. Error exists only relative to norms of correctness, and those are decided by the researcher as an aspect of the subjects’ task. This directs attention to the variety of tasks that could be investigated as ‘lightness perception’—an indispensable context. It shows up Gilchrist’s particular choices as both motivated and arbitrary, depending on point of view. It allows us to see alternatives and, more broadly, could contribute to placing contemporary colour science in its historical specificity. Gilchrist lays his cards on the table at once. Perception must be largely veridical. Otherwise we wouldn’t be here. The notions of veridicality and its complement, error, seem to him obvious and basic. Just as in the lines above from Lewis Carroll it is obvious that one or both perceptions must be wrong. Therefore, to Gilchrist, philosophical analysis of the concepts feels abstract, unnecessary, academic (and a distraction from the excitement of the chase). It has the wrong feel to it. He finds it difficult to engage with it. But to a philosopher the notion of errors in perception at once sets alarm bells ringing, particularly when presented as absolute. ‘Would we not have to know the things-in-themselves?’ Error
472
colour perception
is inherently normative—it requires a standard of correctness. And it is clear that there could be many such standards, relative to different goals, and therefore, in the laboratory, to the tasks set by the experimenter. Gilchrist doesn’t see the importance of this point, and this, I think, shows the first of the philosophical rocks on which the joint project foundered. Gilchrist is locked into what is charitably called ‘a robust realism’. Even when he specifies what his standard of correctness is, viz. that the subject selects a Munsell grey of the same reflectance as the target from an array of greys under a standard illumination (the ‘reference conditions’), he does not admit what seems glaringly obvious to an outsider, namely that this is a wholly artificial standard of correctness, chosen for pragmatic reasons. He writes: ‘the Munsell chips are seen with very little error (my italics). . . To this end, the matching chart embodies those conditions under which perception of lightness is currently known to be optimal.’ That is, although he does give pragmatic reasons (‘conditions optimal’) for his choice of reference conditions, he sees them as reasons for believing that lightness is (almost) correctly perceived under such conditions. How pragmatic the choice is, is shown in the very next paragraph where he modifies the reference conditions by inserting a chequered background because the situation in this experiment would make the usual conditions difficult or ‘unfair’. This is not inconsistency: it is quite logical to adhere to an absolute notion of truth and be pragmatic about how to study it. But if you change your criterion of correctness, your reference conditions, from experiment to experiment, can you generalize across them? That Gilchrist very much wants to generalize, to produce a unified theory of errors in lightness perception, is one reason why it is important to him to believe he is measuring the same thing—a cross-situational veridicality or error—in the different experiments.1 Now, the realism that Gilchrist adheres to (and twice over: perception getting the world right, and psychology getting perception right), and its associated notion of ‘truth’, is well known to be a philosophical quagmire. Adhering to it, rather than to the clear and modest notion of task-relative correctness, is a matter of faith, a faith common among scientists.2 But does it matter if it’s a philosophical quagmire? Has not good science been erected before on philosophical quagmires? Probably. However, I’ve just suggested one reason why it may matter (dubious validity of generalizing across different tasks, so that the unified theory will have weak joints), and I shall point below to some advantages of pragmatism. But before that, I want to turn to what I see as the second philosophical rock on which the conversation foundered. This is indicated by Schwartz’s notion of ‘look errors’, which Gilchrist describes as ‘absolute’ errors: situations in which an object looks ‘not as it really is’. A paradigm case is the Gelb effect, in which a black paper lit by a concealed beam looks white.3 Schwartz argues that these are troublesome because solipsistic: the subject is the only authority, and furthermore that most work on lightness perception could do without them. Work on colour defects offers an instructive comparison. 1 The alternative is not to abandon the idea of cross-situational veridicality, but to check it carefully in each case, as in fact Gilchrist tries to do: ‘we also found that, in general, the checkerboard background produces the same pattern of data as that produced by the white background’. 2 But we might note that even Helmholtz, the giant in whose shadow visual science is still conducted, wrote ‘In my opinion. . . there can be no possible sense in speaking of any other truth of our ideas except of a practical truth’ (Helmholtz 1909/1962, p. 19). By ‘ideas’ there he was referring to our perceptions. Gilchrist nicely combines the two notions of truth in one sentence in his introductory paragraph: ‘Without a high degree of veridicality in visual perception we would not be here.’ 3 The moon in the night sky, lit by the hidden sun, is a natural Gelb effect. Yet we do not regard its normal appearance as an error. Does this not suggest that the ‘error’ in the Gelb effect is more a matter of the context of expectations and conventions than is usually thought?
commentary: avoiding errors about error
473
It consists largely of a body of reliable psychophysics based on discrimination and matching experiments, while the question of appearances, of the subjective experiences of the colour-deficient, is pushed to one side, remaining peripheral and unresolved. Gilchrist would, I think, say that this is just why this work is of little interest to him. He is committed to a science of how the world looks, not just of what discriminations can be made. Behind his work lies Koffka’s disarmingly simple question, ‘Why do things look as they do?’ This question provokes, and I suspect that, like scientists’ robust realism, its main value may be motivational, so that to question the metaphysics behind it is, to that extent, beside the point. Nevertheless there is a large relevant critical literature, organized around key phrases like ‘the myth of the given’, ‘the Cartesian theatre’, and so on. Two points from this literature are directly relevant to the present context: that the trouble with concentrating on ‘looks’ or appearances, is that they are taken as too immutable (too ‘given’), and that they are too disconnected from action. These points are made in one way or another by many philosophers, and in psychology, famously, by James Gibson. The point about immutability seems to me of general relevance in psychophysics. The fact that subjects are in an unfamiliar situation and are learning and, what’s more, could be deliberately trained, is rarely attended to. The appearances they report are ‘naturalized’, taken as given by the nature of their ‘visual systems’, rather than as being relatively transient states that are also a function of expectations and of the present level of skill. For instance, it is a common experience of visual scientists working even with initially very strong ‘visual illusions’ that after a while they cease themselves to see them, or see them differently. This is rarely taken seriously, and I suggest that it should be. The point about disconnection from action is again (as with norms of correctness) to direct attention to the task that the subject is engaged in. And in Gilchrist’s domain—the perception of lightness (degrees of greyness from black to white)— can there not be many such tasks? Consider black-and-white photography, the manufacture of paint, the evaluation of photocopiers, the judgement of X-rays. In each domain there is a multiplicity of tasks involving judgements of greyness, explicitly or implicitly, and the standards of correctness are equally various. Gilchrist assumes that explicit judgements of reflectance are a good probe to investigate the system. But explicit comparison of surface colours in different contexts is only one special use of our colour vision. Quite other uses of it may be more responsible for ‘our still being here’. As Schwartz says, Gilchrist’s chosen tool belongs in the ‘measurement device’ concept of perception, which reminds us of its specific historical and technological context. I am pointing out here that the conversation in these two chapters raises practical suggestions for a working scientist like Gilchrist. I have suggested two: take more seriously, first, that subjects are learning and changing during an experiment, and secondly that the tasks employed are only a small sample of those for which our colour vision, achromatic or not, can be used. My point is not so much that these things should be attended to here and now (they are perhaps the sort of suggestion whose value, if any, shows mainly when a scientist feels at an impasse), but rather that a conversation of this sort between psychology and philosophy is not a matter of academic philosophical tidying or carping, but has, potentially, great practical value. I felt while writing this comment that it opened up an extraordinary number of questions of both intellectual and practical importance.
Reference Helmholtz, H. von (1909/1962). Treatise on physiological optics, (3rd edn), (ed. J. P. C. Southall). Dover, New York.
474
colour perception
Commentaries on Schwartz On the veridicality of lightness perception Richard Brown I would like to make two brief comments on these twin papers by Alan Gilchrist and Bob Schwartz, both on the topic of errors in lightness perception. I have been a long-time admirer of Alan and his ingenious experiments, and his research has had a large influence on my own. I think many of his findings have important and yet-unappreciated implications for understanding human colour perception and its underlying neuronal mechanisms. At the same time, I have never found the concept of veridicality championed by Alan to be terribly useful in my own work, and I would largely agree with Bob that, in general, there is no privileged ‘correct’ measure of lightness against which errors can be measured. As Bob points out, one consequence of this multiplicity of measuring sticks is that lightness judgements that are correct by one legitimate measure may be errors by another. But this doesn’t trouble me, because I am primarily interested in understanding the mechanisms of perception, rather than rating its success. To this end, it is valuable to learn the patterns of perceptual responses generated by various stimuli, but not necessarily whether some of those responses are called errors or not. Of course, there are the ultimate biological measures of success, such as survival and reproduction, which presumably drive the evolution of perceptual mechanisms, but I doubt these are captured in any particular measure of the veridicality of lightness perception. My second comment concerns the applicability of lightness models to colour perception in the real world. Both Alan and Bob have limited their analyses to lightness perception, and, at first glance, it seems a useful simplification of colour perception to isolate just one aspect of it for intensive analysis. But perhaps this is an instance in which the simplifying assumptions needed to isolate lightness have, at the same time, removed something essential to the real issue at hand, namely seeing colours in the real world. I suggest that the implications of their different approaches to errors in lightness perception might be clarified by considering the requirements of veridical lightness perception in the real world, with its coloured lights and surfaces. In an idealized achromatic world, both illuminants and reflectances may be reasonably well approximated by scalar quantities, allowing for a relatively straightforward analysis of errors, particularly in the rank ordering of lightnesses. (I say reasonably well, because there would still be complications such as spectral components of reflectances, and polarization of illuminants, that could not be represented by scalar quantities.) But allowing for the reflectances and the illuminants to take on different values at each wavelength raises serious problems for the task of assigning correct, one-dimensional ‘lightnesses’. Lightness reflectance, according to Gilchrist, is given by a function which depends on the spectral distribution of both the surface and its illuminant, and thus is no longer a stable property of just the surface. (In Don MacLeod’s words, ‘as the light gets redder, the reds get lighter’.) Lightness reflectance also depends on the luminosity function, and thus incorporates the somewhat arbitrary and idiosyncratic properties of the visual system of the particular organism viewing the surface. Even the human standard observer has significantly different luminosity functions for bright (photopic) and dim (scotopic) viewing, reflecting roughly the greater sensitivity to blue light of the rods relative to the cones, forcing us to choose whether the rods or the cones make systematic errors in lightness judgements. Finally, it’s notoriously much harder to make heterochromatic lightness judgements, such as whether a green shirt is lighter or darker than an orange one, than to compare shades of grey, raising the question of whether lightness is even a particularly salient aspect of colour experience.
chapter 16
THE PLACE OF COLOUR IN NATURE brian p. mclaughlin
Preface I am interested in colour because I am interested in visual consciousness. I am interested in visual consciousness because I am interested in how we acquire knowledge about reality—about the way the world actually is. And I am interested in how we acquire knowledge about reality because I want to understand the nature of reality. My desire to understand the nature of reality is non-instrumental: it is something I desire for its own sake, rather than for the sake of some further end. Understanding the nature of reality is intrinsically valuable. Visual consciousness is one of our primary modes of access to the world around us: it is, we intuitively feel, our window to the world. I find the topic of colour especially fascinating because there is a serious question of whether colours are really ‘out there’ in the environmental scenes before our eyes, or whether they are instead projected into scenes by our visual consciousness. If the latter, then rather than our visual experiences presenting the environmental scenes before our eyes as those scenes actually are, our visual experiences are systematically illusory: the world isn’t the way it looks. Our visual experiences present environmental scenes as filled with colour. If, in reality, such scenes are devoid of colour, then our visual experiences present scenes as being a way that they in fact are not, and so in having visual experiences we are subject to a pervasive systematic illusion. It is hard to believe that what we take to be one of our primary modes of access to the world around us invariably misleads us in this way. But science has no need of the hypothesis that there are colours in order to explain why we have the visual experiences we have when we view the scenes before our eyes. Why we have the visual experiences we have when we view the scenes before our eyes can, in principle, be explained by facts about how electromagnetic radiation affects our neurobiology. On the evidence, colours are certainly not fundamental aspects of reality. They are not like properties such as mass, charge, spin, or charm. None the less, I persist in the common-sense belief that colours are aspects of reality, albeit derivative ones. While I am unable to refute projectivism (the view that our visual experiences project colours into environmental scenes), I persist in the common-sense belief that the world around us is indeed a world of colour, and so resist the idea that in having visual experiences we are subject to a pervasive illusion. I believe that colours are really ‘out there’. Colours are mind-independent properties of things in the physical world: they are objective properties and our visual experiences put us in touch with them. However, which objective properties are colours essentially depends, I believe, on visual consciousness, and so on our subjectivity. I am optimistic that we can locate colours among objective properties involving the interaction of matter and electromagnetic radiation. But to so locate them, we must appeal to our visual consciousness. The reason is that what makes an objective property a colour—or, more precisely, a colour for us—is the causal role it plays vis-à-vis our visual consciousness. To borrow a phrase of Frank Jackson, I seek in this chapter to provide a subjectivists’ guide to objectivism about colour. B. P. McLaughlin
476
colour perception
Colours versus what it’s like to see them Some years ago, Bertrand Russell remarked:
It is obvious that a man who can see knows things which a blind man cannot know; but a blind man can know the whole of physics. Thus the knowledge which other men have and he has not is not a part of physics. (Russell 1927, p. 389)
Among the things Russell believed that the sighted know, and the (congenitally) blind do not, is the nature of colours, for he held that the only way to come to know the nature of a colour is by seeing it. He tells us:
The particular shade of colour that I am seeing . . . may have many things to be said about it . . . But such statements, though they make me know truths about the colour, do not make me know the colour itself better than I did before: so far as concerns knowledge of the colour itself, as opposed to knowledge of truths about it, I know the colour perfectly and completely when I see it and no further knowledge of it itself is even theoretically possible. (Russell 1912, p. 47)
Frank Jackson (1982) has recently described the case of Mary, who knows all the physical facts about colour, but has spent her life in a black and white room never seeing (chromatic) colours. Upon leaving the room and encountering something red, she can learn something about red, namely, what it is like to see it. So, Jackson claims, she can come to know a fact about redness that she did not know before.1 Jackson does not tie the idea of what it is
1 Broad (1925, pp. 71–72) once asked: ‘Would there be any theoretical limit to the deduction of the properties of chemical elements and compounds if a mechanistic theory of chemistry were true? Yes. Take . . . , e.g., “Nitrogen and Hydrogen combine when an electric discharge is passed through a mixture of the two. The resulting compound contains three atoms of Hydrogen to one of Nitrogen; it is a gas readily soluble in water, and possessed of a pungent and characteristic smell”. If the mechanistic theory be true . . . a mathematical archangel could deduce from his knowledge of the microscopic structure of atoms all these facts but the last. He would know exactly what the microscopic structure of ammonia must be; but he would be totally unable to predict that a substance with this structure must smell as ammonia does when it gets into the human nose. The utmost that he could predict on this subject would be that certain changes would take place in the mucous membrane, the olfactory nerves and so on. But he could not possibly know that these changes would be accompanied by the appearance of a smell in general or of the peculiar smell of ammonia in particular, unless someone told him so or he has smelled it for himself.’ These considerations succeed in refuting the view Broad called ‘mechanism’. Jackson takes the case of Mary to refute physicalism. In McLaughlin (2003) I argue that it doesn’t.
the place of colour in nature
477
like to see a colour to the nature of the colour, but consider the following claim of Galen Strawson: Colour words are words for properties which are of such a kind that their whole and essential nature as properties can be and is fully revealed in sensory-quality experience given only the qualitative character that that experience has. (Strawson 1989, p. 224)
By ‘the qualitative character of an experience’, he means what it is like for a subject of the experience to have the experience. His claim is that the whole and essential nature of a colour is revealed in what it is like to visually experience it. Mark Johnston (1992) has aptly labelled this doctrine ‘Revelation’. The doctrine is that the nature of a colour is revealed to us in our visual experience of it. According to proponents of Revelation, as concerns the nature of colours, experience is not merely the best teacher, it is the only teacher, for only experience itself will lead us to knowledge of what it is like to experience colours. If our eyes can be trusted, colours pervade physical surfaces, fill volumes of air and liquid, and are manifest at light sources. There is much that we’ve learned about light and how it interacts with matter. We’ve investigated how the chemical properties of surfaces and volumes made up of various materials dispose them to interact with light; how rods and cones respond to light; how electrochemical impulses are propagated along the optic nerve to the visual centres of the brain, and resulting patterns of neural activity therein. From an information-theoretic perspective, we’ve investigated how the visual system might process information about the scenes before our eyes; how, for instance, neural nets might compute the spectral reflectance distributions of the surfaces in such scenes from the spectral power distributions of the radiant fluxes stimulating our retinas. While there is much that we still don’t know, we’ve learned an enormous amount; and research in vision science is moving apace. It is a consequence of the doctrine of Revelation, however, that all we’ve learned and, indeed, all we can ever hope to learn by scientific investigation will contribute not one whit to our knowledge of the nature of colours themselves. For Revelation entails that there is nothing more that we can learn about the nature of colours than what visual experience teaches us. While scientific investigation can uncover the underlying causal conditions for seeing such things as the redness of the surface of a ripe tomato, the greenness of a traffic light, the yellowness of a volume of beer, and the blueness of an expanse of sky, such investigation will reveal nothing about the nature of these colour qualities themselves. As concerns knowledge of them, there is no substitute for experience. In considering Revelation, it is important to note that colours are one thing, the ‘qualitative’ or ‘phenomenal’ characters of colour experiences, another. Red, for instance, is not what it’s like to see red. Redness is a property of surfaces and volumes and, thereby, of the objects and materials of which they are surfaces and volumes. What it’s like to see red is an aspect of visual experiences of red; it is a property of such experiences. The doctrine of Revelation for colours should be distinguished from the doctrine of Revelation for what it’s like to see colours—the doctrine that the nature of the phenomenal character of a colour experience is revealed to us when we have the experience. It’s one question whether Revelation is true for what it’s like to see colours, it’s another whether Revelation is true for colours themselves.
478
colour perception
Whatever intuitive appeal the doctrine of Revelation for colours enjoys is, I believe, due to two factors: our recognition that knowledge of what it’s like to see a colour figures centrally in our understanding of what the colour itself as such is, and the powerful intuitive appeal of the doctrine of Revelation for what it’s like to see colours. In what follows, I’ll offer an account of colour that will reflect the fact that knowledge of what it’s like to see a colour figures centrally in our understanding of it. The account will be compatible with the doctrine of Revelation for what it’s like to see colours, so it will be open to a proponent of that doctrine to embrace the account. The doctrine of Revelation for colours is not, however, entailed by the doctrine of Revelation for what it’s like to see colours and the fact that knowledge of what it’s like to see a colour figures centrally in our understanding of it. And I’ll reject Revelation for colours. I’ll argue, on empirical grounds, that if there are colours, they are ways things might be that involve the interaction or potential interaction of matter and electromagnetic radiation, even though that isn’t revealed in our colour experience. The powerful intuitive appeal of the doctrine of Revelation for what it’s like to see colours—for the phenomenal characters of colour experiences—seems to me undeniable. Still, I believe it, too, is mistaken, for I think that the phenomenal characters of our experiences are neuroscientific properties, even though that isn’t revealed to us when we have the experiences. I won’t attempt to argue that here, however. Nor will I assume it. My focus here is on colour; I leave the nature of colour experiences for another occasion. The question I want to pursue is: What is the nature of colour?
A functional analysis of colour My view of colour is anticipated in the following passage from Thomas Reid:2 That idea which we have called the appearance of colour, suggests the conception and belief of some unknown quality, in the body which occasions the idea; and it is to this quality, and not to the idea, that we give the name colour. The various colours, although in their nature equally unknown, are easily distinguished when we think or speak of them, by being associated with the ideas which they excite . . . When we think or speak of any particular colour, however simple the notion may seem to be, which is presented to the imagination, it is really in some sort compounded. It involves an unknown cause, and a known effect. The name of colour belongs indeed to the cause only, and not to the effect. But as the cause is unknown, we can form no distinct conception of it, but by its relation to the known effect. (Reid Inquiry into the human mind, in Stewart 1822, p. 205)
Colours are properties of bodies (and the like); ordinary folks are ignorant of their nature; but we form a conception of them by their relations to our colour experiences. 2 Consider, also, the following earlier passage from Descartes: ‘When we say we perceive colors in objects, it is really just the same as though we said that we perceive in objects something as to whose nature we are ignorant, but which produces in us a very manifest and obvious sensation, called the sensation of color’ (Descartes 1954, p. 195).
the place of colour in nature
479
For simplicity of exposition, I’ll frame my basic proposal for a specific colour, redness. But my account is intended to hold for all colours, chromatic and achromatic (white, black, and shades of grey).3 Here, then, is my basic proposal (refinements will come later): Basic proposal: redness is that property which disposes its bearers to look red to standard visual perceivers in standard conditions of visual observation, and which must (as a matter of nomological necessity) be had by everything so disposed.
Redness is a visual property in that it plays a certain role vis-à-vis visual consciousness: namely, the role of being the property that disposes its bearers to look red to standard visual perceivers in standard conditions of visual observation and that (nomologically) must be possessed by everything so disposed.4 Call this role ‘the redness-role’. According to the basic proposal, then, redness is just that property, whatever it is, that occupies the redness-role. The proposal is intended as a functional or topic neutral analysis of the concept of redness. The role-description ‘that property which . . .’ is intended not only to fix the referent of the concept, but also to express a condition that is necessary and sufficient for satisfying it.5 Thus, if the proposal is correct, then all that it takes for a property to be redness is just for it to fill the redness-role; filling the role is necessary and sufficient for a property to be redness; for being redness just consists in filling that role. Thus, being red just consists in having a property that fills the role.6
Our functional analysis versus dispositionalism It is a consequence of our functional analysis that redness is not the disposition to look red to standard perceivers in standard circumstances, but rather a basis of the disposition: a property that endows something with the disposition.7 On the dispositional analysis of 3
It is also intended to hold for visual properties such as highlighting, glaring, glowing, gleaming, glinting, glistening, glittering, and the like. Moreover, it can be extended to sensory qualities in the aural, gustatory, olfactory, and tactual modalities—to qualities such as the loudness, pitch, and timbre of sounds, the sweetness, saltiness, sourness, and bitterness of tastes, the putridness of odours, the roughness of tangible surfaces, etc. The account can thus be generalized as an account of sensory qualities. 4 Hereafter, I shall drop ‘(nomologically) must’ and simply speak of the property being possessed by everything so disposed, where ‘everything’ is to be understood as quantifying over all nomologically possible things. 5 Saul Kripke (1980, p. 140n) once suggested that we could use a related description to fix the reference of colour terms: ‘The reference of “yellowness” is fixed by the description “that (manifest) property of objects which causes them, under normal circumstances, to be seen as yellow (i.e. to be sensed by certain visual impressions) ”. ’ 6 The idea of topic neutral analysis of colour can be found in Smart (1961, 1963) and Armstrong (1968). My basic proposal appears in McLaughlin (2003); and the basic proposal, minus the second conjunct, appeared in McLaughlin (2000); that earlier proposal of mine is defended in Cohen (2000); see also Cohen (2001). 7 Being composed of sodium chloride is a basis for the disposition of water-solubility; being composed of threads woven together in a certain way (so that there are pockets for water to collect in) is a basis for the disposition of water-absorbency; having a certain crystalline structure is a basis for the disposition of fragility. These bases are all structural properties, but a disposition can be a basis for another disposition, for possession of one disposition can endow something with another. Perhaps every disposition must ultimately have a structural basis, but it is none the less the case that a disposition can also have a basis that is itself a disposition. A structural property may be a basis for a disposition that is, in turn, a basis for another disposition. I am not denying that colours are, for instance, light dispositions; in fact, as will become clear, that seems to me plausible indeed. (A light disposition might endow something with the disposition to look coloured, so that the former disposition is a basis for the latter.) I’m denying only that colours are dispositions to look coloured.
480
colour perception
colour, something is red if, and only if, it is disposed to look red to standard perceivers in standard circumstances.8 Our analysis agrees with the dispositional analysis in one direction: something is red only if it is disposed to look red to standard perceivers in standard circumstances. But our analysis offers a different explanation of why this is so from the one the dispositional analysis offers. According to the dispositional analysis, the claim is true because redness = the disposition to look red (to standard perceivers in standard circumstances). According to our functional analysis, it’s true for a different reason: red things will be disposed to look red because they have the property of redness, which so disposes them. In implying the falsity of the claim that redness is the disposition to look red, our functional analysis is, I believe, faithful to our common conception of redness, for it is part of that conception that redness disposes its bearers to look red. The disposition to look red doesn’t do that. Nor does the second-order property of being a basis for the disposition. Only a basis for the disposition does. The dispositional analysis implies that if something is disposed to look red (to standard perceivers in standard circumstances), then it is red. So, according to it, if there are things that are so disposed, then there are red things. Our analysis doesn’t have that implication. Indeed, our analysis fails to have that implication even when conjoined with the assumption that the disposition in question has bases. For, on our analysis, being a basis for the disposition to look red doesn’t suffice for being redness; to be redness, a property must be a basis that is common to all things so disposed. Of course, if everything so disposed has a basis that so disposes it, then, if there are disjunctive properties,9 there will be a property shared by everything so disposed, namely, the disjunction of all the bases. But from the fact that each disjunct disposes its bearers to look red, it does not follow that the disjunctive property does; the disjunctive property may very well fail to be a basis for the disposition, even though each disjunct is. So, even if the disposition to look red has bases and there are disjunctive properties, that would not settle whether anything is really red, as opposed to merely being disposed to so look. To be red, something must have the property of redness. Redness exists if, and only if, there is an occupant of the redness-role.10 Whether anything is red thus depends on 8 Dispositional analyses of colour go back at least to John Locke (1690/1975), but a related idea can be found in, for example, Aristotle’s De Anima, III, ii, 426a, and in Metaphysics, IV, v, 1010b. Locke is sometimes interpreted as holding that dispositionalism captures our common conception of colour; thus see Bennett (1968). But see Mackie (1976) for an interpretation of Locke according to which Locke offered a dispositional conception of colour as a replacement for our common conception of colour, a conception to which he thought nothing in fact answers. In recent years, dispositional analyses can be found in, among other places, McGinn (1983), Peacocke (1984), McDowell (1985), Wiggins (1987), and Johnston (1992). In Johnston (1992), it is acknowledged that the dispositional conception of colour diverges from our common conception, a conception to which Johnston thinks, ever so strictly speaking, nothing in fact answers. 9 It is controversial whether there are disjunctive properties in a metaphysical (and so non-pleonastic sense) of ‘properties’; see Armstrong (1979). I’m using ‘properties’ in a metaphysical sense. 10 ‘Redness’ here expresses the concept of redness, a concept that purports to denote a certain property. The question is whether it succeeds in denoting a property. On our analysis, it will so succeed if, and only if, there is some property that answers to its descriptive content. (On a pleonastic notion of property, there is a property of F-ness if, and only if, there is a concept of F. As I said, I’m using ‘property’ in a metaphysical, not in a pleonastic sense.)
the place of colour in nature
481
whether there is an occupant of that role. That’s an empirical question that isn’t settled by the fact that there are things disposed to look red to standard perceivers in standard circumstances. For there to be an occupant of the redness-role, first, the disposition to look red to standard perceivers in standard circumstances must have bases and, secondly, there must be a basis that is common to everything so disposed. If the second condition fails to be met, then while our experiences of red purport to present a single property, they fail to do so. Unless both conditions are met, the visual experiences of red had by standard perceivers in standard circumstances will fail to ‘track’ any property of the things disposed to produce those experiences in them in such circumstances (save, trivially, that of being so disposed and properties entailed by that property). If our basic proposal is correct as it stands, then should there fail to be occupants of the colour-roles (the redness-role, the blueness-role, etc.), colour irrealism would be true: nothing would be coloured. Visible objects are disposed to look coloured to us and such dispositions are activated when we see them. Thus, if nothing were really coloured, projectivism would be true: our colour experiences would be in systematic error since they would purport to present properties of things in the scenes before our eyes that nothing actually has.11 Whether colour realism or colour projectivism is true is an issue that I shall not attempt to resolve here. The reason is that its resolution turns, in part, on empirical issues that remain unsettled. However, in due course, I will sketch what I think is the most promising line of defence of colour realism.
Will the circle be unbroken? A disposition to look red is always a disposition to look red to a certain sort of visual perceiver in certain circumstances of visual observation. It’s well known that how something looks in colour can vary with variations in the visual perceiver (such as the kinds of pigments in the observer’s cones, the state of adaptation of the cones, conditions of the perceiver’s visual cortex, and so on) and with variations in the environmental circumstances of visual observation (the kind of incident light, the surround of the thing observed, the distance and angle of the thing observed, conditions of the intervening medium, and so on). Since our functional analysis, like the dispositional one, appeals to the notions of a standard visual perceiver and standard circumstances of visual observation, it invites the questions: What kind of visual perceiver is standard? What kind of circumstance of visual observation is standard? 11 Projectivism goes back at least to Galileo and Descartes, and related ideas can be found as far back as Democritus. In recent years, defences of projectivism can be found in the philosophical literature in, among other places, Campbell (1969, 1993), Mackie (1976), Hardin (1988, 1990), Boghossian and Velleman (1989, 1991), Baldwin (1992), and Maund (1995). Defences can be found in the psychological literature in, among many other places, Cosmides and Tooby (1995) and Kuehni (1997). For a defence of the view that colour irrealism is incoherent, see Stroud (2000).
482
colour perception
In reply to these questions, one might say that the kind of visual perceiver that is standard is the kind to whom things that were a certain colour, C, would look C in standard circumstances of visual observation, and that the kind of circumstance of visual observation that is standard is the kind in which things that were C would look C to standard visual perceivers. This reply amounts to a functional or topic neutral analysis of the notions of a standard visual perceiver and standard circumstances of visual observation. While such an analysis is compatible with the truth of our claim that redness is the property that occupies the redness-role, it is unavailable to us since we intend this claim as a functional analysis of the concept of redness. Our basic proposal appeals to the notions of a standard perceiver and standard circumstance in saying what redness is, and, so, the reply in question would take us in a circle. According to our analysis, being redness just consists in occupying the redness-role, a role specified in terms of the notions in question. So, if our analysis is correct, then it can’t be the case that being a standard visual perceiver just consists in being a visual perceiver to whom red things would be disposed to look red in standard conditions of visual observation; similarly for being a standard circumstance of visual observation. Since we propose to identify which property (if any) redness is by appeal to the notions of a standard perceiver and standard circumstances, we need to indicate how to identify such kinds of perceivers and circumstances, at least in principle, without identifying redness or having to determine whether anything is red. There is, of course, a Euthyphro question here: Is what makes an experience an experience of red the relationship it bears to redness, or is what makes a property redness the relationship it bears to experiences of red? If our functional analysis is correct, the answer is the latter. (Our functional analysis shares this consequence with dispositionalism.) Our common conception of colour seems to involve notions of a normal visual perceiver and normal circumstances of visual observation. We thus speak of ‘colour blindness’, an abnormality in colour vision, and count visual perceivers suffering from it as undergoing colour illusions in some normal circumstances. And normal perceivers are said to undergo colour illusions in some circumstances of visual observation; we thus count the wall as white, despite the fact that it looks pink to a normal perceiver when bathed in red light.12 In reply to the questions at issue, let us, just as a first-pass, appeal to our actual ordinary standards of normality, those we in fact tacitly invoke in everyday discourse. Our actual everyday notions of a normal perceiver and normal circumstances are, of course, considerably vague, 12 Here, as in our functional analysis, I use ‘looks’ in the phenomenal sense. To someone in the know, a white wall in red light will, of course, not look as if it is pink; it will look as if it is white in red light; similarly, it won’t look to be pink, it will look to be white in red light. (This illustrates one thing that is sometimes meant by ‘colour constancy’.) Still even to such a perceiver, assuming that he or she is normal, the white wall will look pink in the phenomenal sense of ‘looks’. ‘Looks’ is used epistemically, rather than phenomenally, in ‘looks as if ’ and ‘looks to be’ locutions. One difference between the phenomenal and epistemic uses of ‘looks’ is that while something can look as if it is F (or look to be F) only to a subject that has the concept of F, something can look F—in the phenomenal sense—to a subject that lacks the concept of F. For instance, something can (phenomenally) look right-angled to a subject that lacks the concept of a right angle. Moreover, there is a vast number of specific shades of colour such that something can (phenomenally) look that shade of colour to us, even though we lack a concept of it (see Raffman 1995). The locus classicus for discussion of the phenomenal, epistemic, and comparative (‘looks like a(n)’, e.g. looks like a cat), uses of ‘looks’ is Chisholm (1957); but see also Jackson (1977). [Jackson (1977) coined the term ‘phenomenal use of looks’.]
the place of colour in nature
483
and hence considerably semantically indeterminate: borderline cases of standard perceivers and standard circumstances abound. But we can recognize clear-cut cases of each. Our actual everyday standards are, I believe, somewhat arbitrary, to some extent a matter of tacit convention rather than standards that were once somehow discovered.13 It may even be that the standards can vary somewhat with conversational context.14 Still, appeal to these notions will serve for now to fix our ideas. Thus, for the time being, refinements will come later, let us mean by ‘standard visual perceivers’ visual perceivers of the kind that count as normal by our actual ordinary standards, and by ‘standard circumstances of visual observation’, circumstances of the kind that count as normal by our actual ordinary standards. This first-pass reply doesn’t result in circularity since (semantically indeterminate cases aside) we can, in principle, determine whether a type of perceiver or observational circumstance counts as normal in this sense without determining whether anything is red or what property (if any) redness in fact is. In due course, I’ll revise this first-pass proposal in a way that removes any concern whatsoever about circularity. Are there circularity worries as concerns ‘looks red’? From the fact that something looks red, it does not follow that the something in question is red; indeed it does not even follow that there are (were or will be) any red things. Still, there would be a circularity worry if something looks red to a subject only if an experience of the subject bears a relationship to the property redness. The fact that something looks red to a subject does not, however, entail that the subject has an experience that bears a relationship to redness, for it doesn’t even entail that redness exists. (Something can look demon-possessed, even though there is actually no such property as being demon-possessed.) It is one question whether redness exists; it’s another whether anything ever looks red to us, whether we ever have visual experiences of something red. If our functional analysis is correct, then redness exists if, and only if, some property actually plays the redness-role. That there is such a property doesn’t follow from the fact that some things look red to us. Still a circularity worry of sorts can be seen to arise when we address such questions as this: What is it for something to look red? A natural response is that it is for the something in question to look the way red things would look in colour to standard perceivers in standard circumstances. This natural response does not imply that there actually are (were or will be) any red things, or even that redness exists. We can say what would be the case were someone demon-possessed, even though there is no such property as being demon-possessed. But consider that the natural response to the question what it is for something to look blue is this: it’s for it to look the way blue things would look in colour to standard perceivers in standard circumstances. And there are parallel responses to the parallel questions concerning every other way that something might look in colour. How, then, do we understand the different claims made by the responses in question?15 It is, of course, not well understood how we understand any claims at all. Understanding is not well understood. But it seems fair to say that, for those of us blessed with normal sight, 13
See Hardin (1988). See Chisholm (1957). For present purposes, it won’t matter if that is so; we can take the conversational parameters to be fixed. 15 I owe this way of spelling out the circularity worry to Lewis (1997). 14
484
colour perception
our knowledge of the phenomenal characters of our colour experiences contributes to our understanding of the claims in question.16 We know what it’s like for something to look red to us and what it’s like for something to look blue to us. This knowledge contributes to our understanding of the claims about what it is for something to look red and what it is for something to look blue. Moreover, insofar as this knowledge so contributes, it contributes to our understanding of what redness itself as such is, and what blueness itself as such is, for it contributes to our understanding of the redness-role and the blueness-role. This contribution to understanding is one that the congenitally blind are missing, and, for this reason, among others, they lack full possession of colour concepts.17
Might redness not have been redness? Suppose that some property, φ, fills the redness-role. Then, according to our functional analysis, φ is redness. But if there is in fact such a property, φ, it might not have filled the role. Had some of the laws of nature been different from what they in fact are, it would not have disposed its bearers to look red to standard perceivers in standard circumstances, or there would have been another property that also disposes the things that have it to look red to standard perceivers in standard circumstances in its absence, but that is sometimes possessed without φ being possessed.18 Either way, φ would not have played the redness-role. Does it follow, then, that redness might not have been redness? It is, of course, nonsense to say that something might not have been identical with itself. Still the property that plays the redness-role might not have; and so, in a sense, redness might not have been redness. Descriptions admit of scope distinctions. The description ‘the property that plays the redness-role’ takes narrow scope in: ‘It might be the case that the property that plays the redness-role is not the property that plays the redness-role’. And this claim is necessarily false. The fact that the description is sometimes used with narrow scope explains why there is a sense in which redness could not have failed to play the redness-role. The description takes wide scope in: ‘The property that plays the redness-role is such that it might not have been the property that plays the redness-role’. And this claim is true. The fact that the description is sometimes used with wide scope explains why there is a sense in which redness might not have been redness.
16
Lewis (1997) offers a different answer to the circularity worry, one that I could also accept. They can, of course, possess colour concepts, even though they can’t be in full possession of them. Helen Keller, for instance, had colour concepts. For a debate over the extent to which the blind can possess vision-related concepts, see Magee and Milligan (1995), a fascinating correspondence between a sighted philosopher, Bryan Magee, and a blind philosopher, Martin Milligan, who lost his eyes to cancer at age 18 months. 18 Two unrelated points. First, on the understanding of ‘standard perceivers’ and ‘standard conditions of visual observation’, suggested as a first-pass in the previous sections, one or both of these alternatives will hold, provided that at least some of the laws of nature required for a property to play the redness-role are not included in what count as standard circumstances and/or standard perceivers by our actual ordinary criteria. Secondly, I assume here, not uncontroversially (see Shoemaker 1998), that properties do not have their causal powers essentially. I should note, however, that my views about colour could be recast in a way that makes them compatible with the view that properties have their powers essentially. Nothing essential to my view of colour turns on this issue. 17
the place of colour in nature
485
We can achieve further illumination here by appealing to the distinction between rigid and non-rigid descriptions.19 The descriptions ‘the property that actually plays the redness-role’ and ‘the property that in fact plays the redness-role’ are rigid: they pick out the same property in every possible world in which they pick out anything. In contrast, the description ‘the property that plays the redness-role’ is non-rigid. It is impossible that the property that plays the redness-role does not play that role; and so, in a sense, redness could not have failed to be redness. Still, the property that actually plays it might not have; and so, in another sense, redness could have failed to be redness. For instance, the property that is in fact redness might not have disposed things to look red. It seems to me that in ordinary discourse, we sometimes use ‘redness’ rigidly and sometimes use it non-rigidly. I see no decisive reason for thinking either use is primary. The description ‘that property which disposes . . .’ is used in a content-giving sense, rather than merely in a reference-fixing sense, in our functional analysis of redness. But that leaves open whether ‘redness’ is used rigidly or non-rigidly in the proposal. As concerns our basic proposal, if ‘redness’ is used rigidly in it, then while the proposal is a priori, it is contingent. If, however, ‘redness’ is used non-rigidly in it, then our proposal purports to be both a priori and necessary.20 To fix our ideas, let us understand ‘redness’ to be used non-rigidly in the basic proposal. And let us hereafter use ‘redness’ unqualified by a rigidifying expression such as ‘actually’ or ‘in fact’ in a non-rigid way, and use either the rigidified description ‘the property that is actually redness’ or the rigidified description ‘the property that is in fact redness’, when we want to pick out the same property in every possible world. Since ‘redness’ is used non-rigidly in our analysis, it entails that what property counts as redness can vary form one possible world to another. Thus, a statement of the form ‘φ is redness’ would, like the statement ‘Benjamin Franklin is the inventor of bifocals’, be a contingent statement of identification. But, even so, it is none the less necessary that if a property is redness, then it disposes its bearers to look red to standard perceivers in standard circumstances and is had by everything so disposed. The necessity here is de dicto, not de re, for the truth is a truth de conceptu, not de re. A property answers to our concept of redness only if it disposes its bearers to look red to standard perceivers in standard circumstances and is had by everything so disposed. It is not, however, an essential feature of the property (if any) that is in fact redness that is it so disposes its bearers. On the contrary, that is a contingent feature of it, for the property (if any) that answers to our (unrigidified) concept of redness might not have.
The role of vision science It is compatible with our functional analysis, generalized to all colours, that colours are ontologically emergent properties, that is, fundamental, irreducible properties that emerge from microphysical properties of certain complex physical wholes.21 There are, however, 19 Kripke (1980). As Kripke points out, the narrow/wide scope distinction can’t do all of the work of the rigid/non-rigid distinction. 20 To be precise, what purports to be a priori and necessary is the proposal that if redness exists, then redness is that property which disposes its bearers to look red to standard perceivers in standard circumstances and which is had by everything so disposed. 21 The British Emergentists Alexander (1920) and Broad (1925) held such a position (see McLaughlin 1991). More recently, Cornman held this view (Cornman 1974, 1975).
486
colour perception
compelling, and often rehearsed, empirical reasons to deny that colours are fundamental properties. To note one such reason: as a matter of fact, to dispose its bearers to look some colour, a property will have to dispose them to causally produce visual experiences in a way that involves the behaviour of light. As physics is, among other things, the science of the behaviour of light, the fundamental properties that dispose things to produce effects by means of light are the concern of physics. But to explain the behaviour of light, physics has no need of the hypothesis that things are coloured; colours figure in no law of physics.22 It remains open, however, that rather than fundamental, emergent properties of certain physical wholes, colours are, in some sense, derivative from microphysical properties of certain physical wholes. Indeed on the evidence, if there are colours, they are so derivative, for the properties that affect the behaviour of light are microphysical properties or properties derivative from them. Let us call properties that are derivative from microphysical properties, ‘physical properties’. Given that redness, if it exists, is such a derivative property, it is a job for vision science to identify the physical property in question and, thereby, to locate the place of redness in nature.23 Vision science will do this by identifying the physical property that plays the redness-role. It is also a job for vision science to explain how the property in question fills the role. The explanation will include a description of the underlying mechanisms by means of which the property disposes its bearers to look red. It is thus a job for vision science to say (1) what physical property, if any, performs the redness-role, and, if there is such a property, (2) how it performs it. The tasks are related. For the explanation of how the property performs the role will figure in the justification for the theoretical identification of redness as that property. Subtleties aside, if and when a filler of the redness-role is found, we can reason as follows: redness is the property that plays the redness-role. Physical property is the property that plays the redness-role. Thus, redness is physical property . Redness will have, thereby, been located in nature.24 It is thus, in the end, vision science that should tell us what property, if any, redness in fact is. The value of the armchair functional analysis is that it tells us where to look to locate redness in the story about colour vision that science hopes ultimately to tell. The property, if any, cast in the redness-role is the property that is, in fact, redness; likewise for the other colours.
Relations among the colours According to our functional analysis, all it takes for a property to be redness is for it to play the redness-role. Some would object that this is not all that it takes, for colours participate in certain essential similarity and difference relationships. The relations in question include these: • Red is more similar to orange than it is to blue.
• Blue is more similar to purple than it is to yellow. 22
About that, Galileo and Descartes were, of course, right. See the discussion of ‘location problems’ in Jackson (1998). 24 Cf. Jackson and Pargetter (1987).
23
the place of colour in nature
487
• Yellow is more similar to orange than it is to green. • Purple is reddish-blue.
• Lime is yellowish-green. • Cyan is greenish-blue.
• Some oranges are reddish-yellow, and some are yellowish-red. • There are no reddish-greens or greenish-reds.
• There are no yellowish-blues or bluish-yellows.
Such relationships are, it is claimed, essential to the colours in question.25 If these relationships are indeed essential to the colours, then our functional analysis fails to state a sufficient condition for being redness. For it is possible for properties R, O, and B to play, respectively, the redness-role, the orangeness-role, and the blueness-role, and yet it not be the case that R is essentially more similar to O than it is to B. Thus, if these relationships are essential to the colours, then even if vision science succeeds in identifying an occupant of the redness role, it would not follow that it has, thereby, succeeded in locating redness in nature. For the question would remain whether the property in question is essentially more similar to orangeness than it is to blueness. Perhaps the leading philosophical objection to colour-physicalism is that no physical properties that are even remotely plausible candidates for being the colours essentially participate in these patterns of relationships.26 The problem of whether there are physical properties that so participate is called ‘the problem of unity’.27 The alleged problem is, I maintain, no problem whatsoever for colour-physicalism. And our functional analysis will do as it stands. The important point for our purposes is that the relational facts in question are, in the first instance, about what it is like for things to look the colours in question, and only derivatively about the colours themselves. These derivative facts about the colours are, moreover, captured by our functional analysis. Consider, once again, the fact that: • Red is more similar to orange than it is to blue. 25
Meinong took such relations to be ‘internal relationships’ among the colours (see Mulligan 1991), as did Wittgenstein (1921/1961) following him. Thus, in 4.123 of the Tractatus Wittgenstein tells us: ‘A Property is internal if it is unthinkable that its object should not possess it;’ and he gives an example involving colour: ‘This shade of blue and that one stand, eo ipso, in the internal relation of lighter to darker. It is unthinkable that these objects should not stand in that relation’ (Wittgenstein 1921/1961, p. 27). Similarly, Moritz Schlick asserts: ‘relations which hold between the elements of the systems of colours are, obviously internal relations, for it is customary to call a relation internal if it relates two (or more) terms in such a way that the terms cannot possibly exist without the relation existing between them—in other words, if the relation is necessarily implied by the very nature of the terms’ (Schlick 1979, pp. 293–294). (Wittgenstein and Schlick are quoted in Thompson 1995, pp. 270–271.) More recent authors who seem to take the relationships to be essential include Campbell (1969), Hardin (1988, 1990), Boghossian and Velleman (1991), Johnston (1992), Maund (1995), and Thompson (1995). 26 See Hardin (1988, 1990), Boghossian and Velleman (1991), Johnston (1992), Maund (1995), and Thompson (1995). 27 See Campbell (1969) and Johnston (1992).
488
colour perception
This comparative fact holds in virtue of the fact that: • Red is more similar to orange than it is to blue in respect of what it is like for something
to look the ways in question.
All else being equal (i.e. all other dimensions of visual experience held equal), what it is like for something to look red is more similar to what it is like for something to look orange than it is to what it is like for something to look blue. The comparative claim about red, orange, and blue is thus true in virtue of a comparative fact about the visual experiences in question. Colours themselves participate in the similarity and difference relationships derivatively—in virtue of the participation of the visual experiences that they dispose their bearers to produce. All else being equal, what it is like for something to look red is more similar to what it is like for something to look orange than it is to what it is like for something to look blue. That is to say, all else equal, visual experiences of red are more similar to visual experiences of orange than they are to visual experiences of blue—more similar, that is, in their phenomenal characters. If this similarity is an essential similarity, then it is necessarily the case that the kind of experience that red would dispose its bearers to produce (in standard perceivers in standard circumstances of observation) is more similar to the kind of experience orange would dispose its bearers to produce than it is to the kind blue would dispose its bearers to produce. But the necessity of this claim about red, orange, and blue would be de dicto, not de re.28 This de dicto necessary truth (assuming that it is a necessary truth) does not entail that the property that is, in fact, redness essentially participates in this comparative relation. Rather, it entails that to be redness, a property must be such that the kind of experience it disposes its bearers to produce is more similar in phenomenal character to the kind of experience orange would dispose its bearers to produce than it is to the kind blue would dispose its bearers to produce. A physical property will meet this condition if it plays the redness-role, for it will, then, dispose its bearers to look red. If it is de dicto necessary that the kind of experience that red would dispose its bearers to produce is more similar in phenomenal character to the kind of experience orange would dispose its bearers to produce than it is to the kind blue would dispose its bearers to produce, that would be because the truth is de conceptu: properties would answer to our concepts of redness, orangeness, and blueness only if they met this condition. That fact would be captured by the claim that redness, orangeness, and blueness must play, respectively, the redness-role, the orangeness-role, and the blueness-role. Consider the unique/binary distinction. There are of course four ‘unique hues’: pure red (red that is not at all bluish or yellowish), pure green (green that is not at all bluish or yellowish), pure yellow (yellow that is not at all greenish or reddish), and pure blue (blue that is not at all greenish or reddish). The other hues (purple, orange, etc.) are ‘binary’ 28 I’m assuming in this chapter that properties don’t have their causal roles essentially. If properties have their causal roles essentially, then the comparative claim in question is de re necessary, for the colour properties would be essentially similar in respect of certain of their causal roles.
the place of colour in nature
489
since each is a ‘mixture’ or ‘blend’ of two unique hues.29 For instance, shades of purple are reddish-blue or bluish-red, and shades of orange are reddish-yellow or yellowish-red. ‘Balanced’ orange is characterized as a mixture of 50% red and 50% yellow.30 In contrast, unique red, for instance, contains no percentage of any other hues. Since there are no reddish-greens or greenish-reds, red and green are called ‘opponent colours’; and, similarly, since there are no yellowish-blues or bluish-yellows, yellow and blue are also opponent colours. The unique/binary distinction is drawn phenomenologically; the relevant notion of ‘mixing’ or ‘blending’ is entirely phenomenological.31 As is well known, the distinction has nothing to do, for instance, with mixing pigments or lights. Even though a mixture of blue and yellow pigments will make a green pigment, blue and yellow are opponent colours because nothing looks bluish-yellow or yellowish-blue. Similarly, even though the mixing of red light that is somewhat yellowish with green light that is somewhat yellowish will produce yellow light,32 no shade of yellow appears reddish-green or greenish-red—indeed it seems that nothing appears reddish-green or greenish-red. It is because there is a shade of yellow that things can look without looking at all greenish or reddish (or any other combination of hues), that that shade, pure yellow, counts as a unique hue. Orange is a binary hue in that things that look orange look somewhat reddish and somewhat yellowish; orange is characterized as a ‘perceptual mixture’ of red and yellow. Our functional analysis captures facts of unique–binary relations as it stands. To be orange, for example, a property must meet the following condition: it must be such that the things that have it are disposed to look somewhat similar in hue to the way red things would look, and somewhat similar in hue to the way yellow things would look. This does not entail that it is an essential property of orange itself that it be such. The necessity is de dicto, not de re, for the truth is de conceptu, not de re. The condition is one that a physical property must meet to satisfy our concept of orangeness. A property will meet this condition if it plays the orange-role, for to play that role it must disposes its bearers to look orange; and that will guarantee that it meets the condition. As concerns opponent colours, if, for instance, red and green are indeed opponent colours, and so there is no reddish-green or greenish-red, then that fact would be captured by the fact that no property disposes it bearers to look reddish-green or greenish-red. And if, controversially, nothing could look reddish-green or greenish-red, then there couldn’t be a property that disposes its bearers 29 Hardin (1988, p. xxviii) reports: ‘The classic experiment is by Sternheim and Boynton (1966), in which it is demonstrated that subjects who are forbidden the use of the label “orange” are able fully to describe orangelooking stimuli by using the labels “yellow” and “red”, whereas denying them the use of, say, “yellow” while permitting “red”, “orange”, and “green” results in a deficiency in the description of the stimuli that look yellow. Subsequent studies using the Sternheim–Boynton method showed that red and blue are also elemental. However, Fuld, Werner, and Wooten (1983) concluded that brown might be an elemental color . . . although this conclusion was subsequently re-examined, retested, and rejected by Quinn, Rosano, and Wooten (1988).’ 30 Or so, at any rate, it is classified in the Swedish Natural Color System. 31 This phenomenological distinction is, of course, due to Hering (1878/1964). 32 If the red light and green light are both neutral, then the mixture will be desaturated red, desaturated green, or white, depending on the mixture ratio.
490
colour perception
to look reddish-green or greenish-red, and so it would be impossible for reddish-green or greenish-red to exist.33 It may be that some of the claims we’ve been discussing (e.g. that there is no reddish-green, that nothing looks reddish-green) will be disproved by experience. But should it prove to be the case that the properties that are in fact the colours do not essentially participate in the similarity and difference relationships in question, that alone would give us no reason to reject any of the comparative claims about the colours. Nothing we could learn about the properties that are in fact the colours that is independent of how they dispose things to look would refute any of these claims. The reason is simply that if the comparative claims are true, they are so in virtue of what it is like for things to look the colours in question. Of course, it could turn out that the property that is, in fact, redness is essentially overall more intrinsically similar to the property that is, in fact, orangeness than it is to the property that is, in fact, blueness; likewise, it could turn out that the property that is, in fact, orangeness is in some sense a mixture or blend of the properties that are, in fact, yellowness and redness. There could be a structure of essential relationships among the properties that are, in fact, the colours, which is isomorphic to the structure of essential relationships among colour experiences. But if that proved to be the case, it would be a happy coincidence. The fate of colour-physicalism does not depend on it. 33 The claim that nothing appears reddish-green is controversial, and the claim that nothing could appear reddish-green, even more so. Crane and Piantanida (1983) conducted an experiment in which they claim the boundary between two bars, one red the other green, could be made to appear reddish-green by stabilizing the eyes in a certain way. But their results are controversial. Hardin (1988, p. xxix) reports: ‘At an Optical Society conference on color appearance . . . I was able to interview four visual scientists, including Piantanida, who had looked at the boundary between the red and green bars as it was stabilized by means of the eye-tracker. I asked them to describe, in as much detail as they could, the color appearance they had seen. Piantanida’s reply was unequivocal: It was unproblematically a texture-free red-green. The answers from the other three were considerably more guarded. One said that it had a muddy, brownish quality, looking rather like certain regions of a post-Christmas poinsettia leaf in transition from red to green. (This is very similar in appearance to the autumn leaf my wife produced one day when I asked her if there could be reddish green. The surface of such leaves is of course an intermingling of tiny red spots and green spots.) Two others agreed that it was a color, and not quite like any they had seen before. However, they were not immediately inclined to label it “reddish green”, as they would have been immediately inclined to label an orange “reddish yellow”, or a cyan “greenish-blue”. Rather, when they were pressed to attach a color name to it, “reddish green” seemed to them to be more appropriate than any other label, since the quality region seemed to resemble the red bar that was on the red side but also seemed to resemble the green bar that was on the other. More than one person saw it as dark and muddy and remarked that the hues of dark and muddy colors are typically difficult to judge. Although nobody believed that what is known of the workings of the human color-vision system absolutely rules out the possibility of such visual experience as the one Piantanida claimed to have had, none of the three seemed sufficiently impressed by what he or she had seen to want to bother replicating the experiment, even though eye-trackers are now more common than they were at the time. There have been no published follow-up studies by Piantanida or by anyone else.’ There has recently been a follow-up study by Billock et al. (2001) that supports Crane’s and Pinatanida’s findings. But the issue is not yet definitively settled. I’ve not seen the bars myself. (I have, however, examined poinsettia leaves in transition from red to green and while they indeed have an intermingling of red and green spots, I can say with complete confidence that they don’t look reddish-green or greenish-red.) I’ve spoken to one vision scientist who has seen the bars, and he reported that the boundary between them appeared dark and muddy and was difficult to classify. He also said it produced ‘a surprising visual sensation,’ but added that it was not the most surprising sensation he had ever had. When asked him what was the most surprising sensation he had ever had, he replied ‘My first orgasm’.
the place of colour in nature
491
In summary, then, the comparative facts about colours listed above pose no problem whatsoever for the view that colours are physical properties. And that is all to the good since, as we’ll now see, colour-physicalism faces problems enough.
The problem of standard variation Empirical investigation has led some colour scientists to colour irrealism.34 The reason, I submit, is that their investigations have led them to despair of finding physical properties that play the colour-roles. If our basic proposal is on the right track, then there are three basic problems that a defence of colour-realism faces. I’ll call the first ‘the problem of standard variation’, the second ‘the problem of common ground’, and the third ‘the problem of multiple grounds’. Let us turn to the first. (The other problems will be discussed in the next section.) Recall our first-pass understanding of standard visual perceivers and standard circumstances of visual observation as, respectively, the kinds of visual perceivers and circumstances that count as normal by actual everyday standards. There are variations in the state of adaptation of eyes and/or in lighting conditions, surrounds, intervening media, and the like, that count as within the normal range by our actual everyday standards, yet which will affect certain ways things look in colour. The problem, then, is that for those ways things look in colour, there will be no physical property that disposes its bearers to look that way in colour to all perceivers that count as normal in every circumstance that counts as normal by actual everyday standards. To illustrate, consider first of all that often things will look certain ways in colour by looking other ways in colour. For instance, something will look red by looking some shade of red. By a basic way something looks in colour, let’s mean a way something looks in colour but not by means of looking some other way in colour. The point to note for present purposes, then, is that, as concerns these basic ways things look in colour, there will be no physical properties that play the relevant colour-roles. The reason is that for no such basic way something might look in colour will there be a physical property that disposes its bearers to look that way in colour to all perceivers that count as normal in every circumstance that counts as normal by actual ordinary standards. A wide variety of lighting conditions fall within the normal range by actual everyday standards. As lighting conditions change over the course of a typical sunny summer afternoon, a particular tomato may look, perhaps, a dozen different basic shades of red due to shifts in lighting that fall within the normal range.35 (A basic shade of colour is a shade of which there are no more determinate shades.) Suppose that red12 is one of the basic shades in question. Then, there is no property that disposes its bearers to look red12 to all perceivers that count as normal in every circumstance that counts as normal by our actual ordinary standards. The same is true of most non-basic shades. Unique blue (blue that is not at all greenish or reddish) isn’t a basic shade since there are shades of unique blue that vary in saturation (the amount of blue in them) and lightness (how light or dark they are). 34 35
See, for example, Kuehni (1997). This is true in the phenomenal use of ‘looks’, despite the facts of colour constancy.
492
colour perception
The dress may look one shade of unique blue in the lighting of the shop, another in the bright sunlight of the street, and yet another by the light inside the restaurant, even though each lighting condition falls within what counts as the normal range by actual ordinary standards. Even in circumstances that count as normal by everyday standards, lighting conditions are continually varying and, as a result, so are the states of adaptation of the eyes of perceivers. The result is subtle shifts in the colours things look. Consider also the effects of surround on colour appearance called ‘simultaneous contrast effects’. Various surrounds that would count as normal by ordinary standards can affect the way a surface looks in colour. For those ways a surface can look in colour, there will be no physical property that disposes its bearers to look that way in colour to all normal perceivers in every normal circumstance by everyday standards, for how a physical property will dispose its bearers to look to such normal perceivers in such normal circumstances will depend on what surrounds are present. It should be noted that it won’t do to count all colour experiences that result, in part, from contrast effects as illusory. Black, white, brown, navy blue, and olive are ‘contrast colours.’ Nothing looks any of these colours when seen in isolation. Were we to reject colour appearances that result from contrast as illusory, colour illusion would be rampant. It might be thought that the problem of standard variation arises only when we consider normal perceivers and normal circumstances by everyday standards. After all, while our ordinary colour talk presupposes notions of a normal perceiver and normal circumstances, folks don’t normally assume that there is some circumstance that is revelatory of the true colours of things to certain perceivers whose experiences are wholly objectively authoritative. Indeed, upon reflection, folks can recognize that our everyday notion of a standard observer and standard circumstance of observation is at least to some extent arbitrary, a matter of choice rather than discovery. Moreover, ordinary standards of normality can vary somewhat from one conversation context to another. It is important to see, however, that the problem of standard variation will arise however we pick out normal perceivers and normal circumstances, so long as what counts as the range of normality allows for shifts in colour appearance. No matter how the standards of normality are determined, either normal perceivers and normal circumstances can vary in ways that affect what colour something is disposed to look, or else not. If there can be such variation, then the problem of standard variation will arise. As concerns the specific variable ways something can look in colour to a normal perceiver in normal circumstances, there will be no physical property that disposes its bearers to look that way in colour to all normal perceivers in every normal circumstance of visual observation. To avoid this result, the notions of normal perceiver and normal circumstances would have to be such that no such variation is possible. But for that to be the case, the notions would have to be so specific as to exclude any variation in such observers or circumstances that could affect the way things can look in colour to them in such circumstances. It would almost certainly be the case, then, that at the very most only one human being would be an normal perceiver (and then, only at a time) and the normal circumstance would almost certainly be one that none of us has ever been in. Suffice it to note that it seems absurd to think that the colours things look to such completely specific kinds of perceivers in such completely specific kinds of circumstances are objectively true colours of things simpliciter, and that the rest of us are merely experiencing colour illusions.
the place of colour in nature
493
Given how natural it is to think of colours as relativized to kinds of perceivers and kinds of circumstances, it seems appropriate to say that how things look in colour to perceivers of a certain completely specific kind in circumstances of a certain completely specific kind is how things are in colour for them in such circumstances. This is a radical relativization of colour. It seems to me the right response to the problem of standard variation.36 Rather than embracing colour irrealism, we should handle the problem of standard variation by radically relativizing the colours; that is, we should relativize them to kinds of perceivers and circumstances so specific as to leave no room for variation in colour appearance.37 Thus, we can offer the following refined functional analysis: Relativized colours: redness for a visual perceiver of type P in circumstances of visual observation C is that property which disposes its bearers to look red to P in C, and which is had by everything so disposed.
Here the type of perceiver is to be specified not only in terms of the kind of visual system possessed, but also in terms of very specific states of the system, including, for instance, the state of adaptation of the perceiver’s cones; and the circumstances are to be specified in terms of exact lighting conditions, distance, intervening medium, angle, surround, and so on. The type of perceiver, P, and type of circumstance, C, should be as specific, but nor more specific, than is required for it to be the case that exactly the same kinds of things would be disposed to look exactly the same way in colour to any perceiver or type P in any circumstance of type C. On this relativized notion of colour, it could happen that the physical property that is red for Ps in C is some other colour entirely from for a different kind of perceiver, P∗ , in C (or for Ps in a different kind of observation circumstance, C∗ ). There is no contradiction in saying, for example, that physical property φ is red for Ps in C and (say) blue for P∗ s in C (or for Ps in C∗ ). Completely relativized colours are akin to such highly relative properties as being too heavy to lift, being too high to jump over, being too small to fit in, being a walking path, and the like.38 Just as something may be too heavy to lift for P in C but not too heavy to lift for P in C∗ , so something may be red for P in C but fail to be red for P in C∗ . Our ordinary talk of the colours of things should be given a semantic treatment analogous to the treatment of our ordinary talk of things being too heavy to lift, too high to jump over, etc.39 36 Much has been written about the relativity of sensory qualities in the literature on dispositionalism. Bennett’s (1968) gustatory example is much discussed. He claims that phenylthiourea (now known as thiocarbamide, PTC) tastes bitter to a minority of the human population, but is tasteless to the majority. If this is right, it seems that we could perhaps reverse the majority and minority by selective breeding. It thus seems natural to say that phenylthiourea is bitter for the minority of the population and tasteless for the majority. While Jackson has himself appealed to this example to justify relativization (Jackson and Pargetter 1987), he has recently remarked that it may be that phenylthiourea produces a chemical in the mouths of the minority and that it is this chemical that they taste, not phenylthiourea (Jackson 1998). One might respond that, be that as it may, it is still the case that were it phenylthiourea itself that tastes bitter to a minority and is tasteless to a majority, it would seem right to say simply that it is bitter for some, tasteless for others. 37 Cf. Campbell (1982, 1993), Jackson and Pargetter (1987). 38 Cf. Campbell (1993, p. 251). 39 For further defence of the relativization proposal, see McLaughlin (2003).
494
colour perception
The claim that the only colours that exist are completely relativized colours (hereafter, simply ‘relativized colours’) is, of course, compatible with the occurrence of colour hallucination and colour illusion. Someone suffering from delirium tremens might visually hallucinate a pink elephant in the middle of the room. The person’s visual experience of pink is not the manifestation of any object’s disposition to look pink (even to that person in the circumstances in question), so there doesn’t exist anything that looks pink to the person. Similarly, since it involves having an after-image, the Bidwell’s ghost phenomenon (Bidwell 1901) would count as a localized visual hallucination of colour. While it appears as if there is an instance of a certain colour on the spinning disk, it is not the case that the spinning disk looks the colour in question to the observer. The situation is, rather, like one in which one has a red after-image while staring at a white wall after having viewed something that has a saturated green colour.40 In such a situation, despite the fact that there appears to be an instance of colour (a red patch) in the vicinity of the wall, it is not the case that there exists anything in the vicinity of the wall looks red to one. One’s colour experience is not the manifestation of any object’s disposition to look red (even to one in the circumstances in question). Such a colour experience is hallucinatory. Colour illusion is also possible. One undergoes a colour illusion when something looks some colour to one that it isn’t. From the fact that something looks red to P in C, it does not follow that it is red for P in C. For, if our analysis is correct, a further condition must be met for it to be red for P in C. The further condition is that some property that disposes it to look red to P in C be shared by everything (i.e. every nomologically possible thing) that is so disposed.
The problem of common ground and the problem of multiple grounds The so-called problem of metamers is much discussed in the literature on colourphysicalism. When different physical properties dispose their bearers to look the same colour to a type of perceiver, P, in a type of circumstance, C, the properties are metameric in C for P. (Metameric matching is in respect of hue, saturation, and lightness.) On the evidence, no matter how specific the type of human perceiver, P, and type of circumstances, C, if one physical property can dispose its bearers to look some colour to P in C, more than one physical property can. The term ‘the problem of metamers’ suggests that the existence of metamers is itself a problem for colour-physicalism. It isn’t. The existence of metamers doesn’t entail that no physical properties play the relativized colour-roles. For to play the role of being (say) red for P in C, more is required than being a property that disposes its bearers to look red to P in C. It must also be the case that the property in question is common to everything so disposed. Thus, from the fact that there is no unique physical property that disposes its bearers to look red to P in C, it doesn’t follow that no physical property plays the role of being red for P in C. The real problem here for colour-physicalism 40 A colour physicalist must, of course, deny that when one is having a red after-image, it is thereby the case that there exists something such that one is seeing it and it is red, for there will be no (relevant) physical thing that is red. Having after-images must be treated as hallucinatory experiences (or illusory experiences); otherwise physicalism is false. The physicalist acknowledges that there really are states of having red after-images, but denies that such states involve the existence of something that is red.
the place of colour in nature
495
is that there may be no basis for the disposition to look red to P in C that is common to everything (i.e. everything nomologically possible) that is so disposed. This is the common ground (basis) problem, the second basic problem for colour-physicalism. Of course, just as there may be no common basis for the disposition to look red to P in C, there may also be more than one. If there is more than one, then it’s not the case that there is some property that is the property that disposes its bearers to look red to P in C and is common to all things so disposed. This is the multiple grounds problem, the third basic problem for colour-physicalism. As concerns the multiple grounds problem, if there is more than one common basis for the disposition to look red to P in C, but the bases participate in relations of determinable to determinate, then we can happily revise our functional analysis to say that redness for P in C is that most determinate property that disposes its bearers to look red to P in C and is common to everything so disposed. What, however, if the multiple common bases do not all participate in relations of determinable to determinate? Then, it seems the thing to say is that our relativized colour concepts suffer from indeterminacy of reference. Because of vagueness, such indeterminacy is, I believe, to be expected. But referential indeterminacy should no more lead us to irrealism about colour than it should lead us to irrealism about whether there are bald people,41 or the like, which is to say that it should not lead us in that direction at all. While I lack the space here for a proper discussion of semantic indeterminacy, suffice it to note that there are logico-semantic techniques that have the result that, despite semantic indeterminacy, sentences such as ‘Something is red if and only if it has the property that plays the redness-role’ and ‘The term “redness” refers to redness, if redness exists’, turn out to be determinately true.42 Moreover, these techniques accommodate the possibility that the following sentences may be determinately true, despite referential indeterminacy: ‘There is a physical property that is redness’ and ‘Physical property φ is redness’. Rather than pursuing this semantic issue here, however, I will instead focus on the problem of common ground. I believe that it is the most serious problem facing colour-physicalism. If anything is disposed to look red to P in C, there will, of course, be many heterogeneous chemical properties that dispose their bearers to look red to P in C; many heterogeneous chemical properties will be metameric with respect to a given perceiver P in given circumstance C.43 Indeed many heterogeneous chemical properties will be isomeric for P, that is, whatever way in colour the one disposes its bearers to look to (a completely specific kind of perceiver) P in a (completely specific) observational circumstance C, the other will disposes its bearers to look to that P in that C.44 While the disjunction of all such chemical properties (if there are disjunctive properties) will be common to everything so disposed, 41 It is indeterminate which people ‘bald people’ refers to, since, as concerns some individuals, it is indeterminate whether they count as bald. 42 See the supervaluational treatment of referential indeterminacy in McGee and McLaughlin (2000). 43 See Nassau (1980, 1983). 44 The reason is that many heterogeneous chemical properties can endow a surface with exactly the same spectral reflectance distribution, and certain sorts of surfaces with the same spectral reflectance distribution are such that whatever way the one sort of surface looks in colour to a P in a C, the other will look in colour to that P in that C (see the discussion below).
496
colour perception
the heterogeneous disjunctive property will not dispose anything to look red to P in C.45 Thus, vision science looks to more abstract properties such as light dispositions. A disposition to reflect or emit light predominately of a certain wavelength will be a basis for the disposition to look a certain colour to P, pretty much only in aperture settings in which contextual elements are eliminated, and so only in an extremely narrow range of specific circumstances. Moreover, metamerism is rampant as concerns such light dispositions in such settings. As Hardin (1988, 1990) notes, ‘if an observer’s unique yellow were at 575 nm on the spectrum [in such a setting], an appropriate mixture of spectral 550 nm and 650 nm light would match it exactly’. Indeed, there will be indefinitely many heterogeneous light dispositions of the sort in question that will dispose something to look unique yellow (yellow that is not at all greenish or reddish) to P in such a circumstance. Reflectance is measured by the ratio of reflected light to incident light. The spectral reflectance distribution of a surface is measured by the ratio of reflected light to incident light at each point on the surface throughout the visual spectrum. Now certain kinds of surfaces are such that when two of them have exactly the same spectral reflectance distribution, they will be isomeric. Metamerism, however, is rampant. Two such surfaces can have different spectral reflectances and yet look the same in colour to a certain kind of perceiver in a certain kind of circumstance. Moreover, it is not, in general, true that if two surfaces have exactly the same spectral reflectance distribution, then they are isomeric. Fluorescent and phosphorescent surfaces emit light as well as reflect it.46 A non-fluorescent surface can have the same spectral reflectance as a fluorescent surface, yet the surfaces not look the same in colour to P in C, since the fluorescent surface emits light in addition to what it reflects, so that its reflectance has a limited effect on its colour appearance; and similarly for a phosphorescent surface. 45 The idea of identifying colours with disjunctions of physical properties was suggested by Smart (1961, 1963); see also Jackson and Pargetter (1987), Jackson (1996), Cohen (2000), and Ross (2001). Such theorists can claim that while it is not always the case that the disjunction of the bases of a disposition is itself a basis of the disposition, this is sometimes the case. Thus, it might be claimed that while disjunctions of ‘heterogeneous bases’ for a disposition are not themselves bases for the disposition, disjunctions of ‘homogeneous bases’ are. Here they might appeal to the ‘naturalness’ of the disjunction. If this idea can be defended, then there may be a common disjunctive base for the disposition to look red to P in C; whether there is will depend on whether the kinds of things that have the disposition have bases that are ‘homogeneous’, so that the disjunction of them is appropriately natural. This seems the way for so-called ‘disjunctive physicalists’ to go. If it works, all well and good from my point of view, I’d like to see colour-physicalism vindicated since, by my lights, colour realism can be vindicated only if colour-physicalism can. (There is nothing else for colours to be than physical properties.) But I’m sceptical of this disjunctive strategy. First, as I’ve already mentioned, I find the notion of disjunctive properties problematic, where ‘property’ is used in a non-pleonastic sense (see Armstrong 1979). Secondly, I find no clear cases of something’s having a disposition, D, in virtue of having either A or B, that aren’t simply cases in which, strictly speaking, something has D either in virtue of having A or in virtue of having B. 46 Luminescence is the disposition to emit light when exposed to light. While phosphorescence is a ‘slow’ luminescence (light emission starts relatively slowly and subsides slowly, so that a phosphorescent surface can glow for a while in the dark), the luminescence of a fluorescent surface is immediate. Whiteners remove the yellowish look from old white shirts by adding fluorescence in the short wavelength range (Nassau 1983, p.18). Thus the surfaces of white shirts treated with whiteners are fluorescent; so are the surfaces of red marks made by highlighting pens. The surfaces of some emeralds, for example, are phosphorescent (Jakab 2001, cites the example of emeralds).
the place of colour in nature
497
While they can reflect light, translucent volumes, such as a volume of red wine and red sunsets, have spectral transmission distributions.47 If two translucent volumes share exactly the same spectral transmission distribution, then they will look exactly the same in colour to P in C. However, once again, metamerism is rampant. For a given perceiver, P, and circumstance, C, there may well be many distinct spectral transmission distributions that are metameric for P in C. Neither spectral reflectance distributions nor spectral transmission distributions play even the relativized colour-roles. The problem of common ground is formidable indeed.
A promising strategy for solving the problem of common ground In this final section, I want to sketch what is, I believe, a promising strategy for handling the problem of common ground in defence of colour-physicalism. To begin, note that we’re primarily concerned here with whether our colours are real—whether there exist colours for humans. We’re primarily concerned with the prospects of what David Hilbert (1987) calls ‘anthropocentric realism for colours’. Normal humans have three kinds of cones: L (long wavelength) cones, M (medium wavelength) cones, and S (short wavelength) cones. It thus seems promising to look for light dispositions that somehow involve three bandwidths.48 The question is how to find the relevant ones. A promising strategy is to appeal to results from opponent-processing theory,49 the leading neuro-computational theory of colour vision, to try to locate the colours among light dispositions.50 Opponent-processing theory postulates computational mechanisms in the visual system that take as input the output of our three types of cones.51 According to the theory, there are pairs of opponent information channels, where the activity in one channel inhibits activity in an opponent channel. The pairs of channels are the RED and GREEN opponent channels, and the BLUE and YELLOW opponent channels. Subtleties aside, the RED and GREEN channels involve a differencing of the outputs of the L and M cones; and the BLUE and YELLOW channels involve a summing of the outputs of the L and M cones and a differencing of that sum and the output of the S cones. Any spectral stimulus that affects the cones in such a way as to activate the RED channel, for instance, will 47
It might be thought that the redness of a sunset is an illusion. It is often said that the blueness of the sky is. I myself see no reason to say that. We can see translucent volumes. A volume of the atmosphere can have a property that disposes it to look blue to us. The scattering of short light waves in the volume of atmosphere causes the blue appearance. The blue appearance of the eyes is likewise due to the scattering of short light waves. Of course, it is enormously vague where the volume of atmosphere that we see begins and ends. But it is also vague where the volume of blue eye that we see begins and ends; indeed it is vague where anything we see begins and ends. I see no reason to count either the blueness of the sky or the redness of a sunset as an illusion. 48 Appeals to this sort of idea can be found in P. M. Churchland (1985, 1986), P. S. Churchland (1986), Hilbert (1987) (where the idea is appealed to in order to try to solve the problem of unity), Gibbard (1996), Lewis (1997), Jackson (1998). 49 See Jameson and Hurvich (1955, 1956), Hurvich and Jameson (1955, 1956, 1957), and Hurvich (1981). 50 This strategy is suggested by Bryne and Hilbert (1997, p. 265 and p. 282, n. 9). Tye (2000) explores their suggestion in some detail (see footnote 54). 51 It should be acknowledged that it is not known how opponent processing is implemented in the nervous system.
498
colour perception
inhibit its opponent channel, the GREEN channel; and likewise for the BLUE and YELLOW opponent channels. According to opponent-processing theory, that is why nothing looks reddish-green or bluish-yellow.52 Here is the basic strategy for appealing to opponent-processing theory to locate colours among light dispositions: as concerns any colour, C, look for a light disposition that, when activated, would affect the opponent-processing system in a manner that will produce a visual experience of C. The strategy thus involves appealing to opponent-processing theory to try to find ‘structure in the light’ (to use a Gibsonian phrase) that is supplied to the eyes. The idea is that colours are dispositions to supply light with the relevant structure. According to opponent-processing theory, when the RED opponent channel is activated, the GREEN channel inhibited, and the YELLOW and BLUE channels are in equilibrium, the subject will have a visual experience of unique red. In rough outline, then, the idea is that unique red is the disposition to supply light to the eyes that prompts cone responses that would trigger the activation of the RED channel and inhibition of the GREEN channel, while leaving the YELLOW and BLUE channels in equilibrium. We locate unique red among light dispositions by determining what light structure plays the role in question. Unique red is the property of being disposed to supply light with that structure. Byrne and Hilbert (1997, p. 265) claim that, in the case of surfaces, such dispositions will be types of spectral reflectances, and restrict the strategy to surface colours. Following their suggestion, Tye (2000) develops this idea in some detail, speaking of surface colours as such triples of integrated reflectances. Fluorescent and phosphorescent surfaces, we will recall, emit light; and translucent volumes transmit light. The general strategy can, however, be extended to them, even though their reflectance properties have only a limited effect on their colour appearance. One can treat emitted or transmitted light as if it were part of the reflectance of a surface or transmittance of a volume in order to obtain cone responses: one can multiply the sum of the transmittance and emission factors with the cone response function.53 It is inaccurate, however, to talk of reflectances as what are integrated. In the case of fluorescent and phosphorescent surfaces, what is integrated for cones responses is the spectral power of the light entering the eye, weighted by the spectral cone sensitivity functions. The spectral power of the light entering the eye is affected by the spectral power of the light striking the surface, the surface’s reflectance function, and the emission result from any fluorescent or phosphorescent properties; and comparably for translucent volumes. The point nevertheless remains that we can try to locate colours among light dispositions by appealing to results from opponent-processing theory. While fluorescent surfaces or non-fluorescent transparent volumes, phosphorescent surfaces, and non-fluorescent, non-phosphorescent surfaces all interact with light in different ways, they can all be disposed to interact with light in a way that prompts cone responses that trigger opponent-processing mechanisms, the operation of which results in, for instance, a visual experience of unique red; and likewise for other colour experiences. 52 What about Crane’s and Piantanida’s experiment discussed earlier? As Hardin (1988) notes: ‘Crane and Piantanida interpret [their] findings to mean that the opponent channels, which are the normal conduits of color information, do not extend all the way up the visual processing chain, and that the opponency may be superseded by the filling-in mechanism which is known to lie within the brain itself ’. In any case, their results are, as I noted, highly controversial. 53 I owe thanks here to Rolf Kuehni.
the place of colour in nature
499
Recall the problem of standard variation. Because of the effects of lighting, surround, and the like, there is, of course, no single colour that any light disposition of a surface or volume will dispose its bearers to look red, even to a completely specific kind of human visual perceiver, in every observational circumstance that counts as normal by actual everyday standards. Moreover, there is no single colour that any light disposition will dispose its bearers to look, even in a completely specific kind of observational circumstance, to every human visual perceiver that counts as normal by actual, everyday standards; for it’s extraordinarily unlikely that any two normal human perceivers will have exactly the same cone sensitivities, for instance; and cone sensitivities will vary for an individual over time. Indeed, it is unlikely that any two normal perceivers have precisely the same opponent-processing system. We’ve already seen, however, that the problem of standard variation can be handled by radical relativization. We’ve eliminated these concerns by relativizing colours to completely specific types of human perceivers and types of observational circumstances.54 While many questions remain, it is, I believe, fair to say that given the promise of opponent-processing theory, the strategy discussed above may well yield a solution to the problem of common ground for (completely) relativized colours. Thus, while the issue has by no means been settled, there is, I believe, reason for optimism as concerns realism about relativized colours for humans.
Acknowledgements This chapter is an abridged and revised version of several sections of McLaughlin (2003). Drafts of this paper were presented at the University of Saint Andrews in spring 1996, to the ZiF group in Bielefeld, Germany in June of 1997, at a symposium on colour at the Southern Society for Philosophy and Psychology in spring 1998, at the World Congress of Philosophy in Boston in the summer of 1998, at the University of Sheffield in spring 2000, at the University of Washington, Saint Louis in fall 2000, to the Psychology Department at the University of British Columbia in spring 2001, and in the Cognitive Science Distinguished 54
The only actual attempt to employ this strategy to locate colours among light dispositions is Tye’s attempt (Tye 2000). Following Byrne and Hilbert’s (1997) brief suggestive remarks, Tye attempts to employ the strategy for red, green, yellow, blue, black, and white, and for the four broad ranges of binary hues: orange, purple, yellowish-green, and bluish-green. As Jakab (2001) empirically demonstrates, however, Tye’s proposals fail in their predictions about how coloured plastics and white ceramic tiles would look in colour. As I’ll use the linguistic context ‘ / . . . .’, what occupies the dash position will be a description of a coloured item and what occupies the dots position will be the name of the colour that Tye’s proposals predict the item’s colour would be. Jakab’s results, then, are these: red Lego block/orange, red plastic boat/orange, green Lego block/yellow, green plastic boat/yellow, yellow Lego block/orange, yellow plastic boat/orange, blue Lego block/yellow, blue plastic boat/yellow, white ceramic tile/orange. Thus, Tye’s proposals fail in their predictions. Now Tye says that his proposals stand or fall with opponent-processing theory. However, opponent processing theory doesn’t fall with his proposals. The main reason is that, as he himself notes, he frames his proposals by reference to the very oversimplified model of opponent processing described in Hardin (1988), a model that does not specify the various non-linear computations performed by the opponent-processing system or take into account cone sensitivities (which are themselves non-linear). [Hardin himself (1988, p. 35) lists the ways in which his model is oversimplified.] It should be noted, however, that proposals could be formulated that take into account the factors in question. Whose cone sensitivities should be taken into account? On our relativized proposal, the cone sensitivities will be determined by the choice of a specific type of perceiver, P.
500
colour perception
Speakers Series at the University of Carleton in spring 2002. I wish to thank both the ZiF Institute and the audiences on these occasions. Much of the material here was presented in my seminar on colour in spring 1997. I wish to thank the students in that seminar, especially Jonathan Cohen, Troy Cross, and Adam Wager. Finally, I wish to thank Margaret Atherton, Jonathan Cohen, C. L. Hardin, John Hawthorne, Frank Jackson, (my post-doctoral student) Zoltan Jakab, Rolf Kueheni, Cynthia Macdonald, Mohan Matthen, and Jonathan Schaffer for extremely helpful comments on earlier drafts.
References Alexander, S. (1920). Space, time, and deity, Vol. 2. Macmillan, London. Armstrong, D. M. (1968). A materialist theory of mind. Routledge and Kegan Paul, London. Armstrong, D. M. (1979). A theory of universals, Vol. 2. Cambridge University Press, Cambridge. Baldwin, T. (1992). The projective theory of sensory content. In The contents of experience, (ed. T. Crane), pp. 177–195. Cambridge University Press, Cambridge. Bennett, J. (1968). Substance, reality, and primary qualities. In Locke and Berkeley: A collection of critical essays, (ed. C. B. Martin and D. M. Armstrong). Anchor Books, New York. Bidwell, S. (1901). On negative after-images and their relation to certain other visual phenomena. Proceedings of the Royal Society of London B 68, 262–269. Billock, V. A., Gleason, J., and Tsou, B. H. (2001). Perception of forbidden colors in retinally stabilized equiluminant images. Journal of the Optical Society of America 1, 2398–2403. Boghossian, P. A. and Velleman, J. D. (1989). Colour as a secondary quality. Mind 98, 81–103. Boghossian, P. A. and Velleman, J. D. (1991). Physicalist theories of color. Philosophical Review 7, 67–106. Bowmaker, J. (1977). The visual pigments, oil droplets, and spectral sensitivity of the pigeon. Vision Research 17, 1129–1138. Broad, C. D. (1925). The mind and its place in nature. Routledge and Kegan Paul, London. Bryne, A. and Hilbert, D. R. (1997). Colors and reflectances. In Readings on color, Vol.1.: The philosophy of color, (ed. A. Bryne and D. R. Hilbert), pp. 263–288. MIT Press, Cambridge, MA. Campbell, K. (1969). Colours. In Contemporary philosophy in Australia, (ed. R. Brown and C. D. Rollins). Allen and Unwin, London. Campbell, K. (1982). The implications of Land’s theory of colour vision. In Logic, methodology, and philosophy of science, (ed. L. J. Cohen). North Holland, Amsterdam. Campbell, K. (1993). David Armstrong and realism about colour. In Ontology, causality and mind: Essays in honour of D. M. Armstrong, (ed. J. Bacon, K. Campbell, and L. Reinhardt). Cambridge University Press, Cambridge. Chisholm, R. (1957). Perceiving: A philosophical study. Cornell University Press, Ithaca. Churchland, P. M. (1985). Reduction, qualia, and the direct introspection of brain states. Journal of Philosophy 82, 8–28. Churchland, P. M. (1986). Some reductive strategies in cognitive neurobiology. Mind 95, 279–309. Churchland, P. S. (1986). Neurophilosophy. MIT Press, Cambridge, MA. Clark, A. (1993). Sensory qualities. Clarendon Press, Oxford. Cohen, J. (2000). Color properties and color perception: A functionalist account. Doctoral Dissertation, Rutgers University. Cohen, J. (2001). Subjectivism, physicalism, or none of the above? Comments on Ross’s ‘The location problem for color subjectivism’. Consciousness and Cognition 10, 94–104.
the place of colour in nature
501
Cornman, J. (1974). Can Eddington’s ‘two’ tables be identical? Australasian Journal of Philosophy 52, 22–38. Cornman, J. (1975). Perception, common sense and science. Yale University Press, New Haven. Cosmides, L. and Tooby, J. (1995). Preface to S. Baron-Cohen, Mindblindness. MIT Press, Cambridge, MA. Crane, H. and Piantanida, T. P. (1983). On seeing reddish green and yellowish blue. Science 221, 1078–1080. Descartes, R. (1954). Philosophical writings (transl. by G. E. M. Anscombe and P. Geach). Nelson, London. Fuld, K., Werner, J. S., and Wooten, B. R. (1983), The possible elemental nature of brown. Vision Research 23, 631–637. Gibbard, A. (1996). Visual properties of human interest only. In Philosophical issues, (ed. E. Villenueva). Ridgeview, Atascadero, CA. Hardin, C. L. (1988). Color for philosophers: Unweaving the rainbow. Hackett, Indianapolis/Cambridge. Hardin, C. L. (1990). Perception and physical theory. In Mind and cognition: A reader, (ed. W. C. Lycan). Blackwell, Oxford. Hering, E. (1878/1964). Outlines of a theory of the light sense, (transl. L. M. Hurvich and D. Jameson). Harvard University Press, Cambridge. Hilbert, D. R. (1987). Color and color perception: A study in anthropocentric realism. Center for the Study of Language and Information, Stanford University. Hurvich, L. M. (1981). Color vision. Sinauer Associates, Sunderland, MA. Hurvich, L. M. and Jameson, D. (1955). Some quantitative aspects of an opponent-colors theory: II. Brightness, saturation and hue in normal and dichromatic vision. Journal of the Optical Society of America 45, 602–612. Hurvich, L. M. and Jameson, D. (1956). Some quantitative aspects of an opponent colors theory: IV. A psychological color specification system. Journal of the Optical Society of America 46, 416–421. Hurvich, L. M. and Jameson, D. (1957). An opponent-process theory of color vision. Psychological Review 64, 384–408. Jackson, F. (1977). Perception. Cambridge University Press, Cambridge. Jackson, F. (1982). Epiphenomenal qualia. Philosophical Quarterly 32, 127–136. Jackson, F. (1996). The primary quality view of color. Philosophical Perspectives 10, 199–219. Jackson, F. (1998). From metaphysics to ethics: A defense of conceptual analysis. Oxford University Press, Oxford. Jackson, F. and Pargetter, R. (1987). An objectivist guide to subjectivism about color. Revue Internationale de Philosophie 160, 129–141. Jakab, Z. (2001). Color experience: Empirical evidence against representational externalism. Doctoral Dissertation, Carleton University. Jameson, D. and Hurvich, L. M. (1955). Some quantitative aspects of opponent-colors theory. 1. Chromatic responses and spectral saturation. Journal of the Optical Society of America 45, 546–552. Jameson, D. and Hurvich, L. M. (1956). Some quantitative aspects of an opponent-colors theory: III. Changes in brightness, saturation and hue with chromatic adaptation. Journal of the Optical Society of America 46, 405–415. Johnston, M. (1992). How to speak of colors. Philosophical Studies 68, 221–624. Kripke, S. (1980). Naming and necessity. Harvard University Press, Cambridge, MA. Kuehni, R. (1997). Color. J. Wiley and Sons, New York. Lewis, D. (1997). Naming the colours. Australasian Journal of Philosophy 75, 325–342. Locke, J. (1690/1975). An essay concerning human understanding. (ed. P. H. Nidditch). Oxford University Press, Oxford. Mackie, J. L. (1976). Problems from Locke. Claredon Press, Oxford. Magee, B. and Milligan, M. (1995). On blindness. Oxford University Press, Oxford.
502
colour perception
Maund, B. (1995). Colours: Their nature and representation. Cambridge University Press, Cambridge, UK. McDowell, J. (1985). Values and secondary qualities. In Morality and objectivity, (ed. T. Honderich). Routledge and Kegan Paul, London. McGee, V. and McLaughlin, B. P. (2000). Lessons of the many. Philosophical Topics 28, 129–151. McGinn, C. (1983). The subjective view. Oxford University Press, Oxford. McLaughlin, B. P. (1991). The rise and fall of British emergentism. In Emergence or reduction: Essays on the prospects of nonreductive physicalism, (ed. A. Beckermann, H. Flohr, and J. Kim). de Gruyter, Berlin. McLaughlin, B. P. (2000). Colors and color spaces. Proceedings of the Twentieth World Congress of Philosophy, Vol 5 (ed. R. Cubb-Stevens). Philosophy Documentation Centre. McLaughlin, B. P. (2003). Color, consciousness, and color consciousness. In New essays on consciousness, (ed. Q. Smith), pp. 97–154. Oxford University Press, Oxford. Mulligan, K. (1991). Colours, corners and complexity. In Existence and explanation, (ed. W. Spohn), pp. 45–49. Kluwer, Dordrecht. Nassau, K. (1980). The causes of color. Scientific American 242, 124–154. Nassau, K. (1983). The Physics and Chemistry of Color. Wiley, New York. Peacocke, C. (1984). Color concepts and color experience. Synthese 58, 365–382. Quinn, P. C., Rosano, J. L., and Wooten, B. R. (1988). Evidence that brown is not an elementary color. Perception and Psychophysics 43, 156–164. Raffman, D. (1995). On the persistence of phenomenology. In Conscious experience, (ed. T. Metzinger), pp. 293–308. Imprint Academic, Schoningh. Ross, P. (2001). The location problem for color subjectivism. Consciousness and Cognition 10, 42–58. Russell, B. (1912). The problems of philosophy. Oxford University Press, London. Russell, B. (1927). The analysis of matter. Kegan Paul, London. Schlick, M. (1979). Form and content: an introduction to philosophical thinking. In Moritz Schlick: Philosophical papers, Vol. II: 1925–1936, (ed. H. L. Mulder and B. Vele-Schlick). D. Reidel, Dordrecht. Shoemaker, S. (1998). Causal and metaphysical necessity. Pacific Philosophical Quarterly 79, 59–77. Sternheim, C. S. and Boyton, R. M. (1966). Uniqueness of perceived hues investigated with a continuous judgmental technique. Journal of Experimental Psychology 72, 770–776. Stewart, D. (1822). The works of Thomas Reid, Vol. I. N. Bangs and T. Mason, for the Methodist Episcopal Church, New York. Smart, J. J. C. (1961). Colours. Philosophy 36, 128–142. Smart, J. J. C. (1963). Philosophy and scientific realism. Routledge and Kegan Paul, London. Strawson, G. (1989). ‘Red’ and red. Synthese 78, 198–232. Stroud, B. (2000). The quest for reality: Subjectivism and the metaphysics of colour. Oxford University Press, New York. Thompson, E. (1995). Colour vision: A study in cognitive science and the philosophy of perception. Routledge, London. Tye, M. (2000). Consciousness, color, and content. MIT/Bradford, Cambridge, MA. Wiggins, D. (1987). A sensible subjectivism. In Needs, values, truth, (ed. D. Wiggins). Blackwell, Oxford. Wittgenstein, L. (1921/1961). Tractatus Logic-Philosophicus, (transl. D. F. Pears and B. McGuinness). Routledge and Kegan Paul, London.
commentary: the place of colour in nature
503
Commentaries on McLaughlin Asking about the nature of colour Margaret Atherton There is no doubt that colour is a peculiarly difficult phenomenon to get a handle on. Colour appears to be, on the one hand, the simplest and most straightforward kind of thing in the world, for, after all, we see colours every day, but, on the other, it has proved extremely difficult to give a unified account of colour. Colours present themselves as properties of the objects that surround us and yet also, because they seem to depend in all sorts of ways on the nature of the perceiver, have been identified as ‘secondary qualities’, not resembling any quality of the external object. Common sense tells us a great many obvious things about colour, while any attempt to develop a theoretical account of colour tears common sense to shreds, revealing it as a mass of inconsistencies. Faced with this tangle, the question with which McLaughlin begins his paper is heroic. ‘What is the nature of colour?’, he asks. McLaughlin takes his readers through a very careful and judiciously arranged tour of current concerns shared by philosophers working on colour. He develops a physicalist account of colour that very usefully displays just how far such an account can go. Accepting McLaughlin’s approach, however, requires, I think, buying into a set of intuitions, intuitions that would seem very obvious were they not contradicted by those who, for example, accept a position McLaughlin dismisses, Revelation. McLaughlin requires his readers to accept some such views as these. Colours are properties of objects. They are those properties that dispose visual perceivers to see colours. What we all think, that is, is that we see the apple as red because the apple is red. But what we all think is correct only if it can be demonstrated empirically by ‘vision scientists’ that there is such a property. If there is no such property that is had by everything that disposes visual perceivers to see red, then there is no red in the world. However reasonable these intuitions are, the supporters of Revelation are in the grip of a different set. They hold, to put it in a very old-fashioned way, that colours are the proper objects of sight. You can’t see without seeing something, and colours are what we see when we see. Empirical research can tell us many interesting facts about colours, but the one thing it can’t tell us is that colours don’t exist so long as we are seeing them. The sorts of arguments McLaughlin discusses can be seen as a matter of a clash of intuitions. Someone in the grip of the intuitions that lead to Revelation are not going to be impressed with the way McLaughlin deploys his intuitions in order to distinguish between red and what it is like to see red, because, for them, this is precisely the point at issue. Nor are they likely to find acceptable McLaughlin’s claim that colour concepts are recognitional concepts, that shed no light on the nature of colours. For the proponents of Revelation, this is just to contradict their basic insight. A proponent of Revelation might even be willing to agree that the story of Mary is not an argument against physicalism, in the sense that no new object-independent entities are uncovered when Mary leaves the black and white room and sees colours. Yet such a person still might want to maintain that, in seeing colours, Mary is employing a privileged way of knowing colours, a way that reveals at the very least what is most important about colours. Who has expert knowledge about colours, they might ask, the physicist, the ‘vision scientist’ or, as Locke would have it, ‘the Painter or Dyer’ who has clearly the ideas of colours, but who has never ‘enquired into their causes’ (Locke 1690/1975, 2.8.3)? We are here witnessing a faceoff between two sets of intuitions, each of which can lay claim to some portion of common sense. It is tempting at this point to cry, but why can’t we have it all? Why can’t I hold that red is that phenomenal character I see when I look at the apple and that the redness of the apple is why I see it as red? Well, in the olden days, that is, in the very olden days, it was possible to hold a view like this. Colour, according to Aristotle and those who followed him down the centuries, was deemed to be a
504
colour perception
real thing, a characteristically structured piece of matter, whose characteristic structure or form was conveyed, as a form without matter, through the medium of air or water, to the body of the perceiver, where that self-same form structured a visual colour experience. According to this account, the nature of the experience of seeing red can be explained entirely in terms of the nature of red as it exists in the world, because the cause and its effect share the same nature. The red you see is the red in the world. What could be simpler and nicer? Such an account of colour is no longer considered acceptable, however, and the reasons for its downfall can be traced to Descartes. Descartes thought this Aristotelian-derived account fell afoul of two kinds of facts. The first is that there is no way to explain how the form that makes red a real quality out there in the world transfers itself to the body of the perceiver. Therefore, Descartes urged that we rid ourselves of these forms or ‘intentional species flitting through the air’, as he caricatured the position, and instead spell out the interaction between world and perceiver in terms of a mechanical physics. Secondly, Descartes thought we have not completed the explanatory job when we follow the Aristotelian practice and explain simply how something in the world changes the perceiver’s body because, as Descartes put it, it is ‘the mind that sees’. Visual experiences of red, although dependent upon changes in the perceiver’s body, are themselves states of the perceiver’s mind, and are to be explained in terms of the nature of mind. According to Descartes’ account, what makes a visual experience red, unlike the Aristotelian-derived account, is not the same as what makes the world red. Indeed, by Descartes’ way of looking at things, the physical cause and the psychological effect do not even resemble one another. The red you see is not the same as, and is not even like, the red in the world. Descartes has replaced a simple and straightforward account of colour, albeit one in which entirely corporeal processes are informed by intentional entities, with a much more complicated account, introducing the notion of psychophysical connections. Descartes, indeed, may be said to have invented psychophysics. But of course a troubling feature of this new psychophysics is that where before there was a single nature of colour, now there are two, a physical nature and a psychological nature. The anguished cry goes up, but what is colour, really? Descartes himself is of very little help on this issue. While there have been a number of attempts to locate an official position on colour within Descartes’ theory, it has proved very difficult to keep an official position in line with Descartes’ texts, because he has a tendency to say things that would put him in agreement with almost any contemporary position. It is tempting to assume that this is because the issues and the questions that interested Descartes cut across the questions that contemporary philosophers, such as McLaughlin, are raising. I don’t have the space to explore this in any depth, but I do want to point out one interesting discrepancy between Descartes’ approach and that of contemporary philosophers. McLaughlin points out in a footnote that the kind of speculation that forms the basis for Frank Jackson’s Mary example antedates Jackson and that a version can be found in some remarks by C. D. Broad about a mathematical archangel. In fact, the use of angels for this kind of example antedates Broad, and a version can be found in Descartes. (I assume that contemporary philosophers find thinking about scientists locked up from birth more down to earth than speculating about angels.) Descartes wrote to Regius: ‘For if an angel were in a human body, he would not have sensations as we do, but would simply perceive the motions [in the body] which are caused by external objects, and in this way would differ from a real man’ (Descartes, 1991, 206). Such an angel, being fully possessed of a mind, could, like Mary in the black and white room, presumably know the physics of colour, up to and including, the last stage in the body before colour is sensed, but would have no sensation of colour. Now, in Jackson’s example, the way for Mary to get a sensation of colour is to change her physical surroundings. Under the terms of the problem, she enjoys no psychological change at all. But the only way for Descartes’ angel to get a sensation of colour is to have some such event befall him as happened to the angel played by Bruno Ganz in Wim Wenders’ Wings of desire. It is not the physical world that changes in the film, Berlin remains Berlin, but in order to see its
commentary: the place of colour in nature
505
colours, the angel has to get a body. Descartes’ thought experiment (as illustrated by Wim Wenders) is not the same as Jackson’s, because Descartes is not asking about the nature of colour but about the psychology of colour perception. He wants to know what kind of cognitive and sensitive capacities are required in order to see colour, and his answer is that a particular kind of mind is required, an embodied mind. Descartes’ puzzling ontology of colour, in this instance, is developed in the service of a particular enterprise, the advancement of a psychology of colour perception. So we are left with a dilemma: In order to explain the psychology of colour perception, we are going to have to make use of psychophysical laws, laws which leave us suspended when we try to answer the question, what is the nature of colour.
References Descartes, R. (1991). The philosophical writings of Descartes, vol. III (translated by J. Cottingham, R. Stoothoff, D. Murdoch, and A. Kenny). Cambridge University Press, Cambridge. Locke, J. (1690/1975). Essay concerning human understanding, (ed. P. H. Nidditch). Oxford University Press, Oxford.
506
colour perception
Commentaries on McLaughlin Who dictates what is real? Paul Whittle I summarize McLaughlin’s arguments under five headings: 1.
He is aiming at a univocal definition of what colour (for example, redness) is.
2.
Redness is a physical property of the object which causes it to look red (or any other colour). McLaughlin says ‘. . . not the disposition to look red . . . but rather a property that endows something with the disposition’. The layman comments on some philosophical writing at his peril. I paraphrase in this summary.
3.
This property must be partly defined with respect to characteristics of the human eye (therefore, an ‘anthropocentric realism’).
4.
‘Internal relations’ between colours (such as ‘purple is reddish-blue’), he holds to be properties of ‘looks’, not of the putative physical property. This finesses the main philosophical objection to regarding colours as physical properties: that no ‘even remotely plausible candidates’ satisfy such internal relations.
5.
In order to accommodate the notorious variability of how things look, he ‘radically relativizes’ colour to perceivers and situations along the lines of properties like ‘too heavy to lift’.
I want primarily to comment on his basic aim, to provide a univocal definition of what colour is. Colour words (red, blue, pale . . .) find their primary uses in certain kinds of interaction (I shall call them ‘colour practices’) with objects and light. These practices involve a visible quality (‘colour’) of lighting, of the surface of objects or of translucent media. Colour practices intersect with the practices we call perception, recognition, description, judgement, feeling, design, manufacture, etc. All these practices have their histories, both personal and social. They depend on skills, some of which seem readily available to all human beings, some only to specific cultural groups, some to specific individuals. Here, in these practices, is where McLaughlin’s ‘internal relations’ belong. In the common statements of them, they are overgeneralized. They vary with practices and contexts. Even statements that seem innocuous, such as ‘orange is between red and yellow’, were completely unavailable to those who first attempted to order colours (see Gage 1993, for mind-boggling examples). If they are now intrinsic to colour, it is to a concept of colour that has been slowly constructed by science, art, and technology, that is, by practices. Some more examples. When I had a broken car window replaced in Italy, I was offered ‘bronzino’, ‘verdino’, or ‘bianchino’. They had no problem with transparent white. Another: if you habitually, day in, day out, mix different greens from various proportions of the same blue and yellow pigments, you will after a while come to ‘see at a glance’ what proportions are needed for a given green. Yet if you take a subject without this experience and sit them in front of an apparatus that mixes lights, you can ask them to set a ‘unique green that has no trace of either blue or yellow’, and they find the instruction intelligible. The pigment-mixer would not, but he could set green and yellow in balance (an ‘equilibrium hue’ in the literature, less contentious than ‘unique hue’, but set to about the same physical colour). The apparatus can’t be mixing blue and yellow lights, because that would not make green, but that is irrelevant. These practices affect how things look, but if you want to find out about the ‘internal relations’ of colour, the first ‘place in nature’ to search for explanations is in the practices, which are more open to study than private ‘looks’. And to try to anchor ‘looks’ in brain cells seems to me a non-starter because it bypasses the crucial intermediary of practice (which includes culture, and history, and speech practices—Wittgensteinian grammar).
commentary: the place of colour in nature
507
The practices involve both self and world, and accordingly colour words can be used on both sides of the appearance/reality (or self/world or subjective/objective) distinctions. When we want to pin them down on one side or the other we use qualifiers like ‘it really is’ or ‘it looks like’. These linguistic practices work well. We rarely have difficulty with colour words. So my first puzzle is, what is the point of McLaughlin’s attempt to pin them down on the physical side, particularly since, like most people, he allows some aspects (the ‘internal relations’) to depend on the psychological side? I discuss this more in my comment on Hatfield’s chapter (Chapter 6). Here I want to comment particularly on the presumption of philosophers and scientists to tell ‘ordinary folks’ what colour ‘really is’. I should prefer to say that we ordinary folks know what colour is, although we are, of course, ignorant of some of its causes. I will try to press the point home by telling my own autumn leaf story (cf. McLaughlin’s note 33). There are smoke bushes in the woods near my house which in the autumn look red and green all over. I was, of course, surprised to see this, and examined them to see if they were made up of small red and green patches. Sometimes they were (they are very various; and some are green and purple). But if I chose my bush and my day, I found some for which each leaf looked uniform, and all about the same as each other. They are not a muddy colour, but almost luminous. From a distance they mostly look bright red, and I presume that is what a colorimeter would register. But close up they are also green, as though haunted by their previous hue, and as you go on looking, the green sometimes dominates and you forget the red. Yet the usual impression is of co-presence, not alternation. Others have agreed that ‘red and green all over’ is a good description. I wouldn’t expect everyone to agree, and in particular I wouldn’t expect most of my colour scientist colleagues to give more than a grudging acquiescence (‘well, if you like that sort of description, I suppose . . .’). And this, which might seem just a slightly melancholy joke, goes to the heart of what I am trying to say. I think that what has happened to me, as I have lived in this landscape, is that I have come to see differently (I would say ‘better’). There’s nothing extraordinary about this; it is commonplace in art school. But it highlights something crucial about colour science and the associated philosophical discussions. Because the colour scientist in me mutters ‘Well, of course they can’t really be red and green all over, it is just that they look that way sometimes, or you choose to poetically describe them that way . . .’ But then the walker in the woods replies ‘By what right does colour science dictate to me what is real and what is not?’ I am seeing, at last, the variety of the wonderful world. I will not have my notions of what is real defined by colorimeters. They do not have the authority to do this, because they are, by design, context-blind and unambiguous. Nor will I consign my experiences to the metaphysically dubious limbos of appearances, qualia, percepts, or other mental events. None of these concepts will hold water, let alone my life. Nor say that I am using only a fanciful form of words (though certainly my use of words has become freer). I simply want to say that this is how the world is for me, and that this is primary. The rest, colorimeters, colour science, concepts of mind, etc., are secondary. I am not being solipsistic. It was important for me to find friends who agreed with my description, and here I am sharing it with you, as painters and poets have done for millennia, ‘opening our eyes’, as we say. I am being an old-fashioned free-thinker. I am refusing to let the established church of science tell me what is so and what is not. My point is a very simple one. That we use science to explore and reveal the world, but we also allow it to decree what is real, and so to replace the world with a different one in ways which are more subtle than material transformations. It is the theme of Husserl’s anguished last book, written as Europe was collapsing around him, and asking how the world could have come to this so soon after the Renaissance dream of a universal natural philosophy (Husserl 1954/1970).
References Gage, J. (1993). Colour and culture. Thames and Hudson, London. Husserl, E. (1954/1970). The crisis of European sciences and transcendental phenomenology, (transl. D. Carr). Northwestern University Press, Evanston IL.
This page intentionally left blank
AU T HOR IND EX
Abramov, I. 87, 116, 125 Adelson, E. H. 315, 322, 324, 417, 418, 440 Agostini, T. 441 Albrecht, D. G. 75, 89, 96, 186 Albright, T. D. 363, 365 Alexander, S. 485 Alhazen 187 Allan, L. G. 83 Allen, G. 194, 394 Allman, J. 181 Alpern, M. 121 Andersen, R. A. 363, 364, 365 Andres, J. 206, 232, 233, 422, 423 Anstis, S. M. 365 Apostol, T. M. 286 Arend, L. E. 177, 97, 123, 128, 309, 314, 316, 332, 348, 351, 416, 440, 441 Aristotle 187, 480, 503 Armstrong, D. M. 197, 479, 480, 496 Arnauld, A. 401, 406 Asch, S. E. 329 Atherton, M. 304 Atick, J. J. 85, 86, 97, 103, 113, 161 Aubert, H. 392 Bacon, R. 203 Badcock, D. R. 365 Baillargeon, R. 409 Baker, C. L. 362 Baldwin, T. 481 Barlow, H. B. viii, 69, 77, 84–6, 97, 98, 113, 162, 168, 171, 189, 440 Bauer, B. 148 Bäuml, K. H. 309, 314, 316, 333 Baylor, D. A. 168 Beck, J. 320, 392 Beckers, G. 362 Bell, A. J. 169 Bennett, J. 480, 493 Benzschawel, T. 77 Bergström, S. S. 440 Bezold, W. von 122 Bialek, W. 162
Bidwell, S. 363, 494 Billmeyer, F. W. 397 Billock, V. A. 490 Blake, A. 345 Blake, R. 74 Blakemore, C. 73, 74, 75 Blaker, N. 123 Block, N. 302 Bloj, M. G. 283, 322, 340, 418 Bloomfield, S. 371 Bocksch, H. 412, 422 Boghossian, P. A. 190, 192, 481, 487 Bonnardel, V. 275, 282, 288, 292, 338 Bossomaier, T. R. J. 173 Bouma, P. J. 43, 44 Boyle, R. 187, 191, 279, 295, 362 Boynton, R. M. 58, 61, 71, 78, 87, 116, 159, 167,178,195, 214, 217, 251, 267, 339, 384, 392, 396, 489 Braddick, O. 83, 84 Bradley, A. 74, 77, 82 Brainard, D. H. 71, 97, 131, 222, 227, 243, 254, 257, 281, 282, 307–24, 329–33, 336, 339, 348–51, 359, 360, 412 Brakel, J. van 395 Brandom, R. 203 Breneman, E. J. 359 Brenner, E. 355 Brewster, D. 363 Brill, M. H. 122, 205, 208, 209, 211, 286, 339 Brindley, G. 62 Broad, C. D. 476, 485, 504 Brocklebank, R. W. 131 Brookes, A. 117 Brooks, M. J. 293 Brown, J. L. 415 Brown, R. O. 97, 102, 158, 159, 174, 175, 177, 205, 211, 223, 247, 253, 259, 260, 262, 263, 266, 272, 274, 275, 276, 280, 321, 329, 335, 359, 447 Bruno, N. 140, 447 Brunswik, E. 414, 438, 441 Buchsbaum, G. 158, 161, 208, 211, 224, 257, 286, 292, 310–4, 319–21 339, 384, 447
510
author index
Buckley, D. 418 Bühler, K. 381, 391, 414, 415, 419, 422 Bülthoff, H. 345 Burnham, R. W. 309, 314, 359, 392 Burton, G. J. 100, 103 Burzlaff, W. 440 Byrne, A. 296, 454, 497, 498, 499 Campbell, F. W. 74 Campbell, J. 188 Campbell, K. 481, 487, 493 Campenhausen, C. V. von 219 Carney, T. 372 Carroll, L. 469, 470 Cassirer, E. 398 Casson, R. W. 395 Cataliotti, J. 444 Cavanagh, P. 82, 414, 363, 365, 371 Challands, P. D. C. 72, 253, 255, 265 Chaparro, A. 73, 178 Chevreul, M.E. 116 Chichilnisky, E. J. 71, 73, 121, 123–6, 129, 222, 223, 316 Chisholm, R. 482, 483 Chittka, L. 242, 287, 288–90, 296 Chomsky, N. 401, 409 Chubb, C. 128 Churchland, P. M. 497 Churchland, P. S. 497 Cicerone, C. M. 73, 273, 361, 368, 370, 371, 372 Clark, A. 459 Cohen, J. 1, 18, 58, 95, 257, 258, 262, 263, 265, 289, 291, 294, 312, 363, 479, 496 Cohen, M. F. 279, 293, 294 Cohn-Vossen, S. 63 Cole, G. R. 87 Collins, M. 362 Coren, S. 329, 336 Cornelissen, F. W. 355 Cornman, J. 485 Cornsweet, T. N. 256, 275, 441, 446 Cortese, J. M. 364, 365 Cosmides, L. 481 Cottaris, N. P. 88 Cowan, W. B. 223 Craik, K. J. 177 Crane, H. 490, 498 Crick, F. H. C. 268 Croner, L. J. 166,180 Cudworth, R. 401, 404, 406 Cunningham, D. W. 373 Cynader, M. S. 75
D‘Zmura, M. 88, 95, 98, 145–51, 153, 154 Da Vinci, L. 253, 268 Dana, K. J. 295 Dannemiller, J. L. 94, 95, 275, 218, 243, 291 Das, S. R. 290 Davidson, D. 203 Dawson, B. M. 89 Delacroix, E. 251 Delahunt, P. B. 314, 316, 324, 331, 360 Delbrück, M. 258 DeMarco, P. 309 Democritus v, 481 Derefeldt, G. 129, 130 Derrington, A. M. 61, 78, 80, 84, 85, 87, 88, 125, 145,150,157,160,161,171,174, 365 Descartes, R. v, 187,191,198, 406, 438, 478, 481, 486, 504, 505 Desimone, R. 362 DeValois, K. K. 88, 157,171,174 DeValois, R. L. 87–9, 145,147,157,171,174, 264, 265 DeYoe, E. A. 363 Ditchburn, R. W. 177 Dixon, E. R. 290 Dobkins, K. R. 365 Dodwell, P. C. 83 Dong, D. W. 103 Donner, K. 168 Doorn, A. J. van 57–9, 61, 62, 227, 386, 413, 430 Dosher, B. A. 407, 408 Dow, B. M. 362 Dretske, F. 304 Drew, M. S. 339 Drew, M. S. 310, 313, 322, 339 Dufort, P. A. 362 D’Zmura, M. 225, 227, 244, 287, 310, 313, 314, 321, 322, 339, 345, 346, 353, 403, 423 Eco, U. 249 Eisner, A. 159 Ekroll, V. 331, 332, 413, 423 Elsner, A. 82 Emmerson, P. G. 253 Endler, J. A. 262 Enroth-Cugell, C. 72 Eskew, R. T. J. 88 Evans, R. M. vi, 115, 120, 121, 128, 129, 131, 135, 251, 280, 332, 349, 383, 390–4, 397 Fairchild, M. D. 73, 257, 316, 318 Faul, F. 111, 397, 414, 423
author index Favreau, O. E. 82, 365 Fechner, G. T. 135, 153,179,180, 363, 439 Ffytche, D. H. 362 Fidopiastis, C. 368, 369, 370 Field, D. J. 103, 171 Finger, S. 249 Finlayson, G. D. 310, 313 Fiorentini, A. 89 Flanagan, P. 82, 88 Fletcher, H. 149 Fodor, J.A. 304 Földiák, P. 84, 113 Foley, J. M. 77 Forsius, A.S. 130 Forsyth, D. A. 59, 206, 227, 228, 313 Foster, D. H. 94, 132, 140, 218, 222, 243 Franklin, B. 485 Freeman, W. T. 227, 310, 313, 323, 324 Friele, L. F. C. 126 Frisby, J. P. 68 Fuchs, W. 414, 419, 419 Fukurotani, K. 158,161 Fuld, K. 489 Funt, B. V. 310, 313, 322, 339
Gage, J. 64, 202, 506 Galileo v, 187, 202, 481, 486 Gallistel, C. R. 401 Ganz, B. 504 Gegenfurtner, K. R. vii, 87, 88, 147,174,177 Geisler, W. S. 322, 355 Gelade, G. 147 Gelb, A. 316, 332, 381, 393, 397, 398, 412, 413, 417, 418, 419, 421, 443, 444, 449, 469, 471 Georgeson, M. A. 75, 89, 372 Gerbino, W. 132 Gershon, R. 321 Gibbard, A. 497 Gibson, J. J. 73, 249, 300, 364, 371, 406, 443, 448, 449, 472 Gilchrist, A. L. 112, 131, 132, 135, 140, 205, 224, 253, 267, 280, 315, 320, 321, 322, 359, 440–8, 453–5, 458, 470–3 Gilinsky, A. S. 74 Girgus, J. S. 336 Girgus, J. S. 329 Giulianini, F 88 Goethe, J. W. von 1, 3, 23, 27–9, 31, 34, 53, 57, 65, 66, 116 Goldsmith, T. H. 194, 403 Goldstein, R. 416, 441
511
Golz, J. 59, 100, 206, 233, 242–4, 276, 330, 331, 396, 417, 421 Goodale, M. 403 Goodman, N. 459 Gordon, D. A. 363 Gordon, J. 87, 116, 125 Gottschalk, A. 158, 161, 291, 384 Gouras, P 115 Graham, C. H. 415 Graham, N. 74, 84 Granit, R. 383, 419 Grassmann, H. 1, 9, 14, 23, 26, 43 Green, D. M. 322 Greenlee, M. W. 74, 76, 97 Gregory, R. L. 178, 363, 371 Grossberg, S. 373 Guth, S. L. 78, 88 Hahn, L. W. 355 Halevy, D. 88 Hamilton, D. B. 89, 186 Hamilton, W. J. 253 Hardin, C. L. 188–92, 197, 202, 296, 301, 302, 481, 483, 487, 489, 490, 496, 498, 499 Harman, G. 188 Harris, J. P. 371 Harrower, M. R. 419 Hassenstein, B. 145, 401 Hateren, J. H. van 288, 338 Hatfield, G. 188, 189, 192–4, 198, 201, 203, 300, 301, 507 Hayek, F. 430 Hayhoe, M. M. 73 He, S. 126 Heard, P. 178 Heider, F. 418 Heitger, F. 77, 97 Helmholtz, H. von v, 1, 26, 52, 116, 144, 187, 208, 249, 264, 265, 268, 335, 336, 389, 393, 395, 406, 414, 415, 422, 438, 445, 471 Helson, H. 121, 124, 223, 253, 257, 279, 280, 309, 316, 336, 359, 442, 447, 463 Henderson, S. T. 260, 261, 288 Hering, E. v, 1, 22, 62, 116, 121, 130, 144, 187, 249–52, 390, 391, 393, 395, 413, 415, 418, 419, 438, 489 Heyer, D. 247 Heywood, C. A. 362, 403 Hilbert, D. 63 Hilbert, D. R. 129, 188–91, 197, 296, 454, 497–9 Hochberg, J. 251, 252 Hochegger, R. 394 Hoffman, D. D. 272, 273, 368, 372, 378, 407, 419, 434
512
author index
Hömberg, V. 363 Homer 394 Hood, D. C. 72 Horn, B. K. P. 293, 440 Hornbostel, E. M. 407 Hubel, D. H. 89, 90, 113, 178, 264, 265, 362 Humanski, R. 77, 97 Humphrey, G. K. 83 Hunt, R. W. G. 309, 391 Hurlbert, A. C. 207, 208, 224, 244, 281, 282, 287, 313, 338, 353 Hurvich, L. M. 72, 143, 144, 145, 151, 266, 336, 446, 497 Husserl, E. 507 Ibn al-Haytham 187 Ingle, D. 403 Ingling, C. R. 88, 146 Irtel, H. 140 Iverson, G. 244, 287, 310, 313, 339 Ives, H. E. 121, 122, 257, 281, 337, 349, 396 Jaaskelainen, T. 312 Jackendoff, R. 385 Jackson, F. 190,197, 475, 482, 486, 493, 496, 497, 504, 504 Jacobs, G. H. 194, 301 Jacobsen, A. 224, 253 Jaensch, E. R. 397 Jakab, Z. 302, 496, 499 Jakobsen, A. 321 Jakobsson, T. 135 James, W. 407 Jameson, D. 72, 143, 145, 151, 266, 446, 497 Jeffers, V. B. 309 Jenness, J. W. 232, 314, 321, 355, 417, 422 Jepson, A. D. 321 Jin, E. W. 316 Johnston, M. 188, 197, 477, 480, 487 Johnston, S. F. 63 Jones, L. A. 393 Judd, D. 22 Judd, D. B. 22, 95, 124, 131, 259, 260–63, 279–81, 290, 312, 336, 384, 390, 392, 396, 397 Julesz, B. viii Kaiser, P. K. 116, 195, 251, 339 Kambe, N. 71, 78, 87, 167 Kaplan, E. 181 Kappauf, W.E. 58 Kardos, L. 135, 381, 398, 414, 419, 447
Katz, D. 135, 316, 332, 372, 391–3, 397, 412, 414, 416–8, 438–40, 447 Kaufman, L. 336 Keller, H. 484 Kelley, K. L. 289, 350 Kellman, P. J. 364, 367, 369, 370, 373 Kepler, J. 438 Kersten, D. 322, 407, 418, 431 Kingdom, F. A. A. 88, 441 Kinney, J. A. S. 265 Kiper, D. C. 88, 147 Kirschmann, A. 133, 264, 266 Klinker, G. J. 295 Knill, D. 322, 418 Knoblauch, K. 88, 149,150,151,153 Koenderink, J. J. 57, 58, 59, 61, 62, 77, 97, 228, 273, 386, 413, 430 Koffka, K. 316, 381, 386, 396, 397, 413, 414, 419, 440, 471 Köhler, W. 74, 440 Kohonen, T. 169 König, A. 58 Kozaki, A. 320 Kraft, J. M. 318, 319, 320, 322, 323, 329, 331, 333, 360 Krantz, D. H. 393 Krauskopf, J. 61, 77, 78, 87–9, 143–5, 174, 177 Krauss, S. 419 Kries, J. von 71, 118, 121, 131, 144, 208, 390 Krinov, E. L. 218, 258–62, 288, 289–92 Kripke, S 479, 485 Kroh, O. 413 Krüger, H. 419 Kuehni, R. G. 302, 481, 491, 498 Kuriki, I. 316, 321, 348 Lambert, J. 36, 52 Land, E. H. 121, 131, 177, 208, 210, 228, 256, 257, 267, 283, 291, 311, 321, 415, 447 Landauer, A. A. 416 Landy, M. S. 296, 336, 339, 341, 342, 344, 346, 351, 356 Lange-Malecki, B. 309, 316 Larimer, J. 118, 160,161,172 Larson, G. W. 294, 295, 349, 350 Latchford, G. 267 Laughlin, S. B. 155, 156, 165, 169, 185 Lawson, R. B. 371 Le Grand, Y. 78, 87, 120–2, 126, 131, 167 Lee, B. B. 162, 180, 181 Lee, D. N. 371 Lee, H.-C. 294, 295, 310, 313, 322, 339, 345, 346, 353 Lee, J. 87, 90
author index Legge, G. E. 77 Lennie, P. 61, 79, 84, 87–9, 95, 113, 114, 143,146,147,159, 257, 310, 313, 316, 318, 322, 339, 345, 346, 353 Leonova, A. 174, 177, 178 Leopold, D. A. 409 Levine, M. W. 166,180 Lewis, D. 483, 484, 497 Li, A. 88 Li, X. 446, 447 Lindberg, D. 438 Lindsey, D. T. 365 Livingstone, M. S. 89, 90, 178,180, 362 Locke, J. v, 187,191, 480, 503, 503 Logothetis, N. K. 90, 409 Logvinenko, A. S. 320, 333, 445 Longère, P. 319, 329 Lorenz, K. 401 Lucassen, M. P. 133, 309, 314 Lueck, C. J. 362 Lumsden, C. J. 362 Luntz, E. 372 Luther, R. 58, 61, 159 Lutze, M. 302 Lythgoe, J. N. 193, 242, 291 Maattanen, L. M. 77, 97 MacAdam, D. L. 291 Mach, E. 268, 438 Mackie, J. L. 480, 481 MacLeod, D. I. A. 59, 60, 61, 72, 73, 78, 89, 95, 97, 100, 102, 126, 143, 159, 165, 168, 175, 185, 186, 210, 214, 217, 218, 223, 233, 242–4, 253, 264, 269, 276, 280, 321, 330, 355, 359, 396, 417, 421, 473 MacLeod, R. B. 396 Maffei, L. 75 Magee, B. 484 Mahon, L. E. 126 Malkoc, G. 83, 89 Maloney, L. T. 94, 188,189, 205–8, 211, 242, 244, 257, 260, 262, 274–6, 280–92, 300–4, 310–3, 316, 320–4, 329, 330, 335–55, 359, 360, 421 Mardia, K. V. 275, 289 Marimont, D. 291 Marler, P. 401 Marr, D. 170, 205, 207, 300, 307, 404, 405, 421, 440 Martin, M. F. 392, 416, 416 Martinez-Uriegas, E. 88, 146 Marty, A. 394 Matthen, M. 193 Maund, B. 481, 487 Maunsell, J. H. R. 90, 363
513
Mausfeld, R. 57, 113, 121, 128, 135, 190, 206, 232, 247, 249, 264, 269, 316, 320, 333, 336, 339, 386, 398, 401, 406, 407, 413–6, 422, 423, 430–35 Maxwell, J. C. v, 1, 13, 31, 58, 268 Maxwell-Stuart, P. G. 394 McCann, J. J. 121, 208, 210, 228, 256, 283, 447 McCann, J. J. 359 McCann, J. J. 309, 311, 321 McCollough, C. 82 McDougall, W. 121 McDowell, J. 480 McFarland, W. N. 193,194 McGee, V. 495 McGinn, C. 480 McLaughlin, B. P. 303, 304, 476, 479, 485, 493, 495, 499, 503–7 McMahon, M. J. 159 Meinong, A. 487 Menshikova, G. 320 Merigan, W. H. 90 Merleau-Ponty, M. 135, 203 Metelli, F. 132 Metzger, W. 418, 424 Meyer, H. 264, 265 Michels, W. C. 309, 316, 317, 359 Michotte, A. 399, 401, 408 Milligan, M. 484 Mingolla, E. 373 Miyahara, E. 103, 174, 370, 371 Mollon, J. D. viii, 71–91, 95–7, 100, 101, 103, 112–5, 120, 121, 125, 134, 194, 258 Mondrian, P. 311 Monge, G. 115 Moon, P. 212 Moorhead, I. R. 100, 103 Morgan, M.J. 159,177 Motter, B. C. 362 Moxley, J. P. 78 Mullen, K. T. 87, 88, 265 Müller, E. A. 397 Müller, G. E. 174, 391 Mulligan, K. 487 Munsell, A. 36, 40, 60, 441, 442, 456, 457, 458, 459, 462, 463, 470 Munz, F. W. 193,194 Myers, C. S. 115
Nadler, S. M. 401 Nagle, M. G. 100, 259 Nagy, A. L. 87 Nakayama, K. 371, 407
514
author index
Nascimento, S. M. 94, 132, 140, 218, 222, 243, 275 Nassau, K. 274, 280, 294, 296, 345, 495, 496 Nayar, S. K. 295 Neitz, J. 302 Neitz, M. 302 Nerger, J. L. 123, 370 Neumeyer, C. 281 Newsome, W. T. 362, 363 Newton, I. v, 1, 3, 5, 22, 24, 25, 27–33, 53, 57, 66, 144, 187, 249, 268 Niederée, R. 121, 247, 316, 332, 333, 390, 415, 422 Nijhawan, R. 407 Noguchi, K. 320 Ohzawa, I. 75, 96 Olds, E. S. 149 Olshausen, B. A. 186 Olson, C. X. 267 Oren, M. 295 Osorio, D. 100, 173, 259 Ostwald, W. 1, 3, 17, 26–36, 38, 41–6, 51–4, 57, 60, 65, 66 Ovenston, C. A. 122, 134 Oyama, T. 320, 419 Palmer, S. E. 390 Paradiso, M. A. 75 Pare, E. B. 362, 363 Pargetter, R. 190, 197, 486, 493, 496 Parkkinen, J. P. S. 289, 312, 338 Párraga, C. A. 100, 103 Partridge, J. C. 193 Passmore, J. A. 401 Peacocke, C. 480 Pelli, D. 143 Pessoa, L. 441, 445 Piantanida, T. P. 118, 490, 498 Plato 434 Poggio, T. 385 Poirson, A. B. 88, 125 Pokorny, J. 88, 95, 309 Polyak, S. L. 194, 259 Prevost, B. 363 Prophet, W. 373 Pugh, E. N., Jr 71, 144, 177 Quinn, P. C. 489 Radner, M. 73 Raffman,D. 482
Ramachandran, V. S. 371 Redies, C. 371 Reeves, A. 309, 314, 316 Regan, B. 87 Reichardt, W. 401 Reid, T. 478 Reniff, L. 73 Riddoch, G. 362 Rieke, F. 162 Ritter, H. J. 169 Rivers, W. H. R. 115, 116, 121, 394 Roberts, F. S. 153 Rock, I. 441 Rodger, R. S. 416 Rodriguez, T. 361 Rollet, A. 121 Romero, J. 290, 312, 338 Rosano, J. L. 489 Ross, H. E. 77, 97, 253 Ross, P. 496 Ross, W. 441, 445 Rowe, Ch. 394 Rubin, E. 419 Ruderman, D. L. 100, 103, 158,159,163,168,172–5,177,179, 216–8, 258 Runge, P. O. 36 Russell, B. 381, 476 Rutherford, M. D. 320, 331
Sacks, O. 362 Sällström, P. 211, 281, 286, 337 Saltzman, M. 397 Salzman, C. D. 362 Sanchez, R. R. 87 Sankeralli, M. J. 87, 88 Sastri, V. D. P. 290 Sato, T. 365 Saul, A. B. 75 Savage, C. 171 Schein, S. J. 362 Schiller, P. H. 90, 378 Schirillo, J. 417, 422, 447 Schlick, M. 487 Schnapf, J. L. 73 Schöne, W. 391 Schopenhauer, A. 57 Schrater, P. 431 Schrödinger, E. v, 1, 9, 26, 36, 65, 268 Schwartz, B. J. 407 Schwartz, R. 443, 453, 459, 464, 466, 468–73 Sclar, G. 75, 96
author index Seiple, W. 73 Sejnowski, T.J. 169 Sellars, W. 203 Semmelroth, C. C. 122 Shackleton, T. M. 372 Shadlen, M. N. 372 Shafer, S. A. 295, 345, 346 Shakespeare, R. 294, 295, 349, 350 Shapiro, A. 77, 85, 88 Shapley, R. M. 72, 90, 181 Sharpe, T. vii Shepard, R. N. 189, 219, 262, 264, 279, 280, 286, 295, 300 Shevell, S. K. 71, 72, 87, 123, 124, 232, 314, 316, 321, 344, 355, 359, 417, 422, 447 Shinomori, K. 124 Shipley, T. F. 364, 367, 369, 370, 371, 373 Shoemaker, S. 484 Siegel, R. M. 363 Siegel, S. 83 Simoncelli, E. P. 186 Singer, B. 314, 321 Smart, J. J. C. 197, 479, 496 Smirnakis, S. M. 75 Smith, V. C. 88, 309 Socrates 434 Speigle, J. M. 315, 316, 320, 331 Spencer, D. E. 121, 122, 131, 212 Spencer, H. 406 Sperling, G. 407 Stappers, P. J. 364 Sternheim, C. S. 489 Stevens, K. A. 117 Stevens, S. S. 153 Stewart, D 478 Stiles, W. S. 65, 70, 125, 126, 133, 144, 202, 216, 291, 292, 303, 309, 314, 345, 350, 359, 390, 391, 396, 455, 457 Stockman, A. 70, 87, 159, 216 Strang, G. 286, 287 Strawson, G. 477 Stromeyer, C. F. 74, 82, 83, 87, 89 Stroud, B. 295, 481 Stumpf, C. 390, 391, 392 Stumpf, P. 363, 365 Suppes, P. 393 Sutton, P. 73 Swets, J. A. 322 Switkes, E. 76, 77 Takasaki, H. 122 Teller, D. 365
515
Tenenbaum, J. 440 Thompson, E. 188, 195, 296, 301, 390, 487 Thompson, P. 267 Thomson, M. 303 Thornton, J. E. 177 Thouless, R. H. 439 Tinbergen, N. 401 Todorovic, D. 135, 363, 417 Tolhurst, D. J. 103, 166 Tominaga, S. 295, 313, 322 Tooby, J. 481 Tootell, R. B. 265 Treismann, A. M. 143,147 Troland, L. T. 393 Troost, J. M. 314 Trussel, H. J. 310, 313 Tschermak, A. 121, 122 Turhan, M. 408 Twer, T. von der 60, 61, 89, 155,165,168,185,186, 218, 264 Tye, M. 304, 497, 498, 499 Uchikawa, K. 316, 321, 348, 351 Uexküll, J. von 258, 401 Ungerleider, L.G. 404 Valberg, A. 309, 316 van Essen, D. C. 363 Velleman, J. D. 190,192, 481, 487 Verrey, L. 362 Vingrys, A. J. 126 Vollmer, G. 258 Vos, J. J. 144 Vrhel, M. J. 289, 290, 310, 313, 338 Wachtler, T. 181 Wade, N. J. 296 Wallace, J. R. 279, 294 Wallach, H. 74, 256, 363–5, 416–8, 445, 447, 448 Walls, G. L. 131, 194, 415, 422 Walraven, J. 71, 72, 94, 121–3, 265, 309, 314, 316, 422 Walraven, P. L. 144 Wandell, B. A. 71, 73, 88, 97, 121–9, 133, 188, 189, 205, 206, 222, 223, 243, 253, 254, 257, 275, 285, 291, 295, 302, 304, 309, 310, 311, 313, 314, 316, 318, 321, 322, 338, 339 Wang, S. 409 Ware, C. 223 Wasserman, R. 362 Weber, E. H. 439
516
author index
Webster, M. A. 67, 71–91, 95–8, 100, 101, 103, 104, 111–4, 120, 121, 128, 134, 139, 175, 249, 257, 258, 396, 421 Webster, W. R. 267 Weert, C. M. de 314 Wehner, R. 401 Wei, J. 72, 359 Weisskopf, V. F. 294 Weisstein, N. 406, 419 Wenderoth, P. 73, 75 Wenders, W. 504 Werner, A. 281 Werner, J. S. 124, 316, 489 Wesner, M. F. 314, 422 Westland, S. 303 Whittle, P. 71, 72, 115, 117, 122, 123, 128, 129, 131, 134, 135, 139, 140, 141, 206, 223, 249, 253, 255, 265, 332, 333, 453, 462 Wierzbicka, A. 420 Wiesel, T. N. 113 Wiggins, D. 480 Wilson, M. D. 401 Wilson, M. H. 77, 97, 131 Wittgenstein, L. 203, 331, 470, 487 Wohlgemuth, A. 74 Wolff, W. 419
Wollschläger, D. 273, 361 Wong, E. 406, 419 Woodworth, R. S. 130 Wooten, B. R. 489 Worthey, J. A. 208, 209 Wuerger, S. 120 Wundt, W. M. 406 Wyszecki, G. 65, 71, 125, 126, 133, 202, 216, 309, 345, 350, 390, 391, 396, 455, 457
Yang, J. N. 242, 244, 257, 282, 283, 286, 316, 320–2, 330, 344, 346, 348–55, 359, 360 Yarbus, A. L. 265 Yilmaz, H. 286 Yolton, J. W. 401 Yoshioka, T. 362 Young, N. 285 Young, T. 144, 268
Zaidi, Q. 77, 85, 88, 89 Zeki, S. 90, 140, 249, 267, 362, 363 Zöller, W. 424 Zrenner, E. 88, 115 Zucker, S. 181
S UB J E CT IND EX
achromatic system 140 adaptation 99, 116, 140, 144, 175, 396 chromatic 70, 392, 396, 397 contrast 68, 77, 92 light 68, 92 receptor 124 second-site 71, 128 von Kries 70, 118, 133, 257, 267, 396 additive colour mixture 12, 46, 49 affordances 406 afterimage 419, 494 albedo 34 ambiguity of colour 135 Ames room 408 analogue representation 171 anchoring at the maximum 140 anchoring problem 133, 227 anchoring theory 446 aperture colours 7, 53, 251, 416 aperture problem 363 articulation 440 asymmetric colour matching 133, 314, 316, 332, 393 subjective difficulties in 332, 412, 413 atmospheric conditions 253 attributes of colours 195, 200, 388, 420, 430 basic 387 brightness 11, 87, 117 colour content 9, 32, 36 darkness 22, 53, 390 hue 253, 390, 391 saturation 41, 44 average luminance rule 447
Babinet’s principle 28 background-independent constancy 446 band pass colours 30 band stop colours 31 Bayesian decision theory, ideal observer for colour constancy 227, 323 behaviorism 187 Benham’s disc 378
Benham’s top 363 bi-directional reflectance density functions 293, 432 spectrogeometrically separable 294 Bidwell’s disc 378 Bidwell’s ghost 363, 494 binocular disparity 117 biofunctional conception of colour 194, 197 biological function of colour vision 192 black threshold 124 Brunswik’s ratio 439
categorizing objects by colour 194, 195 cerebral achromatopsia 362 cerebral akinetopsia 362 chromatic habituation 144, 145 chromatic system 140 chromatic variance 224, 227, 228, 232, 261, 275, 423 chromatophenes 362 coding efficiency 97 Cohen’s matrix R 16, 17, 58 cold-warm dimension 14, 53, 413, 420 colorimetric equality 40 colorimetric equation 6, 8, 13 colorimetric thinking 392, 397 errors of the application of 383 colorimetric tradition 4, 413 colorimetry 1–10, 12, 20, 22, 30, 41, 54, 195, 394, 416 colour, as a psychobiological attribute 188, 195–8 colour, as perceiver-dependent property 197 colour, as relational attribute 195, 196, 199 colour appearance 4, 7, 150, 248, 303, 304, 308, 316, 322, 390, 411, 412 ambiguity of colour 135 brightish white 391 brightness 390, 417, 440 brown 7, 384 chroma 391 cold-warm 14, 53, 413, 420 continuous transitions 416 indeterminacy 408 local colour quale 416 luminous grey 391, 417
518
subject index
colour appearance (cont.) material colours 415, 432 mental attitude 416 metallic appearance 415, 418 object colours 34, 35, 140, 384, 397, 412 reddish-green 487, 490, 507 seeing two colours at the same location 414 surface colours 388, 397, 400, 412 transparent white 506 veiling 391 whiteness 417, 422 colour appearances under chromatic illumination 412 colour atlases 41, 46 colour bodies 52, colour codes, second-order statistics 224, 227, 228, 232, 261, 275, 423 colour constancy 94, 116, 140, 189, 207–45, 248–68, 255, 256, 268, 272, 301, 308–23, 329, 336–55, 359, 387, 396, 398, 418, 424 ambiguity of the mean chromaticity 225, 228 chaotic world 208 computational theories 310, 316, 320–2 constancy index 317, 322, 323 grey world assumption 133, 224, 225, 227, 232, 248, 257, 260, 261, 272 linear models 189, 211, 222, 243, 244, 257, 268, 274, 280–295, 310–14, 337 probabilistic 225 Retinex model 121, 210, 225, 256 three-band world 209 colour cube 36, 46 colour discrimination 156, 162, 172, 174, 179 colour experience 188, 190, 191, 192, 196, 199, 304 colour from motion 419 colour induction 264, 267 colour irrealism 491 colour objectivism 301–3 colour order systems colour pyramids 36 colour solids 1, 36, 37, 38, 39, 40, 41, 46, 47, 52 colour trees 36, 52 Munsell scale 46, 47 Munsell system 41, 317, 389, 442 Munsell tree 40, Ostwald’s atlas 41, 42 colour physicalism 398, 487, 503 colour space 1, 9, 12, 14, 16, 17, 23, 31, 53, 118, 389 achromatic locus 23, 316, 319 achromatic point 23, 29, 32, 33 achromatic white 172 affine invariant arc lenght 46 affinely equivalent 14
anisotropy of 178 basis 19 black 11, 16, 25, 32, 34, 36, 41, 53, 54 black point 39, 45, 49 black space 16, 17, 20 boundary colour hues 50 boundary colours 2, 31, 32, 37, 49, 50, 53 canonical bases 18, 21, 38, 39 cardinal axes 87 characteristic colours 44, 45, 51, 53 chromaticity diagram 12, 23, 31, 32 CIE-colour space 9, 58, 393 colour circle 1, 3, 29 colour cone 10, 22 dimensionality of colour codes 391 double cone 41, 44 full colours 27, 30, 32, 38, 49, 50, 54 fundamental space 16, 17, 20, 53 grey axis 37, 48 loci of constant hue 22 plane of purples 11, 12, 15 primaries 13, 14, 16, 17, 22, 54 RGB-cube 47, 48, 49, 50, 51, 52 semichromes 3, 27, 29, 30, 31, 32, 33, 38, 41, 42, 44, 49 spectral cone 10, 11, 20, 21, 44, 45, 49 spectral locus 12, 15, 20, 29 white beam 35, 40, 47 white point 39, 40, 44, 45, 49 colour spreading 363, 369, 370 colour terms, cultural development 393, 395 colour vision as a biological capacity 195 colour vision of bees 403 coloured shadows 116, 135, 395, 422 colour-matching functions 13, 14, 21, 36 colour-matching matrices 13, 14, 15, 16, 17, 20, 22 Commission International d’Eclairage (CIE) 1, 9, 22, 58 common-sense taxonomies of colour 382 comparative perspective on colour 194 complementary wavelength 23, 26 computational theory 310, 316, 318, 321, 322, 373, 405, 438, 440 cone action spectra 8, 16 cone contrast 118, 175 cone contrast rule 118 conjoint representations 407, 411, 416 constancy 301 colour 94, 116, 140, 207–45, 248–68, 255, 256, 268, 301, 308–23, 336–55, 387, 396, 398, 418, 424 contrast 97 failure of 447
subject index lightness 94, 248, 253, 268, 454 two types of 131, 255 contrast colours, dimensionality of 128 contrast gain 128 contrast sensitivity 94, 103 contrast coding 140 contrast colour 249, 251 enhancement 141 local 177 Michelson 122 role in colour vision 249 simultaneous 116, 422, 492 Weber 122 contrast-response function of opponent signals 156 correlation between cone excitations 158 cortex cerebral 362 dorsal and ventral stream 404 sensory 113 striate 84 visual 145,146, 256 crispening effect 122, 141, 462 Crule algorithm 228 cue perturbation methods 346 cue promotion 342 cues, depth 117 cues, sensory 405 decomposition model 440 decontextualized colour patches 392, 416 decorrelation model 86 decrements 129 delta-contrasts 121 detection, colour detection mechanisms 145, 147,148 dichoptic display 372 dichromat 193 diffuse-specular superposition 295 discounting the background 123 disposition analysis of colour 190–2, 197, 479 dominant wavelength 23, 27 dual code hypothesis 339, 430 dual function of sensory codes 401, 402 dynamic colour spreading 361, 366, 367, 370, 372, 373, 378 dynamic reweighting 342, 355 ecological optics 22 ecology of colour signals 258
519
edge classification 437 empiricist accounts of perception 404, 406 empiricist approaches to the mind 388 environment Flat World 283, 295 idealizations of the visual 207, 212, 282 Mondrian World 311, 312, 314, 320, 323 Shape World 293, 295 environmental regularities 206, 221, 222, 281, 282 epistemically private, colours as 199 equivalent background hypothesis 359 equivalent homogeneous context 140 error look error 457, 471 notion of perceptual 386 reflectance error 456 ethological approach to colour 381, 401, 407 ethology 386, 409 evolution of trichromacy 194, 219 feature detectors 113 figure-ground segmentation 148, 406, 419 flourence 129, 390 free parameter, colour as a 388, 402, 411 function of colour vision 193, 301 functional analysis of colour 188, 193, 195, 482
gamut compression 216, 220, 223, 240 gamut expansion 223 ganzfelds 265 Gaussian World idealization 212, 219, 220, 237, 242 Gelb effect 443 genericity 379 gestalt theory 438 Grassmann’s laws 9, 13, 14, 389 Grey World assumption 133, 224, 225, 227, 232, 248, 257, 260, 261, 272
haploscopially superimposed displays 121, 333 Helson-Judd effect 124, 223 Hering’s stainshadow demonstration 418 heterochromatic lightness judgements 473 heterochromatic photometry 9 highest luminance rule 447 histogram equalization 155
ideal observer 227, 292, 323 Illuminant 34, 35, 36, 44, 112, 189, 252, 300, 437, 439
520
subject index
illuminant cues 335, 336 illuminant cues with missing parameters 344 illuminant estimation hypothesis 337 illuminant change of 35 equivalent 243 estimating the 224 feasible set 228 standard 35, 41 illumination invariance 140 illumination perception 395 illumination ambient 410, 417, 421 average daylight 23, chromatic 422 natural 308 illumination-dependent mapping 207, 210, 214, 221 illusion of objectivity 57 illusory contour 364 independent components analysis (ICA) 169 internal relations between colours 506 internal semantics of the perceptual system 404, 406 internalism 304, 386, 401, 409, 434 intrinsic colour 35, 279, 281, 285–7, 294, 300–2 intrinsic image model 440 inverse optics 440 inverted spectral cone 39, 44, 45, 49 inverted spectrum 28, 29, 31, 37, 40 isoluminance 363, 365, 371, 373
Kantenfarben (boundary colours) 31 Kirschmann’s law 265, 266, 422
L vs M signals 78 Lambertian surface 34 Land’s two-colour projections 116, 415 lateral geniculate nucleus 145, 265 legitimate and illegitimate errors 443 light beam 4, 6, 8, 9, 14, 16, 34, 35, 36 ∞tant 9, 10 achromatic beam 22, 23, 26, 27, 35 Achromatic component 26 black beam 16, 17 complementary beams 27, 40, 47 fundamental beam 16, 17 homogeneous lights 3, 24, 29, 31, 33, 37, 53 imaginary beams 7, 9 incoherent superposition 6 indistinguishable beams 6
monochromatic beam 5, 10, 11, 13, 15, 24, 26, 29, 31, 53 negativ light 9 null beam 6 real beam 7, 9, 10 space of beams 9, 16, 17, 18, 20, 46, 53 lightness 280, 391, 439, 454 linear code 179 logarithmic compression 179 loss functions 291 low-pass hypothesis 292 luminance 9, 79, 265 scientific status of 9 match-prediction linking hypothesis 314, 331–3 maximum a posteriori estimate 225 maximum local mass estimator 227 McCollough effect 82, 267 measurement device conception of perception 135, 189, 190, 206, 301, 383, 396, 398, 424, 431, 465 mental lexicon 409 metameric beams 17, 26 metameric black component 16 metameric black space 16, 17 metameric matches 389 metameric suite 17 metamers 8, 16, 17 problem of 494 metaphysics of relational and dispositional properties 196 mind-independent property, colour as 191, 199, 300 modes of appearance 7, 388, 392, 415 aperture colours 7, 53, 251, 416 film colours 7 motion apparent 372, 373 biological 378 perception of 363 multiple simultaneous layers of representations 407 multiple-channel models 83 multistability, perceptual 407, 409 Munker-White phenomenon 419 Munsell chips 289, 350, 420, 459 Munsell system 41, 317, 389, 442 Munsell viewing condition 456, 459 myth of the given 471 natural scenes 100, 216 neon-colour spreading 419 Newtonian spectrum 25, 27, 28, 29, 31, 53 Newton’s experimentum crucis 28,
subject index Newton’s spectrum 24, 27, noise-masking 149 axial noise 149 sectored noise 149 non-linear encoding in colour coding 156, 180 non-matchability under chromatic illuminations 413 non-reducible primitives of the perceptual system 403 normalization 122, 208, 214, 222, 238, 421 von Kries type model of 141 normalization-compatibility, deviations from 214, 219, 222, 238, 240 normalization-compatible mapping 210, 214, 217 object recognition 308 objectivism 189, 192, 197, 475 objectivity, notion of 197 observer’s share 3, 54, opponent channels 21, 22, 264 opponent colours 124, 155 optimal code 163 optimal colours 31, 37, 41, 43, 53 ordinary language talk about colour 197 Ostwald system 41 Ostwald’s principle of internal symmetry 43, 46, 54 Ostwald’s Semichromes 29 parameter setting, perception as 401 pathways magnocellular 90, 180, 181, 185, 362, 365 parvocellular 84, 90, 181 perceptual vs. sensory system 387, 405, 409, 420 phenomenal characters of colour experiences 188, 196, 198, 478 phenomenal reality, concept of 399 phenomenological observations, status of 385 philosophical theories of colour biofunctional conception of colour 194, 197 colour as a subject- and species-relative property 192 colour as a mind-independent property 191, 199, 300 colour as a psychobiological attribute 188, 195–8 colour as relational attribute 195, 196, 199 colour irrealism 491 colour objectivism 301–3 colour perceiver-dependent property 197 colour physicalism 398, 487, 503 colours as epistemically private 199 disposition analysis of colour 190–2, 197, 479 functional analysis of colour 188, 193, 195, 482
521
internalism 304, 386, 401, 409, 434 metaphysics of relational and dispositional properties 196 objectivism 189, 192, 197, 475 projectivism 481 relational functionalist view of colour 188, 192, 197, 301 representational externalist theories of colour experience 304 revelation 304, 477, 503 solipsism 460 subjectivism 188, 190, 191, 196, 197, 434 photometer metaphor 135, 445 physicalist reductionism 187 physicalistic trap 57 pi mechanism 70, 112 pleistochrome 161, 163, 172 Poisson process 166, 180 polarity-specific mechanisms 89 post-receptoral colour vision 87 post-receptoral contrast signals 268 primary qualities 191, 202 problem of common/multiple grounds 491 problem of standard variation 491 projectivism 481 proliferation of degrees of freedom 213 proximal mode 416 proximal semantics 400 Ragona Scina experiment 415 rectifying neurons 170 reduction screen 439 reference to the environment, notion of 399, 432 related colours 118 relational functionalist view of colour 188, 192, 197, 301 representational content 192, 304 representational externalist theories of colour experience 304 representational primitives 113, 384 retinal colour codes 421, 423 retinally stabilized stimuli 265 Retinex model 121, 210, 225, 256 revelation 304, 477, 503 rigid and non-rigid descriptions 485 robust realism 470 S vs LM signals 78 scene statistics 224, 227, 228, 232, 423 scene-averaged chromaticity 224 secondary qualities 191, 202, 304
522
subject index
segmentation 322 sensor quantum catches 302 sensory vs. perceptual system 387, 405, 409, 420 sensory-motor interface 409 Seurat stimuli 232 shadows 293, 417 shift resistance, chromatic 214, 220, 222, 240 sign stimuli 405 solipsism 460 spatial low-pass filter 140 spectral reflectance 112, 189, 192, 196, 197, 211–44, 250, 279–304, 308–12, 336–55, 385, 398, 400, 437, 454, 498 basis functions 211, 212 specular highlight cue 340 specular highlights 254, 293, 322 specularity 293 surface 344 split-range code 157 stabilized image 118 standard observer, notion of 492 standard observers 191, 482 standard viewing condition 191, 459 subject- and species-relative property, colour as 192 subjective boundary 365, 370, 371, 373, 374 subjective colour experience 87 subjective surfaces 371 subjectivism 188, 190, 191, 196, 197, 434 supersaturated colours 44 supplementary colours 41 surface colours, colorimetric description 35, 40 surface reflectance black chip 34
spectral 112, 189, 192, 196, 197, 211–44, 250, 279–304, 308–12, 336–55, 385, 398, 400, 437, 454, 498 white chip 34, 36 surface representation 402 surfaces under chromatic illumination 411 threshold-versus-intensity (t.v.i.) curves 68 tissue contrast 264, 395, 415, 421 transparency 372, 374, 395, 407, 410, 414, 423 triggering 401, 431 unique hues 88, 232, 488, 506 vagueness 407, 408, 416 veridicality 249, 386, 387, 437, 438, 470, 473 viewing conditions 455 viewing geometry 253, 285 virtual colours 10, 22 visual channels 74 visual consciousness 475 visual search 98, 147 Vollfarben (semichromes) 27, 30 von Kries-type normalization 70, 118, 133, 257, 267, 396 wavelength-dependent behaviour 403 Weber‘s law 68, 122, 167, 179 When the lighting gets red, the reds get lighter 214, 219, 222, 238
Plate 1 The spectrum and the colour circle. The spectrum is an open, linear segment. Notice how the colours merge into black at each side. The spectrum is not complete as the purples are missing. The colour circle is complete by construction. It is a closed, continuous (thus periodic) arrangement. All hues are as colourful and bright as the printing permits. The relation between spectrum and colour circle is a major topic of this chapter. (See Fig. 1.1.)
Plate 2 Some views of the spectral cone in the CIE basis. Notice the plane of purples. (See Fig. 1.3.)
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.1
Plate 3 The chromaticity diagram. The cone of colours is projected from the origin on the ‘chromaticity plane’, c. This figure illustrates the CIE basis and choice of chromaticity plane. The red curve is the locus of the monochromatic beams of unit radiant power; it generates the spectral cone. (See Fig. 1.4.)
0.2
0.3
0.4
0.5
0.6
0.7
Plate 4 The CIE chromaticity diagram with several remarkable objects: the purple line with the spectral limit points, the warm–cold division on the spectral locus. The achromatic point defines the complementarities of the spectral limits and the plane that divides all chromaticities into warm and cold. All these objects depend for their existence on the introduction of an achromatic beam. (See Fig. 1.13.)
Spectra of various slit width
(a)
Dominant wavelength 560 nm Colorimetric purity 60% Luminance 464 cd/m2
(c)
CIE D65 day light
0.8 0.7 0.6 0.501 400
(b)
1
450
500
550
600
650
700
0.5 0.4
Green leaf (#1: leskenlehti)
0.3 0.2
0.5
0.1 0 400
Plate 5 The Newtonian spectrum at various slitwidths. (See Fig. 1.14.)
450
500
550
600
650
700
0
0.1
0.2
0.3
0.4
0.6
0.7
Plate 6 Example of ‘Helmholtz coordinates’. The remitted spectrum of a green leaf illuminated with average daylight in the CIE chromaticity diagram. We can obtain the colour as a mixture of the achromatic beam with a monochromatic beam (indicated). The proportions are easily obtained from the chromaticity diagram. (See Fig. 1.15.)
Inverted spectra of various slit width
Complementary spectra
Plate 7 Newton’s spectrum and the inverted spectrum at some reasonable slitwidth. Subtractive combination yields black (top), additive combination white (or rather achromatic, bottom). (See Fig. 1.16.)
0.5
Plate 8 The inverted spectrum at various slit widths. (See Fig. 1.17.)
Low pass optimal colour 485 nm
Low pass optimal colour 560 nm
0.8 0.7 0.6
Band pass optimal colour 450 nm to 564 nm
0.5 0.4
Band pass optimal colour 485 nm to 611 nm
0.3 0.2
High pass optimal colour 490 nm
0.1 0
High pass optimal colour 560 nm
Short wavelength boundary colors
0.1 0.2 0.3 0.4 0.5 0.6 0.7
Long wavelength boundary colors
Band stop optimal colour 450 nm to 564 nm
Band stop optimal colour 485 nm to 611 nm
Plate 9 Some representative full colours: spectra (left) and chips (squares on the far right). (See Fig. 1.19.)
Plate 10 The boundary colours in the CIE chromaticity diagram and impressions of sequence of hues of the short wavelength and long wavelength boundary colours. (See Fig. 1.20.)
Band pass optimal colour 570 nm to 575 nm Band pass optimal colour 560 nm to 580 nm
View of the white apex Band pass optimal colour 555 nm to 590 nm Band pass optimal colour 535 nm to 620 nm
300 200 100
Band pass optimal colour 485 nm to 700 nm
0
Band pass optimal colour 470 nm to 700 nm
–100
Band pass optimal colour 440 nm to 700 nm
–200
Band pass optimal colour 400 nm to 700 nm
–300 –300 –200 –100
Plate 11 Spectra and samples (chips) of a ‘yellow’ paint. The difference is the width of the spectrum remitted by the paints (they are all optimal colours). When this range is very narrow, the paint appears dark brown. When it is very large (the whole visual region), the paint looks white. The ‘best yellow paint’ remits all wavelengths above 490 nm. Notice that this paint is a long wavelength boundary colour. (See Fig. 1.22.)
0
100
200
300
Plate 12 The colour solid in the canonical (SVD) basis. Here is a view of the white pole. (See Fig. 1.24.)
Projection on U–V plane 200
200
100
100
0
0
–100
–100
–200
–200
–300
–300
–400
–400 –600 –500 –400 –300 –200 –100
0
–600 –500 –400 –300 –200 –100
0
Plate 13 The colour solid in the canonical (SVD) basis. Here is a view in the direction of the third dimension. (See Fig. 1.25.)
Projection on U–W plane
300
300
200
200
100
100
0
0
–100
–100
–200
–200 –600 –500 –400 –300 –200 –100
0
–600 –500 –400 –300 –200 –100
0
Plate 14 The colour solid in the canonical (SVD) basis. Here is a view in the direction of the second dimension. (See Fig. 1.26.)
Projection on V–W plane 300
300
200
200
100
100
0
0
–100
–100
–200
–200 –400 –300 –200 –100
0
100 200
–400 –300 –200 –100
0
100 200
Plate 15 The colour solid in the canonical (SVD) basis. Here is a view in the direction of the first dimension. (See Fig. 1.27.)
(a)
(b)
Plate 16 (a) Principle of an Ostwald page. The page consists of partitive ternary mixtures of white, black and an optimal colour. (b) The same Ostwald page as in (a), but with the white, black and colour sectors mixed (one may think of a set of Maxwell tops being spun). (See Fig. 1.28.) 24 hues circle
Ostwald page #12
Seegrün-Laubgrün Seegrün-Eisblau
Ublau
Laubgrün
Eisblau
Colour chip
Gelb-Laubgrün
Ublau-Eisblau
Gelb-Kress
a Ublau-Veil
Kress
Veil
Rot-Kress Rot-Veil
12
12 {50, 25, 25}
Parameterized by hue, full colour, white and black content
Plate 17 The basic structure of the Ostwald atlas: colour circle (mensurated set of Vollfarben), single page, single chip. (See Fig. 1.29.)
Plate 18 Ostwald’s principle of internal symmetry in action. Here we have mensurated a circle and deformed it into an ellipse. The ellipse is automatically mensurated because Ostwald’s principle is affinely invariant. (See Fig. 1.30.)
Plate 19 How the spectral cone and its complementary image (inverted spectral cone at the white point) define a double conical volume in C. (See Fig. 1.31.)
Plate 20 Various views of the intersection of the spectral cone at the black point and the inverted spectral cone at the white point. The sharp edge (equator) is the locus of characteristic colours. Notice that the overall shape is strongly determined by the (flat) purple sector and its inverted copy. (See Fig. 1.32.)
Plate 21 A generic view of the RGB-cube. (See Fig. 1.35.)
Plate 22 A view at the white pole of the RGB-cube. (See Fig. 1.36.)
W C
M
G
B
Y
R
K
Plate 23 Tricolour spot diagrams for subtractive (top left) and additive (top right) colour mixture. At the bottom, the Hasse diagram of spectral dominance, on the left with the RGB contributions explicitly drawn, at the right with the hues indicated. (See Fig. 1.37.)
Plate 24 Two views of the locus of full colours on the RGB-cube. (See Fig. 1.38.) W
Y C
R
B – R – Y
– B K
– C
Plate 25 The loci of boundary colours on the RGB-cube with sequences of boundary colour hues. (See Fig. 1.39.)
Plate 26 The RGB chromaticity diagram. The boundary colour loci are plotted in the RGB colour triangle and in the complementary (inverted) triangle. (See Fig. 1.40.)
Plate 27 The RGB colour triangle. At each chromaticity we have plotted the brightest RGB colour. (See Fig. 1.41.)
Illuminant 1
Illuminant 2
Wavelength (nm)
Wavelength (nm)
Surface 1
Wavelength (nm)
Surface 2
Wavelength (nm)
Plate 28 Renderings of two surfaces under two illuminants. The top row shows the same surface rendered under two different illuminants. Each rendering was obtained using an illuminant spectral power distribution and surface reflectance function to compute the spectrum of the colour signal. From this the Smith–Pokorny estimates (Smith and Pokorny 1975; DeMarco et al. 1992) of the L, M and S cone spectral sensitivities were used to obtain the quantal absorption rates of each cone class in response to the colour signal. These, in turn, were used, together with typical red, green, and blue phosphor emission spectra and monitor gamma curves, to compute RGB coordinates for the rendering. The RGB coordinates were chosen using standard methods (e.g. Brainard 1995) so that the light they cause to be emitted from the monitor has the same effect on the cones as the colour signal being rendered. The RGB coordinates were used to produce the figure by methods outside of the authors’ control. The spectral plots show the surface reflectance functions and illuminant spectral power distributions used for this example (See Fig. 10.2.)
Plate 29 Pictures of the experimental chamber when the spectral average has been equated. This plate shows pictures of the experimental chamber used by Kraft and Brainard (1999). Across the two images, both the illuminant and the surfaces in the scene have been changed. The two changes have a reciprocal effect, so that the spatial average of the L, M, and S cone quantal absorption rates is the same in both images. The images shown are rendered versions of hyperspectral images taken of the stimuli. The hyperspectral imaging system (Longère and Brainard 2001) provided 31 narrow-band (approximately 10 nm bandwidth at 10 nm spacing between 400 and 700 nm) images of the scene. The hyperspectral images were also used to determine the spatial average of the cone quantal absorption rates. (Adopted from Figure 1 of Kraft and Brainard 1999.) (See Fig. 10.6.)