A Special Issue of Visual Cognition Visual space perception and action Guest Editors
Jochen Müsseler A.H.C.Van der Heijden Dirk Kerzel
HOVE AND NEW YORK
Published in 2004 by Psychology Press Ltd 27 Church Road, Hove, East Sussex BN3 2FA www.psypress.co.uk This edition published in the Taylor & Francis e-Library, 2005. “To purchase your own copy of this or any of Taylor & Francis or Routledge’s collection of thousands of eBooks please go to www.eBookstore.tandf.co.uk.” Simultaneously published in the USA and Canada by Taylor & Francis Inc 29 West 35th Street, New York, NY 10001 Psychology Press is part of the Taylor & Francis Group © 2004 by Psychology Press Ltd All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN 0-203-50001-6 Master e-book ISBN
ISBN 0-203-59558-0 (Adobe eReader Format) ISBN 1841699667 (Print Edition) ISSN 1350-6285 (Print Edition) Cover design by Joyce Chester
Contents*
Visual space perception and action: Introductory remarks Jochen Müsseler, A.H.C.Van der Heijden, and Dirk Kerzel
129
Position of code and code for position: From isomorphism to a sensorimotor account of space perception Peter Wolff
137
Multisensory self-motion encoding in parietal cortex Frank Bremmer, Anja Schlack, Werner Graf, and Jean-René Duhamel
162
Localization of targets across saccades: Role of landmark objects Heiner Deubel
174
Transsaccadic integration of bystander locations Filip Germeys, Peter de Graef, Sven Panis, Caroline van Eccelpoel, and Karl Verfaillie
205
Two spatial maps for perceived visual space: Evidence from relative mislocalizations Jochen Müsseler and A.H.C.Van der Heijden
237
Curved movement paths and the Hering illusion: Positions or directions? Jeroen B.J.Smeets and Eli Brenner
257
Compensation of neural delays in visual-motor behaviour: No evidence for shorter afferent delays for visual motion Romi Nijhawan, Katsumi Watanabe, Beena Khurana, and Shinsuke Shimojo
277
Perceived localizations and eye movements with action-generated and computer-generated vanishing points of moving stimuli Sonja Stork and Jochen Müsseler
301
The role of action plans and other cognitive factors in motion extrapolation: A modelling study Wolfram Erlhagen and Dirk Jancke
318
Anticipating action in complex scenes Ian M.Thornton and Amy E.Hayes
344
* This book is also a special issue of the journal Visual Cognition and forms Issues 2 and 3 of Volume 11 (2004). The page numbers used here are taken from the journal and so begin on p. 129.
iv
Reaching beyond spatial perception: Effects of intended fu ure actions on visually guided prehension Scott H.Johnson-Frey, Michael E.McCarty, and Rachel Keen
374
Action influences spatial perception: Neuropsychological evidence Glyn W.Humphreys, M.Jane Riddoch, Sara Forti, and Katie Ackroyd
403
Subject index
431
VISUAL COGNITION, 2004, 11 (2/3), 129–136
Visual space perception and action: Introductory remarks Jochen Müsseler Max Planck Institute for Human Cognitive and Brain Sciences, Munich, Germany A.H.C.van der Heijden Leiden University, The Netherlands Dirk Kerzel Giessen University, Germany Vision evolved from the vital necessity to act in a dynamic environment. Following this view it is clear that perceptual processes and action planning are much more interlocked than is evident at first sight. This is especially evident in visual space perception; actions are performed in space and are guided and controlled by objects in spatial positions. Here we shortly introduce the three research camps dealing with the relationship between space perception and action: the ecological camp, the two-visual-systems camp, and the constructivist camp. We show that these camps emphasize and open different theoretical and empirical perspectives, but that they can be seen to complement each other. We end with an overview of the papers in this special issue.
Vision had no end in itself. The visual system, just as all other “perceptual systems have evolved in all species of animals solely as a means of guiding and controlling action” (Allport, 1987, p. 395). Given this point of view it is clear that empirical and theoretical work concerned with actions that humans and animals have in common—with moving and jumping, with grasping, picking and
Please address correspondence to: Jochen Müsseler, Max-Planck-Institut für Kognitions- und Neurowissenschaften, Amalienstr. 33, D-80799 München, Germany. Email:
[email protected] © 2004 Psychology Press Ltd http://www.tandf.co.uk/journals/pp/13506285.html DOI:10.1080/13506280344000455
130 MÜSSELER, VAN DER HEIJDEN, KERZEL
catching, with approaching and avoiding—has to care about perception, about action, and about their interaction. And it is clear that in such empirical and theoretical work the elaboration of the perception of space and of position is of vital importance; all such actions are performed in space and are guided and controlled by objects on positions. CURRENT APPROACHES ON VISUAL SPACE PERCEPTION AND ACTION Nowadays, the vital importance of space perception in the relationship between perception and action is recognized and discussed in three largely independent research camps: (1) the ecological camp, (2) the two-visual-systems camp, and (3) the constructivist camp. The theoretical points of view emphasized in these camps are not mutually exclusive. In fact, by emphasizing (1) light, (2) brain, and (3) behaviour, they neatly complement each other (for a more elaborated integrating account see Norman, 2002). The ecological camp emphasizes light. According to the Gibsonian ecological approach (e.g. Gibson, 1979; cf. also Reed, 1996), perception and action are linked by affordances, that is, by the action possibilities that the world offers and that are specified in the structure of the light surrounding the perceiver. Affordances are aspects of—or possibilities in—the environment with reference to the animal’s body and its action capabilities, like, for instance, the “climbability” of a rock face or the “sittability” of a stump. Ecologists stress the active perceiver exploring his/her environment. Because the vital structure of the light is not simply given but has to be extracted, eye, head, and body movements are seen as parts of the perceptual process. Thus, perception means to perceive events, which change over time and space through body and object movements. Space perception comprises the many surfaces that make up the environment. The perceptual performance of an observer consists of the pickup of the (invariant) information inherent in the structured light coming from this environment in a direct manner. Gibson refrained to refer to the processes underlying perception, thus this camp is basically silent about the brain. The only concession was the resonance principle according to which the perceptual system resonates with or is attuned to the invariant structures. Consequently, ecologists analyse the dynamics of perception and action (e.g., Thelen & Smith, 1994) and decline the reductionist experimental paradigms. Experiments with stimulus presentations of a few milliseconds and simple key presses as a behavioural measure are simply seen as inadequate for the analysis of the perception-action interplay. Of course, this was and is always a matter of dispute particularly with the representatives of the constructivist account (see below; for discussions see, e.g., Gordon, 1997, chap. 7; Nakayama, 1994). The two-visual-systems camp accepts these views on light but emphasizes brain. They provide neuropsychological, neuroanatomical, and behavioural evidence for two channels in the visual system, one channel for perception/
VISUAL SPACE PERCEPTION AND ACTION 131
cognition, the so-called vision-for-perception pathway, and one channel for action, the so-called vision-for-action pathway. In modern Experimental Psychology, evidence for two visual systems can be found in some studies of the 1960s and 1970s (e.g., Bridgeman, Hendry, & Stark, 1975; Fehrer & Raab, 1962). Somewhat later, Ungerleider and Mishkin (1982) provided evidence for a ventral pathway leading from the occipital cortex to the inferior temporal lobe, assumed to deal with object identification, and a dorsal pathway leading from the occipital cortex to the posterior parietal cortex, assumed to deal with object location. Later still Goodale and Milner (1992; see also Milner & Goodale, 1995) attributed the dorsal stream the function of the visual control and guidance of motor behaviour. In the recent past, the two-visual system account inspired numerous studies. While ecologists stress the unity of perception and action, the representatives of the two-visual system camp emphasize diversity and dissociation. Given the assumed modularity of spatial information processing in both functional and structural terms, their studies aimed at and found evidence for dissociations between perception and action also in behavioural studies (e.g., Aglioti, DeSouza, & Goodale, 1995; Haffenden & Goodale, 1998, 2000; Milner & Dyde, 2003). This research strongly suggests that “what one sees” is basically different from “what one needs to act”. The constructivist camp accepts these views on light and views on brain but emphasizes the importance of the interaction between perception/cognition and action. Constructivists are not convinced that perception and action are largely separated and unconnected cognitive domains and emphasize that, despite the rare examples of diversity and dissociation, the normal case is unity and association. In their view, while the anatomical and functional architecture and the transformational computations involved might be highly complex, there can be no doubt that the system uses spatially coordinated maps for perception and action. Besides the role of perception for action, this camp emphasizes the importance of action for perception. There is indeed increasing evidence that the functional unity of perception and action works not only from perception to action but from action to perception as well (e.g., Hommel, Müsseler, Aschersleben, & Prinz, 2001; Müsseler & Wühr, 2002). As animals act in their environment, perceptual information cannot be interpreted in an unambiguous way without reference to action-related information. This, in turn, requires that these two pieces of information interact with each other. With regard to visual space perception, the interesting question is: What is the influence of action on the experienced visual space with objects on positions? In the constructivist camp the central question is the question how the visual system figures out what environmental situation gives rise to the optical image registered on the retina (cf. Rock, 1983). Different kinds of cues, and especially cues resulting from actions in the world, assist in this construction process. By analysing the motion parallax phenomenon, von Helmholtz (1866) could already
132 MÜSSELER, VAN DER HEIJDEN, KERZEL
claim that one’s own movements deliver important depth cues. In contrast to the ecologists, who focus almost exclusively on the external environment as specified in light, constructivists take also into account the internal cognitive mechanisms. In this sense they are representatives of the information processing account. In contrast to the representatives of the two-visual system account, who might also feel constrained with the information processing account, the constructivists are more interested in the aspects of unity and association between perception and action. THE CONTRIBUTIONS TO THE SPECIAL ISSUE This special issue on Visual Space Perception and Action brings together 12 contributions written by various experts in this field, ranging from experimental psychologists and neurophysiologists to computational modellers and philosophers. Each contribution introduces new concepts and ideas that explain how visual space is being established and represented. The first two papers present theoretical discussions about how position and information about one’s actions may be represented in the brain. Wolff contrasts two hypotheses about codes for the position of objects. It is often assumed that the retinotopic organization of cortical and subcortical structures codes the position of objects in space. Wolff however shows that this assumption cannot be maintained in the face of a constantly moving observer. Instead, spatial coding is learned from the sensorimotor contingencies. That is, the relation between observer motion and its sensory consequences may establish a code for position. Whereas Wolff is concerned with an abstract theoretical framework of spatial coding, Bremmer, Schlack, Graf, and Duhamel elaborate on how self-motion may be represented in the parietal cortex. One important visual cue to selfmotion is optic flow. Forward motion produces an expansion of image elements, and backward motion a contraction. The precise direction of motion may be derived from a singularity, the focus of expansion. Bremmer et al. show that neurons in the ventral intraparietal area (VIP) of the macaque cortex are sensitive to variations of the focus of expansion and may therefore code the direction of egomotion. These neurons not only respond to visually simulated self-motion, but also to real physical displacement. This shows that VIP may be a multimodal area in which visual and vestibular as well as auditory and somatosensory information is integrated. The next two contributions deal with information processing around the time of a saccade. Both contributions focus on the detection of position changes that are presented when observers move their eyes from a source object to a target object. In both studies, additional elements are displayed. Deubel asks how the detection of target displacements during the saccade is affected by surrounding objects, referred to as landmarks. He reports that there is a bias to localize the target toward the irrelevant landmarks when these landmarks are close to the target and horizontally aligned with it. Germeys, de Graef, Panis, van Eccelpoel,
VISUAL SPACE PERCEPTION AND ACTION 133
and Verfaillie pursue the opposite goal of clarifying how memory of surrounding objects, referred to as bystanders, is organized. They find that bystander location is better remembered if the complete scene is presented during recall suggesting that the location of a single bystander is encoded with respect to the remaining bystanders. As a challenge for current theorizing, Germeys et al. show that the saccade source may be more important for encoding of bystander location than the saccade target. Taken together, the two studies converge on the general conclusion that transsaccadic memory relies strongly on relational information. The following two papers contribute to an ongoing discussion about whether the perception of space differs from the representation of space used for motor action. Müsseler and van der Heijden examined the hypothesis that two sources may be used to calculate position, a sensory map, and a motor map. The sensory map provides vision while the motor map contains the information for saccadic eye movements. The model predicts that errors in relative location judgements will be observed when the motor map has to provide the information for the visual judgements. The authors provide evidence for this model by showing that the perceived position of differently sized targets follows the same pattern as saccadic eye movements to these targets: Eye movements to a small target undershoot less than eye movements to a spatially extended target. A similar trend is found when observers make perceptual judgements that require relative position judgements: The centre of a small object appears further from the fixation point than the centre of a spatially extended object. Thus, this paper shows association and not dissociation between perception and action. A similar conclusion is reached in the contribution of Smeets and Brenner. The authors investigate why observers who are asked to connect two points with a straight line fail to do so and draw a curved line between the two positions instead. This may be so because of a spatial distortion, or because the direction of the motion is initially wrong and requires continuous adjustment. The perception of straightness was influenced by using the Hering illusion. In the Hering illusion, a straight line appears curved because a pattern of radiating line is superimposed. Observers had to judge the straightness of a dot moving across the Hering illusion, as well as draw a straight line across the illusion. Smeets and Brenner found that the curvature that the background induced in the hand’s movement path was correlated with the curvature that the background induced in a moving dot’s path. Thus again, perception and action are more associated than dissociated. The next block of four papers is concerned with the perceived position of moving objects. In earlier work, Nijhawan proposed that processing latencies are compensated by motion extrapolation in the visual system such that a flashed stationary object is seen to lag a moving object. A conflicting explanation for the flash-lag effect is that latencies of moving objects are reduced. Both accounts would explain why responses to moving objects are typically accurate. However, Nijhawan, Watanabe, Khurana, and Shimojo show that reaction times to moving stimuli are not reduced compared to stationary objects, and temporal order
134 MÜSSELER, VAN DER HEIJDEN, KERZEL
judgements do not indicate that moving objects are perceived earlier than stationary ones. This contradicts the claim of latency reduction with moving stimuli and favours motion extrapolation. Stork and Müsseler as well as Thornton and Hayes investigate factors that affect localization of the endpoint of a motion trajectory. Stork and Müsseler show that the endpoint of a moving stimulus is mislocalized in the direction of motion when it is pursued with the eyes. In contrast, judgements of the final target position are accurate when eye fixation on a stationary object is maintained. When observers had control over the target’s vanishing point because target disappearance was coupled to a key press, both eye movements and position judgements beyond the vanishing point were reduced, suggesting that intentions affect eye movements and position judgements in a similar manner. Effects of eye movements and further cognitive factors on endpoint localization were modelled by Erlhagen and Jancke. Their model consists of interacting excitatory and inhibitory cell populations. The intrinsic network dynamics explain mislocalization of the final position of a moving target by assuming that the population response to the moving stimulus continues to travel in the direction of stimulus motion even after stimulus offset. However, in the absence of stimulus input, the dynamic extrapolation of trajectory information decays. The strength of extrapolation depends on thresholds for recurrent interactions. The lower the threshold, the further the forward shift of the final target position. It is assumed that cognitive factors and eye movements may adjust these thresholds. Thornton and Hayes extend previous work on the mislocalization of the final target position to complex scenes. The authors used movies simulating selfmotion through an artificial landscape or movies of realistic scenes, such as passengers boarding a train. They found that the position in the final image of these movies was shifted in the direction of the self-motion. This effect was independent of the nature of the probe stimulus. Regardless of whether observers compared the final image of the movie to a static or a dynamic test image, a forward shift was observed. This paper broadens our understanding of memory for dynamic events by showing that mislocalization of the final target position is not confined to the highly impoverished stimuli used in previous studies (typically disks or rectangles), but may occur in more realistic scenarios as well. In the final section of this issue, two studies with higher demands in motor control discuss how intended or executed manual actions influence spatial perception. Johnson-Frey, McCarty, and Keen asked their participants to grasp an object. Consistent with Fitts’ law, they found that movement time increased with decreasing object size. Additionally, movement times were shorter when observers intended to transport it to a new location compared to when they only had to lift the object. This effect was independent of the difficulty of the task following the initial grasping movement. These results indicate that both the immediate and the future goal of a movement determine movement speed.
VISUAL SPACE PERCEPTION AND ACTION 135
Finally, Humphreys, Riddoch, Forti, amd Ackroyd review recent literature on two interesting symptoms: Neglect and Balint’s syndrome. They show that tool use may alleviate neglect for the contralesional side because of visual and visuomotor cueing. If patients explore space with a tool, objects close to the tool may be detected even if they fall within the neglected area of space. Also, neglect is reduced if two objects, one falling into the neglected part of space, are placed relative to each other such that they have the correct place for action. For instance, a hammer may be placed above the nail. Similarly, correct action relations may allow for improved binding of object properties in a patient with Balint’s syndrome. Overall, the 12 papers included in this Special Issue present a number of exciting findings and raise a number of interesting questions for future research. In addition the papers make clear that space perception and action is a central component of human perception and performance. REFERENCES Aglioti, S., DeSouza, J.F.X., & Goodale, M.A. (1995). Size-contrast illusions deceive the eye but not the hand. Current Biology, 5(6), 679–685. Allport, D.A. (1987). Selection for action: Some behavioral and neurophysiological consideration of attention and action. In H.Heuer & A.F.Sanders (Eds.), Perspectives on perception and action (pp. 395–419). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Bridgeman, B., Hendry, D., & Stark, L. (1975). Failure to detect displacement of the visual world during saccadic eye movements. Vision Research, 15(6), 719–722. Fehrer, E., & Raab, D. (1962). Reaction time to stimuli masked by metacontrast. Journal of Experimental Psychology, 63, 143–147. Gibson, J.J. (1979). The ecological approach to visual perception. Boston: Houghton Mifflin. Goodale, M.A., & Milner, A.D. (1992). Separate visual pathways for perception and action. Trends in Neurosciences, 15(1), 20–25. Gordon, I.E. (1997). Theories of visual perception. New York: John Wiley & Sons. Haffenden, A.M., & Goodale, M.A. (1998). The effect of pictorial illusion on prehension and perception. Journal of Cognitive Neuroscience, 10(1), 122–136. Haffenden, A.M., & Goodale, M.A. (2000). Independent effects of pictorial displays on perception and action. Vision Research, 40(10–12), 1597–1607. Hommel, B., Müsseler, J., Aschersleben, G., & Prinz, W. (2001). The Theory of Event Coding (TEC): A framework for perception and action planning. Behavioral and Brain Sciences, 24(5), 869–937. Milner, A.D., & Goodale, M.A. (1995). The visual brain in action. Oxford, UK: Oxford University Press. Milner, D., & Dyde, R. (2003). Why do some perceptual illusions affect visually guided action, when others don’t? Trends in Cognitive Sciences, 7(1), 10–11. Müsseler, J., & Wühr, P. (2002). Response-evoked interference in visual encoding. In W.Prinz & B. Hommel (Eds.), Attention and performance XIX: Common
136 MÜSSELER, VAN DER HEIJDEN, KERZEL
mechanisms in perception and action (pp. 520–537). Oxford, UK: Oxford University Press. Nakayama, K. (1994). James J.Gibson: An appreciation. Psychological Review, 101(2), 329–335. Norman, J. (2002). Two visual systems and two theories of perception: An attempt to reconcile the constructivist and ecological approaches. Behavioral and Brain Sciences, 25, 73–144. Reed, E.S. (1996). Encountering the world: Toward an ecological psychology. New York: Oxford University Press. Rock, I. (1983). The logic of perception. Cambridge, MA: MIT Press. Thelen, E., & Smith, L.B. (1994). A dynamics systems approach to the development of perception and action. Cambridge, MA: MIT Press. Ungerleider, L.G., & Mishkin, M. (1982). Two cortical visual systems. In D.Ingle, M.A.Goodale, & R.J.W.Mansfield (Eds.), Analysis of visual behavior (pp. 549–586). Cambridge, MA: MIT Press. von Helmholtz, H. (1866). Handbuch der physiologischen Optik [Handbook of physiological optics]. Hamburg, Germany: Voss.
VISUAL COGNITION, 2004, 11 (2/3), 137–160
Position of code and code for position: From isomorphism to a sensorimotor account of space perception Peter Wolff Department of Psychology, University of Osnabrück, Germany
The paper starts with a discussion of the assumption that positions of the outer world are coded by the anatomical locations of firing neurons within retinotopic maps (“coding position by position”). This “code position theory” explains space perception by some kind of structural isomorphism since it implies that the perceptual space is based on a spatial structure within the brain. The axiom of structural isomorphism is rejected. Subsequently, a sensorimotor account of space perception is outlined according to which the spatial structure of the outer world is coded by the temporal structure of cortical processing. The basis is that action changes the perceiver’s relationship to the outer world and, therefore, changes the representation of the outer world coded by the sensory responses of the brain. According to this view the code for position is not a spatial but a temporal structure resulting from action (“coding position by action”). The sensorimotor account offers a possible solution to the binding problem. The paper ends with some remarks on the possible origin and function of retinotopic representations. “CODE POSITION” THEORY One of the many discoveries of neurophysiology during the last two decades was that the visual pathway is retinotopically organized: The retinal image corresponds to the spatial pattern of activated neurons within retinotopic maps, in the superior colliculus, the lateral geniculate, and the areas V1–V4, respectively (e.g., Zeki, 1993). Although the retinotopic organization gets more and more lost beyond the striate cortex, it nevertheless characterizes a considerable part of the
138 WOLFF
visual path: The retinotopic organization preserves wholly or partially the spatial structure of the retinal image within subcortical and cortical maps. The discovery of retinotopic organization seems to corroborate the widespread assumption that the spatial structure of the retinal image is the basis for the spatial structure of the perceptual world—in other words, retinotopic space is the basis for perceptual space. This idea is at least as old as the problem of the inverted retinal image: Why do we perceive the world upright and correctly leftright oriented although its retinal projection is a mirror image? This question is generated by the assumption that retinal space is the basis for perceptual space, and the vertical and horizontal coordinates of both the perceptual and retinotopic space correspond with each other. Retinal or retinotopic space as a basis for perceptual space means that the environmental positions of objects are coded by the retinotopic positions of firing cells. The proponents of this “code position” theory conceive the strategy of “coding position by position” as the obvious function of the retinotopic organization. “CODING POSITION BY POSITION” Theoretical problems At the first glance, “coding position by position” seems to be a simple and economical principle according to which space perception is less conceived as a matter of complicated processing, but more as a matter of direct anatomical wiring between the retina and the retinotopic maps. However, the code position theory is, in fact, neither a simple nor an economical account nor does it explain space perception. Rather, it introduces many severe problems. The present line of argument is focused on saccadic eye movements that are considered as an example of intended action.
Please address correspondence to: Peter Wolff, Department of Psychology (FB 8), University of Osnabrück, Knollstr. 15, D-49069 Osnabrück, Germany. Email:
[email protected] I would like to thank Jochen Müsseler and two anonymous reviewers for helpful suggestions on a previous version of the paper and Ulrich Ansorge, Hans Colonius, Elena Carbone, Manfred Heumann, and Ingrid Scharlau for their comments on a related talk given at the University of Bielefeld. Special thanks for additional help, also, to Werner Klotz and Christine Klaß. © 2004 Psychology Press Ltd http://www.tandf.co.uk/journals/pp/13506285.html DOI: 10.1080/13506280344000383
POSITION OF CODE AND CODE FOR POSITION 139
One problem is introduced by the third dimension, since depth cannot be coded by retinotopic position. One might argue, however, that retinal disparity, the basis for perceptual depth, is as well derived from retinal space (i.e., from retinal space differences between both eyes). But even this argument shows that “coding by position” cannot be conceived as a matter of direct anatomical wiring, but that it needs additional processing. Retinotopic position could at the most only contribute to perceptual position. The conclusion does not only follow from the issue of perceptual depth, but holds in general. The reason is that the optical system of the eye is not able to produce an analogue retinal image with an accurate scale. The retinal image is distorted and blurred, for example, by spherical and chromatic aberration of the crystalline lens (e.g., Charman, 1991, Fig. 1). It is, therefore, not topographically but at best topologically in line with the environment (Van der Heijden, Müsseler, & Bridgeman, 1999a). Such an imprecise reproduction of a spatial structure could not be of much use in coding environmental space. Thus, even with the monocular perception of a twodimensional frontoparallel plane, “coding position by position” needs additional processing for correction. While the discussed problem results from the optical properties of the eye, further complications result from the neural magnification of the foveal area in the geniculate and cortical retinotopic maps (Figure 1). Compensation by still more additional mechanisms seems to be necessary. It can be concluded from the preceding considerations that the postulated strategy of “coding by position” is not as simple and economical as it seemed to be at the first glance. It is not simple, since the retinotopic space cannot directly code the environmental space; it is not economical, since it depends on a lot of additional mechanisms for correction. As long as the anatomy of the correction procedures is not clarified, the code position theory does not really explain space perception but evades the problem of space perception by leaving it to the correction mechanisms. Empirical problems The clearest prediction that follows from the code position theory is that stability and instability of retinotopic positions should correspond to the stability and instability of perceptual positions, respectively. However, this prediction is not confirmed. 1. Visual stability during saccades. We do see a stable world during saccades although the retinotopic position of the objects shifts with every saccade. Thus, shifts of retinotopic positions do not correspond to shifts of perceptual positions. 2. Perceived motion of an after-image. An afterimage moves across the perceptual space as a result of saccades although its position is retinotopically fixed. Thus, constant retinotopic positions do not correspond to constant perceptual positions. 3. Perceived motion of a visually tracked target. A moving target which is tracked by pursuit eye movements is perceived as moving on the resting
140 WOLFF
Figure 1. Illustration of the retinal image (b) and the retinotopic representation (c) of a two-dimensional stimulus pattern (a) with the fixation on the right of the stimulus pattern.
background although retinotopically the reverse is true: The target rests and the background is moving. Thus, retinotopic rest and motion do not correspond to perceptual rest and motion, respectively. One might argue that the reported findings merely concern the egocentric positions of objects (i.e., objects’ positions relative to the observer) and that they do not refer to the allocentric positions of objects (i.e., objects’ positions relative
POSITION OF CODE AND CODE FOR POSITION 141
to one another; cf. Pick & Lockman, 1981). According to this argument, it is not the anatomical positions of the firing cells that contribute to the perceptual positions but, rather, the relative positions of the firing cells. In other words, “coding position by position” actually means “coding layout by layout”. It is true that a rigid shift of the whole pattern of firing cells across the retinotopic space does not change the retinotopic layout (Smeets & Brenner, 1994). However, the saccadic reafference is, by no means, a simple rigid shift of the whole retinal image, but rather a nonlinear modification of the retinotopic layout. Because of the spherical aberration of the cornea and the crystalline lens, each saccade deforms the spatial structure of the retinal image in a complicated way. Consequently, each saccade produces complex transformations of the whole retinal configuration. The important point is that these transformations do not at all correspond to perceptual transformations, since the perceptual layout remains stable during eye movements. Thus, deformations of the retinotopic layout do not correspond to deformations of the perceptual layout. It can be concluded from the preceding consideration that stability and instability of the retinal space does not produce corresponding stability and instability of the perceptual space. Although the reported empirical findings contradict the code position theory, they traditionally are not treated as its refutation. Rather they are introduced as classical problems of space perception, which are discussed within the framework of the code position theory. According to the traditional solutions of the classical problems, special mechanisms based on extraretinal signals are suggested, which serve the function to neutralize the reafferent effects of active eye movements (e.g., Honda, 1991; Matin, 1976, 1982; Shebilske, 1976, 1977; Sperry, 1950; Steinbach, 1987; Von Holst & Mittelstaedt, 1950/1980). This presents a serious problem, since the reafferent change produced by the intended eye movement is, in fact, the intended result of that movement. It is the reason why the movement has been executed (Wolff, 1984). “CODING POSITION BY POSITION” AND INTENDED ACTION The reported findings clearly show that “coding position by position” seems not to be suited for an actively moving visual system. According to the code position theory, reafferent retinotopic changes should produce the same perceptual changes as exafferent changes—a prediction which, as has been reported already by von Helmholtz (1866), actually holds for unintended passive eye movements (see also Brindley & Merton, 1960; Skavensky, Haddad, & Steinman, 1972, p. 290). However, it does not hold for active, i.e., intended saccadic eye movements. That does not mean, however, that the code position theory can be applied better to resting eye conditions. On the contrary, when an observer fixates a resting object, while the background moves, induced motion (Bridgeman, Kirch,
142 WOLFF
& Sperling, 1981; Duncker, 1929) is observed. The fixated and resting object is perceived to move in the opposite direction to the background, although neither the retinal image of the object nor the eye move. As has been clearly demonstrated by Bridgeman (1986a, 1986b), induced motion depends on the efferences that are necessary to maintain the intended fixation and keep the eye from being moved by the background through the optokinetic reflex (cf. Leibowitz, Post, & Sheehy, 1986). Thus, it is the intended behaviour of the eye that presents a problem to the code position theory—irrespectively of whether the intended behaviour is an eye movement or steady fixation. We may conclude that the strategy of “coding position by position” seems not to be suited for a visual system that is capable of intentionally controlling the behaviour of the eye. The reason is that intentional control needs information on how to control the sensory states by action. The retinotopic position does not offer this information, since it is unrelated to action (see below). “CODING POSITION BY ACTION” Theoretical considerations “Coding position by position” is a strategy, which requires a precise image of the environmental space, but which is disrupted by actions at the same time. Thus, it would fit best to a passive monitoring system, whose optical projection capacities are excellent and which never moves. For such a system, however, a perceptual world would be useless, since a perceptual world is needed only for intended movements (i.e., for action planning). Additionally, the equipment of the visual system does not meet the requirements of “coding position by position”. Neither the optical equipment of the eye nor the neuroanatomical structure of the retina (with a receptor system analysing more than 90% of the retinal image only in terms of global features) seem to be designed to deliver a precise, analogue image of the environmental space. At the same time, the visual system has excellent movement capacities and an optimal kinetic equipment (i.e., bearing, suspension, torque, etc.). The problems with the code position theory (see above) arise because the supposed strategy of “coding position by position” does not fit with the actual equipment of the visual system. This equipment suggests a strategy that, in contrast to “coding position by position”, does not depend on retinotopic space but instead on active movements. Obviously, the visual system is designed for a strategy of “coding position by action”. The so-called “classical problems of space perception” (see above) are introduced by the fact that the proximal metrics of retinotopic maps does not refer to intended action. This means that the retinotopic positions does not inform about how to use the positions for action planning. The problem is fundamental
POSITION OF CODE AND CODE FOR POSITION 143
and cannot be solved by compensating for the saccadic reafference or by optimizing the optical conditions of the eye. If perception is for action (Allport, 1987; Goodale & Humphrey, 2001; Neumann, 1987, 1990; Van der Heijden, 1995) and if, consequently, the perceptual world is for action planning, the code for the positions of objects should represent: 1. Those actions that are made possible by these positions (i.e., those actions that these positions afford; in the sense of Gibson, 1979, cf. Bridgeman, van der Heijden, & Velichkovsky, 1994). 2. The consequences that will result from these positions when actions are executed. Since coding by the retinotopic position can in no way fulfil these functions, the retinotopic space can in no way contribute to the perceptual space. It is not easy to imagine that an organism that needs information for intended actions could have evolved with a perceptual system which is not tuned to these needs of action control. Empirical data While “coding position by position” cannot be verified empirically, “coding position by action” can be based on findings according to which the perceived position of an object is more closely related to the saccadic eye movement system, needed to fixate that object, than to the retinal position of the corresponding retinal image. Recently, Van der Heijden et al. (1999a) reported that the eccentricity of briefly presented targets is underestimated by about 10%– an amount which equals the magnitude of undershoots of saccadic eye movements. Erlhagen and Jancke (2004 this issue) argue that predictive eye orientation leads to perceived mislocalizations of future positions of a moving target. Thus, the perceived position seems to reflect features of the eye movements (see also Müsseler & Van der Heijden, 2004 this issue; Müsseler, Van der Heijden, Mahmud, Deubel, & Ertsey, 1999; Stork, & Müsseler, 2004 this issue; Van der Heijden, Van der Geest, de Leeuw, Krikke, & Müsseler, 1999b). Corresponding data have been reported concerning the perceived figural layout. For example, geometrical optical illusions are reduced when the figures are inspected for some time (overview in Coren & Girgus, 1978). The decrement of the illusion depends on saccadic exploration, since a decrement is not observed with steady fixation of the same duration (e.g., Coren & Heonig, 1972; Day, 1962; Festinger, White, & Allyn, 1968) or when saccades are prevented by other methods (e.g., Burnham, 1968; Lewis, 1908). The findings suggest that saccadic eye movements contribute to the perceptual space, a conclusion which is further corroborated by results of experiments on perceptual learning. If a prismatic contact lens is attached to the eye, the initially perceived distortion of the environment is reduced as the display is explored by saccadic eye movements with the head in a fixed position (Festinger, Burnham,
144 WOLFF
Ono, & Bamber, 1967; Slotnik, 1969; Taylor, 1962/1975). Removing the prism lens produces an aftereffect. The adaptation occurs even if the explored stimulus is a curved line, which appears initially straight because of the prismatic modification (Slotnik, 1969). Thus, adaptation does not merely result from the “Gibson effect”, according to which a curved line is perceived less curved after the exploration by the naked eye (Coren & Festinger, 1967; Gibson, 1933). Adaptation to the prismatic modification of a contact lens is a genuine perceptual effect produced by saccades. With prism goggles and the head fixed, saccadic exploration produces either no adaptation (Cohen, 1965, cited by Welch, 1978) or merely the Gibson effect (Festinger, et al., 1967). Prism lens and prism goggles differ only with respect to their influence during eye movements. While the prism lens modifies the reafference of the saccadic eye movement, the prism goggles do not. The reason is that the contact lens moves with the eye (Howard & Templeton, 1966; Taylor, 1962/1975; Wolff, 1987). PLAUSIBILITY OF THE CODE POSITION THEORY In spite of its severe shortcomings, the code position theory is widely accepted. Mostly, “coding position by position” is even implicitly introduced as an empirical fact, which does not need further justification or explicit mention. The reason is the extraordinary plausibility of the theory. The plausibility might result from two different points, which depend on each other. Point 1 refers to the relationship between the environment and the retinotopic structure. Since the retinal image is a projection of the environment, the retinotopic space depends on the environmental space. This means that the retinotopic structure is causally connected with, and topologically similar to, the environmental structure. Therefore, the retinal space can be conceived as a representation of the environmental space. Point 2 refers to the relationship between the perceptual world and its corresponding neural base. The code position theory embodies the principle of structural isomorphism, which is less sophisticated than the principle of functional isomorphism proposed by Köhler (1947). While according to Köhler, the “experienced order in space is always structurally identical with a functional order in the distribution of underlying brain processes” (p. 61), code position theory implies that the physiological basis of the perceptual space is as well a spatial structure. This psychophysical principle of structural isomorphism is widely accepted, although, mostly, it is not explicated but merely implied. Consider, for example, the current discussion on “feature binding”. The question why different features of an object are perceived as attributes of one and the same object is often justified by the fact that the processing of the features is distributed across the brain. Gray (1999, p. 36), for an example, writes: Given that the activity evoked by the features comprising an object is distributed, some mechanism is necessary to identify the members of a
POSITION OF CODE AND CODE FOR POSITION 145
representation as belonging together and to distinguish them from other representations that may be present at the same time. Accordingly, the binding postulate is based on the assumption that without additional binding the features of an object cannot be perceptually integrated, just because they are processed at different anatomical positions within the brain. As if, according to the principle of isomorphism, without the binding procedure, the distribution of the processing of features across different positions in the brain would produce a distribution of the perceptual features across different positions in the perceptual world (ªexternali zation of segregationº; Taraborelli, 2002). My present intention is not to declare ªbindingº to be a pseudo problem, but rather to demonstrate that the idea of isomorphism is currently still alive (cf. Scheerer, 1994). Though, if ªbindingº were justified by nothing other than the distributed processing across the brain, it would be a pseudo problem, indeed. Fortunately, there are additional reasons to postulate a binding mechanism (Roskies, 1999). Anyhow, the code position theory seems to be anchored on two sides (Figure 2). On the one side, the retinotopic space is conceived to be a representation of the environmental space, on the other side, retinotopic space is conceived to be the isomorphic neural base of the perceptual space. Accordingly, the retinotopic space is conceived as a natural mediator between environmental and perceptual space. For this reason, the code position theory seems to be so self-evident. However, I will argue that retinotopic space does neither function as a representation of environmental space nor as a neural base of perceptual space. THE REPRESENTATION PROBLEM With regard to the relation between environmental and retinotopic space, causal connection and topological similarity are discernible merely from the outsider viewpoint of an external observer. For example, the neurophysiologist has separate access to both the environmental stimulus and, in principle, to the retinotopic spatial structures within the perceivers' brain. Therefore the retinotopic space can be conceived as a representation of the environmental space, but only from the outsider view.
Figure 2. Retinotopic mediation: Retinotopic space as representation of the environmental space and as isomorphic physiological basis of the perceptual space.
146 WOLFF
From the insider view of the visual system, however, the conditions are totally different, since the visual system is a closed system which is totally blind to what is outside its sensory universe. Therefore, from this point of view, retinotopic space cannot represent environmental space and the position of the retinotopic code cannot code environmental position. Accordingly, from the insider view of the visual system, retinotopic space does not represent anything. It is without any meaning (Van der Heijden et al., 1999a), though not without any function (Wolff, 1999a; see below). THE ISOMORPHISM PROBLEM While nobody presumes to claim that the perception of nonspatial features like colour, hue, etc. correlates with an isomorphic neural base, the idea of isomorphism seems to be reasonable with space perception. The reason is that, both, the perceptual structure and the neurophysiological structure can be described along the same spatial dimension (Prinz, 1985). The same holds for temporal extension: Both the perceptual world and its neural base can be described within the same spatiotemporal frame of referenceÐa fact that involves the danger of confusing the representings with the representeds (Bridgeman et al., 1994; Dennett & Kinsbourne, 1992). Such confusion cannot happen in the case of nonspatial and nontemporal features, like colour, hue, etc., but it does happen with the code position theory of space perception. To consider the position of code as the code of position, means to confuse the coder with the coded, or, semiotically spoken, the sign vehicle that carries meaning with the designatum to which the meaning of the sign is referred (cf. Millikan, 1993; Taraborelli, 2002). If one conceives the brain as a tool and perception as its performance, it becomes quite clear that features of the brain, i.e., the retinotopic structure of firing cells, must not be confused with features of the perceptual content, i.e., the perceptual positions. As a feature of the brain, the retinotopic structure can become a subject of perception, but only from the outsider view of the external observer. From the insider view of the visual system, however, the retinotopic map does not exist as a space. TEMPORAL STRUCTURE VERSUS SPATIAL STRUCTURE With natural signs, the structure of the sign vehicle might be similar to the structure of the designatum (e.g., footprints in the snow representing feet). But similarity is not required in general, and is not the case with the brain and its
POSITION OF CODE AND CODE FOR POSITION 147
performance. According to Creutzfeldt (1979, pp. 217–218), “there is no unified representation of the world in any single cortical area” and “no state of brain activity can be defined as consciousness”. Nowhere in the brain, was a space-constant topographic map found that might be correlated with the stable perceptual world. On the contrary, a characteristic of the central nervous system is that its activity is in a constant change. There is no stable state, but rather ongoing variation. The same holds for the retinotopic maps. The reason is that the alert perceiver is moving all the time. She/he produces more than 150.000 saccadic eye movements every day (Robinson, 1981). Each action changes the relationship of the perceiver to the outer world and, consequently, changes the sensory responses of the brain to the outer world. (“Sensory” means that the responses are determined exclusively by afferent stimulation, irrespectively of whether they are located at a peripheral or a cortical level; cf. Müsseler, 1999.) Together with the former conclusion that the visual system seems to be designed for “coding position by action”, the present consideration leads to the insight that the information about the environmental space is carried by how the sensory responses of the brain changes with intended movements, that is, by the reafferent change that can be produced by action. Accordingly, “one may… define consciousness as an ongoing process rather than a state” (Creutzfeldt, 1979, p. 218). What follows is the exact opposite of what the code position theory states. The reafferent change is nothing that has to be compensated for, as the code position theory assumes, but is, on the contrary, something which has to be produced in order to establish the perceptual space (MacKay, 1973, 1978, 1984; O’Regan & Noë, 2001; Wolff, 1984). Within the code position theory, the reafferent sensory change is described as a spatial variation, i.e., a shift and/or deformation of the retinotopic structure. However, the reafferent change can as well be described as a temporal variation, i.e., as a pattern of temporal activity of the responding neurons (Wolff, 1999b). Both descriptions are equally correct and both are available from the outsider view. From the insider view, however, the reafferent change is not available as a spatial variation, in principle, since the retinotopic space does not exist from this point of view (see above). Consequently, the only information that is accessible from the insider view is what the temporal activity patterns of the responding cells transmit. Thus, we may conclude that information about the environmental space is supported by the temporal activity patterns of the cells, i.e., the temporal patterns that coincide with intended movements. The idea that the perceptual space is based exclusively on temporal information fits to Jerison’s theory according to which consciousness has evolved from mechanisms of orientation of nocturnal animals. These mechanisms served the function to integrate olfactory and auditive stimulation, i.e., basically temporal information to construct an internal representation (Jerison, 1973, 1994).
148 WOLFF
It is easy to see that the way in which the cells’ responses change with exploring movements depends on the explored environmental structure. The temporal pattern of the sensory responses is causally connected with the spatial structure of the environment and can, therefore, deliver information on the environmental space. But since this causal connection is discernible merely from the outsider view, we have a representation problem again: How can the visual system extract information about the environmental space from the temporal activity patterns although it cannot refer to the outer world, and cannot transcend its sensory universe? The problem can be easily dissolved. It is true, the visual system—as a closed system—cannot refer to the outer world, but it need not do so; it need not even know that an outer world exists. The reason is that information on how the temporal activity patterns coincide with movements is quite enough to inform about: 1. What actions can be done, and 2. What effects will result from actions. This information is available from the insider view and is exactly what the intending system needs. I would like to show now that the perceptual space is based on exactly that information. The availability of this information is the function the perceptual world is serving. What we conceive to be a representation of the outer world is, in fact, a presentation of how the brain’s sensory responses to the world can be changed by intended movements (O’Regan & Noë, 2001; Wolff, 1985, 1986, 1987, 1999b). Therefore, in order to find the physiological basis of the perceptual world we do not have to look for a spatial structure. Rather we have to describe the temporal activity patterns as a function of intended movements. A SENSORIMOTOR ACCOUNT OF SPACE PERCEPTION Activity of a single cell during exploration To illustrate this point, I use saccadic eye movements as an example of intended actions. However, the account should be valid for any intended actions in general. I refer to an ingenious experiment published already in 1978 by Creutzfeldt and Nothdurft. They developed a method to investigate transfer properties of neurons in the cat, using pictures of complex visual stimuli. The picture was moved over the receptive field of a neuron along vertical and horizontal lines so that the neuron systematically scanned the whole picture. Each discharge of the cell was recorded continuously. Each time a discharge occurred a spot was produced at the corresponding position of the stimulus detail that, at that
POSITION OF CODE AND CODE FOR POSITION 149
moment, was moved across the receptive field. So, the activity of the neuron during scanning was presented in a two-dimensional dot display in scale with the original picture. Among other stimulus patterns, Creutzfeldt and Nothdurft (1978) used the bullfinch photo shown in Figures 1 and 2. The corresponding transfer pattern, recorded from a geniculate OFF-centre cell, is shown in Figure 3. The transfer pattern shows the spatial distribution of the cell’s discharges across the stimulus plane. (For the following considerations, the type of the cell from which the activity is recorded is of no relevance. The transfer patterns that Creutzfeldt & Nothdurft, 1978, recorded from cortical cells look very similar, but were not so well suited for illustration.) The significance of the recordings by Creutzfeldt and Nothdurft (1978) becomes obvious if one realizes that, under ecological conditions, the stimulus is resting and the eye moving. In such ecological standard conditions, the same transfer pattern will result, i.e., irrespectively of the specific movement path, provided that the exploration is exhaustive. Thus, in order to understand the fundamental functional role played by the transfer functions of responding cells, one has to consider the transfer pattern of Figure 3 under the assumption that not the stimulus was moved in front of the resting eye but that, instead, the eye had actively explored the resting stimulus by intended saccades. Now the transfer pattern gets a totally new meaning. The reason is that the plane of the transfer pattern gets another metric, namely, the metric of the intended saccades. Now the transfer pattern presents the information that the activity of one single cell provides during exhaustive exploration, namely, the layout of the cell’s responses across the two-dimensional continuum of the saccadic eye movement. The layout presents all possible coincidences of intended movements and activity patterns. Although the coincidences are temporal events, their entirety has to be represented by a spatial array of paths, an array that describes how the cell’s activity can be manipulated by action. The position of each momentary cell discharge is defined by how it can be changed by intended saccades. Since the metric of the continuum is action related, the spatial structure does not suffer from optical distortion. It is important to realize that the spatial structure depends solely on the movements of the eye relative to the environment, that is on the ecologically relevant factors, and not at all on the aberration of the eye’s optical system. Activity of the entire map during exploration It is clear that the transfer pattern is independent of the retinotopic position of the responding cell. During exhaustive exploration, exactly the same transfer pattern will result with each other cell of the retinotopic map, provided that each cell has a receptive field of the same size (which we shall assume, for the sake of
150 WOLFF
Figure 3. Stimulus pattern and corresponding transfer pattern (right), recorded from a geniculate OFF-centre cell by Creutzfeldt and Nothdurft (1978). Reprinted by permission of Dr. C.Nothdurft and Springer-Verlag, Heidelberg).
simplicity). Figure 4 illustrates this point for two cells at two different retinotopic positions. Accordingly, during exhaustive exploration, the spatial structure is overspecified by the activity of the entire retinotopic map. At each point in time, however, each cell specifies only one single position of the spatial structure, respectively. In particular, different cells specify different positions of one and the same spatial structure, at the same time. The reason is that each cell scans a different area of the visual field. What follows is that all positions of the structure are specified simultaneously by the activity of the entire retinotopic map. In other words, any time, the whole structure is completely available and defines the perceptual space. Each perceptual position is specified by the activity of one cell of the retinotopic map. Figure 5 illustrates that point for some perceptual positions. The perceptual positions do not depend on the positions of the firing cells, but on how the cell activities can be changed by movement. Which perceptual positions are specified by which cells is changed by every movement. The information that becomes available successively through the temporal activity pattern of a single cell during exhaustive exploration is now available simultaneously through the activity of the entire retinotopic map. All current cell responses are connected to each other according to the context of movement, which has been learned by exploration. This way, they are located and this is the way position is coded by action, respectively. The temporal patterns of all cells’ responses provide the structure of a space whose dimensions are defined by the degrees of freedom of the intended saccades. According to the terminology of Husserl (cf. Scheerer, 1985), the intended movements are the “space-giving” constituents, and the temporal cell responses that register the sensory changes are the “space-filling” constituents.
POSITION OF CODE AND CODE FOR POSITION 151
Figure 4. Transfer patterns from cells of two different retinotopic positions.
Although all cell activities are temporal events, they carry information about spatial structure. That spatial structure can be conceived of as the array of all coincidences of intended movements and cell activities, that is, as the invariance, which determines which temporal activity patterns of all cells coincide with which saccade. The whole spatial structure describes how the entirety of the momentary cell activities can be changed by intended movements. This is what the perceptual world is made for: It offers possible movements by presenting their sensory effects. This way, the perceptual positions represent (1) the actions that these positions afford, and (2) the consequences that will result from these positions when actions are executed (see above). For this reason, the space is needed for intended actions, that is, for saccades. Each position within the perceptual space is a response that can be produced by each cell. And the spatial structure (i.e., the distribution of all possible responses) offers the movements needed to do that. So, the perceptual world offers both the effects that can be intended and the movements that are needed to
152 WOLFF
Figure 5. Retinotopic and perceptual space: All perceptual positions are specified by the temporal activity of the entire retinotopic map, at the same time. The figure illustrates some perceptual positions (circles), which are specified by the activity of some cells (squares), at a given moment. If the eye moves, these perceptual positions will be specified by the activity of other cells. The reason is that the perceptual positions do not depend on the positions of the firing cells, but on how the cell activities can be changed by movement. For the sake of simplicity, the illustration of the perceptual space does not consider differences of spatial solution between retinal locations.
realize the intentions. Therefore, it is exactly tuned to an organism that is able to intend actions.
POSITION OF CODE AND CODE FOR POSITION 153
REPRESENTATION VERSUS PRESENTATION According to the present account, the perceptual world is an objective description of all possible sensory changes that can be produced by intended movements. It is a description of how to manipulate the sensory states by saccades. The description is available from the insider view and does not require any knowledge about the external world. In this respect, the perceptual world does not represent the external world but, rather, presents the sensory consequences of possible actions. In spite of this, the perceptual topology agrees with the environmental topology since, if a variation resulting from a movement in a stable structure is decomposed into a movement and a stable structure, both the movements and the structures will topologically agree. Of course, the perceptual space depends on the environmental space, and the presented sensory consequences of possible actions are continually validated by ongoing actions. If the sensory consequences of afforded actions turn out to be wrong, perceptual learning will restore the agreement according to the outlined sensorimotor mechanism (Wolff, 1985, 1987, 1999b). PERCEPTUAL SPACE, ACTIVITY PATTERN, AND RETINOTOPIC SPACE It is important to realize that the perceptual space is not a construction on the basis of the temporal sensory structure. Rather, moving within the stable perceptual world along a specific path, on the one hand, and producing specific temporal activity patterns of neurons by a saccade, on the other hand, are one and the same reality described from different cognitive positions. The former cognitive position is the insider view; the latter one is the outsider view. A distance in the perceptual space is a possible coincidence of intended movement and activity patterns. A current coincidence is a movement across the space, regardless of which neuron fires with which activity pattern. The reason is that the spatial structure describes which activity patterns will accomplish which movements. It does not tell, however, which cells respond with which activity patterns. Therefore, the retinotopic space is of no relevance for coding position. Even if the spatial arrangement of the cells within the retinotopic maps would get mixed, so that the retinotopic ordering (and of course the topological similarity to the environmental space) got lost, space perception would still work. The only necessary condition is that the connections between the analysing cells and the areas of the visual field that the cells scan by the way of their receptive fields remain constant. If the connections are changed by an optical modification (e.g., a prismatic contact lens), new exploration is needed.
154 WOLFF
FEATURE BINDING According to the present sensorimotor account, a current response of a firing cell codes position by how the response can be changed by movements, that is, by its activity pattern produced by action. Consequently, distributed processing across the brain does not pose a binding problem, since the locations of cells within the brain are not involved in coding position. Furthermore, the activity pattern is independent of the feature to which the cell is sensitive. The reason is that different features of one and the same object (e.g., colour and form) are distributed across the explored environmental space in exactly the same way, since they occupy exactly the same environmental area. Consequently, the temporal activity patterns of the (colour and form analysing) cells will be correlated during exploration. Such correlation might be related to the synchronization of cell activities, which is coupled with “binding” (see among other, von der Malsburg, 1981/1994, 1999). Some authors propose an optimal frequency in the order of 30–70 Hz (e.g., Singer & Gray, 1995). Perhaps the tremor of the eyes, which is on the order of this rhythm, is not merely a random artifact of the oculomotor system, but serves the function of updating and maintaining the perceptual space. Engbert and Kliegl (2003) and Hafed and Clark (2002) have recently at least shown that microsaccades are not random movements, as often assumed, but that they are correlated with covert attention. Tremor and microsaccades, however, could only verify the spatial relations between nearest positions. But such verification would spread out across the whole space. According to the present account, binding is not a problem, but results quite naturally from the sensorimotor basis of perception. The term “binding” refers to one aspect of structuring the perceptual world. Another aspect is segregation. Both aspects are two sides of one coin, i.e., the structuring by intended movements. MULTIPLE SPACES For the sake of simplicity, we have illustrated the sensorimotor principle using the example of eye movements and have considered only what the twodimensional saccades contribute to the perceptual space, namely, a twodimensional perceptual plane. Such oculomotor space established by saccades can be used to selectively elicit and control actions. It allows “selection-foraction” (Allport, 1987; Neumann, 1990) and offers position information that can be used by visual selective attention in order to “structure information in time” (Van der Heijden, 1992). However, our prolific perceptual world is not reduced to that oculomotor space. Rather, all other movement systems that can be controlled voluntarily (i.e., convergence movements, head movements, locomotion, manual manipulation, etc.) might contribute to the perceptual world as well. For example, binocular depth perception is assumed to be constituted by how the responses of
POSITION OF CODE AND CODE FOR POSITION 155
the disparity tuned cells (Barlow, Blakemore, & Pettigrew, 1967; Hubel & Wiesel, 1970) can be changed by convergence movements. Accordingly, each movement system is assumed to be “space giving”, that is to constitute its specific space. Evidence for the existence of multiple spaces is available from recent neuropsychological studies indicating that space is not coded in a unitary way and that the differentiation between specific spaces is according to action (Berti, Smania, & Allport, 2001; Humphreys & Heinke, 1998; Humphreys & Riddoch, 2001; Humphreys, Riddoch, Forti, & Ackroyd, 2004 this issue; see also Bremmer, Schlack, Duhamel, Graf, & Fink, 2001, and Matelli & Luppino, 2001, for corresponding conclusions from neurophysiological investigations with animals). The integration of the different movement-specific spaces results from the coordination between the respective movement systems. The compensatory eye movements that accompany head movements in a coordinated way are an example. In agreement with the present sensorimotor account, each movement system creates its own perceptual properties. The perceptual properties depend on how the movement system changes the activities of cells responding to the outer world, i.e., the sensory representation of the environment in the brain. In any case, it is the temporal structure of the brain’s activity resulting from the particular action that counts. ORIGIN AND FUNCTION OF RETINOTOPIC POSITION If retinotopic position does not code position, why is it preserved in multiple maps? Several aspects might be of interest in this regard. Organotopics First of all, retinotopics is a special variant of organotopics, a principle according to which the sensory path from the periphery to the central system preserves the spatial relations between the receptors in the periphery. This principle holds for the auditive and somatosensory systems as well. While somatotopics, in principle, could code position by position, the cochleotopics of frequencyspecific maps, quite obviously does not code position, but frequency. The relevant point is that retinotopics is a special case of a general principle that need not be related to coding position. Ontogenetical constraints At the beginning of ontogenetic development, the connections between the retina and the higher levels are more or less disordered. As Menzel and Roth (1996, p. 259) report, the retinotopic ordering is formed in a self-organizing way by the interaction between the synaptic ends of the incoming fibres from the retina, thalamus, and the postsynaptic cells, respectively. The global ordering emerges
156 WOLFF
according to the principle that simultaneously and similarly activated synapses reinforce each other, while not simultaneously and similarly activated synapses inhibit each other. This way, the global ordering emerges by the elimination of disordered connection without any specific visual experience. Possible function of retinotopics: Feature detection and motor control Even if the retinotopic position is “meaningless” in the sense that it does not code position or represent anything, it is in no way functionless (cf. Wolff, 1999a). Retinotopics facilitates interaction and communication of processes initiated by sensory neighbours, especially the formation of modular units. The analysis of spatially extended features (e.g., lines, edges, etc.) by feature detectors which needs the interaction of adjacent units would be, presumably, impossible without retinotopic organization. Retinotopics might, furthermore, play an important part in motor control, especially of eye movements. Since the retinal position of a target is perfectly correlated with direction and amplitude of the corresponding saccade, retinotopic position is an excellent candidate in order to guide and control the execution of a given saccade, once a perceptual object has been selected as the target, in the course of action planning. Accordingly, Duhamel, Colby, and Goldberg (1992) found that the retinal receptive field of neurons in the parietal cortex of monkeys shifts to the target’s position well before the onset of the saccade. Accordingly, one might speculate about a dissociation between eye movement control and conscious perception, respectively, about the dissociation between the control of pointing actions and conscious perception (as shown, e.g., by Bridgeman et al., 1981; Bridgeman, Lewis, Heit, & Nagle, 1979; see also Bridgeman, 1999): While motor control might depend on the retinotopic position of firing cells, conscious perception might be based on their temporal activity patterns. Another question is, however, how both aspects might be coordinated in the cognitive control of eye movements. CONCLUSION The aim of this paper was, to critically discuss the code position theory according to which the retinotopic position of firing cells is the basis of perceptual position. This strategy of “coding position by position” turned out not to be suited to the visual system, which is rather designed for a strategy of “coding position by action”. The main problems of the code position theory arise because the theory is based on an inadequate stimulus description (i.e., the spatial structure of the firing cells within the retinotopic map). According to the present sensorimotor account, however, the sensory basis of the perceptual space is the temporal structure of the cell responses, not a spatial structure. With the example of the two-dimensional saccadic eye movement, it was possible to
POSITION OF CODE AND CODE FOR POSITION 157
demonstrate that the sensory temporal structure produced by saccades can actually constitute a perceptual space. REFERENCES Allport, A. (1987). Selection for action: Some behavioral and neurophysiological considerations of attention and action. In H.Heuer & A.F.Sanders (Eds.), Perspectives on perception and action (pp. 395–419). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Barlow, H.B., Blakemore, C., & Pettigrew, J.D. (1967). The neural mechanism of binocular depth discrimination. Journal of Physiology, 193, 327–342. Berti, A., Smania, N., & Allport, A. (2001). Coding of far and near space in neglect patients. Neuroimage, 14, S98-S102. Bremmer, F., Schlack, A., Duhamel, J.-R., Graf, W., & Fink, G.R. (2001). Space coding in orimate posterior parietal cortex. Neuroimage, 14, S46–S51. Bridgeman, B. (1986a). Multiple source of outflow in processing spatial information. Acta Psychologica. 86, 35–48. Bridgeman, B. (1986b). Relations between the physiology of attention and the physiology of consciousness. Psychological Research, 48, 259–266. Bridgeman, B. (1999). Separate representations of visual space for perception and visually guided behavior. In G.Aschersleben, T.Bachmann, & J.Müsseler (Eds.), Cognitive contributions to the perception of spatial and temporal events (pp. 3–13). Amsterdam: Elsevier. Bridgeman, B., Kirch, M, & Sperling, A. (1981). Segregation of cognitive and motor aspects of visual functioning using induced motion. Perception and Psychophysics, 29, 336–342. Bridgeman, B., Lewis, S., Heit, G., & Nagle, M. (1979). Relation between cognitive and motor oriented systems of visual position perception. Journal of Experimental Psychology: Human Perception and Performance, 5, 692–700. Bridgeman, B., van der Heijden, A.H.C., & Velichkovsky, B.M. (1994). A theory of visual stability across saccadic eye movements. Behavioral and Brain Sciences, 17, 247–292. Brindley, G.S., & Merton, P.A. (1960). The absence of position sense in the human eye. Journal of Physiology, 153, 127–130. Burnham, C.A. (1968). Decrement of the Mueller-Lyer illusion with saccadic and tracking eye movements. Perception and Psychophysics, 3, 424–426. Charman, W.N. (1991). Optics of the human eye. In W.N.Charman (Ed.), Vision and visual dysfunction: Vol 1. Visual optics and instrumentation (pp. 1–26). Basingstoke, UK: Macmillan Press. Coren, S., & Festinger, L. (1967). An alternative view of the “Gibson normalization effect”. Perception and Psychophysics, 2, 621–626. Coren, S., & Girgus, J.S. (1978). Seeing is deceiving: The psychology of visual illusions. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Coren, S., & Heonig, P. (1972). Eye movements and decrement in the Oppel-Kundt illusion. Perception and Psychophysics, 12, 224–225.
158 WOLFF
Creutzfeldt, O. (1979). Neurophysiological mechanisms and consciousness. In Brain and Mind [Ciba Foundation Symposium 69 (New Series)] (pp. 217–233). Amsterdam: Excerpta Medica. Creutzfeldt, O., & Nothdurft, H.C. (1978). Representation of complex visual stimuli in the brain. Naturwissenschaften, 65, 307–318. Day, R.H. (1962). The effects of repeated trials and prolonged fixation on error in the Mueller-Lyer-figure. Psychological Monographs, 76(14). Dennett, D.C., & Kinsbourne, M. (1992). Time and the observer: The where and when of consciousness in the brain. Behavioral and Brain Sciences, 15(2), 183–201. Duhamel, J.-R., Colby, C.L., & Goldberg, M.E. (1992). The updating of the representation of visual space in parietal cortex by intended eye movements. Science, 255, 90–92. Duncker, K. (1929). Über induzierte Bewegung (ein Beitrag zur Theorie optisch wahrgenommener Bewegung) [On induced motion (a contribution to the theory of optically perceived motion)]. Psychologische Forschung, 2, 180–259. Engbert, R., & Kliegl, R. (2003). Microsaccades uncover the orientation of covert attention. Vision Research, 43, 1035–1045. Erlhagen, W., & Jancke, D. (2004). The role of action plans and other cognitive factors in motion extrapolation: A modelling study. Visual Cognition, 11(2/3), 315–340. Festinger, L., Burnham, C.A., Ono, H., & Bamber, D. (1967). Efference and the conscious experience of perception. Journal of Experimental Psychology Monograph, 74(4). Festinger, L., White, C.W., & Allyn, M.R. (1968). Eye movements and decrement in the Mueller-Lyer illusion. Perception and Psychophysics, 3, 376–382. Gibson, J.J. (1933). Adaptation, aftereffect, and contrast in the perception of curved lines. Journal of Experimental Psychology, 16, 1–31. Gibson, J.J. (1979). The ecological approach to visual perception. Boston: Houghton Mifflin. Goodale, M.A., & Humphrey, G.K. (2001). Separate visual systems for action and perception. In E. B.Goldstein (Ed.), Blackwell handbook of perception (pp. 311–343). Oxford, UK: Blackwell. Gray, C.M. (1999). The temporal correlation hypothesis of visual feature integration: Still alive and well. Neuron, 24, 31–47. Hafed, Z.M., & Clark, J.J. (2002). Microsaccades as an overt measure of covert attention shifts. Vision Research, 42, 2533–2345. Honda, H. (1991). The time courses of visual mislocalization and of extraretinal eye position signals at the time of vertical saccades. Vision Research, 31, 1915–1921. Howard, I.P., & Templeton, W.B. (1966). Human spatial orientation. London: Wiley. Hubel, D.H., & Wiesel, T.N. (1970). Cells sensitive to binocular depth in area 18 of the macaque monkey cortex. Nature, 225, 41–42. Humphreys, G.W., & Heinke, D. (1998). Spatial representation and selection in the brain: Neuropsychological and computational constraints. Visual Cognition, 5, 9–41. Humphreys, G.W., & Riddoch, M.J. (2001). The neuropsychology of visual object and space perception. In E.B.Goldstein (Ed.), Blackwell handbook of perception (pp. 204–236). Oxford, UK: Blackwell. Humphreys, G.W., Riddoch, M.J., Forti, S., & Ackroyd, K. (2004). Action influences spatial perception: Neuropsychological evidence. Visual Cognition, 11(2/3), 401–427.
POSITION OF CODE AND CODE FOR POSITION 159
Jerison, H.J. (1973). Evolution of the brain and intelligence. New York: Academic Press. Jerison, H.J. (1994). Evolutionäres Denken über Gehirn und Bewußtsein [Evolutionary thinking on brain and consciousness]. In V.Braitenberg & I.Hosp (Eds.), Evolution. Entwicklung und Organisation in der Natur (pp. 120–138). Reinbek bei Hamburg, Germany: Rowohlt. Köhler, W. (1947). Gestalt psychology. New York: Liveright. Leibowitz, H.W., Post, R.B., & Sheehy, J.B. (1986). Efference, perceived movement, and illusory displacement. Acta Psychologica, 63, 23–34. Lewis, E.O. (1908). The effect of practice on the perception of the Mueller-Lyer-illusion. British Journal of Psychology, 2, 294–306. MacKay, D.M. (1973). Visual stability and voluntary eye movement. In R.Jung (Ed.), Handbook of sensory physiology (Vol. 7/3, pp. 307–331). Berlin, Germany: Springer. MacKay, D.M. (1978). The dynamics of perception. In P.A.Buser & A.Rougel-Buser (Eds.), Cerebral correlates of conscious experience (pp. 53–68). Amsterdam: Elsevier. MacKay, D.M. (1984). Evaluation: The missing link between cognition and action. In W.Prinz &A. F.Sanders (Eds.), Cognition and motor processes (pp. 175–184). Berlin, Germany: Springer. Matelli, M., & Luppino, G. (2001). Parietofrontal circuits for action and space perception in the macaque monkey. Neuroimage, 14, 27–32. Matin, L. (1976). Saccades and extraretinal signal for visual direction. In R.A.Monty & J.W. Senders (Eds.), Eye movements and psychological processes (pp. 205–219). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Matin, L. (1982). Visual localization and eye movements. In A.H.Wertheim, W.A.Wagenaar, & H. W.Leibowitz (Eds.), Tutorials on motion perception (pp. 101–156). New York: Plenum Press. Menzel, R., & Roth, G. (1996). Verhaltensbiologische und neuronale Grundlagen des Lernens und des Gedächtnisses [Behavioural-biological and neural basis of learning and memory]. In G.Roth & W.Prinz (Eds.), Kopf-Arbeit (pp. 239–277). Heidelberg, Germany: Spektrum-Verlag. Millikan, R.G. (1993). Content and vehicle. In N.Eilan, R.McCarthy, & B.Brewer (Eds.), Spatial representation (pp. 256–268). Oxford, UK: Blackwell. Müsseler, J. (1999). How independent from action control is perception? An event-coding account for more equally-ranked crosstalks. In G.Aschersleben, T.Bachmann, & J.Müsseler (Eds.), Cognitive contributions to the perception of spatial and temporal events (pp. 121–147). Amsterdam: Elsevier. Müsseler, J., & van der Heijden, A.H.C. (2004). Two spatial maps for perceived visual space: Evidence from relative mislocalizations. Visual Cognition, 11(2/3), 235–254. Müsseler, J., van der Heijden, A.H.C., Mahmud, S.D., Deubel, H., & Ertsey, S. (1999). Relative mislocalisations of briefly presented stimuli in the retinal periphery. Perception and Psychophysics, 61, 1646–1661. Neumann, O. (1987). Beyond capacity: A functional view of attention. In H.Heuer & A.F.Sanders (Eds.), Perspectives on perception and action (pp. 361–394). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Neumann, O. (1990). Visual attention and action. In O.Neumann & W.Prinz (Eds.), Relationships between perception and action (pp. 227–267). Berlin, Germany: Springer.
160 WOLFF
O’Regan, K., & Noë, A. (2001). A sensorimotor account of vision and visual consciousness. Behavioral and Brain Sciences, 24(5), 883–917. Pick, H.L., Jr., & Lockman, J.J. (1981). From frames of reference to spatial representation. In L.S. Liben, A.H.Patterson, & N.Newcombe (Eds.), Spatial representation and behaviour across the life span (pp. 39–61). New York: Academic Press. Prinz, W. (1985). Ideomotorik und Isomorphie [Ideomotor processes and isomorphism]. In O. Neumann (Ed.), Perspektiven der Kognitionspsychologie (pp. 39–62). Berlin, Germany: Springer. Robinson, D.A. (1981). Control of eye movements. In J.R.Pappenheimer (Ed.), Handbook of physiology: Section I. The nervous system, 2 (pp. 1275–1320). Bethesda, MD: American Physiological Society. Roskies, A.L. (1999). The binding problem. Neuron, 24, 7–9. Scheerer, E. (1985, February). The constitution of space perception: A phenomenological perspective. Paper presented at the Sensorimotor Interactions in Space Perception and Action symposium, Bielefeld, Germany. Scheerer, E. (1994). Psychoneural isomorphism: Historical background and current relevance. Philosophical Psychology, 7, 183–210. Shebilske, W.L. (1976). Extraretinal information in corrective saccades and inflow vs. outflow theories of visual direction constancy. Vision Research, 16, 621–628. Shebilske, W.L. (1977). Visuomotor coordination in visual direction and position constancies. In W. Epstein (Ed.), Stability and constancy in visual perception: Mechanisms and processes (pp. 23–69). New York: Wiley. Singer, W., & Gray, C.M. (1995). Visual feature integration and the temporal correlation hypothesis. Annual Review of Neuroscience, 18, 555–586. Skavensky, A.A., Haddad, G., & Steinman, R.M. (1972). The extraretinal signal for the visual perception of direction. Perception and Psychophysics, 11, 287–290. Slotnik. R.S. (1969). Adaptation to curvature distortion. Journal of Experimental Psychology, 81, 441–448. Smeets, J.B.J., & Brenner, E. (1994). Stability relative to what? Behavioral and Brain Sciences, 17, 277–278. Sperry, R.W. (1950). Neural basis of the spontaneous optokinetic response produced by visual inversion. Journal of Comparative and Physiological Psychology, 43, 482–489. Steinbach, M.J. (1987). Proprioceptive knowledge of eye position. Vision Research, 27, 1737–1744. Stork, S., & Müsseler, J. (2004). Perceived localizations and eye movements with actiongenerated and computer-generated vanishing points of moving stimuli. Visual Cognition, 11(2/3), 299–314. Taraborelli, D. (2002). Feature binding and object perception: Does object awareness require feature conjunction? Retrieved May 10, 2002, from http:// jeannicod.ccsd.cnrs.fr/documents/ disk0/00/00/03/1l/index_fr.html Taylor, J.G. (1975). The behavioural basis of perception. Westport, CT: Greenwood Press. (Original work published 1962) Van der Heijden, A.H.C. (1992). Selective attention in vision. London: Routledge. Van der Heijden, A.H.C. (1995). Modularity and action. Visual Cognition, 2, 269–302. Van der Heijden, A.H.C., Müsseler, J., & Bridgeman, B. (1999a). On the perception of position. In G.Aschersleben, T.Bachmann, & J.Müsseler (Eds.), Cognitive
POSITION OF CODE AND CODE FOR POSITION 161
contributions to the perception of spatial and temporal events (pp. 19–37). Amsterdam: Elsevier. Van der Heijden, A.H.C., Van der Geest, J.N., de Leeuw, F., Krikke, K., & Müsseler, J. (1999b). Sources of position perception error for small isolated targets. Psychological Research, 62, 20–35. Von der Malsburg, C. (1994). The correlation theory of brain function. In E.Domany, J.L.van Hemmen, & K.Schulten (Eds.), Models of neural network: II. Temporal aspects of coding and information processing in biological systems (pp. 95–119). Berlin, Germany: Springer. (Original work published 1981) Von der Malsburg (1999). The what and why of binding: The modeler’s perspective. Neuron, 24, 95–104. Von Helmholtz, H. (1866). Handbuch der Physiologischen Optik [Handbook of physiological optics]. Leipzig, Germany: Voss. Von Holst, E., & Mittelstaedt, H. (1980). The reafference principle (interaction between the central nervous system and the periphery). In C.R.Gallistel (Ed.), The organization of action: A new synthesis (pp. 176–209). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. (Original work published 1950, in German) Welch, R.B. (1978). Perceptual modification: Adapting to altered sensory environments. New York: Academic Press. Wolff, P. (1984). Saccadic eye movements and visual stability: Preliminary considerations towards a cognitive approach. In W.Prinz & A.F.Sanders (Eds.), Cognition and motor processes (pp. 121–137). Berlin, Germany: Springer. Wolff, P. (1985). Wahrnehmungslernen durch Blickbewegungen [Learning to perceive through eye movements]. In O.Neumann (Ed.), Perspektiven der Kognitionspsychologie (pp. 63–111). Berlin, Germany: Springer. Wolff, P. (1986). Saccadic exploration and perceptual motor learning. Acta Psychologica, 63, 263–280. Wolff, P. (1987). Perceptual learning by saccades. In H.Heuer & A.F.Sanders (Eds.), Perspectives on perception and action (pp. 249–271). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Wolff, P. (1999a). Function and processing of “meaningless” and meaningful” position: Commentary on Van der Heijden et al. In G.Aschersleben, T.Bachmann, & J.Müsseler (Eds.), Cognitive contributions to the perception of spatial and temporal events (pp. 39–42). Amsterdam: Elsevier. Wolff, P. (1999b). Space perception and intended action. In G.Aschersleben, T.Bachmann, & J. Müsseler (Eds.), Cognitive contributions to the perception of spatial and temporal events (pp. 43–63). Amsterdam: Elsevier. Zeki, S.A. (1993). A vision of the brain. Oxford, UK: Blackwell.
VISUAL COGNITION, 2004, 11 (2/3), 161–172
Multisensory self-motion encoding in parietal cortex Frank Bremmer Neurophysik, Philipps-Universität Marburg, Marburg, Germany Anja Schlack Vision Center Laboratory, The Salk Institute, La Jolla, USA Werner Graf LPPA, CNRS—Collège de France, Paris, France Jean-René Duhamel Institute of Scientific Cognition, CNRS, Bron, France Navigation through the environment requires the brain to process a number of incoming sensory signals, such as visual optical flow on the retina and motion information originating from the vestibular organs. In addition, tactile as well as auditory signals can help to disambiguate the continuous stream of incoming information and determining the signals resulting from one’s own set of motion. In this review I will focus on the cortical processing of motion information in one subregion of the posterior parietal cortex, i.e., the ventral intraparietal area (VIP). I will review (1) electrophysiological data from single cell recordings in the awake macaque showing how self-motion signals across different sensory modalities are represented within this area and (2) data from fMRI recordings in normal human subjects providing evidence for the existence of a functionally equivalent area of macaque area VIP in the human cortex.
Please address correspondence to: Frank Bremmer, AG Neurophysik, PhilippsUniversität Marburg, D-35032 MARBURG, Germany. Email:
[email protected] This work was supported by the HCM program of the European Union (CHRXCT 930267), the Human Frontier Science Program (RG 71/96B), and the DFG (SFB 509/B7). © 2004 Psychology Press Ltd http://www.tandf.co.uk/journals/pp/13506285.html DOI: 10.1080/13506280344000275
MULTISENSORY SELF-MOTION 163
MOTION-SENSITIVE AREAS IN THE MACAQUE VISUAL CORTICAL SYSTEM Self-motion through the environment generates a variety of sensory input signals. In the macaque more than half of the cortical tissue is dedicated to the processing of visual signals. This indicates the importance of the incoming visual information for the processing of self-motion information and implies its dominance over the other sensory signals. In the parietal cortex the different sensory signals converge. In recent experiments we could show that individual cells within a functional subdivision of the posterior parietal cortex (PPC), i.e., the ventral intraparietal area (VIP), process these signals originating from different sensory modalities. By summarizing these studies, we will describe the visual motion processing in area VIP and, thereafter, we will show how selfmotion information originating from other sensory modalities (vestibular, tactile, and auditory) is processed in this area. Yet, we will start our description of the cortical motion processing by briefly illustrating, how the preceding stages process the relevant visual motion signals.
VISUAL MOTION PROCESSING IN THE M-PATHWAY In primates, visual information processing is segregated into parallel channels already within the retina and the preprocessed signals are transmitted via the thalamus towards area V1 of the visual cortex. Signals related to the motion of a stimulus are predominantly processed and forwarded within the fast “Mpathway”. Information is sent directly from area V1 or via a further processing stage (area V2) to the middle temporal area (MT). Area MT (or V5) is located in the posterior bank of the superior temporal sulcus (STS). It is retinotopically organized, i.e., neighbouring cells within area MT represent neighbouring parts within the visual field (Shipp & Zeki, 1989; Ungerleider & Desimone, 1986a, 1986b). Many cells in area MT are tuned for the direction and speed of a moving visual stimulus (Albright, 1984; Mikami, Newsome, & Wurtz, 1986a, 1986b). Furthermore, a considerable proportion of cells in area MT increase their discharge in relation to smooth eye movements (for review see, e.g., Ilg, 1997). The visual field representation in area MT is mostly contralateral. Although visual receptive fields of MT cells are larger than those in striate cortex, they are still small compared to the large field motion across the whole visual field typically occurring during self-motion. Area MT thus can only be considered a relay station for visual motion processing necessary for the encoding of selfmotion. Two major output structures of area MT are the medial superior temporal area (MST) in the anterior bank of the STS and the ventral intraparietal area (VIP) in the depth of the intraparietal sulcus (IPS). It is known for many years now, that MST neurons respond selectively to optic flow stimuli mimicking self-motion in
164 BREMMER ET AL.
3-D space (Duffy & Wurtz, 1991a, 1991b, 1995; Graziano, Andersen, & Snowden, 1994; Lappe, Bremmer, Pekel, Thiele, & Hoffmann, 1996; Saito, Yukie, Tanaka, Hikosaka, Fukada, & Iwai, 1986; Tanaka, Hikosaka, Saito, Yukie, Fukada, & Iwai, 1986). Over the years, these and other studies have established the view of an involvement of area MST in heading perception. Further evidence for this functional role comes from studies showing responses of single MST neurons to real compared to visually simulated motion (Bremmer, Kubischik, Pekel, Lappe, & Hoffmann, 1999; Duffy, 1998; Froehler & Duffy, 2002). In these studies, vestibular responses were observed during linear movement in light (i.e., combined visual and vestibular stimulation) as well as in darkness (i.e., pure vestibular stimulation). Usually, vestibular responses were smaller than visual responses. Only a weak if any correlation was found between preferred visual and vestibular directions. As an example, a cell preferring visually simulated forward motion might prefer pure vestibular stimulation directed backwards or into any other direction. The expected response scheme of identical preferred directions in the visual and vestibular domain, i.e., a synergistic signal convergence, was observed only for a small proportion of cells. HEADING ENCODING IN AREA VIP As mentioned above, area MST is not the only major output structure of area MT. Based on anatomical data (Maunsell & Van Essen, 1983; Ungerleider & Desimone, 1986a), the ventral intraparietal area (VIP) was originally defined as the MT projection zone in the intraparietal sulcus (IPS). Studies on the functional properties of VIP cells showed sensitivity for the direction and speed of moving visual stimuli (Colby, Duhamel, & Goldberg, 1993; Duhamel, Colby, & Goldberg, 1991). Follow-up studies suggested an involvement of area VIP in the processing of self-motion information (Bremmer, Duhamel, Ben Hamed, & Graf, 1995, 1997; Schaafsma & Duysens, 1996; Schaafsma, Duysens, & Gielen, 1997). In the experiments reviewed here, we went one step further and tested neurons in area VIP for their capability to encode the direction of self-motion (Bremmer, Duhamel, Ben Hamed, & Graf, 2002a). In our studies, we presented optic flow stimuli simulating straight-ahead (expansion) or backward (contraction) motion, i.e., with the singularity of the optic flow (SOF) at the screen centre. During the experiment, the head fixed animal was facing a translucent screen subtending the central 80° by 70° of the visual field. Computer generated visual stimuli as well as a fixation target were back-projected by a liquid crystal display system. During visual stimulation, the monkey had to keep the eyes for 4500 ms within the tolerance window always at straight-ahead position ([x, y]= [0°, 0°]) to receive a liquid reward. Visual stimuli were random dot patterns, consisting of 240 dots, each individual dot 0.5° in size. Expansion and contraction stimuli were presented interleaved in pseudorandomized order.
MULTISENSORY SELF-MOTION 165
Figure 1. Optic flow responses in area VIP. The two histograms show the responses of a single VIP neuron to an expansion stimulus (left) and a contraction stimulus (right). Raster displays (spike trains) indicate the response on a single trial basis. The tick marks in the spike trains indicate stimulus onset (1st tickmark), motion onset (2nd tickmark), motion offset (3rd tickmark), and stimulus offset (4th tickmark). This cell responded to the expansion stimulus but was inhibited by the contraction stimulus.
About two thirds of the neurons in area VIP responded selectively to optic flow stimuli simulating forward or backward motion. Activity often encompassed strong phasic responses to the onset of the simulated movement, which then decreased to a weaker tonic discharge level. One such example is shown in Figure 1. The panel on the left shows the responses of a cell for simulated forward motion (expansion), while the right panel shows the cell’s response for simulated backward motion (contraction). The cell revealed a clear preference for forward motion. At the population level, the majority of cells preferred expansion over contraction stimuli (72%). In addition, the average response of the population of neurons for an expansion stimulus was significantly stronger compared to the response for a contraction stimulus (Wilcoxon Signed Rank Test, p