THE PSYCHOLOGY OF LEARNING AND MOTIVATION Advances in Research and Theory
VOLUME 33
This Page Intentionally Left Bla...
16 downloads
737 Views
15MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
THE PSYCHOLOGY OF LEARNING AND MOTIVATION Advances in Research and Theory
VOLUME 33
This Page Intentionally Left Blank
THE PSYCHOLOGY OF LEARNING AND MOTIVATION Advances in Research and Theory
EDITEDBY DOUGLAS L. MEDIN DEPARTMENT OF PSYCHOLOGY NORTHWESTERN UNIVERSITY, EVANSTON, ILLINOIS
Volume 33
ACADEMIC PRESS San Diego New York Boston London Sydney Tokyo Toronto
This book is printed on acid-free paper.
8
Copyright 0 1995 by ACADEMIC PRESS, INC. All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.
Academic Press, Inc. A Division of Harcourt Brace & Company
525 B Street, Suite 1900, San Diego, California 92101-4495 United Kingdom Edition published by Academic Press Limited 24-28 Oval Road. London NW I 7DX
International Standard Serial Number: 0079-7421 International Standard Book Number: 0- 12-543333-6 PRINTED IN THE UNITED STATES OF AMERlCA 95 96 9 7 9 8 99 O O Q W 9 8 7 6
5
4
3 2 I
CONTENTS
Contributors
.................................................................................
ix
LANDMARK-BASED SPATIAL MEMORY IN THE PIGEON
Ken Cheng I. Wayfinding as Servomechanisms ................................................ I1. The Vector Sum Model ............................................................ III . Are Vectors or Components of Vectors Averaged? .......................... IV. Of Honeybees and Pigeons ........................................................ V. Averaging Independent Sources of Information ............................... References ............................................................................
2 6 8 13 16 20
THE ACQUISITION AND STRUCTURE OF EMOTIONAL RESPONSE CATEGORIES
Paula M . Niedenthal and Jamin B. Halberstadt I. Introduction .......................................................................... I1. Underlying Emotion Theory ...................................................... III. Emotion and Memory .............................................................. IV. Acquisition of Emotional Response Categories ............................... V . Influence of Emotional Response Categories in Perception ................. VI . Representing Emotional Response Categories: Beyond Semantic Networks .............................................................................. VII. Conclusion ............................................................................ References ............................................................................ V
23 27 28 37 44 55 57 57
vi
Contents
EARLY SYMBOL UNDERSTANDING AND USE
Judy S. Debache I . Introduction .......................................................................... I1. The Scale Model Task ............................................................. I11. A Model of Symbol Understanding and Use .................................. IV . Conclusion ............................................................................ References ............................................................................
65 70 77 109 112
MECHANISMS OF TRANSITION: LEARNING WITH A HELPING HAND
Susan Goldin-Meadow and Martha Wagner Alibali I . Gesture-Speech Mismatch as an Index of Transition ........................ I1. The Sources of Gesture-Speech Mismatch .................................... 111. The Role of Gesture-Speech Mismatch in the Learning Process ......... IV . The Representation of Information in Gesture and Speech ................. References ............................................................................
117 130 143 149 155
THE UNIVERSAL WORD IDENTIFICATION REFLEX
Charles A. Pedetti and Sulan Zhang I. The Centrality of Phonology in Alphabetic Systems ......................... I1. A Universal Phonological Principle ............................................. I11. Nonalphabetic Writing Systems .................................................. IV . Chinese Reading: No Semantics without Phonology? ....................... V. Phonology in Chinese Word Identification and Lexical Access ............ VI . The Time Course of Phonological and Semantic Activation ............... VII . Conclusion: Why Phonology May Be Privileged over Meaning ........... References ............................................................................
162 163 164
168 170 174 184 187
PROSPECTIVE MEMORY: PROGRESS AND PROCESSES
Mark A . McDaniel I . Introduction .......................................................................... I1. Conceptually Driven or Data-Driven Processing? ............................ I11. Recognition and Recall Processes in Prospective Memory .................
191 194 202
Contents
IV . Components of Recognition: Possible Mechanisms in Prospective Remembering ........................................................................ V . Automatic Unconscious Activation of Target-Action Associations ....... VI . Conclusions .......................................................................... References ............................................................................
vii
206 211 217 219
LOOKING FOR TRANSFER AND INTERFERENCE
Nancy Pennington and Bob Rehder I. I1. 111. IV . V.
Introduction .......................................................................... A Component View of Transfer ................................................. A Knowledge Representation View of Transfer .............................. A Competence View of Transfer ................................................ Summary .............................................................................. References ............................................................................
Index .......................................................................................... Contents of Recent Volumes ..............................................................
223 231 252 267 280 282
291 293
This Page Intentionally Left Blank
CONTRIBUTORS Numbers in parentheses indicate the pages on which the authors’ contributions begin.
Martha Wagner Alibali (117), Department of Psychology, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 Ken Cheng (l), School of Behavioural Science, Macquarie University, Sydney, New South Wales 2109, Australia Judy S. DeLoache (65). Department of Psychology, University of Illinois at Urbana-Champaign, Champaign, Illinois 6 1801 Susan Goldin-Meadow (1 17),Department of Psychology, The University of Chicago, Chicago, Illinois 60637 Jamin B. Halberstadt (23), Program in Cognitive Science, Indiana University, Bloomington, Indiana 45405 Mark A. McDaniel(l91). Department of Psychology, University of New Mexico, Albuquerque, New Mexico 87 131 Paula M. Niedenthal (23), Department of Psychology, Indiana University, Bloomington, Indiana 45405 Nancy Pennington (223), Department of Psychology, University of Colorado, Boulder, Colorado 80309 Charles A. Perfetti (161), Department of Psychology, University of Pittsburgh, Pittsburgh, Pennsylvania 15260 Bob Rehder (223), Department of Psychology, University of Colorado, Boulder, Colorado 80309 Sulan Zhang (16 I), Department of Psychology, University of Pittsburgh, Pittsburgh, Pennsylvania 15260 ix
This Page Intentionally Left Blank
LANDMARK-BASED SPATIAL MEMORY IN THE PIGEON Ken Cheng
Finding the way back to a desired place is an oft-encountered problem in animal life. For many, survival and reproduction are based on it. The Clark’s nutcracker, for example, is a bird that stores thousands of caches of pine seeds. To survive a winter, it must retrieve an estimated 2200-3000 such caches (Vander Wall & Balda, 1981).For another example, the digger wasp is so called because the females dig nests in the ground in which they lay eggs. The wasp is then faced with the problem of finding the way back to her nests to provision the future larvae with a source of food. In general, many creatures need to get back to locations of food and home. It is little wonder, then, that a number of place-finding mechanisms are found across the animal kingdom. In this chapter, I will describe wayfinding mechanisms, and in particular, the use of landmark-based spatial memory as servomechanisms. I will then describe one model for landmark-based spatial memory in pigeons, the vector sum model, and then revise it on the basis of further data. At the end, the servomechanism (1) uses the metric properties of space (that is, distances and directions) in guiding search, and (2) computes directions and distances separately and independently. This system is compared to the honeybee’s landmark-based spatial memory, which also shares these two properties. These systems break down the problem into separate modules and average the dictates of different sources of information. Comments on this form of modular neurocognitive architecture will complete the chapter. THE PSYCHOLOGY OF LEARNING AND MOTIVATION. VOL. 33
1
Copyright 8 1YYS by Academic Press. Inc. All rights of reproduction in any form reserved.
2
Ken Cheng
I. Wayfinding as Servomechanisms A servomechanism, which Gallistel (1980) called a basic unit of action, attempts to maintain a particular level on one or more variables behaviorally or physiologically (or both). I call the level to maintain a standard. The standard can be of an internal physiological variable, such as blood pH level or temperature, or some aspect of the world, such as keeping a moving object on the fovea. The latter is typically maintained by behavior. Figure 1 illustrates the servomechanism. At its core, a comparator compares a reading of the environment, called a record, with the standard. The environment must be broadly defined to include the internal milieu of the body as well. The discrepancy or error drives the mechanism to initiate action to reduce the error, thus completing a negative feedback loop. Servomechanisms of orientation are generally called taxes (plural of tuxis) in the ethological literature (Fraenkel & Gum, 1940). Most taxes move the animal not to any particular place, but to a place with more “desired” characteristics. But in two cases explained below, the standard is a particular place, and the mechanism compares a specification of the current place with the standard and generates action to reduce this discrepancy. Tropotaxis (Fig. 2) serves as an example of taxic orientation (Gallistel, 1980). It moves the animal, for example, a moth, toward a source of light.
COMPARATOR STANDARD
RECORD
ACTION
t
I
Fig. 1. The servomechanism. In a servomechanism. a comparator compares a perceptual signal (R, record) to a standard (S). The difference (error) drives behavior to reduce the error.
Landmark-Baaed Spatial Memory
3
COMPARATOR
Fig. 2. The tropotaxis, an example of a servomechanism where the comparator compares light intensity levels on the two eyes. The mechanism turns the organism toward the side receiving more light.
The comparator compares the light intensities on the two eyes of the moth, which are placed toward the side of the head. If the source of light is to the right of the moth, the right eye receives more light than the left. The action generated turns the moth to the right, thus reducing the discrepancy between the direction of light and the direction of the moth’s flight. If the left eye receives more light, the moth turns to the left, which also veers the moth toward the light. A. PATHINTEGRATION
Path integration, also known as inertial navigation or dead reckoning (Gallistel, 1990, ch. 4), is a wayfinding servomechanism that returns the animal to the starting place of the journey, typically its home (Fig. 3). The creature does so by keeping track of the vector it has covered since the beginning of the journey, that is, the straight line distance and direction from the starting point. This vector then allows it to deduce the approximate distance and direction home. Somehow then, it must be continuously adding, in the fashion of vector addition, the path it covers as it meanders, thus giving rise to the term path integration, literally the summing of small steps along the journey. In this servomechanism, the standard is the 0 vector, and the record is the vector deduced en route. To put the mechanism
Ken Cheng
4
COMPARATOR
ERROR
I
SUN, POLARIZED LIGHT, MOVEMENT, VISUAL FLOW, ETC.
4
MOVE IN DlRECTl$N -V FORIVI
Fig. 3. Path integration as an example of a servomechanism. The system keeps track of the vector (straight line distance and direction) from the starting place to where it ends up (V). The specification of the starting position (S) is the 0 vector. Comparison of S with the current position (R) gives the vector V. The organism thus must move according to the vector - V to reach home (starting position).
into operation, the animal reverses the deduced vector to derive the direction and distance home. Path integration is found in insects (Mueller & Wehner, 1988; Wehner & Srinivasan, 1981), birds (von St. Paul, 1982), and rodents (Etienne, 1987; Mittelstaedt & Mittelstaedt, 1980; Potegal, 1982, SCguinot, Maurer, & Etienne, 1993).
B. LANDMARK-BASED SPATIAL MEMORY Another common wayfinding servomechanism, and the main topic of this chapter, is landmark-based spatial memory (Gallistel, 1990, ch. 5). The ethological literature calls this rnnernofuxis(mnemo for memory), and modern practice also labels it piloting (Fig. 4). Some aspects of the spatial relationships between the goal and its surrounding landmarks are used to guide the way back. The standard here is some specification of the goal in terms of its relationship to landmarks, and the record is some specification of the current place. The mechanism works to reduce this discrepancy; I spell out more specific details below. Landmarks at different scales are undoubtedly used in zeroing in on a goal. The broad features of the terrain guide the subject to the region of the goal, and then nearer landmarks often must be used to pinpoint the
Landmark-Bd Spatial Memory
5
MNEMOTAXIS PILOTING COMPARATOR %GOAL W.R.T. LANDMARKS
I
RiCURRENT POSITION
1
LANDMARKS
1k-
VECTORS BASED ON D(R,S), ETC. I
Fig. 4. Landmark-basedspatial memory as an example of a servomechanism.The organism compares a specification of its current location (R), in terms of its relationshipwith landmarks, to a standard specification (S) representing the location of the goal with respect to landmarks. It reduces the discrepancy between the two (D(R,S)) in finding its way to the goal.
goal itself, which is often hidden and has no obvious characteristics marking it out. The food caches of Clark’s nutcrackers and the nests of digger wasps are such beaconless examples. I will restrict discussion to the final stage of using landmarks to pinpoint a goal. One experimental strategy, the most convincing to me, for demonstrating that a set of landmarks is used in pinpointing a location is to displace systematically the landmarks and observe that the subject systematically follows the displacement. Tinbergen (1972) provided one early classic example of this strategy. He placed pine cones around the nests of digger wasps, and while the wasps were away foraging, he displaced the cones. The wasps’ searches were displaced just as the cones were, indicating that the insects were relying on the provided landmarks in their search for their nests. Since then, such a strategy has been used to show that landmarks are used by diverse creatures, including rodents (Cheng, 1986; Collett, Cartwright, & Smith, 1986; Etienne, Teroni, Hurni, & Portenier, 1990; Suzuki, Augerinos, & Black, 1980), birds (Cheng, 1988,1989,1990; Cheng & Sherry, 1992; Spetch & Edwards, 1988;Vander Wall, 1982), cephalopods (Mather, 1991), and insects (Cartwright & Collett, 1982, 1983; Dyer & Gould, 1983; von Frisch, 1953; Wehner & Rgiber, 1979).
Ken Cheng
6
11. The Vector Sum Model
The first model I developed to account for landmark-based spatial memory in pigeons was inspired by the work on gerbils by Collett et al. (1986). They used cylindrical tubes as landmarks and suggested that the animal treats each tube as a landmark element, recording the distance and compass direction from the landmark to the goal. I call these landmark-goal vectors (Fig. 5), and they provide the basis for pointing the subject toward the goal. How the natural world, or even the laboratory world, is divided into landmark elements is unclear, but the model can nevertheless be tested. Artifical objects such as cylinders intuitively form landmark units, but intuitions provide little guidance for dividing up surfaces like walls, large objects like trees or tables, or clumps of objects like bushes. No theory takes the place of intuitions either. But the vector sum model makes a testable prediction no matter how landmarks are divided into elements, a prediction that was its glory and its downfall. The creature derives from every landmark-goal vector a vector directing itself to the goal. It does this by a bit of vector addition. For each landmark-goal vector, it adds the landmark from itself to the landmark in
B
A LEFT-RIGHT AXIS
Fig. 5. An illustration of the vector sum model. (A) The pigeon is trained to find hidden food at the goal, which stands in a constant location with respect to the landmark (LM) and the arena. Occasionally, it is tested with the goal and food absent. (B) When the landmark is shifted (to the right in this case) on a test, the navigation (self-goal) vectors associated with the shifted (s) landmark points to a location shifted in the direction and to the extent of the landmark shift. The rest of the navigation vectors, associated with unshifted (u) landmarks, point to the original goal location. The vector sum model predicts that the pigeon will search somewhere on the straight line connecting these two theoretical points (the dotted line).
Landmark-Bd Spatial Memory
7
question. Self-to-landmark vector plus landmark-to-goal vector create selfto-goal vector. The bird then averages the dictates of all the self-to-goal vectors. This averaging process gives rise to an interesting prediction. A. LANDMARK-BASED SEARCH TASK To explain the prediction, I must first describe the task, using the example shown in Fig. 5. The pigeon searches for food hidden under wood chips in some search space. The space might be an arena, as in Fig. 5, or a tray, or the entire floor of a lab room. Conditions in the search space remain constant from trial to trial, and some landmarks near the target usually help the search. In the example, a block of wood is to the right of the goal. The goal, a hidden food cup or hole in the ground, stands in a constant spatial relationship to the landmarks from trial to trial. After the bird is trained, it is occasionally tested with the food and goal absent. On a test, landmarks are sometimes manipulated. In Fig. 5B for example, the landmark has been shifted to the right. Search behavior on tests make up the data. Most often, a single-peaked search distribution is found, and a place of peak searching is formally calculated from the distribution (e.g., Cheng, 1989,1994). The vector sum model makes an interesting prediction on tests in which a landmark has been translated from its usual location. When a landmark is moved, the landmark-goal vectors associated with it point to a goal that is moved in the same direction by the same extent (open dot in Fig. 5B). The rest of the unmanipulated landmarks of course point to the original goal location (filled dot in Fig. 5B). When the dictates of these vectors are averaged, the averaged position must lie on the line connecting the two theoretical goal locations (the dotted line in Fig. SB). This much follows from Euclidean vector geometry. What this means is that when a landmark has been shifted on a test, the place of peak searching might shift in the direction of the landmark shift, but not in the orthogonal direction. This prediction was first tested in rectangular arenas in which the goal was near one of the edges (Cheng, 1988,1989). Shifts of landmarks parallel or perpendicular to the edge produced results supporting the prediction (Fig. 6A,B). When the landmark shifted parallel to the edge, the birds shifted their search parallel but not perpendicular to the edge. When the landmark shifted perpendicular to the edge, the birds shifted their search perpendicular but not parallel to the edge. But shifts of landmarks in a diagonal direction produced results contradicting the prediction (Cheng, 1990; Cheng & Sherry, 1992; Spetch, Cheng, & Mondloch, 1992). According to the model, the birds should shift in the diagonal direction, or equal amounts parallel and perpendicular to the edge. The birds, however, shifted
8
Ken Cheng
Fig. 6. Summary of results found with landmark shift experiments. The goal is near an edge, with a landmark (LM) near it. When the LM is shifted parallel to the edge (A), the birds shifted their searching parallel to the edge. When the LM is shifted perpendicular to the edge (B), the birds shifted their searching perpendicular to the edge. Both these patterns of results support the vector sum model. But when the LM is shifted diagonal to the edge (C), the birds shifted their searching more in the parallel direction than in the perpendicular direction, violating the predictions of the vector sum model.
far more in the parallel direction than in the perpendicular direction (Fig. 6C). We (Cheng & Sherry, 1992) suggested that the perpendicular distance of the goal from the edge also enters into the computation. A perpendicular distance, like a vector, has a distance and a direction, but it does not have a defined starting point; anywhere from the edge might be the starting point. The vectors we speak of have a defined starting point. This perpendicular distance enters into the averaging process and acts to hold the creature near the edge in the perpendicular dimension. Hence, searching shifts in the parallel direction but not much in the perpendicular direction. What is averaged with the perpendicular distance? It might be the landmark-goal vector in its entirety, or it might be only the perpendicular component of the landmark-goal vector. In the former case, the spirit of the vector sum model remains: vectors are averaged, but an additional component is thrown in. In the latter case, vector averaging does not take place; only elements of vectors are averaged. The spirit of the model is changed.
III. Are Vectors or Components of Vectors Averaged? I found no way to differentiate between the two cases with some variation of the paradigm, and hence used a new method to test whether entire vectors are averaged or components of vectors. The new method relied on
Landmark-Based Spatial Memory
9
directional conflict, and its experimental logic is illustrated in Fig. 7. On crucial tests, two sources of information pointed in different directions from a point that can be considered as the center of the search space. In all the experiments, the two landmark-goal vectors have the same distance, but pointed 90" apart. If conditions are set up right, the pigeon will average the dictates of the two sources and search somewhere in between the dictates of the two vectors. It has two ways of averaging: Entire vectors might be averaged, which I have called the vector-averaging model, or distances and directions, components of vectors, might be separately averaged, a scheme I call the direction-averaging model. The two models make different predictions about the radial distance of search, that is, the distance from the center of the search space to the place of peak searching. Suppose that the two entire vectors are averaged. That means that the resulting average must lie on the straight line connecting the endpoints of the vectors. The radial distance of search thus should be shorter than the radial distance of search on tests where no directional conflict is presented. The amount of shortening depends on the angle between the two vectors and the direction of search. It is indicated by the formula given in Fig. 7. On the other hand, suppose that distances and directions are averaged separately. In that case, the averaged radial distance should not differ from the radial distance of search on tests without direc-
1x1 =
IUI. cos(%)
cos(b: - 19)
Fig. 7. Predictions of the vector-averaging and direction-averagingmodels when two conflicting directions of search are present. The vector-averaging model predicts peak searching somewhere on the line segment connecting the endpoints of u and s, whereas the directionaveraging model predicts peak searching somewhere on the arc connecting the endpoints of u and s. (Reprinted from Cheng (1994) with permission of the Psychonomic Society Publications.)
Ken Cheng
10
tional conflict because the radial distance of the two landmark-goal vectors are the same. Directions can be averaged, and the resulting place of peak searching should lie somewhere on an arc connecting the endpoints of the landmark-goal vectors. In the intermediate range of directions then, the two models make detectably different predictions. As an example of this paradigm (Cheng, 1994, Experiment l), pigeons were trained to find hidden food at a constant distance and direction from one cylindrical bottle. On the bottle, facing the goal, was taped a cardboard strip. On crucial tests, the bottle was rotated by No, the strip along with it. The strip thus pointed at a direction 90"apart from the rest of the landmarks. RESULTS FAVOR THE DIRECTION-AVERAGING MODEL The data from a series of experiments using the directional conflict paradigm are plotted against the predictions of the two models in Fig. 8. The dependent measure is the radial distance of search on experimental tests, tests in which a directional conflict was presented. The predicted measure for the direction-averaging model is the radial distance on the corresponding control test. For the vector-averaging model, the predicted measure is
25
A
0
0
B
1
f
i
0
2o 15
i
3
10
8 > % 5
B 0
0 ~~
PREDICTED RADIAL DISTANCE (cm)
PREDICTED RADIAL DISTANCE (em)
Fig. 8. Observed radial distance of the place of peak searching on experimental tests plotted against the predicted radial distance for the vector-averaging model (A) and the direction-averagingmodel (B). The dotted lines represent perfect fit between data and model. The solid lines represent the best-fitting linear function through the origin by the least squares criterion. (Reprintedfrom Cheng (1994)with permission of the Psychonomic Society Publications.)
Landmark-Based Spatial Memory
11
calculated according to the formula in Fig. 7. Each point represents one bird in one experiment. The dotted line represents the exact predictions of each of the models, neither of which has any free parameters. The solid lines represent the straight line through (0,O) that best fits each set of data points. The data points are scattered unsystematically about the dotted line for the direction-averaging model, whereas they lie systematically above the dotted line for the vector-averaging model. This means that the data support the direction-averaging model but contradict the vector-averaging model, which systematically underpredicted the data. A number of statistical tests confirmed this impression (Cheng, 1994). We thus arrive at a different model that still retains the flavor of the vector sum model (Fig. 9). Whereas the vector sum model averages vectors in their entirety, the new model separately averages the distance and direction components of vectors. Different subsystems compute distances and directions, and their outputs are combined to determine where to search. In Fig. 9, I have listed some of the factors that affect the direction of search, although doubtlessly other factors are also used. Experimental
DIRECTIONAL DETERMINATION
LANDMARK ORIENTATION
\
S U MM
LOCAL
DISTANCE DETERMINATION
COMPUTATIONAL SYSTEMS
I 1c ACTION
GEOMETRY
Fig. 9. Distances and directions are separately and independently calculated in the landmark-based spatial memory of the pigeon. The summator averages the dictates of various cues.
12
Ken Cheng
evidence has been found for each of these factors. The experiment on landmark orientation, with the bottle with a stripe on it, has already been described. In the unpublished experiment on the direction of light, done in collaboration with Sylvain Fiset, two lights were placed high on the walls, 90" apart. A circular search tray served as the search space. During training, only one of the lights was lit. On crucial tests when the training light was turned off and the alternate light turned on, each of three birds shifted its location of peak searching. Thus the direction from which light emanates in part determines the direction of search. In the unpublished experiment on local geometry, the circular search space was placed asymmetrically in a larger square tray. The orientation of the larger tray thus gave a directional cue, a local geometric frame about the search tray. When the square tray was rotated about the center of the search space by 90",some birds followed the local geometry and rotated their place of peak searching. And finally, based on results with rats (Cheng, 1986; Gallistel, 1990, ch. 6; Margules & Gallistel, 1988), I assume that the global geometry defined by the shape of the room is also used in determining the direction of search. In those experiments, directionally disoriented rats searched for food in an enclosed rectangular arena. On the walls of the arena were found many distinguishing features, including black and white panels on different walls, and different smells, textures, and visual patterns in the corners. The rats in this task often confused the correct location with a geometrically equivalent location located at 180" rotation through the center from the correct location. In some conditions, they chose the geometric equivalent as often as the correct location. The two locations are indistinguishable based on the geometry of the shape of the enclosure alone. That is, without noting and remembering the features that stand on the shape carved out by the walls, the two locations cannot in principle be distinguished. That the rats make this geometric error suggests a module of the spatial representation system that encodes only the broad shape of the environment, a geometric module. Recent experiments with human infants have found the same pattern of errors (Hermer & Spelke, 1994). As for the distance component, manipulations of landmark size have no systematic effects on the distance of search (Cheng, 1988). The birds apparently do not rely much on the projected retinal size of landmarks (although gerbils do in part; see Goodale, Ellard, & Booth, 1990). Sources of information that deliver the three-dimensional distance from goal to landmarks are used, presumably a number of different sources.
Landmark-Bad Spatid Memory
13
IV. Of Honeybees and Pigeons The piloting servomechanism of the pigeon shares many features with that of the honeybee. I present together here the honeybee’s system as well as the comparison. The experiments on honeybees ran in the same spirit as those on pigeons (Cartwright & Collett, 1982, 1983; Cheng, Collett, & Wehner, 1986; Cheng, Collett, Pickhard, & Wehner, 1987; Collett & Baron, 1994). Free-flying bees searched for a dish of sugar water in a test room. An array of landmarks marked the location of the goal, which stood in a constant spatial relationship with respect to the array. Occasionally, the bees were tested with the sugar water absent. Landmarks might be manipulated on these tests. In the model developed from the results, the honeybees, like the pigeons, divided the array into separate landmark elements, and made computations based on distances and directions separately and independently. A. ELEMENTALISTIC SYSTEMS Cartwright and Collett (1982, 1983) proposed that the honeybee encodes a panoramic retinal template of the way the surrounding landmarks look from the viewpoint of the goal. The bee compares the current retinal panorama to the template and moves so as to reduce the discrepancy, in the classic manner of a servomechanism. The pigeons too compare current perceptual input with an encoded record and act so as to reduce the discrepancy between the two. The term template, however, is misleading because it implies an entire, holistic picture or record to which comparisons are made. But the piloting servomechanisms in both honeybees and pigeons are elementalistic rather than holistic. The landmark array is analyzed into separate elements, and comparisons and computations are made on each element independently. We ought to think of the servomechanism as having multiple comparators, or else a comparator with multiple independent standards to which comparisons are made. Whether we speak of multiple comparators or multiple standards within a single comparator is purely a terminological difference at the moment, as I do not know how to distinguish the two cases empirically. An elementalistic system brings on the problem of how an array is divided into elements. The problem has two facets: (1) how the array is divided into elements, and (2) how each element in the percept is matched to its corresponding partner in the encoded representation, also called the matching problem. An artificial landmark array often presents us an intuitive way of dividing it into elements. For example, in the work on honeybees, where colored cylindrical tubes served as landmarks, each tube served
14
Ken Cheng
as an element as well as each of the spaces between the tubes. But in a natural setting, intuition is far from clear. B. THEMATCHING PROBLEM The matching problem is an even graver problem. My guess is that the systems would work with a large variety of divisions into elements. Getting the systems to work depends far more crucially on matching elements correctly, or at least enough elements correctly. A mismatched element systematically misdirects the honeybee or pigeon, and enough of those will make the system dysfunctional. For the honeybee, Cartwright and Collett (1982,1983) proposed that an element on the template is matched to the spatially nearest element of the same type on the percept. Spatial nearness is defined in terms of compass direction and the type of element refers to whether the element represents a tube or the space in between tubes. Mismatches are produced with this scheme. But not enough of them are generated to sabotage the system. Of course, other characteristics, such as color (Cheng et al., 1986), can help solve the matching problem. For pigeons, I suggested (Cheng, 1992) another spatial scheme for identifying landmarks. Landmarks are identified by their particular (approximate) location in the global space rather than by any particular set of characteristics or features. In this scheme, the pigeons must have a higher level representation of the overall layout of space for locating particular landmarks to be used for piloting. One piece of evidence from unpublished data suggests that a landmark moved a substantial distance from its usual place is not identified as the same landmark. In this experiment, the birds’ task was to go to a food cup at the middle of one wall of a square arena in which food was hidden. Similar food cups were found at the other three walls. By the target food cup was a sizable block. When this block was moved to another wall on an unrewarded test, no pigeon followed it in its search. They all went to search at the usual place in the room where food was always found, even though the block was the only thing within the arena that distinguished the four walls. Apparently, the birds had used their inertial sense and landmarks outside the arena to locate at least the approximate place of reward. Another way of putting this is that a landmark moved too far from its usual place is not the same landmark anymore. Identification depends on spatial localization. Similar nonidentification of displaced landmarks or target objects have been noted in rodents. Mittelstaedt and Mittelstaedt (1980), studying homing by path integration in gerbils, displaced their home at the edge of the arena by some 20 cm and found that the returning gerbils went to the place
Landmark-BawdSpatial Memory
15
where they thought the home should be, and not to the nearby place from which the sounds and smells of their young were emanating. Devenport and Devenport (1994) trained chipmunks and ground squirrels to come to a distinctive feeder for food. When they displaced the feeder, to a place within easy sight of the original location, all the creatures but one went first to the original location where the feeder was and looked for it there. (The exception did not go to the new feeder either, but to another marked location serving as a control location.) They did this despite having the beacon within easy sight. A displaced beacon is apparently not the same beacon anymore, and the animals needed to learn again to find food at the new location. C. COMPARISONS OF DISTANCES AND DIRECTIONS DRIVETHE SYSTEM SEPARATELY For honeybees as for pigeons, comparisons of distances and directions are separately done. The piloting system is modular not only in breaking down the landmark array into elements, but also in breaking each element into a distance and a directional component. For the pigeon, these components make up parts of the landmark-to-goal vectors that help specify the exact location to search. For the honeybee, these components independently direct the bee in the direction it should fly to get to the target. No exact specification of the target location is derived. Each element in the template is compared to its corresponding element on the percept for projected (retinal) size and compass direction. Mismatches in size move the bee closer or farther from the element in question: The bee moves toward (in a centrifugal direction) elements that look too small and away from (in a centripetal direction) elements that look too big. It turns left (tangent to the left) when the matching element on the percept is to the left, and right when the matching element is to the right. All these different directional vectors are averaged in a weighted fashion to determine the direction of flight for the next step. The model bee takes a quantum step and does the comparisons and computations over again. The real bee presumably does this continuously. At po point is the entire template or representation as a whole invoked in guiding movement. An interesting problem arises in the honeybee system; the directions specified in the template must be compass directions, or have an earthbased reference. As the precept is specified in retinal coordinates, it would seem that the two coordinate systems must somehow be matched. In other words, the template must be rotated to be in the correct alignment with respect to the world. It turns out that this problem never arises because the honeybees always face the same direction in flight when searching for
Ken Cheng
16
a landmark-based goal (Collett & Baron, 1994). They first identify the nearby landmark (only one was used in the experiments in question) and fly toward it. When they are near the landmark, they then turn to fly in a particular direction and conduct the landmark-based search. The direction seems to be specified in geomagnetic coordinates.
V. Averaging Independent Sources of Information
These piloting servomechanisms illustrate a neurocognitive architecture of dividing and averaging. They illustrate modular systems in a number of senses. They suggest certain neural and cognitive control mechanisms. They seem also to require some neurophysiological mechanisms for storing the values of dimensions of information that enter into averaging. I conclude this chapter with some thoughts along these lines. A. MODULARITY Landmark-based spatial memories in the pigeon and the honeybee work in a modular fashion in three senses. (1) I consider the systems as a whole modular. (2) They modularize the information input in breaking down a whole landmark array into elements. (3) They modularize computation over types of information in working separately and independently on the metric properties of distances and directions. These piloting systems take in and compare a particular, restricted set of information and generate behavior on the basis of that. Other kinds of information do not play a role in the systems. They are specialized systems or special learning devices designed particularly to perform one kind of task. They illustrate par excellence what Chomsky (1980) has called mental organs, of which they are examples, and what Fodor (1983) terms vertical modularity. They are each a candidate for a special memory system (Sherry & Schacter, 1987). As such, they throw doubt on the empiricist view that the organism is simply one general learning device. The systems work with the metric properties of space, distances and directions. These geometric properties, highest in the hierarchy of geometric properties (Cheng & Gallistel, 1984; Gallistel, 1990, ch. 6), are intuitively the most spatial to us. This further suggests specialized systems particularly geared to computing spatial information. Attempts to incorporate spatial knowledge in an associationist framework have posited the use of “lower” geometric properties, such as graph-theoretic properties (e.g., Lieblich & Arbib, 1982). These systems are inadequate in accounting for the use of metric information as they have no way of capturing metric information.
LPndmark-BPsed Spatial Memory
17
The landmark-based spatial memories of honeybees and pigeons also modularize within the system. As I mentioned before, the landmark array is broken down into elements for analysis and computation. Each element is separately and independently analyzed, and the outputs are later integrated by way of the averaging process. Likewise, the two types of metric information, distances and directions, are treated in modular fashion. We can compare this to a perceptual system, in which incoming sensory information is separated and sent to different channels, to be reconstructed later into percepts. The different “channels” in the piloting servomechanisms are reconstructed not into percepts but to generate behavior bringing the animal nearer to its goal. We can call such systems internally modular. B. CONTROL BY WEIGHTING Internally modular systems provide many handles for control. Each of the boxes in Fig. 9, for example, can be emphasized or dampened. Factors external to the system can control which factors play a larger role in the averaging process that guides behavior, giving flexibility to the system. Control is exerted by adjusting the weighting parameters of each of the boxes. How this is done neurophysiologically is easy to imagine and understand. Presumably, the boxes give neural outputs that are represented neurally by synaptic connections to other parts of the systems. The strengths of the outputs can be modulated by other inputs that potentiate or depotentiate the connection (Gallistel, 1980). Potentiation and depotentiation are typical methods of hierarchical control of action. The flexibility of the control system means that we might find variation in behavior across organisms and situations. In the case of pigeons, this is reflected in the data: often when compromises are to be made, different animals show different weighting patterns. I should emphasize that this does not represent noise or imperfections in the data. Individual variation here shows where free parameters are found in the system. The data show constancy in other respects. For example, the data for particular individuals usually show astounding order (see, for example, Cheng, 1988). Or, when a landmark was moved parallel to the edge near the goal, pigeons did not shift their peak location of search in the perpendicular direction. The system works well enough with a large range of weighting schemes: All the animals find the food usually. It is a functional desideratum for a system to work with a range of parameter values, relieving the animal of the task of learning exact parameter values. Nevertheless, aside from idiosyncratic parametric variation, some general principles in weighting can be found. Both honeybees (Cheng et al., 1987)
18
Ken Cheng
and pigeons (Cheng, 1989) put more weight on nearer landmarks, with the honeybees in this case gauging distance by motion parallax (Lehrer, Srinivasan, Zhang, & Homdge, 1988; Srinivasan, Lehrer, Zhang, & Horridge, 1989). Honeybees also put more weight on landmarks that project a larger retinal size. Functionally speaking, the two species do this presumably because nearer and larger landmarks guide more accurately. Control by modulating the weights in averaging also takes place with two wayfinding servomechanisms, namely piloting and path integration. Etienne et al. (1990) put piloting (landmark) cues and path integrational cues in conflict for hamsters by displacing a salient landmark. The hamsters wandered from their home at the edge of a circular arena to the middle of the arena to retrieve some food for hoarding. A salient landmark, a light usually positioned over their home, was displaced by 90"on one series of tests. Many subjects followed the landmark given this conflict, but some struck out homeward in a compromise direction in between the direction of the landmark and the inertial direction home. The path integration and piloting systems were averaged in these cases. I suspect that the weighted averaging of independent sources of information is a common feature in neurocognitive architecture. The modularity of information processing makes any problem more tractable. Weighted averaging allows ready control over the system and is readily realizable in the brain. C. NEUROPHYSIOLOGICAL INSTANTIATION The entire scheme of a weighted average of the values of different submodules of information suggests other points concerning the neurophysiological basis of memory and learning, although matters here become far more speculative. The averaging of values suggests that the brain must somehow store the values of dimensions stably. I conclude with some reflections on this point. The most straightforward suggestion of such a system is that to average values, one must have stored the values in the brain, and then retrieve them for averaging. Values must be stable to be useful. Values that decay or change unpredictably over time without further informational input will mislead the animal. To me, values are unlikely to be represented by firing rates. Firing rates can fluctuate depending on the state of the brain, such as the availability of neurotransmitters. Functionally, firing rates are also expensive as a storage device. The brain is energetically expensive compared to the body, a fact that makes it especially inefficient to store values, which are only used on occasion, with a process that constantly uses energy. It is like using lit light bulbs to represent a number rather than, say, pieces of stones, the former being far more expensive energetically.
LPndmark-Blsed Spatial Memory
19
The alternative to using firing rate is some kind of stable structural representation, something stable rather than dynamic. This might be some “memory” substance the amount of which represents the value on a dimension. Gallistel (1990, ch. 15) provides more insights into this issue. Or it might be some structural change in the neuron that makes it fire at different rates when called upon. By “called upon,” I mean when it comes time to retrieve the value. The retrieval process might well be dynamic, represented by the firing of neurons. One scheme I can imagine is to have a retrieval neuron fire at a constant rate in accessing a value. The structure of a value neuron, which functions to store the value, then determines the rate at which it fires in response to the retrieval neuron, and the output of the value neuron can represent a value. Representing the outputting of values by firing rates has the advantage that it readily allows modulatory control by potentiators and depotentiators. That is, other processes can weight the value by affecting its connection to other units. It is not clear how the modulation can take place on the structurally stored values. To me, it seems unlikely that values to be averaged can be stored as associative connective strengths, including the modern version of this in the form of weights in a connectionist network. How do values about distances and directions from landmarks get stored in any such network? How do they get modulated in weighted averaging? I can only await concrete proposals in this regard, but I remain doubtful. Finally, by speculating on neurophysiology here, I mean far less to propose particular neurophysiological mechanisms of memory and far more to illustrate the point that to come up with the neurophysiological underpinnings, we must strive for a deep and theoretical understanding of the behavior that the physiology is supposed to underpin. We can identify many kinds of neurophysiological events in the brain, though there might be others that we have not yet identified. The kind of neurophysiological mechanisms that we seek must depend heavily on the models of behavior we devise. Perhaps the most central physiological function of the brain is to store and use information about the world gathered by the senses. Neuroscience’s grand failure is not having a definitive account of how the brain does this job. To overcome this central shortcoming, it would be best to pay far more attention to behavior. ACKNOWLEDGMENTS The author’s research reported in this chapter was supported by a research grant from the Natural Sciences and Engineering Research Council of Canada. The chapter was written while the author was at the Department of Psychology, University of Toronto.Correspondence
20
Ken Cheng
about this chapter should be sent to Ken Cheng, School of Behavioural Sciences, Macquarie University, Sydney NSW 2109, Australia (electronic mail: KCHENG9bunyip.bhs.mq.edu.au).
REFERENCES Cartwright, B. A., & Collett, T. S. (1982). How honeybees use landmarks to guide their return to a food source. Nature, 295, 560-564. Cartwright, B. A., & Collett, T. S. (1983). Landmark learning in bees. Journal of Comparative Physiology A, 151, 521-543. Cheng, K. (1986). A purely geometric module in the rat’s spatial representation. Cognition, 23, 149-178. Cheng, K. (1988). Some psychophysics of the pigeon’s use of landmarks. Journal of Compararive Physiology A , 162, 815-826. Cheng, K. (1989). The vector sum model of pigeon landmark use. Journal of Experimental Psychology: Animal Behavior Processes, 15, 366-375. Cheng, K. (1990). More psychophysics of the pigeon’s use of landmarks. Journal of Comparative Physiology A , 166, 857-863. Cheng, K. (1992). Three psychophysical principles in the processing of spatial and temporal information. In W. K. Honig & J. G. Fetterman (Eds.), Cognitive aspects of stimulus control (pp. 69-88). Hillsdale, NJ: Erlbaum. Cheng, K. (1994). The determination of direction in landmark-based spatial search in pigeons: A further test of the vector sum model. Animal Learning & Behavior, 22, 291-301. Cheng. K., Collett, T. S., Pickhard, A., & Wehner, R. (1987). The use of visual landmarks by honeybees: Bees weight landmarks according to their distance from the goal. Journal of Comparative Physiology A, 161, 469-475. Cheng. K., Collett, T. S., & Wehner, R. (1986). Honeybees learn the colours of landmarks. Journal of Comparative Physiology A, 159, 69-73. Cheng, K., & Gallistel, C. R. (1984). Testing the geometric power of an animal’s spatial representation. In H. L. Roitblat, T. G. Bever, & H. S. Terrace (Eds.). Animal cognition (pp. 409-423). Hillsdale, NJ: Erlbaum. Cheng, K., & Sherry, D. (1992). Landmark-based spatial memory in birds (Parus atricapillus and Columbia livia):The use of edges and distances to represent spatial positions. Journal of Comparative Psychology, 106, 331-341. Chomsky, N. (1980). Rules and representations. New York, NY: Columbia University Press. Collett, T. S., & Baron, J. (1994). Biological compasses and the coordinate frame of landmark memories in honeybees. Nature, 368, 137-140. Collett, T. S., Cartwright, B. A., & Smith, B. A. (1986). Landmark learning and visuo-spatial memories in gerbils. Journal of Comparative Physiology A , 158, 835-851. Devenport, J. A., & Devenport. L. D. (1994). Spatial navigation in natural habitat by grounddwelling sciurids. Animal Behaviour, 47, 727-729. Dyer, F. C., & Gould, J. L. (1983). Honeybee navigation. American Scientist, 71, 587-597. Etienne, A. (1987). The control of short-distance homing in the golden hamster. In P. Ellen & C. Thinus-Blanc (Eds.), Cognitive processes and spatial orientation in animal and man: Volume I. Experimental animalpsychology and ethology (pp. 233-251). Dordrecht, Netherlands: Martinus Nijhoff. Etienne, A., Teroni. E., Hurni, C., & Portenier, V. (1990). The effect of a single light cue on homing behaviour of the golden hamster. Animal Behaviour, 39, 17-41. Fodor, J. A. (1983). The modularity ofmind. Cambridge, MA: MIT Press.
Landmnrk-Bd Spatial Memory
21
Fraenkel, G.S., & Gunn, D. L. (1940). The orientation of animals. Oxford: Oxford University Press. Gallistel,C. R. (1980). The organization of action. Hillsdale, NJ: Lawrence Erlbaum Associates. Gallistel, C. R. (1990). The organization of learning. Cambridge, MA: MIT Press. Goodale, M. A., Ellard, C. G.,& Booth, L. (1990). The role of image size and retinal motion in the computation of absolute distance by the Mongolian gerbil (Meriones unguiculatus). Vision Research, 30, 399-413. Hermer, L., & Spelke, E. S. (1994). A geometric process for spatial reorientation in young children. Nature, 370, 57-59. Lehrer, M., Srinivasan,M. V., Zhang, S. W., & Horridge, G. A. (1988). Motion cues provide the bee’s visual world with a third dimension. Nature, 332, 356-357. Lieblich, I., & Arbib, M. A. (1982). Multiple representations of space underlying behavior. The Behavioral and Brain Sciences, 5, 627-659. Margules, J., & Gallistel, C. R. (1988). Heading in the rat: Determination by environmental shape. Animal Learning & Behavior, 16, 404-410. Mather, J. A. (1991). Navigation by spatial memory and use of visual landmarks in octopuses. Journal of Comparative Physiology A, 168, 491-497. Mittelstaedt, M. L., & Mittelstaedt, H. (1980). Homing by path integration in a mammal. Natunvissenschafren, 67, 566-567. Mueller, M., & Wehner, R. (1988). Path integration in desert ants, Cataglyphis fortis. Proceedings of the National Academy of Sciences, 85, 5287-5290. Potegal, M. (1982). Vestibular and neostriatal contributions to spatial orientation. In M. Potegal (Ed.), Spatial abilities: Development and physiological foundations (pp. 361-387). New York: Academic Press. SCguinot, V., Maurer, R., & Etienne, A. S. (1993). Dead reckoning in a small mammal: the evaluation of distance. Journal of Comparative Physiology A , 173, 103-113. Sherry, D. F., & Schacter,D. L. (1987). The evolution of multiple memory systems. Psychological Review, 49, 439-454. Spetch, M. L., Cheng, K., & Mondloch, M. V. (1992). Landmark use by pigeons in a touchscreen spatial search task. Animal Learning & Behavior, 20, 281-292. Spetch, M. L., & Edwards, C. A. (1988). Pigeons’, Columba livia, use of global and local cues for spatial memory. Animal Behaviour, 36, 293-296. Srinivasan, M. V., Lehrer, M., Zhang, S. W., & Horridge, G. A. (1989). How honeybees measure their distance from objects of unknown size. Journal of Comparative Physiology A, 165, 605-613. Suzuki, S., Augerinos, G., & Black, A. H. (1980). Stimulus control of spatial behavior on the eight-arm maze in rats. Learning and Motivation, 11, 1-18. Tinbergen, N. (1972). The animal in its world. Cambridge, MA: Harvard University Press. Vander Wall, S. B. (1982). An experimental analysis of cache recovery in Clark’s nutcracker. Animal Behaviour, 30, 84-94. Vander Wall, S. B., & Balda. R. P. (1981). Ecology and evolution of food-storage behavior in conifer-seed-cachingcorvids. Zeitschrifi f i r Tierpsychologie, 56, 217-242. von Frisch, K. (1953). The dancing bees (D. Ilse, Trans.). San Diego: Harcourt, Brace, Jovanovich. (Original work published 1953) von St. Paul, U. (1982). Do geese use path integration for walking home? In F. Papi & H. G. Wallraff (Eds.), Avian navigation (pp. 297-307). Berlin: Springer Verlag. Wehner, R., & Rtiber, F. (1979). Visual spatial memories in desert ants, Cataglyphis bicolor (Hymenoptera: Formicidae). Experientia, 35, 1569-1571. Wehner. R., & Srinivasan,M. V. (1981). Searching behaviour of desert ants, genus Cataglyphis. Journal of Comparative Physiology A , 142, 315-338.
This Page Intentionally Left Blank
THE ACQUISITION AND STRUCTURE OF EMOTIONAL RESPONSE CATEGORIES Paula M . Niedenthal and Jamin B. Halberstadt
I. Introduction Cognitive scientists study categories because the manner in which people perceive equivalences and make distinctions among objects reveals much about the mind’s structure and function. One of the early influential statements on category structure and function was Bruner, Goodnow, and Austin’s (1956) A Study of Thinking. Although our understanding of categories and their use has advanced enormously in the four decades since the publication of that book (e.g., Barsalou, 1991, 1993; Cantor & Mischel, 1979; Goldstone, 1994a, 1994b; Lakoff, 1987; Markman, 1989; Medin & Schaffer, 1978 Murphy & Medin, 1985; Rosch, 1973, 1975; E. E. Smith & Medin, 1981; E. R. Smith & Zarate, 1992), this advance has primarily concerned what Bruner et al. called formal and functional categories. Such categories are now likely to be variously called taxonomic, natural kind, artifactual, trait, social role, and, in the case of some functional categories, ad hoc, or goal-derived categories. That is, cognitive scientists have focused largely on groups of animate and inanimate, natural and artificial things that are grouped together because of their perceptual, relational, or functional similarity, because they are related through a common theory of cause and effect, or because they facilitate a common goal. That list seems long enough; one might ask if any other type of category has been left out. THE PSYCHOLOGY OF LEARNING
AND MOTIVATION. VOL. 33
23
Copyright 0 1 9 5 by Academic Press, Inc. All rights of reproduction in any form reserved.
24
Paula M. Niedenthal and Jamin B. Halberstadt
In fact, although some have forgotten this point, Bruner et al. proposed that “one may distinguish three broad classes of equivalence categories, each distinguished by the kind of defining response involved. They may be called affective, functional, and formal categories.”’ Affective categories are the anomaly here with respect to the corpus of research and theory on categorization in cognitive science. The idea is that affective reactions provide conceptual coherence. But Bruner et al. did little more than offer examples of affective categories: A group of people, books, weather of a certain kind, and certain states of mind are all grouped together as “alike,” the “same kind of thing”. . . . What holds them together and what leads one to say that some new experience “reminds one of such and such weather, people, and states” is the evocation of a defining affective response. (p. 4)
And we know of no subsequent theoretical or empirical attempts to follow up on these provocative ideas. On the one hand, it is unsurprising that cognitive scientists have chosen to ignore the possible role of affective response in conceptual coherence. The solution to the empirical problem of what should be taken as evidence that an individual possesses or uses an affective category is not at first obvious, in part because affective response categories may be implicit. Evidence of an individual’s possession of a taxonomic category is often inferred from a nominative response: a child calls two different furry animals that bark dog, and it is assumed that he or she understands the concept, dog. And the content of taxonomic categories can be assessed by asking subjects to perform feature-listing tasks. But as Bruner and colleagues acknowledged, affective response categories are often not amenable to description in terms of the features of the objects and events comprising them. Moreover, somewhat misleading category labels may be attached to affective categories in verbal discourse. For example, Bruner et al. suggested that “an air-raid siren, a dislodged piton while climbing, and a severe dressing-downby a superior may all produce a common autonomic response in a man and by this fact we infer that they are all grouped as “danger situations” (p. 2). In inferring the category danger situations Bruner et al. avoid their own proposition that affective reactions provide conceptual coherence, and in effect generate an example of conceptual coherence that fits the concepts-as-theories framework offered by Murphy and Medin I Although the present authors might be inclined to contend that Bruner et al.3 ordering of the three types of equivalence categories reflects the relative importance of those categories in human behavior, it appears that the ordering was intended to reflect a proposed developmental trend toward the use of different types of categories, with affective equivalences and distinctions being primary.
Emotional Response Categories
25
(1985). The air-raid siren, dislodged piton, and dressing-down might be more simply called things that evoke fear, or maybe even more accurately, things that are responded to with a set ofphysiological and expressive behaviors as well as overt actions that we call fear. Although there is probably a relationship between linguistic structure and the physiological basis of emotional reactions (e.g., see Lakoff’s, 1987, case study of anger), a point to which we return at the end of the chapter, we distinguish between categories of emotion knowledge, or what people know about the emotions (e.g., that they are fearful at horror movies), and affective response categories, or the sets of things that elicit similar emotional reactions (such as the items in the fear category described by Bruner et al., 1956). The former type of category is a domain of semantic knowledge that appears to behave in the way that other semantic categories behave (e.g., Fehr & Russell, 1984; Shaver, Schwartz, Kirson, & O’Connor, 1987). Affective response categories are classes of things that are treated as similar because they elicit similar affective responses; and people may not have explicit knowledge about the members of these categories. Neuroscientists have recently made a similar distinction between memory for facts about emotion and memory for emotionally evocative experiences. For example, LeDoux (1994) noted that emotional learning that comes about through fear conditioning is not declarative learning. Rather it is mediated by a different system, which in all likelihood operates independently of our conscious awareness. Emotional information may be stored within declarative memory, but it is kept there as a cold declarative fact. (p. 57)
On the other hand, the empirical study of affective response categories is not untractable. Categories are studied with methods that do not involve the production of verbal labels or feature lists (e.g., Medin & Schaffer, 1978; Nosofsky, 1986; Reed, 1972). And emotion researchers have developed numerous methods of inducing emotional states in the laboratory (e.g., Isen & Gorgoglione, 1983); so emotions themselves are amenable to experimental study. Furthermore, we know that emotions provide powerful specific feedback about stimulus events in the organism’s environment (Buck, 1985; Hinde, 1974), and we know that brain mechanisms involved in the processing of emotion have extensive interconnections with cortical areas that subserve higher level cognitive processes (LeDoux, 1993, 1994). So, it makes both theoretical and biological sense to talk about emotions as a kind of equivalence response, that is, as a response that can render external events similar. The present chapter is about affective response categories and what the study of such categories reveals about the mind. One way to investigate
26
Paula M. Niedenthd and Jnmin B. Hdberstadt
categories is to generate the proposed equivalence response in the subject and then examine how the response mediates category learning or the retrieve1 of already-learned information. So, to study affective categories, one might induce an affective response in the subject and then examine the influences of this state on learning and memory. In the first section of this chapter, we review the extant literature on the relationship between emotional state and long-term memory. Laird (1989) commented, “For memory theories, the mood and memory research has meant little change beyond the addition of some interesting variables” (p. 33). This may be so for theories of memory, but we contend that the work does something more for theories of categorization. The research indicates that although some categories are grounded in perceptual, functional, or relational similarity, other categories have their ground in a common affective response. This notion is captured in associative network theories of emotion that locate emotions as organizing nodes in memory (e.g., Bower, 1981, 1987; M. S. Clark & Isen, 1982). We contend further that the study of emotion lends insight into the probable structure of emotion concepts and serves to clarify the failure to find support for some hypotheses that are derived from the network model. In fact, for reasons that will become apparent, we call what Bruner et al. (1956) referred to as affective response categories, emotional response categories henceforth. Another way to study categories is to construct artificial categories and teach subjects to associate distinct equivalence responses with members of the different categories. In the second section of this chapter, we review studies in which we used this logic to provide evidence for some of the assumptions inherent in models of emotional response categorization. The findings suggest that subjects can acquire emotional response categories in the laboratory. In addition, we review some research that suggests that there are important constraints on emotional response categories that make them more similar to natural kind categories than other types of categories. Finally, in the last section of the chapter, we report a series of experiments that suggest that the influence of emotion categories begins at perception. Some recent evidence (e.g., Goldstone, 1994b; Norman, Brooks, Coblentz, & Babcook, 1992; von Hipple, Hawkins, & Narayan, 1994) supports the provocative and longstanding idea that people’s concepts mediate their perception of external events (e.g., Whorf, 1941). Our specific interest in the influences of emotion in perception has its intellectual basis in the New Look research of the 1940s and 1950s. Researchers who represented that movement (e.g., Bruner & Postman, 1947) first proposed that emotion and motivation mediate perception, or as Dixon (1981) put it, that “the emotional connotations of an external stimulus may be reacted to before
Emotional Response Categories
27
or without achieving conscious representation, and this emotional ‘classification’ then determines subsequent [cognitive-perceptual] events” (p. 121). The overarching theme of the chapter then is that emotional reactions can make things “hang together” for individuals, and that within existing categories as diverse as women, solutions to mathematical problems, music, and cartoon stimuli, individuals group things together on the basis of their common emotional reactions to them.
II. Underlying Emotion Theory The present work is guided by an approach to the study of emotion that informs our specificconcern with the processes that subserve the physiological and subjective components of emotion, that is, the feelings that we are suggesting can provide a form of conceptual coherence. The general approach is exemplified by the perceptual-motor theory of Leventhal(l984; but see also Buck’s, 1985, prime theory), which is itself based in theories that assert that emotions are central nervous system processes (Cannon, 1929), and also theories that hold that peripheral feedback can influence subjective feeling (Izard, 1977, Tomkins, 1962). According to Leventhal, the feeling of an emotion is created by the motor activity that results from the perception of an exciting stimulus. He outlines a hierarchical system of processes involved in the elicitation of an emotion. The first level, the expressivemotor level, is composed of neuromotor templates that produce expressive facial and bodily programs of behavior. The latter connections are presumably innate. The second level is the schematic level of processing at which information related to subjective state, motor routines, and their eliciting events are stored as schematic memory. Lastly, the conceptual level is composed of learned abstract rules for emotion episodes. According to Leventhal (1984), conceptual knowledge can sometimes be utilized to regulate the former two levels of processing by controlling attention and voluntary action. Hierarchical, or component process, theories of emotion such as Leventhal’s (1984) and Buck’s (1985) explicitly recognize that emotions can be elicited by the mere perception of evocative stimuli. Hierarchical theories also provide the means of conceptualizing emotional responses, and the development of schematic memory for conditioned emotional responses, as processes that are potentially distinct from higher order appraisal processes. Although we concur that cognitive appraisal mediates emotions under some conditions (e.g., see Fridja, Kuipers, & ter Schure, 1989; Lazarus, 1966; Ortony, Clore, & Collins, 1988), in this work we are most concerned with the way in which feelings themselves (expressive-motor processes) and
28
Paula M. Niedenthd and Jamin B. Hdberstodt
their associations with triggering events serve to provide order to the individual’s perceptual world. With respect to Leventhal’s hierarchy, we suspect that it is the conceptual level at which cross-cultural differences in emotion processes are most likely to be observed. Indeed, although many differences in what Mesquita and Fridja (1992) have called antecedent events and event coding have been observed across cultures, there are far fewer differences in the physiological and expressive components of emotion (Ekman, 1984; see also Shweder, 1991). This suggests that the structure of emotional response categories should be quite universal, but their frequency of instantiation and content may vary widely. 111. Emotion and Memory
Consider the following scenario: Stephanie, held captive in a long supermarket line, returns to her car just in time to see the tow truck, with her car in tow, recede into the distance. She is furious. She couldn’t have been more than five minutes late on the meter, she fumes to herself. And besides, why doesn’t the market open more checkout lines if they’re going to be so strict in their parking enforcement? She calls her husband for a ride, and while she waits with her slowly spoiling dairy products, her anger over the incident persists, even intensifies. It starts to cast a shadow over new interactions. She scowls at an approaching girl scout hawking her cookies: why are those girls always pestering you for money? She kicks aside a stray puppy that has wandered up, seeking attention: why don’t people keep control of their animals? Stephanie is reminded of other incidents that have made her angry: Claude never returned the CDs she lent him over a year ago. Melanie circulated those pictures of her from last New Year’s Eve. The bitch. Her anger also affects her predictions of the future. There’s no way she’s going to that office party this weekend; Tim will just make those maddening jokes about her small feet. He thinks he’s such a comedian! Come to think of it, she might as well cancel that vacation for next month; she and her husband will inevitably fight over how to spend money on it. They can’t afford it, really, and there’s no sense spending what little money they have on a week of angry nights.
In general, Stephanie’s interpretation of current and future events, as well as her memory for past events, have a common theme: they are all associated with feelings of anger. This phenomenon is interesting not only because it suggests that emotions may be important memory-retrieval cues, but also because it illustrates the idea that during emotional states people group together events that apparently have nothing else in common other then their relationship to the emotion. In this situation, in Stephanie’s mind the towed car is associated with Claude and the missing CDs, Melanie and the circulated pictures, jokes about her feet, fights with her husband, and the idea of canceling her
Emotional Response Categories
29
vacation plans. If the towing incident had not made her angry, Stephanie might have constructed or accessed the category of “things to do when your car is towed” and her thoughts would have followed a different chain of associations; for example, she might have thought about where her car was likely to have been towed, how much it would cost to retrieve it, and whether or not the car would be damaged. More formally, this example illustrates the emotion-congruence hypothesis, which states generally that the activation of a particular emotional state primes emotion-related material and biases perceptual and conceptual processes. Next, we describe a family of models from which more specific emotion-congruence predictions are derived. We also review relevant empirical literature. From this work we draw some conclusions about the structure of emotional response categories. OF EMOTION A. NETWORK THEORIES
Scientists who study categories usually do not ask the question, Are categories represented cognitively? For example, the fact that people can recognize dogs and deal with them not as novel stimulus events but as things that have been encountered before seems to be good enough evidence. The more interesting problem, then, is how to model the cognitive representation of categories and category use (e.g., Kruschke, 1992; Medin, Wattenmaker, & Michalski, 1987; Nosofsky, Palmeri, & McKinley, 1994). The associative network model is a possible solution. In general, in this type of model, memory is conceptualized as a network of informational units in which events, facts, and beliefs are represented as collections of propositions and the relationships among them (Anderson & Bower, 1973; Collins & Loftus, 1975; Shiffrin & Schneider, 1977). Intersections in the network represent concepts, and these are linked to other representations as a function of the strength of the semantic relations between them. For example, the concept dog consists of the information linked to the representation for dog, with the graded category membership accounted for by the varying strength of the connections between the members attached to the concept. (Previously the mental representations of concepts have been called nodes. We favor the terms information or informational units because there is no evidence to support the existence of specific mental entities that correspond to pieces of information.) When information is activated by an internal or external experience relevant to the concept it represents, activation spreads along the pathways intersecting the informational unit in a decreasing gradient. Associated information in the network receives activation as a function of the proximity and strength of their links to the activated unit, and the degree to which that information was initially activated. Information that is activated above some threshold reaches consciousness.
30
Paula M. Niedenthal and Jamin B. Halberstadt
In contrast, that people represent emotional response categories at all is not immediately obvious. Some compelling evidence is that people recognize their emotional states, probably even before they have words with which to label the states. It may be hard to describe sadness or anger, but we know them when we feel them; we have felt them before. The phenomenal experience of emotion must be represented (cf. Rime, Philippot, & Cisamolo, 1990). The Stephanie example suggests further that representations of emotional states are associated with conceptual material in memory such that they organize emotion-related material. A number of researchers (e.g., Bower, 1981, 1987, 1991; M. S. Clark & Isen, 1982; Ingram, 1984; Niedenthal, Setterlund, & Jones, 1994; Teasdale, 1983) have heuristically modeled the representation of emotion within associative network models of memory. Emotions are represented as units-which stand for the phenomenal states-and information connected to them. The associated information might be past events that have evoked the emotion, verbal labels for the emotion, statements and descriptions about the emotion, or the behaviors, expressive activity, and other physiological events that comprise the emotion. Some of these links are assumed to be innate (e.g., those corresponding to appropriate facial expressions of emotion); other links are assumed to be created through learning at the individual or cultural level (e.g., the association of the concept of a funeral with the emotion of sadness). From this approach the experience of an emotion such as happiness involves the activation of the appropriate emotion unit in memory. Activation then spreads from the information representing the phenomenal experience of happiness to other conceptual units that represent events that were related to happiness in the past, as well as to other related material. A spreading-activation model of emotional memory yields several predictions about the influences of emotional state on information processing. B. EMOTION-CONGRUENT RETRIEVAL One such prediction is that a person who is experiencing a particular emotional state will be more likely to retrieve material contained in the corresponding emotional response category, that is, material with similar emotional meaning and causally related to the emotion in question. In numerous studies, activation of a category has been found to facilitate retrieval of information associated with that category (e.g., Rosch, 1975). Similarly, when an emotion concept is activated (e.g., when an individual enters an emotional state), activation is assumed to spread to items connected to that information through their association with the appropriate emotional response. The additional activation that the items receive from
Emotional Response Categories
31
the emotion unit make them more likely to reach consciousness. Thus, the activation of emotional response categories offers an account of emotioncongruent retrieval, or facilitated recall for material related to a person’s current emotional state. In some cases, as in the related phenomenon of emotional-state-dependent retrieval, the relationship between the emotion and some information may not have existed prior to the experiment. Regardless of when the encoding takes place, however, information that is encoded as an instance of an emotional response category is assumed to receive activation from a primed emotion concept.* There are numerous empirical demonstrations that emotions do indeed guide, in an emotioncongruent manner, retrieval from long-term memory. Many of these findings have been reviewed at length elsewhere (Blaney, 1986; Ellis & Ashbrook, 1989; Singer & Salovey, 1988), and a few examples here will suffice. Some of the studies have examined the retrieval of autobiographical memories by subjects in induced emotional states (e.g., Madigan & Bollenbach, 1982; Natale & Hantas, 1982; Salovey & Singer, 1988, Exp. 2; Snyder & White, 1982;Teasdale & Fogarty, 1979), and naturally occurring states (e.g., D. M. Clark & Teasdale, 1982; Lloyd & Lishman, 1975). Generally, these studies find that happy and nondepressed subjects are more likely than sad or depressed subjects to recall happy personal experiences. The latter subjects retrieve memories with more depressive or negative content, and show increased latency to retrieval of positive memories. For example, Snyder and White (1982) used the Velten (1968) procedure, in which subjects read increasingly emotional self-referential statements, to induce happy and sad feelings in their subjects. Afterwards, subjects recalled personal experiences from the previous week. Happy subjects tended to recall more “pleasant, happy, or positive” events, whereas sad subjects tended to recall more “unpleasant, sad, or negative events.” Frequently in these studies (as in the one just described), judgment of pleasantness of memories recalled has been made by trained judges who are blind to the emotional state of the subject. This is likely to be a conservative (at best) or a misleading (at worst) method of assessing emotion congruence, because it is not clear what the subjects were feeling at the time of the recalled experiences. In some cases, therefore, judgments of the emotional nature of the memories have been made by the subjects themselves, at a time when they are no longer experiencing the emotion induced prior to Bower (1987) noted that the state-dependent retrieval effect has been relatively small and unreliable, and speculated that a causal attribution of the emotional response to an event may be necessary for a strong association to be formed. Although this proposal itself is consistent with research on category formation, an alternative explanation may be simply that newly formed associations in the lab are weaker than those formed from repeated, consistent experience with emotion-related words and concepts outside of the lab.
32
Paula M.Niedenthd and Jamin B. Halberstadt
recall. Emotion-congruent retrieval has also been observed in these cases (e.g., Laird, Cuniff, Sheehan, Shulman, & Strumn, 1989). Presumably in order to gain greater experimental control, some experimenters have examined recall of lists of words with positive and negative meanings (e.g., D. M. Clark & Teasdale, 1982, Exp. 1; Mathews & Bradley, 1983; Teasdale & Russell, 1983) or of stories with normatively evaluated positive and negative content (e.g., Bower, Gilligan, & Monteiro, 1981, Exp. 2 & 4). Importantly, this approach assumes that in daily experience, words with particular emotional meanings covary with the experience of specific emotional states in predictable ways: Other people label our emotional states, we describe our emotional states to others, and so forth (cf. Strauman, 1990). For example, Teasdale and Russell’s (1983) subjects learned a list of positive, negative, and neutral trait terms while in a neutral emotional state (i.e., without a mood induction). They then recall the words after completing a modified version of Velten’s (1968) happy or sad emotion induction. After a break, the entire preocedure was repeated, using a second word list and the alternative emotion induction. Emotional state interacted with word type, such that subjects recalled more sad words when sad than when happy, and more happy words when happy than when sad. C. EMOTION-CONGRUENT LEARNING In addition to retrieval effects, emotion network models also predict that emotion-congruent material should be better learned than emotionincongruent material. As in the case of nonemotional categories, activating an appropriate category or schema facilitates encoding and mediates attentional focus (e.g., Gilligan & Bower, 1984). Specifically, learning material that matches the learner’s state in emotional tone is more likely to be attended to, and can be more easily encoded as an instance of that emotional response category, making available as retrieval cues the wealth of information associated with that ~ategory.~ Many researchers (e.g., MacLeod, Mathews, & Tata, 1986; Mathews & MacLeod, 1986) have reported biases in attention to emotion-congruent stimuli due to emotional states such as anxiety. Mathews and MacLeod (1986), for example, found that anxious subjects were distracted by threatrelated words presented in an unattended auditory channel, although they Bower (1981) has offered an explanation of emotion-congruent learning related, but not identical to, the current category-based account. He suggested, essentially, that the activation of emotion-congruent material increases the elaboration of the to-be-learned material when this material is itself emotion-congruent. Bower also proposed that emotion-congruent retrieval could influence intensity of the emotional experience, and in turn develop stronger associations to learning material. We prefer the more parsimonious account based on emotion categories: When a category is activated, category members have an encoding advantage.
Emotional Response Categories
33
were unaware that the words had been presented. The Stroop Color and Word Test has also been employed to demonstrate the attention-drawing power of stimuli related to a subject’s anxiety or phobia (e.g., Watts, McKenna, Sharrock, & Trezise, 1986). Studies have also found attentional and encoding biases due to labinduced, specific emotions. For example, Forgas and Bower’s (1987) subjects, after receiving either a happy or sad emotion manipulation. spent more time reading statements congruent with their mood, in the context of an impression-formation task. Importantly, the differential reading time translated into an emotion-congruent bias in recall and judgment about the targets in the task. Baron (1987) also found judgment and recall biases for emotion-congruent information gleaned in a bogus job interview. In an explicit demonstration of emotion-congruent learning, Bower et al. (1981) used posthypnotic suggestion to induce happy or sad feelings in hypnotic-virtuoso subjects. During the emotional state, subjects read a story that contained a happy and a sad character, and that described incidents that occurred to the characters that justified their happy and sad feelings, respectively. Later, while in a neutral state, subjects recalled more about the character whose feelings and experiences were congruent with their emotional state during learning (see also Gilligan & Bower, 1984). In another study, Nasby & Yando (1982, Exp. 1) induced happy or sad emotional states in their 10-year-old subjects by instructing them to generate vivid imagery of times when they had felt those emotions. Subjects then learned a list of 24 adjectives, some positive and some negative in valence. Compared to subjects who learned the word lists in a neutral mood, subjects who learned the list while in a happy state later recalled more positive adjectives and fewer negative adjectives, and the reverse was generally true for sad subjects. Emotion-congruence effects at encoding have also been observed using subjects in naturally occurring emotional states (Hasher, Rose, Zacks, Sanft, & Doren, 1985). D. EMOTION-CONGRUENT PERCEP~ION One of the most interesting implications of our conceptualization of emotions as categories is its predictions regarding the influence of emotion on very early stages of processing. For example, emotional states are predicted to increase the ease and efficiency of perception of emotion-congruent stimuli. If emotional information spreads activation to codes for the perceptual features of words (consistent with top-down processing in early wordrecognition models such as Morton’s, 1969, and more recent interactive activation models such as McClelland & Rumelhart’s, 1981), then less sensory evidence from emotion-congruent (compared to emotion-incongruent)
34
Paula M. Niedenthal and Jamin B. Hnlberstadt
words should be required for the threshold for perceptual response to be reached (e.g., Gerrig & Bower, 1982). Emotion-congruent perception is particularly important when one considers that perception provides the input for higher level cognitive processes. We review evidence that both questions and supports this prediction in a later section. OF EMOTION CATEGORIES E. THESTRUCTURE
A feature that distinguishes various network models of emotional memory is the proposed number and nature of the emotion categories assumed to be represented. Specifically, the level of abstraction at which emotionrelated information is organized, and in turn the labels assigned to emotion categories, differ among the models. In Bower’s (1981) model, the biologically basic emotions are thought to be represented as categories in memory. Although there is some disagreement among emotion theorists regarding which emotions are basic (and indeed there are objections to the very idea of basic emotion; see Ortony & Turner, 1990), many theories of emotion consistently name at least the five emotions of happiness, sadness, anger, disgust, and fear, as good candidates (e.g., Ekman, 1984; Izard, 1977;Johnson-Laird & Oatley, 1992; Tomkins, 1962,1963; surprise, contempt, desire, and interest appear in some lists). These five emotions can be elicited by very simple visual, auditory, or tactile stimulation (or their withdrawal), and appear very early in development. Moreover, the emotions appear to be physiologically differentiated (Levenson, 1992; Levenson, Ekman, & Friesen, 1990), and are associated with universally recognized facial expressions (Ekman & Friesen, 1971; see Ekman, 1994, and Izard, 1994, for reviews; but see also Russell, 1994). Finally, recent research has demonstrated categorical perception of facial expressions of happiness, sadness, anger, disgust, and fear (Etcoff & Magee, 1992). According to Bower’s “categorical emotions” model, an anger unit, for example, should be activated exclusively by an angry emotional state, and not by sadness, fear, and so forth. Each central emotion unit should spread activation to information specifically related to the appropriate emotional state (see Niedenthal, et al., 1994, for discussion). In contrast, M.S. Clark and Isen (1982) have been interested in positive and negative feeling states. They proposed a network model of memory in which there are only two emotional response categories: one corresponding to positive feelings and one corresponding to negative feelings. This “valence” model of emotional experience and memory is based on factoranalytic studies of self-reports of emotional experience and of the apprehension of stimulus meaning (e..g, Osgood & Suci, 1955). Those studies reveal that the greatest variance in both subjective emotional experience and
Emotional Response Categories
35
semantic meaning is explained by the valence dimension (i.e., positivity). According to the Clark and Isen model, all positive emotions should activate the positive emotion concept, and all negative emotions should activate the negative emotion concept. These in turn should activate all related positive and negative ideas in memory, respectively. Still other researchers have been interested in emotional traits, such as depression, mania, and anxiety (e.g., Johnson & Magaro, 1987). Network models proposed in that literature argue for the existence of concepts for depression, anxiety, and elation (e.g., Ingram, 1984; Teasdale, 1983). For example, Ingram (1984) proposed that a specific “depression” concept organizes the representations of depressive behaviors and cognitions. Many of the same emotion-congruence predictions are made by these frameworks, and there is substantial evidence in their support (Gotlib & McCann, 1984). A more careful look at the pattern of results reported in the emotion and memory literature provides some reason to pursue the categorical emotions view of emotional memory. The fact is that emotion-congruent encoding and retrieval effects are robust laboratory phenomena only when positive emotions are considered. The effects are weaker and more inconsistent for the negative emotions. Very often an asymmetric influence of positive and negative emotions has been reported such that positive emotions enhance learning and retrieval of positive material, but negative emotions do not exert detectable effects on the processing of negative material (e.g., Clark & Waddell, 1983; Mischel, Ebbesen, & Zeiss, 1976; Nasby & Yando, 1982, Exp. 1). A possible explanation for this asymmetry is that positive and negative emotions are accompanied by different types of motivations (or “meta-moods”; Mayer & Gaschke, 1988) that differentially interfere with the proposed priming effects of emotion. In this view, people in happy moods are motivated to maintain their happy emotional state and so do nothing in particular to prevent the retrieval of congruent (happy) memories. However, people in sad moods are motivated to reduce their aversive emotional states, and so may engage controlled “mood repair” processes that counteract the priming of negative thoughts (M. S. Clark & Isen, 1982; Isen, 1984, 1987). We favor a different account of the asymmetry. Such an asymmetry might obtain if researchers assume a valence model of emotional memory when in fact emotional memory is organized according to the basic emotions. In most studies, the positive emotion that is induced is happiness; indeed, most theories of emotion do not include other positive basic emotions. For example, contrived experiences of success, emotional films, and emotional music seem to induce happiness; certainly manipulation checks indicate that this is the case, Similarly,the “congruent” emotional information used as the dependent measure is typically associated with happiness. Thus,
36
Paula M. Niedenthal and Jamin B. Halberstadt
regardless of any underlying assumption about the structure of emotional response categories, in the case of positive emotions, the procedures in an experiment may actually provide, de facto, a categorical match between the subject’s emotional state and the target material. However, there are more negative emotions, and many of the most popular negative emotion-induction procedures, such as emotional movies and contrived failure experiences, probably induce more than one negative emotion. For instance, some subjects may respond to failure by feeling sad and others by feeling angry. Similarly, the “negative” stimulus information used in many studies is in fact related to several different emotional states. Thus, when researchers implicitly or explicitly work from a valence model of emotional response categories, it is likely that they induce several specific negative emotions, such as sadness and anger, and use as dependent measures “congruent” information that is related to a different basic negative emotion, or to several of them (see Niedenthal, Setterlund, & Jones, 1994). If emotion categories are organized according to the basic emotions, feelings of sadness, for example, will not necessarily facilitate the processing of information related to other negative emotions. Therefore, a failure to detect emotion-congruent encoding and retrieval for negative feelings states may be due to a mistaken assumption about the structure of emotional categories. Recent experiments on emotion and retrieval from long-term memory by Laird and colleagues provide support for the notion that basic emotions serve as the primary emotional response categories (Laird, Wagener, Halal, & Szegda, 1982; Laird et al., 1989). In one experiment, subjects heard spoken sentences that, in content and vocal affect, reflected each of three negative emotions (anger, sadness, and fear; Laird et al., 1982, Study 2). The sentences were divided into three sets that contained equal numbers of sentences of each type. Before they listened to a given set of sentences, subjects were instructed to contract certain muscles so that they adopted the facial expression of one of the three negative emotions (there was no mention made of the name of the facial expression). This technique has been shown to produce the corresponding feeling in the expressor (e.g., Duclos et al., 1989;Laird, 1984;McCaul, Holmes, & Solomon, 1982). Irnmediately after hearing each set of sentences, subjects recalled the sentences while maintaining the expression adopted during learning. Although it is not clear whether the effect was due to selective encoding or retrieval, a categorical emotion-congruence effect was observed: subjects showed better recall for the type of emotional sentence that matched their emotional state during learning and recall. To summarize, the experimental studies of emotion and memory, and the models generated to account for such effects, suggest the following
Emotional Response Categories
37
tentative framework for conceptualizing emotional response categories: Some emotions, defined in terms of their physiological and expressive response properties, are represented as innate concepts in memory. During development, representations of events, beliefs, and verbal descriptions that are related to the emotional responses become associated with these concepts, forming emotional response categories. The experience of a particular emotion primes conceptual material related to that emotion. This process mediates the retrieval of old information and the perception and learning of new information. Although we have discussed the processing of emotional response categories in terms of semantic network or spreadingactivation models of memory, other process models account equally well for the experimental findings of emotion-congruent retrieval, learning, and perception.
IV. Acquisiton of Emotional Response Categories The emotion and memory research just reviewed suggests that events that share no obvious basis of similarity other than the emotional reaction that they have elicited are associated in memory. The literature provides an account of the influences of emotion on memory; from it we have tried to draw some inferences about the organization of cognitive material during emotional states and about the possible structure of emotional response categories. If the contention that emotional responses provide conceptual coherence is correct, then we also need to demonstrate that things that elicit similar emotional responses are treated as equivalent, or as “the same type of thing.” That is, we need evidence for emotional response categorization. In conceptualizing this problem, we have assumed that within existing categories, people possess subordinate concepts that group together members of these categories according to emotional response equivalence (Niedenthal, 1992). Bruner et al. (1956) illustrated this idea nicely with the following example: It is interesting that the gifted mathematicians often speak of certain formal categories in terms that are also affective in nature. G. H. Hardy in his delightful “apology” (1940) speaks of the class of things known as “elegant solutions” and while these may have formal properties they are also marked by the fact that they evoke a common affective response. (p. 6)
Similarly, people may possess (probably highly idiosyncratic) categories such as “people who make me feel sad,” “foods that disgust me,” and
38
Paula M. Niedenthal and Jamin B. Halberstadt
“landscapes that inspire In these examples, the instances of the response categories share another category membership (people, food, and landscapes) but are grouped together within those categories on the basis of their tendency to evoke a common emotional response. We looked for evidence that emotional responses serve as a basis for categorization in a series of two category learning experiments (Niedenthal, 1990, 1992). In both studies, subjects first engaged in an acquisition or training stage, during which an emotional response was essentially conditioned on the perception of members of a target category. Type of conditioned emotional response varied between subjects. In a subsequent test stage subjects saw examples of the target category and also new stimuli (distractors) and had to classify the old and new stimuli as such. During testing, targets and distractors sometimes evoked the emotional response associated with targets during learning, and sometimes evoked a different emotional response. We tested the hypothesis that during testing subjects would classify target stimuli more quickly when the targets evoked the emotional response on which they were originally conditioned. It was also possible that subjects would classify distractors more slowly on trials on which their emotional response to the distractor was the same as their conditioned emotional response to the targets. We attempted to condition emotional responses during acquisition by preceding presentations of target stimuli with pictures of faces that expressed a particular emotional expression (either a happy expression or a disgusted expression). The expressive faces were exposed for a very brief duration (5 ms) and the target stimuli served to mask them such that the faces were not consciously perceptable (detection data were also collected; see Niedenthal, 1990, 1992, for details). During testing we could create emotion-congruent and emotion-incongruent trials by pairing targets and distractors with faces that expressed the same type of emotion paired with targets during testing and faces that expressed different emotion, respectively. We expected the masked slides of emotional faces to elicit responses that would be associated with the stimuli because the time parameters did not allow for separate processing of the two stimuli. There is also a set of research findings that, taken together, suggest that the subliminal presentation of emotional faces actually elicits emotional responses per se (Niedenthal & Showers, 1991): First, past work has demonstrated that people are extremely efficient at processing human faces (Homa, Haver, & Schwartz, 1976;van Santen & Jonides, 1978) and are particularly skilled at processing expressions of emotion (Hansen & Hansen, 1988,1994). Moreover, people Although we label these categories, we do not mean to suggest that individuals always possess verbal labels for them.
Emotional Response Categories
39
respond to a facial expression of emotion by experiencing a similar emotion (Dimberg, 1982). This may be an innate reaction-facial expressions of emotion may be signal stimuli. Second, recent evidence suggests that the subcortical circuits that process emotion can respond very quickly, and respond prior to high-level cortical processes. The former mechanisms appear to modulate the latter processes through multiple ascending projections (LeDoux, 1987;Saper, 1987).Thus, in real time, the emotional processing of a stimulus can precede and in fact influence more complex cognitive responses (see Derryberry & Tucker, 1994, and Steinmetz, 1994, for reviews), Based on these prior research findings (e.g., Zajonc, 1980), we expected masked slides of expressive human faces to elicit distinctive affective reactions in our subjects (and the second experiment provides some converging evidence for this idea). In the first experiment the targets and distractors were cartoon stimuli. Examples of the targets appear in Fig. 1A and examples of the distractors appear in Fig. 1B.
A
B
Fig. 1. Target (top panel) and distractor stimuli from Niedenthal (1990).
Paula M. Niedenthal and Jamin B. Halberstadt
40
The response latency data for the experiment are illustrated in Fig. 2. The predicted effect of emotional response was observed for the accurate classification of targets: Subjects in both happy and disgust conditions classified targets more quickly when face primes evoked an emotional response similar to the response to targets during acquisition than when face primes evoked an incongruent emotional response. For accurate distractor responses, the pattern of results was somewhat more complicated. Subjects in the disgust condition classified distractors faster on incongruent trials (i.e., those on which the response not previously associated with the target was evoked) compared to congruent trials. This is a pattern of findings we might have predicted. However, subjects in the happy condition, who initially associated the targets with the response evoked by smiling faces, tended to categorize distractors faster on congruent, compared to incongruent, trials. Thus, people in the happy condition classified both targets and
A
'mm 1200
1100
800 900
happy
disgust
happy
disgust
""I
Fig. 2. Response latency (in rns) to classification of targets (A) and distractors (B) on affect-congruent (black bar) and affect-incongruent (white bar) trials. (From Niedenthal, 1990, Study 1).
Emotional Response Categories
41
distractors more quickly when these stimuli were preceded by congruent face primes. These results provided some evidence for the notion that people associate their emotional response with the perceptual features of stimuli and use the response as a basis for grouping stimuli together. However, a possible objection to this interpretation is that aspects of the subliminal faces other than subjects’ emotional responses could have caused the observed effects. For example, the structural properties of one type of facial expression (e.g., the shape of a smile) might have been encoded in subjects’ representation of the targets during training. It might then have been the congruence of mouth shape rather than affect elicited that produced facilitation on congruent compared to incongruent trials during testing. Put differently, it may have been features of the faces rather than the emotional responses they elicited that provided a cue to category membership. It might therefore be erroneous to conclude from the first experiment that emotional responses serve to provide a basis for categorization. With this in mind, we designed a second experiment in which the structural features of the emotion-eliciting stimuli varied substantially across acquisition and testing phases. The target stimuli used in this experiment were female faces that expressed neutral emotion. There were four different acquisition conditions. For some subjects the targets were paired with face primes that expressed happiness or disgust during training. These were the same conditions of category acquisition as those that comprised the first experiment. However, for other subjects targets were preceded by pictures of scenes that pretesting had shown to elicit happiness or disgust reactions in subjects (with supraliminal exposures). During testing, stimuli were always paired with faces that evoked a congruent or an incongruent emotional response. Thus, there were two scene-face conditions (one disgust, one happy) in which there was little or no physical similarity between the critical slides used in training and testing. The two face-face conditions (one disgust, one happy) served as a replication of the first experiment. An advantage to this experiment over Experiment 1, in addition to that of providing a more careful test of the role of emotional response in categorization per se, is that the members of the categories acquired during training shared no obvious set of features that distinguished them from the distractors. The sets of targets and distractors were randomly constructed from a large set of female faces of the same race and approximately the same age. For both disgust conditions (scene-face and face-face), as predicted, target responses were faster on emotion-congruent trials. Distractor responses were also slower on emotion-congruent trials. This result provides some evidence that emotional response rather than structural features of the
Paula M. Niedenthal and Jamin B. Halberstadt
42
faces accounted for the pattern of results observed in the first experiment. Results of the happy conditions did not provide such clear support. Subjects in those conditions did not make target responses faster on emotioncongruent trials. However, subjects in both happy conditions did make distractor responses more slowly on emotion-congruent compared to emotion-incongruent trials. It is not obvious why subjects who had associated happy responses with targets failed to categorize targets faster on emotion-congruent, compared to emotion-incongruent, trials. Recall that subjects in the happy response condition of the first experiment classified distractors faster on emotioncongruent trials and that this was also an unexpected finding. One possibility is that happiness simply does not serve as an emotional response category. However, as already discussed, there are numerous studies that demonstrate that happiness serves as a retrieval cue to happy memories. An alternative
A
800 n
Fig. 3. Response latency (in ms) to classification of targets (A) and distractors (B) on affect-congruent (black bar) and affect-incongruent (white bar) trials. Scene versusface refers to the type of affect conditioning subjects received (From Niedenthal, 1992.)
Emotional Response Categories
43
explanation for the weakness of the evidence in favor of a happy emotional response category in the studies just described is that the stimuli used in both studies (cartoons and female faces) were positive stimuli even without the subliminal face primes. Thus, the emotional response elicited by the subliminal faces might not have provided much in the way diagnostic evidence of category membership. Although the results of the two experiments just presented do not conform perfectly to our predictions, they are highly suggestive, and they fit with a number of other findings to form a quite compelling set of evidence that people can learn to equate groups of stimuli on the basis of emotional response. Kostandov and his colleagues (e.g., Kostandov, 1985; Kostandov & Arzumanov, 1977), for example, found evidence of emotional response categorization in event-related brain potentials (ERPs). In one study (Kostandov, 1985), subjects were exposed t o pictures of arrows drawn in two possible spatial orientations; one was drawn at 30” and the other at 35” from a vertical origin. During an acquisition phase, arrows of one orientation were paired with subliminal presentations of neutral words, and the other type of arrow was paired with emotional words that were related to the subjects’ previously committed crimes of jealousy. (These studies were conducted in the former Soviet Union.) A backward-conditioning procedure was employed; that is, the subliminal words succeeded presentations of the arrows. Although subjects could not consciously distinguish the two kinds of arrows, P300 responses showed that subjects came to reliably discriminate the two types of stimuli, presumably on the basis of the acquired emotional meanings. And the effect persisted even when the subliminal words were no longer paired with the arrows. Kostandov’s work is conceptually similar to prior studies that reported the acquisition of emotional response categories through the conditioning of emotional responses on previously neutral and otherwise unrelated sets of stimuli (e.g., Corteen & Wood, 1972; Forster & Govier, 1978). ERPs have also been shown to classify emotionally meaningful information in recent research by Shevrin and his colleagues (Shevrin et al., 1992). ON EMOTIONAL RESPONSE CATEGORIES A. CONSTRAINTS
We have been arguing that emotional responses provide one basis of conceptual coherence, and that the set of emotional response categories corresponds to the basic, or biologically prewired, emotions. If we are right, then there should be constraints on the possible “members” of emotion categories. For example, although associations between emotional state and novel (previously emotionally neutral) stimuli can be learned, as the research just reviewed seems to suggest, a stimulus event for which individu-
44
Paula M. Niedenthal and Jamin B. Halberstadt
als are biologically prepared to have a particular emotional response should not, or not easily, be associated with a different emotional response category. Some evidence in support of such constraints is provided by research by Ohman and Dimberg, (1978; Dimberg & ohman, 1983). In one study (ohman & Dimberg, 1978, Exp. 2), subjects’ electrodermal responses were conditioned to pictures of faces that expressed anger, happiness, or neutral emotion. Electric shock was the unconditioned stimulus. Electrodermal responses to the conditioned angry faces showed resistance to extinction, whereas responses to the happy and neutral faces extinguished very rapidly. This result and the results of similar studies by other researchers (e.g., Lanzetta & Orr, 1980; Orr & Lanzetta, 1980) have been interpreted as meaning that individuals are “prepared” to learn certain types of associations (Seligman, 1970). According to the preparedness view, stimulus events differ in the emotional reactions with which they can be associated, and the prepared connections are those that have adaptive value. A learned association between a happy face and the aversive experience of shock has no advantage; in fact such an association has a clear disadvantage. Many well-known animal studies suggest the same conclusion (Cook & Mineka, 1989, 1990; see Mineka, 1992, for review). In sum, it appears that although associations between emotional response and novel stimuli can be learned, there are important, adaptive limits to this learning. The “members” of emotional response categories probably reflect covariations among naturally occurring events and cannot be completely idiosyncratic.
V. Influence of Emotional Response Categories in Perception Within the study of categories and concepts, the provocative idea that category learning influences perceptual sensitivity and perceptual discrimination remains controversial. Still, the hypothesis has received some sound empirical support (e.g., Goldstone, 1994b; von Hipple et al., 1994). To us it seems less provocative to propose that the emotional response categories we have been describing mediate perception. This is because emotional categorization may take place at the subcortical level (LeDoux, 1993), and the brain substrates involved in emotion processing can directly influence those involved in perceptual processing (Derryberry & Tucker, 1991,1994). Our daily language also reflects the belief that emotions mediate perception. We say that people who are happy “see the world through rose-colored glasses.” When people are sad they “see only the dark side” of things, whereas when people are angry they “see red.”
Emotional Response Categories
45
Until recently, this interesting prediction, which derives from some interpretations of associate network models, had received little empirical support. For example, Gerrig and Bower’s (1982) highly hypnotizable subjects were induced with hypnotic suggestion to feel happiness and anger. During each feeling state, subjects were presented with briefly exposed target words that were positive, negative, and neutral in meaning. On each trial subjects had to recognize the target word from a pair of words. Contrary to predictions, subjects were not more accurate at identifying words that were congruent with their current emotional state. In a second experiment, Gerrig and Bower measured happy and angry subjects’ recognition thresholds for happy, angry, and neutral words. Again, contrary to the prediction, happy and angry feelings were not associated with lower recognition thresholds for emotion-congruent words relative to emotion-incongruent or neutral terms. D. M. Clark, Teasdale, Broadbent, and Martin (1983) conducted a test of emotion-congruent perception using a lexical decision task. Subjects in their study listened to happy or sad emotionally evocative music. They then performed the lexical decision task in which the critical words were positive, negative, and neutral personality attributes. Happy subjects did not make lexical decisions about positive words faster than sad subjects, and sad subjects did not respond to negative words faster than happy subjects, as was expected. Sad subjects merely made slower lexical decisions about all types of words, compared to happy subjects. Similar null or weak findings were observed in studies by MacLeod, Tata, and Mathews (1987), and Powell and Hemsley (1984), who used depressed and nondepressed subjects; Bower (1987) reported some unpublished failures to observe emotion congruence in perception. In a review of these null findings, Niedenthal, Setterlund, and Jones (1994) concluded that most of the previous experimental tests of the influences of emotion in perception were guided by a valence model of emotional response categories. For the most part, experimenters induced categorical emotions of happiness, sadness, or anger, but then exposed subjects to stimuli that were only related to those emotions by valence. It seemed the possibility that emotions influence perceptual processes was still real, and thus required a test with a categorical-emotions approach. A. LEXICAL DECISION EXPERIMENTS Our interpretation of the previous failed attempts to observe emotion congruence in perception inspired us to conduct two experiments (a main study and an exact replication) to compare directly the emotion-congruence predictions of a valence model and a categorical-emotions model (Niedenthal & Setterlund, 1994). In the studies, emotional states of happiness and
46
Paula M. Niedenthal and Jamin B. Halberstadt
sadness were induced in subjects by means of emotional classical music that had been demonstrated to evoke specific feelings (and not diffuse moods) in student subjects (see Niedenthal & Setterlund, 1994,for empirical support for this ~ l a i m )After . ~ an initial induction period during which they heard 10-12 min of the appropriate music delivered over stereo headphones, subjects filled out a manipulation check (a mood scale), and then performed the lexical decision task. The emotional music continued throughout the experiment in order to facilitate the maintenance of the emotional states. The words that appeared in the lexical decision task had happy (e.g., delight, joy), sad (e.g., despair, regret), positive (but happinessunrelated; e.g., wisdom, grace), negative (but sad-unrelated; e.g., blame, decay), and neutral (e.g.. habit, cluster) meanings. There were six words of each type. A single letter in each of 48 neutral English words was changed to create pronounceable nonwords. Given these stimuli, the valence model predicts that people experiencing happiness will make lexical decisions about happy and positive words more quickly than sad subjects, and that sad subjects will make lexical decisions about sad and negative words more quickly than happy subjects. The categorical model predicts a similar effect, but only for happy and sad words; happy and sad subjects would not be expected to differ detectably in the speed of their lexical decisions about positive and negative words. The results of both experiments supported the categorical model: happy subjects made lexical decisions about happy words faster than did sad subjects, and sad subjects made lexical decisions about sad words faster than did happy subjects. However, emotional state did not mediate lexical decisions about positive and negative words. Happy subjects made lexical decisions about both positive and negative items faster than sad subjects did. Mean response latencies for the two studies taken separately are graphically illustrated in Fig. 4. We also combined the data from the two experiments and performed regression analyses that collapsed over emotion-induction condition. In the analyses, subjects’ self-report ratings of their feelings of happiness and sadness were used as predictors of lexical decision speed to the different types of words. The regression analyses revealed that subjects’ happiness was negatively correlated with the speed of their lexical decisions about happy words; so happiness facilitated lexical decisions about happy words for all subjects. In addition, subjects’ sadness was negatively correMusical selections used to induce happy feelings included allegros from Eine KIeine Nachr Mwik, Divertimento #136, and Ein Mwikalischer Spass, all by Mozart, and from Concerto for Harpischord and Strings in C Major by Vivaldi. Musical selections used to induce sad feelings included Adagio for Strings by Barber, Adagieno by Mahler, and the adagio from Piano Concerto No. 2 in C Minor by Rachmaninov. None of the selections was repeated during the experimental session. No mention was made of subjects trying to attain a particular emotional state or arousal level.
Emotional Response Categories
47
Study 1 - 1
620
600 580
560 540
520 500
hPPY
8ad
pos
neg
neutral
word category
Study 2
i
---
happy
md
POS
neg
neutral
Fig. 4. Lexical decision response latency (in ms) by word category for happy (black bar) and sad (white bar) subjects in two studies. (From Niedenthal & Setterlund, 1994.)
lated with the speed of their lexical decisions about sad words. Self-reported happiness and sadness did not reliably, or even suggestively, mediate responses to positive and negative words. We followed up these effects with two more lexical decision experiments in which the word stimuli came from four coherent emotion categories: happiness, love, sadness, and anger (Niedenthal, Halberstadt, & Setterlund, 1994, Exp. 1). We also altered the design of these experiments somewhat in order to increase sensitivity to changes in speed of lexical decision as a function of emotion congruence, and to learn more about the perceptual processing of emotional words. Specifically, an affectively neutral control word was found to match every emotion word from all four categories on frequency, length, and first letter. Emotion and control words were also matched on concreteness. Example stimulus words and their matched controls appear in Table I. With the inclusion of control words in this way we
Paula M. Niedenthal and Jamin B. Halberstadt
48
TABLE I
EMOTION WORDSAND MATCHED CONTROLS Happy words: cheer [codes] pleasure [platform] delight [depends] joy [sum1
Love words: affection [afterward] caring [census] desire [detail] passion [plastic]
Sad words: defeat [device] despair [degrees] sorrow [sector] gloom [globe]
Anger words: anger [agent] rage [rent] dislike [diagram] fury [folk]
could test the null hypothesis that happy and sad subjects make lexical decisions with equal speed about words from four emotion categories compared to matched control items. The third experiment was very similar to the first two reported above: Subjects heard emotional classical music for a 10-12-min induction period, completed a mood scale, and then performed the lexical decision task. Results of the third experiment are illustrated in Fig. 5. Average differences in lexical decision to emotion terms and the matched controls are
""
40
I
-1 0 -20
-"
-
I
-50 I
happy
sad love word category
anger
1
Fig. 5. Mean facilitation for lexical decision latencies by word category for happy (black bar) and sad (white bar) subjects. (From Niedenthal, Halberstadt, & Setterlund, 1994, Study 1.)
Emotional Response Categories
49
presented separately for the four word categories for happy and sad subjects (positive values indicate facilitation). As can be seen, all subjects made lexical decisions about both happy and love words more quickly than about matched control words, and emotional state did not mediate these relationships. Thus, in the case of happy words we did not replicate the findings of the first two studies. However, the situation with negative words conformed more closely to the predictions. Sadness facilitated lexical decision about sad words, whereas happiness slowed lexical decision about sad words, compared to matched controls. In addition, all subjects made lexical decisions about anger words more slowly than about matched control words, and this effect was not mediated by emotional state. Consistent with the findings of the first two lexical decision experiments, in absolute decision time, happy subjects made lexical decisions about happy words more quickly than sad subjects (Ms = 537 and 551 ms, respectively), and sad subjects made lexical decisions about sad words more quickly than happy subjects (Ms = 569 and 583 ms, respectively) though the interaction was statistically marginal. In the fourth experiment, we added a neutral emotion control condition, in which subjects heard no music at all (Niedenthal et al., 1994. Exp. 2). In addition, we moved the manipulation check to the end of the experimental session so that subjects would not be exposed to emotional words (i.e., items on the mood scale) that could differentially affect their perception of the words in the lexical decision task. Finally, we increased the number of words in each category to eight, and also the number of subjects run in the emotion conditions, in order to enhance the power of the experiment. The primary results of the fourth experiment are reported in Fig. 6. As can be seen, greater support for the emotion-congruence prediction was obtained in this experiment. Happy subjects made lexical decisions about happy words more quickly than about matched controls, and this facilitation was significantly greater than the neutral mood and sad conditions. Sad subjects responded to sad words more quickly than to matched control words, whereas neutral mood and happy subjects, if anything, made lexical decisions about sad words more slowly than about matched control words. In addition, all subjects responded to love words faster than matched controls, and all subjects responded to anger words marginally slower than matched controls. Absolute response latencies to the different categories of words were also analyzed and again happy subjects responded to happy words more quickly than sad subjects (Ms = 545 and 567 ms, respectively), and sad subjects made lexical decisions about sad words more quickly than the happy subjects (Ms = 614 and 637 ms, respectively), and the interaction was statistically reliable.
50
Paula M. Niedenthal and Jmin B. Halberstadt
Fig. 6. Mean facilitation for lexical decision latencies by word category for happy (black bar) and sad (white bar) subjects and controls (hatched bar). (From Niedenthal, Halberstadt, & Setterlund, 1994, Study 2.)
As a package, the four lexical decision experiments provide quite compelling support for the hypothesis that emotion mediates perception. Furthermore, it appears that this mediation conforms to a categorical-emotions model of emotion categories rather than a valence model.
EXPERIMENT B. WORDPRONUNCIATION
It is not necessarily the case that the lexical decision task is a good measure of word encoding. The selection of that task was primarily motivated by the fact that many of the failures to detect emotion-congruent perception employed lexical decision methodology, and we wanted to respond directly to those studies. Many researchers agree, however, that in addition to encoding processes, lexical decision is sensitive to postaccess processes (e.g., Lorch, Balota, & Stamm, 1986; West & Stanovich, 1982). In the canonical lexical decision experiment, on a given trial a target word is preceded by a priming word. If the prime and target are semantically related, lexical decision about the target is facilitated compared to trials on which there is no semantic relationship between the prime and the target. By some accounts, such facilitation is explained by a postaccess process in which subjects check for a semantic association between the target and the prime (e.g., de Groot, 1983). Detection of a semantic relationship could facilitate a lexical decision because in order for a semantic relationship to exist, both items must be meaningful words; a semantic relationship thereby provides information about whether a target is a word or not. In the context of the present paradigm, one could argue that subjects checked for the presence
Emotional Response Categories
51
of a relationship between the meaning of the target word and the nature of their emotional state to inform the lexical decision. That is, given our methodology, there is also the possibility that the lexical decision task measured encoding plus some postaccess decision process. Sensitivity to this type of postaccess strategy can be more or less eliminated by the use of a word pronounciation task in which each item requires a novel response (pronunciation of that word). The word pronunciation task is held to be a more sensitive measure of stimulus encoding (e.g., Seidenberg, Waters, Sanders, & Langer, 1984). Because we ultimately wanted to ask whether emotions influence how emotional people actually see their world, we conducted an experiment in which we employed a word pronunciation methodology. In the experiment, subjects who were feeling happy, sad, or neutral emotions were exposed to the same words-emotional and matched controls-as those used in the fourth lexical decision study, and also to 25 filler words. Under speed-accuracy instructions, subjects said each word aloud as it was presented (once, and in a different random order for each subject) on the computer screen. Unlike the lexical decision experiments, the emotional music used to induce emotional states did not continue as background during the word pronunciation task because pilot testing indicated that the music interfered with word naming. This is important because analysis of the manipulation check (which was administered at the end of the experiment) showed that by the end of the experiment subjects in the various conditions were not experiencing very different emotional states (as they were in the previous lexical decision experiments). It appears that the manipulation of emotion wore off during the experiment. Nevertheless, the results of the experiment, illustrated in Fig. 7, are very suggestive. First, happy subjects showed significantly greater facilitation in naming happy words than neutral mood and sad subjects. Sad subjects also showed somewhat greater facilitation in naming sad words than happy or control subjects, but this effect did not reach significance. All subjects named words related to love and anger more quickly than matched control words, and emotional state did not mediate the effect.6 The analysis of the absolute word naming speed revealed that happy subjects named happy words faster than did sad subjects (Ms = 447 and 457 ms, respectively), and sad subjects named sad words faster than did happy subjects (Ms = 456 and 462 ms, respectively). The results of the word pronunciation study thus provide some suggestive evidence that emotional states mediate perceptual encoding. Results remained the same when the data from those subjects whose emotional states were unaffected by the musical inductions of happiness and sadness were removed from the data set.
52
Paula M. Niedenthal and Jamin B. Halberstadt
Fig. 7. Mean facilitation for word-naming latencies by word category for happy (black bar) and sad (white bar) subjects and controls (hatched bar). (From Niedenthal, Halherstadt, & Setterlund, 1994, Study 3.)
We have not yet discussed the observed word connotation main effects that are quite obvious in the figures. Regardless of their current emotional state, perceivers tended to make lexical decisions about positive words faster than about neutral controls, and to make lexical decisions about negative words slower than about neutral controls. In the word pronunciation experiment, subjects named all affective words faster than they named neutral control words. This tendency was not significant for sad words, however. It is unclear why the word connotation effects occurred and why they differed as a function of the experimental task. Although admittedly speculative, one explanation holds that the effect is due to the fact that lexical decision responses are more complex than are naming responses, and reflect the operation of both encoding and postaccess processes. Lexical decision requires both stimulus encoding and a decision having to do with successful or failed lexical access. The response time may therefore be sensitive to several processes that have been suggested to be engaged by affective information. The first is automatic processing of affectivity, which has been shown to facilitate processes operating on the stimuli because they potentially elicit arousal (Kitayama, 1990). The second process is the allocation of attention to negative information, which has been observed in several tasks, including the Stroop Color and Word Test (e.g., Hansen & Hansen, 1994; Pratto & John, 1991). In lexical decisions, responses to positive words may be particularly fast because the affective nature of the lexical item facilitates its processing, and positive information does not demand additional attention. Any tendency for lexical decisions to be facilitated by the affectivity of negative information may be eliminated, and in
Emotional Response Categories
53
fact reversed, by the fact that negative information is important enough to demand sustained attention, and to interfere with other attended tasks, such as lexical decision, that do not require a judgment about stimulus valence. Naming responses are less sensitive to postaccess decision processes, and most researchers contend that such responses are somewhat purer indicators of stimulus encoding. The lower level naming response may therefore only reflect the first component of affective processing, that is, the amplification by automatic detection of affectivity or stimulus significance. If not sensitive to the later process by which negative information captures attention, the naming task may therefore not reveal differences in response time due to the valence of the stimulus. Clearly, this interpretation is tentative, and further work using these tasks is required to test the present contentions. However, it does not seem too far-fetched to argue that the tasks measure fewer (pronunciation) or more (lexical decisions) processes that operate on affective information. C. RESOLUTION OF LEXICAL AMBIGUITY A final experiment in the line of word processing examined somewhat higher level processes and was conducted using auditorily presented homophones (Halberstadt, Niedenthal, & Kushner, in press). The purpose of the experiment was to explore the possibility that emotional state influences the resolution of lexical ambiguity, which is a special case of word recognition. Emotions, if they prime specific word meanings in memory, should serve to resolve lexical ambiguity such that the meaning of a homonym that is congruent with the emotional state of the individual is more likely than other meanings of the homonym to come to mind. So, for instance, sad subjects should be more likely to think of the sad rather than the neutral meanings of the words down or blue. Recent studies using clinically anxious subjects have found results that support this prediction. Eysenck, Mogg, May, Richards, and Mathews (1991, Study 1) asked clinically anxious, recovered-anxious, and nonanxious subjects to interpret ambiguous sentences such as, “The doctor examined little Emma’s growth.” Anxious subjects were more likely than the other subjects to interpret the sentences in a threatening way, that is, by interpreting the word growth to mean a tumor rather than change in height. In a similar study, relative to nonanxious subjects, anxious subjects generated threatening spellings for auditorily presented homophones that had threatening and nonthreatening meanings (e.g., die and dye, slay and sleigh; Mathews, Richards, & Eysenck, 1989). Richards, Reynolds, and French (1993) exposed high and low trait-anxious subjects to positive or negative mood inductions by showing them pleasant or unpleasant pictures. Subjects then heard and
54
Paula M. Niedenthd and Jamin B. Hdberstadt
wrote down a list of words, some of which were homophones with threatening and nonthreatening meanings. Results showed that both trait anxiety and manipulated mood influenced subjects’ resolution of lexical ambiguity. Anxious subjects, and those in negative moods, were more likely to spell the threatening meanings of the homophones. We conducted an experiment that employed a methdology similar to that used by Mathews et al. (1989), which more closely corresponded to our interest in emotion categorizations rather than persisting traits such as anxiety (Halberstadt et al., in press). In the study, happy and sad subjects, whose emotional states were induced with classical music, listened to a list of words and spelled the items in sequence. Some of the words were homophones that had both emotion-related (happy or sad) and neutral meanings. The words were presented every 3 s, which constrained subjects’ ability to choose among multiple meanings. Subjects heard 126 words in all, 19 of which were the critical homophones. After performing the spelling task subjects were instructed to go back and write down definitions next to the critical homophones so that we could be sure that the spellings that the subjects generated referred to the intended word meaning. We predicted that happy subjects would spell out more happy meanings of happy homophones (e.g., presents vs. presence) than sad subjects. Sad subjects were expected to spell out more sad meanings of sad homophones (e.g., mourning vs. morning) than happy subjects. Results revealed that, compared to happy subjects, sad subjects wrote down significantly more affective meanings of sad homophones (Ms = .494 for sad, and .403 for happy subjects; the scores reflect proportions of affective meanings generated). The two groups did not differ in the proportion of happy meanings written for happy homophones (Ms = .546 for sad, and .547 for happy subjects). We are not sure why the influence of sad and happy emotions appeared to be asymmetrical in this experiment. One possibility is that happiness and sadness have other effects on language processing in addition to the proposed influences on lexical access; specifically, emotional states may also influence concept meanings themselves. In a study by Isen, Niedenthal, and Cantor (1992) happy subjects rated rather poor examples of positive social categories as significantly better instances of those categories than did control subjects. For instance, compared to control subjects, happy subjects rated bartender as a somewhat better example of the category of nurturant people. This effect-greater category inclusiveness-did not extend to poor examples of negative social categories. Thus happiness seemed to be associated with a broadening of (positively valenced) category boundaries, which might have been due to a heightened focus on the positively evaluated features of the concepts being compared.
Emotional Response Categories
55
In light of this past work, it is possible that happy subjects in the homophone experiment tended t o see the neutral meanings of the happy homophones as more positive than usual. For example, happy subjects might have accessed dear and deer equally often rather than preferentially accessing dear because both concepts (temporarily) had positive meanings. If this were the case, then happy subjects would not show a greater tendency than sad subjects to access “happy” meanings of homophones because for them both meanings would have been relatively happy.
VI. Representing Emotional Response Categories: Beyond Semantic Networks The semantic network models of emotional memory discussed here have guided much of the work presented in this chapter. The models are largely extensions of the semantic network models used t o represent conceptual knowledge. An objection to this framework for representing emotion categories is that it fails to account for important distinctions among different kinds of emotion representations. Specifically, representing the phenomenal quality of emotions themselves with the same representational code (i.e., the proposition) as the conceptual knowledge describing the emotions, as well as the memories of episodes related to the emotions, implies that all of the representations operate according to the same processing principles. Clearly, this simple representational structure is inadequate to account, for example, for the observation that one can discuss being in an emotional state without actually returning to that state. In order for a model to capture the interactions between (probably implicit) emotional response categories and conceptual knowledge about the emotions, therefore, it will have to allow for various types of representational code, and to specify how the different representations interact. Niedenthal, Setterlund, and Jones (1994) proposed a model that represents lexical, conceptual, and somatic information at different processing levels and is largely consistent with the models of Leventhal (1984) and Buck (1985) discussed earlier. The levels cannot be entirely autonomous. For example, somatic priming of the conceptual level of emotion knowledge may be required to account for some types of emotion-congruent recall, whereas priming of the somatic level by the activation of the conceptual level is probably necessary to account for the success of the Velten mood manipulation (Velten, 1968). Moreover, the structure of emotion language corresponds well to the proposed structure of emotional experience itself (e.g., Fehr & Russell, 1984; Shaver, Schwartz, Kirson, & O’Connor, 1987). For example, in the Shaver et al. (1987) study of emotion terms, results of
56
Paula M. Niedenthal and Jamin B. Halberstadt
hierarchical cluster analysis revealed that people think about emotions and organize the emotion lexicon in terms of the basic categories of joy, love, anger, sadness, fear, and possibly surprise. The authors concluded that “It seems possible . . . that all of the terms in the emotion lexicon-at least the hundred or so that are most prototypical of the category emofionrefer in one way or other to a mere handful of basic-level emotions” (p. 1072). Finally, Niedenthal and Cantor (1986), and Erber (1991) have demonstrated that emotional responses to people and emotional states, respectively, can influence the ways in which those people are categorized into social role and personality trait categories. So, clearly there is an important relationship between the emotional response categories and conceptual knowledge about the emotions and ways of describing emotion (Lakoff, 1987). However, as just mentioned, people do not always experience emotions when discussing emotions or even when discussing emotional episodes. And, at the other extreme, some people can experience strong emotions and fail to find words to describe them at all (ten Houten, Hoppe, Bogen, & Walter, 1986).In an interesting pair of studies, Strauman and Higgins (1987) showed that words with positive meanings can prime negative emotional reactions in subjects. Specifically, Strauman and Higgins recruited some subjects who indicated in pretesting that they possessed discrepancies between who they believed they actually were (the actual self) and who they ideally wanted to be (the ideal self), and other subjects who had discrepancies between the actual self and the person they believed they ought to be (the ought self). Actual-ideal self discrepancies have been shown to be associated with a proneness to depression and dejection-related affect. Actual-ought discrepancies are associated with a proneness to anxiety and agitation (Higgins, 1989 for review). In the main studies, subjects were asked to complete sentence stems such as, “An honest person . . .” For each subject, some adjectives in the sentence stems, all of which were positively valenced, referred to that subject’s own self discrepancies. For example, if a subject’s actual self contained the trait disorganized and his or her ideal self contained the word organized, then organized might have appeared in one of the sentence stems for that subject. When subjects responded to the stems that contained adjectives that represented their self discrepancies, they demonstrated behavioral and physiological evidence of the predicted types of negative emotion (i.e., evidence of either depressive or agitated emotion). This was true despite the fact that the words were positive in meaning and were ostensibly referring to a “typical person,” and not the self. Similar findings have been obtained in work on the priming of positive and negative autobiographical memories (Strauman, 1990). This type of result is powerful evidence of the distinction between normative
Emotional Response Categories
57
word valence and emotional responses to information by individuals. We would classify some of the positive words used by Strauman as members of negative emotional response categories of his subjects.
W. Conclusion In conclusion, we have suggested that emotional responses provide a type of conceptual coherence. Specific emotional reactions to events may render those events equivalent for the perceiver. The emotional response categories are probably implicit, in that the individual cannot necessarily articulate the rule of conceptual coherence. We have also argued that the set of emotional response categories corresponds to a set of basic emotions, probably including happiness, sadness, fear, anger, and disgust. Some “members” of the categories are learned through experience, but there are probably innate constraints on the connections between emotional responses and naturally occurring stimuli. The relationship between emotional response categories and language is clearly important, but it is highly complex and will require much future research to generate quantitative models of emotional response categories and their connections to people’s explicit knowledge of emotions and emotional language. ACKNOWLEDGMENTS Preparation of this chapter and some of the authors’ research reported herein were facilitated by Grants MH4481 from the National Institute of Mental Health and BNS-8919755 and DBS-921019 from the National Science Foundation. The authors thank Lawrence BarsaIOU, Jerome Bruner, Robert Goldstone, Cindy Hall, and Douglas Medin for their helpful comments on earlier versions of the manuscript.
REFERENCES Anderson, J. R.. & Bower, G. H. (1973). Human associative memory. Washington, D.C.: Winston. Baron, R. A. (1987). Interviewer’s mood and reaction to job applicants: The influence of affective states on applied social judgments. Journal of Applied Social Psychology, 17, 91 1-926. Barsalou, L. W. (1991). Deriving categories to achieve goals. In G. H. Bower (Ed.), The psychology of learning and motivation (pp. 1-64). San Diego: Academic Press. Barsalou, L. W.(1993). Flexibility, structure, and linguistic vagary in concepts. Manifestations of a compositional system of perceptual symbols. In A. F. Collins, S. E. Gathercole, M. A. Conway, & P. E. Morris (Eds.), Theories of memory (pp. 29-101). Hillsdale, NJ: Erlbaum.
58
Paula M. Niedenthal and Jamin B. Halberstadt
Blaney, P. H. (1986). Affect and memory: A review. Psychological Bulletin, 99, 229-246. Bower, G. H. (1981). Mood and memory. American Psychologist, 36, 129-148. Bower, G . H. (1987). Commentary on mood and memory. Behaviour Therapy and Research, 25, 443-455. Bower, G. H. (1991). Mood congruity of social judgments. In J. P. Forgas (Ed.), Emotion and social judgments (pp. 31-54). Oxford: Pergamon Press. Bower, G. H., Gilligan, S. G., & Monteiro, K. P. (1981). Selectivity of learning caused by affective states. Journal of Experimental Psychology: General, 110, 451 -473. Bruner, J. S., Goodnow, J. J., & Austin, G. A. (1956). A study of thinking. New York: Wiley. Bruner, J. S., & Postman, L. (1947). Emotional selectivity in perception and reaction. Journal of Personality, 16, 69-77. Buck, R. (1985). Prime theory: An integrated view of motivation and emotion. Psychological Review, 92, 389-413. Cannon, W. B. (1929). Bodily changes in pain, hunger, fear, and rage. New York: Appleton-Century. Cantor, N., & Mischel, W. (1979). Prototypes in person perception. In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 12, pp. 3-52). New York: Academic Press. Clark, D. M., & Teasdale, J. D. (1982). Diurnal variation in clinical depression and accessibility of positive and negative experiences. Journal of Abnormal Psychology, 91, 87-95. Clark, D. M., Teasdale, J. D., Broadbent, D. E., & Martin, M. (1983). Effect of mood on lexical decisions. Bulletin of the Psychonomics Society, 21, 175-178. Clark, M. S., & Isen, A. M. (1982). Towards understanding the relationship between feeling states and social behavior. In A. H. Hastorf & A. M. Isen (Eds.), Cognitive socialpsychology (pp. 73-108). New York: Elsevier-North Holland. Clark, M. S., & Waddell, B. A. (1983). Effects of moods on thoughts about helping, attention, and information acquisition. Social Psychology Quarterly, 46, 31-35. Collins, A. M., & Loftus, E. F. (1975). A spreading activation theory of semantic processing. Psychological Review, 82, 407-428. Cook, M., & Mineka, S. (1989). Observational conditioning of fear to fear-relevant versus fear-irrelevant stimuli in rhesus monkeys. Journal of Abnormal Psychology, 98, 448-459. Cook, M., & Mineka, S. (1990). Selective associations in the observational conditioning of fear in monkeys. Joiirnalof Experimental Psycho1ogy:Animal Behavior Processes, 98,448-459. Corteen, R. S., & Wood, B. (1972). Automatic responses to shock associated words. Journal of Experimental Psychology, 94, 308-31 3. de Groot, A. M. B. (1983). The range of automatic spreading activation in word priming. Journal of Verbal Learning and Verbal Behavior, 22, 417-436. Derryberry, D., & Tucker, D. M. (1991). The adaptive base of the neural hierarchy: Elementary motivational controls on network function. In R. Dienstbier (Ed.), Nebraska symposium on motivation, Vol38: Perspectives on motivation (pp. 289-342). Lincoln, NE: University of Nebraska Press. Derryberry, D., & Tucker, D. M. (1994). Motivating the focus of attention. In P. M. Niedenthal & S. Kitayama (Eds.). The heart’s eye: Emotionalinfluencesin perception and attention (pp. 167-196). San Diego: Academic Press. Dimberg, U. (1982). Facial reactions for facial expressions. Psychophysiology, 19, 643-647. Dimberg, U., & ohman, A. (1983). The effects of directional facial cues on electrodermal conditioning to facial stimuli. Psychophysiology, 20, 160-167. Dixon, N. (1981). Preconscious processing. New York: John Wiley & Sons, Inc. Duclos, S. E., Laird, J. D., Schneider, E., Sexter, M., Stern, L., & Van Lighten, 0. (1989). Emotion-specific effects of facial expressions and postures on emotional experience. Journal of Personality and Social Psychology, 57, 100-108.
Emotional Response Categories
59
Ekman, P. (1984). Expression and the nature of emotion. In K. Scherer & P. Ekman (Eds.), Approaches to emotion (pp. 319-343). Hillsdale, NJ: Erlbaum. Ekman, P. (1994). Strong evidence for universals in facial expressions: A reply to Russell’s mistaken critique. Psychological Bulletin, 115, 268-287. Ekman, P.. & Friesen, W. V. (1971). Constants across culture in the face and emotion. Journal of Personality and Social Psychology, 17, 124-129. Ellis, H. C., & Ashbrook, P. W. (1989). The “state” of mood and memory research: A selective review. Journal of Social Behavior and Personality, 4, 1-21. Erber, R. (1991). Affective and semantic priming: Effects of mood on category accessibility and inference. Journal of Experimental Social Psychology, 27, 480-498. Etcoff. N. L., & Magee, J. J. (1992). Categorical perception of facial expressions. Cognition, 44, 221-240. Eysenck, M. W., Mogg, K., May, J., Richards, A., & Mathews, A. M. (1991). Bias in interpretation of ambiguous sentences related to threat in anxiety. JournalofAbnormal Psychology, 100, 144-150. Fehr, B., & Russell, J. A. (1984). The concept of emotion viewed from a prototype perspective. Journal of Experimental Psychology: General, 113, 464-486. Forgas, J. P., & Bower, G. H. (1987). Mood effects on person perception judgments. Journal of Personality and Social Psychology, 53, 53-60. Forster, P. M., & Govier, E. (1978). Discrimination without awareness? Quarterly Journal of Experimental Psychology, 30, 289-295. Fridja, N. H., Kuipers, P., & ter Schure, E. (1989). Relations among emotion, appraisal, and emotional action readiness. Journal of Personality and Social Psychology, 57, 212-228. Gerrig, R. J., & Bower, G. H. (1982). Emotional influences on word recognition. Bulletin of the Psychonomics Society, 19, 197-200. Gilligan, S. G., & Bower, G. H. (1984). Cognitive consequences of emotional arousal. C. E. Izard, C. J. Kagan, & R. B. Zajonc (Eds.), Emotions, cognition, and behavior (pp. 547-588). Cambridge: Cambridge University Press. Goldstone, R. L. (1994a). The role of similarity in categorization: providing a groundwork. Cognition, 52, 125-157. Goldstone, R. L. (1994b). Influences of categorization on perceptual discrimination. Journal of Experimental Psychology: General, 123. 178-200. Gotlib, 1. H., & McCann, C. D. (1984). Construct accessibility and depression: An examination of cognitive and affective factors. Journal of Personality and Social Psychology, 47, 427-439. Halberstadt, J. B., Niedenthal, P. M., & Kushner, J. (in press). Resolution of lexical ambiguity by emotional state. Psychological Science. Hansen, C. H., & Hansen, R. D. (1988). Finding the face in the crowd: An anger superiority effect. Journal of Personality and Social Psychology, 54, 917-924. Hansen, C. H., & Hansen, R. D. (1994). Automatic emotion: Attention and facial efference. In P. M. Niedenthal & S. Kitayama (Eds.), The heart’s eye: Emotional influences in perception and attention (pp. 217-243). San Diego: Academic Press. Hardy, G. H. (1940). A mathematician’s apology. Cambridge: Cambridge University Press. Hasher, L., Rose, K. C., Zacks, R. T., Sanft, H., & Doren, B. (1985). Mood, recall, and selectivity in normal college students. Journal of Experimental Psychology: General, 114, 106-120. Higgins, E. T. (1989). Self-discrepancy theory: What patterns of self-beliefs cause people to suffer? In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 22, pp. 96-136). San Diego: Academic Press. Hinde, R. A. (1974). Biological bases of human social behavior. New York: McGraw-Hill.
60
Paula M. Niedenthal and Jamin B. Hdberstadt
Homa, D., Haver, B., & Schwartz, T. (1976). Perceptibility of schematic face stimuli: Evidence for a perceptual gestalt. Memory and Cognition, 4, 176-185. Ingram, R. E. (1984). Toward an information-processing analysis of depression. Cognitive Therapy and Research, 8, 443-478. Isen, A. M. (1984). Toward understanding the role of affect in cognition. In R. S. Wyer & T.K. Srull (Eds.), Handbook of social cognition (Vol. 3. pp. 179-236). Hillsdale, NJ: Erlbaum. Isen, A. M. (1987). Positive affect, cognitive processes, and social behavior. In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 20, pp. 203-253). San Diego: Academic Press. Isen, A. M., & Gorgoglione, J. M. (1983). Some specific effects of four affect-induction procedures. Personality and Social Psychology Bulletin, 9, 136-143. Isen, A. M., Niedenthal, P. M., & Cantor, N. (1992). An influence of positive affect on social categorization. Motivation and Emotion, 16, 65-78. Izard, C. E. (1977). Human emotions. New York: Plenum Press. Izard. C. E. (1994). Innate and universal facial expressions: Evidence from developmental and cross-cultural research. Psychological Bulletin, 1 IS, 288-299. Johnson, M. H., & Magaro, P. A. (1987). Effects of mood and severity on memory processes in depression and mania. Psychological Bulletin, 101, 28-40. Johnson-Laird, P. N., & Oatley. K. (1992). Basic emotions, rationality, and folk theory. Cognition and Emorion, 6, 201-223. Kitayama, S. (1990). Interaction between affect and cognition in word perception. Journal of Personality and Social Psychology, 58, 209-217. Kostandov, E. A. (1985). Neurophysiological mechanisms of “unaccountable” emotions. In J. T. Spence & C. E. Izard (Eds.), Motivation, emotion, and personality (pp. 175-193). North-Holland: Elsevier. Kostandov, E. A,, & Arzumanov, Y. L. (1977). Averaged cortical evoked potentials to recognized and non-recognized verbal stimuli. Acta Neurobiologiae Experimentalis, 37, 311-324. Kruschke, J. K. (1992). ALCOVE: An exemplar-based connectionist model of category learning. Psychological Review, 99, 22-44. Laird, J. D. (1989). Mood affects memory because feelings are cognitions. Journal of Social Behavior and Personality, 4, 33-38. Laird, J. D., Cuniff, M., Sheehan, K., Shulman, D., & Strum, G. (1989). Emotion specific effects of facial expression on memory for life events. Journal of Social Behavior and Personality, 4, 87-98. Laird, J. D., Wagener, J. J., Halal, M., & Szegda, M. (1982). Remembering what you feel: Effects of emotion on memory. Journal of Personality and Social Psychology, 42,646-657. Lakoff, G . (1987). Women, fire, and dangerous things. Chicago: University of Chicago Press. Lanzetta, J. T., & Orr, S. P. (1980). Influences of facial expressions on the classicalconditioning of fear. Journal of Personality and Social Psychology, 39, 1081-1087. Lazarus, R. S. (1%6). Psychological stress and the coping process. New York McGraw-Hill. LeDoux, J. E. (1987). Emotion. In F. Plum (Ed.), Handbook of physiology. Section I: The netvous system. Volume V:Higher functions of the brain, Part I (pp. 419-459). Bethesda, MD: American Physiological Society. LeDoux, J. E. (1993). Emotional memory systems in the brain. Behavioural Brain Research, 58, 69-79. LeDoux, J. E. (1994, June). Emotion, memory and the brain. Scientific American, 50-57. Levenson, R. W. (1992). Autonomic nervous system differences among emotions. Psychological Science, 3, 23-27.
Emotional Response Categories
61
Levenson, R. W., Ekman, P., & Friesen, W. V. (1990). Voluntary facial action generates emotion-specificautonomic nervous system activity. Psychophysiology, 27, 363-384. Leventhal, H. (1984). A perceptual-motor theory of emotion. In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 17). San Diego, CA: Academic Press. Lloyd, G. G., & Lishman, W. A. (1975). Effect of depression on the speed of recall of pleasant and unpleasant experiences. Psychological Medicine, 5, 173-180. Lorch, R. F., Jr., Balota, D. A., & Stamm, E. G. (1986). Locus of inhibition effects in the priming of lexical decisions: Pre- or postlexical? Memory and Cognition, 14, 95-103. MacLeod, C., Mathews, A., & Tata, P. (1986). Attentional bias in emotional disorders. Journal of Abnormal Psychology, 95, 15-20. MacLeod, C., Tata, P., & Mathews, A. (1987). Perception of emotionally valenced information in depression. British Journal of Clinical Psychology, 26, 67-68. Madigan, R. J., & Bollenbach, A. K. (1982). Effects of induced mood on retrieval of personal episodes and semantic memories. Psychological Reports, 50, 147-157. Mathews, C., & MacLeod, A. (1986). Discrimination of threat cues without awareness in anxiety states. Journal of Abnormal Psychology, 95, 131-138. Markman, E. (1989). Categorization and naming in children. Cambridge: MIT Press. Mathews, A. M., & Bradley, B. (1983). Mood and the self-reference bias in recall. Behavioural Research and Therapy, 21, 233-239. Mathews, A. M., Richards, A., & Eysenck, M. (1989). Interpretation of homophones related to threat in anxiety states. Journal of Abnormal Psychology, 98, 31-34. Mayer, J. D., & Gaschke, Y.N. (1988). The experience and meta-experience of mood. Journal of Personality and Social Psychology, 55, 102-111. McCaul, K. D., Holmes, D. S., & Solomon, S. (1982). Voluntary expressive changes and emotion. Journal of Personality and Social Psychology, 42, 145-152. McCleUand, J. L., & Rumelhart, D. E. (1981). An interactive activation model of context effects in letter perception: Part I. An account of basic findings. Psychological Review, 76, 165-178. Medin, D. L. & Schaffer,M. M. (1978). Context theory of classificationlearning. Psychological Review, 85, 207-238. Medin, D. L., Wattenmaker, W. D., & Michalski, R. S. (1987). Constraints and preferences in inductive learning. An experimental study of human and machine performance. Cognitive Science, 11, 299-339. Mesquita, B., & Fridja, N. H. (1992). Cultural variation in emotion: A review. Psychological Bulletin, 112, 179-204. Mineka, S. (1992). Evolutionary memories, emotional processing, and the emotional disorders. In D. L. Medin (Ed.), The psychology of learning and motivation (Vol. 28, pp. 161-206). San Diego: Academic Press. Mischel, W., Ebbesen, E. E., & Zeiss, A. (1976). Determinants of selective memory about the self. Journal of Consulting and Clinical Psychology, 44, 92-103. Morton, J. (1969). The interaction of information in word recognition. Psychological Review, 76, 165-178. Murphy, G . L., & Medin, D. L. (1985). The role of theories in conceptual coherence. Psychological Review, 92, 289-316. Nasby, W., & Yando, R. (1982). Selective encoding and retrieval of affectively-valent information: Two cognitive consequences of mood. Journal of Personality and Social Psychology, 43, 1244-1253. Natale, M., & Hantas, M. (1982). Effect of temporary mood states on selective memory about the self. Journal of Personality and Social Psychology, 42, 927-934.
62
Paula M. Niedenthal and Jamin B. Halberstadt
Niedenthal, P. M. (1990). Implicit perception of affective information. Journal of Experimental Social Psychology, 26, 505-527. Niedenthal, P. M. (1992). Affect and social perception: On the psychological validity of rosecolored glasses. In R. Bornstein & T. Pittman (Eds.), Perception without awareness. (pp. 211-235). New York: Guilford Press. Niedenthal, P. M.. & Cantor, N. (1986). Affective responses as guides to category-based inferences. Motivation and Emotion, 10, 217-332. Niedentha1.P. M., Halberstadt, J. B.. & Setter1und.M. B. (1994). Emotional state and emotional connotation in word perception. Indiana University Cognitive Science Program Research Report 119, Indiana University. Niedenthal, P. M., & Setterlund, M. B. (1994). Emotion congruence in perception. Personality and Social Psychology Bulletin, 20, 401-410. Niedenthal, P. M., Setterlund, M. B., & Jones, D. E. (1994). Emotional organization of perceptual memory. In P. M. Niedenthal& S. Kitayama, (Eds.), The heart’s eye: Emotional influences in perception and attention (pp. 87-113). San Diego: Academic Press. Niedenthal, P. M., & Showers, C. J. (1991). The perception and processing of affective information and its influence on social judgment. In J. Forgas (Ed.), Affect and social judgment (pp. 125-143). Oxford: Pergamon Press. Norman, G. R., Brooks, L. R., Coblentz, C. L., & Babcook, C. J. (1992). The correlation of feature identification and category judgments in diagnostic radiology. Memory and Cognition, 20, 344-355. Nosofsky, R. M. (1986). Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology: General, 115, 39-57. Nosofsky, R. M., Palmeri, T. J., & McKinley, S. C. (1994). Rule-plus-exception model of classification learning. Psychological Review, 101, 53-79. Ohman, A., & Dimberg, U.(1978). Facial expressions as conditioned stimuli for electrodermal responses: A case of “preparedness?” Journal of Personality and Social Psychology, 36, 1251-1258. Orr, S. P., & Lanzetta, J. T. (1980). Facial expressions of emotion as conditioned stimuli for human autonomic conditioning. Journal of Personality and Social Psychology, 38,278-282. Ortony, A., Clore. G., & Collins, A. (1988). The cognitive structure of emotions. Cambridge, England Cambridge University Press. Ortony, A., & Turner, T. J. (1990). What’s basic about basic emotions? Psychological Review, 97, 315-331. Osgood, C. E., & Suci, G. J. (1955). Factor analysis of meaning. Journal of Experimental Psychology, 50, 325-338. Powell, M., & Hemsley, D. R. (1984). Depression: A breakdown of perceptual defense? British Journal of Psychiatry, 145, 358-362. Pratto, F., & John, 0.P. (1991). Automatic vigilance: The attention-grabbing power of negative information. Journal of Personality and Social Psychology, 61, 380-391. Reed, S. K. (1972). Pattern recognition and categorization. Cognitive Psychology, 3, 382407. Richards, A., Reynolds, A., & French, C. (1993). Anxiety and the spelling and use in sentences of threatheutral homophones. Current Psychology: Research and Reviews, 12, 1825. Rime, B.. Philippot, P., & Cisamolo, D. (1990). Social schemata of peripheral changes in emotion. Journal of Personality and Social Psychology, 59, 38-49. Rosch, E. (1973). On the internal structure of perceptual and semantic categories. In T. E. Moore (Ed.), Cognitive development and the acquisition of language, (pp. 111-144). New York: Academic Press.
Emotional Response Categories
63
Rosch, E. (1975). Cognitive representations of semantic categories. Journal of Experimental Psychology: Human Perception and Performance, 1, 303-322. Russell, J. A. (1994). Is there universal recognition of emotion from facial expression? A review of the cross-cultural studies. Psychological Review, 115, 102-141. Salovey, P., & Singer, J. A. (1988). Mood congruency effects in recall of childhood versus recent memories. Journal of Social Behavior and Personality, 89, 99-120. Saper, C. B. (1987). Diffuse cortical projection systems: Anatomical organization and role in cortical function. In F. Plum (Ed.), Handbook of physiology. Section I: The neruous system. Volume V: Higher functions of the brain, Part I. (pp. 169-210). Bethesda, MD: American Physiological Society. Shweder, R. A. (1991). Thinking through cultures. Cambridge, MA: Harvard University Press. Seidenberg, M. S., Waters, G . S., Sanders, M., & Langer, P. (1984). Pre- and postlexical loci of contextual effects on word recognition. Memory & Cognition, 12, 315-328. Seligman, M. E. P. (1970). On the generality of the laws of learning. Psychological Review, 77, 406-418. Shaver, P., Schwartz, J., Kirson, D., & O’Connor, G. (1987). Emotion knowledge: Further exploration of a prototype approach. Journal of Personality and Social Psychology, 52, 1061-1086. Shevrin, H., Williams, W. J., Marshall, R. E., Hertel, R. K., Bond, J. A., & Brakel, L. A. (1992). Event-related potential indicators of the dynamic unconscious. Consciousness and Cognition, I , 340-366. Shiffrin, R. M., & Schneider, W. (1977). Controlled and automatic human information processing: 11. Perceptual learning, automatic attending, and a general theory. Psychological Review, 84, 127-190. Singer, J. A., & Salovey, P. (1988). Mood and memory: Evaluating the network theory of affect. Clinical Psychology Review, 8, 21 1-251. Smith, E. E., & Medin, D. L. (1981). Categories and concepts. Cambridge, MA: Harvard University Press. Smith, E. R., & Zarate, M. A. (1992). Exemplar-based model of social judgment. PR, 99, 3-21. Snyder, M., & White, P. (1982). Moods and memories: Elation, depression, and the remembering of events in one’s life. Journal of Personality, 50, 149-167. Steinmetz, J. E. (1994). Brain substrates of emotion and temperament. In J. E. Bates & T. D. Wachs (Eds.). Temperament: Individual differences at the interface of biology and behavior. Washington, D C APA Press. Strauman, T. J. (1990). Self-guides and emotionally significant childhood memories: A study of retrieval efficiency and incidental negative emotional content. Journal of Personality and Social Psychology, 59, 868-880. Strauman, T. J., & Higgins, E. T. (1987). Automatic activation of self-discrepancies and emotional syndromes: When cognitive structures influence affect. Journal of Personality and Social Psychology, 53, 1004-1014. Teasdale, J. D. (1983). Negative thinking in depression: Cause, effect or reciprocal thinking? Advances in Behavioral Research and Therapy, 5, 3-25. Teasdale, J. D., & Fogarty, S. J. (1979). Differential effects of induced mood on retrieval of pleasant and unpleasant events from episodic memory. Journal of Abnormal Psychology, 88, 248-257. Teasdale, J. D., & Russell, M. L. (1983). Differential effects of induced mood on the recall of positive, negative and neutral words. British Journalof Clinical Psychology, 22,163-171. ten Houten, W. D., Hoppe, K. D., Bogen, J. E. & Walter, D. 0. (1986). Alexithymia: An experimental study of cerebral commissurotomy patients and normal control subjects. American Journal of Psychiatry, 143, 312-316.
64
Paula M. Niedenthal and Jamin B. Halberstadt
Tomkins, S. S. (1962). Affect, imagery, consciousness (Vol. 1). The positive affects. New York: Springer Verlag. Tomkins, S. S. (1963). Affect, imagery, consciousness (Vol. 2). The negative affecrs. New York: Springer Verlag. van Santen, J. P. H., & Jonides, J. (1978). A replication of the face-superiority effect. Bulletin of the Psychonomic Society, 12, 378-380. Velten, E. A. (1968). A laboratory task for induction of mood states. Behavior Research and Therapy, 6, 473-482. von Hipple, W., Hawkins. C., & Narayan, S. (1994). Personality and perceptual expertise: Individual differences in perceptual identification. Psychological Science, 6, 401-407. Watts, F. N., McKenna, F. P., Sharrock, R.. & Trezise, L. (1986). Colour naming of phobiarelated words. British Journal of Psychology, 77, 97-108. West, R. F., & Stanovich, K. E. (1982). Source of inhibition in experiments on the effect of sentence context on word recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 8, 395-399. Whorf, B. (1941). Language and logic. In J. B. Carroll (Ed.), Language, thought, and reality: Selected papers of Benjamin Lee Whorf (pp. 233-245). Cambridge: MIT Press. Zajonc, R. B. (1980). Feeling and thinking: Preferences need no inferences. American Psychologist, 35, 151-175.
EARLY SYMBOL UNDERSTANDING AND USE Judy S. DeLoache
I. Introduction Many psychotherapists, as well as various mystics and gurus, exhort their clients or followers to “live in the present moment,” to focus on current feelings and experiences, suppressing thoughts about past experiences or emotion and future plans and concerns. Although therapeutic benefits may derive from the ability to “be here now,” that is precisely what our sophisticated cognitive powers enable us not to do. It is the ability to distance ourselves from present experience-to remember the past, plan for the future, and reflect upon the present-that gives human cognition its extraordinary power. We reminisce, reconsider, and repent; we make predictions, test hypotheses, draw inferences, and induce general principles; we daydream, plan, and formulate strategies. All these and other cognitive activities extricate us from the present moment. The early cognitive development of children is to a large extent the development of the ability to mentally transcend time and space, a feat made possible by the development of symbolic capacities. Mastering the particular symbols of one’s culture is thus a central developmental task. All children begin learning language and gestures in the first year of life. Western children soon start to learn informally about other kinds of symbolic media such as pictures and television, followed by formal instruction in the alphabet and reading and in numbers and arithmetic. Many young children also learn to read music, identify computer icons, interpret simple THE PSYCHOLOGY OF LEARNING AND MOTIVATION, VOL. 33
65
Copyright Q 1995 by Academic Press. Inc. All rights of reproduction in any form reserved.
66
Judy S. DeLoache
maps and diagrams, and so forth. Given the centrality of symbol use in human cognition and communication, it is important to learn more about its origin and early development. The goal of this chapter is to present and discuss a model of early symbol understanding and use, as well as to review some of the research on which the model is based. Before doing so, however, it will be useful to consider in some detail what is meant by the term symbol and to delineate how that term is used in the research to be reviewed. A. DEFINITIONS OF SYMBOL
One problem with defining what one means by the word symbol is that the term has been used by many different people in many different fields to mean many different things. Even if we limit our consideration to psychologists’ use of the term, there is still a certain amount of variability: 1. “A symbol brings to mind something other than itself” (Huttenlocher & Higgins, 1978, p. 109). 2. “Symbolization is the representing of an object or event by something other than itself” (Potter, 1979, p. 41). 3. Symbols include “words, artifacts, or other symbolic products that people use to represent (to stand for, refer to) some aspect of the world or . . . their knowledge of the world” (Mandler, 1983, p. 420).
Consistent with these definitions, psychologists commonly use the term symbol to refer to both internal and external representations-to mental as well as nonmental representations, and to both linguistic and nonlinguistic symbols. However, there is reason to suspect that different processes are involved in the encoding of experience in the mind and the creation and exploitation of the great variety of symbols external to the mind. External symbols such as maps and numbers serve as vehicles for thought, they give rise to internal representations, but that may be the primary link between them. I believe that lumping together these extremely different kinds of representations has probably obscured our understanding of both. In this chapter, I focus exclusively on nonmental, external symbols. A similar conflation occurs with respect to linguistic and nonlinguistic symbols, and it is again possible that failing to distinguish between language and very different kinds of symbols such as pictures and orthography may obscure real, important differences in the processes involved in their use, and hence impede understanding of either. The focus of this chapter is on nonlinguistic symbols. Thus, I concentrate on symbols that persist over time, that are not, like language and gestures, evanescent. These are artifacts that have been created or drafted to serve a representational function.
Early Symbol Understanding and Use
67
My working definition is as follows: A symbol is some entity that someone intends to stand for something other than itself Note that this definition is agnostic regarding the nature of a symbol: Virtually anything can be a symbol so long as some person intends for it to be interpreted in terms of something other than itself. Similarly, anything-person, concept, object, event-can be a referent, so long as there is a stipulated link between it and an entity that stands for it. Intentionality is a defining feature, both in the philosopher’s sense of being about something and in the everyday sense of being purposeful. Even with this major delimiting of the term symbol, there is still enormous variability in its application. To illustrate this point, Table I lists a number of different external or artifactual symbols. The reader can easily generate numerous additional entries for the table. Indeed, by my definition, the list is essentially infinite; a person can turn anything into a symbol. Gombrich (1969) pointed out that a couch displayed in the window of a furniture store is not just another pretty couch. (However, we should remember that, as Freud cautioned, sometimes a cigar is just a cigar.) The items in the table differ on many dimensions, including characteristics of symbols themselves and of symbol-referent relations, functions of symbols, and their place in larger systems. 1.
Characteristics of Symbols
One important source of variability in symbols is what, if any, status they have as something other than a representation. At one extreme are symbols TABLE I EXAMPLES OF DIFFERENT TYPES OF SYMBOLS Calendar Santa Claus Photograph Scale model Candles on a birthday cake Swastika Semaphor flags Braille Morse code American flag Trophy Painting Musical score Mao
Blueprint Christian cross Jewish star Numbers Alphabet letters Words Poem, story, novel Cartoon Play Movie Money Statue Replica toys Graph
68
Judy S. Debache
such as statues, scale models, and Gombrich’s couch. These are objects that serve a clear representational function, but they are meaningful in their own right as well and can possess many interesting and salient properties. At the other extreme are purely abstract and formal symbols such as letters and numbers, which have meaning only as representations and possess relatively few physical properties of any interest (other than to a typographer or calligrapher). 2. Nature of Symbol-Referent Relation
There are many dimensions on which symbol and referent can be related. One is iconicity: A symbol may bear a close physical resemblance to its referent (e.g., realistic statues or paintings, photographs, models). Alternatively, there may be absolutely no similarity between symbol and referent at all (e.g., numerals have no resemblance to the quantities they represent). A second dimension is the nature of the mapping between symbol and referent. There can be a quite specijic, one-to-one mapping, with a single symbol representing a single, unique referent (e.g., portraits, maps, numbers). One-to-one mappings are not necessarily simple; they can include multiple elements within both symbol and referent and hence involve structural matches between the internal relations of the two entities. The mapping may also be generic, one-to-many, with a single symbol standing for any number of referents (e.g., blueprints for tract houses, assembly diagrams). Generic mappings can also involve a symbol for a whole category of referents (e.g., children’s replica toys). Finally, the mapping may be one-tonone; that is, a meaningful symbol can stand for something that has never and may never actually exist. A good example is the word unicorn and the famous medieval tapestries depicting that imaginary animal. Another is a large proportion of the illustrations in books for young children, which include dogs driving sports cars, pigs as detectives, mother bunnies reading bedtime stories, fantastic space ships, and so on. Symbolsalso differ in terms of the functions they fulfill. For most symbols, a primary purpose is communication. The symbol provides a means for one person to have some effect on another person or persons. Some symbols are very specifically designed and used as a source of information about something else (e.g., blueprint, musical score, map, scientific formula). The symbol provides the medium for the contents of one person’s cognitionhis or her ideas, knowledge, designs-to be available to someone else. Symbols also serve as tools for problem solving (e.g., numbers, diagrams, maps). Many symbols have as their primary function the commemoration or recording of some person or event (e.g., trophies, historical statues and markers, photographs). The main function of some symbols is much less
Early Symbol Understanding and Use
69
specific and primarily evocative-aesthetically, intellectually, or emotionally (e.g., religious icons, patriotic symbols such as flags). Finally, selfexpression can be the main function of a symbol, particularly to its creator. Symbols also differ in the extent to which they are part of a tightly organized system. In symbol systems such as numbers, musical notation, and print, the individual elements have meaning only with respect to one another. The relations among elements are critical. This discussion about different types and functions of symbols is based on the belief that not enough attention has been paid to these distinctions. It is possible that various psychological processes involving the understanding and use of symbols may vary slightly or significantly as a function of the specific nature of the symbol, its relation to its referent, how it is being used, and so forth. It seems quite likely that these differences would have substantial impact on the early symbolic development of young children. Therefore, in discussing symbol understanding and use, it is important to specify the type of symbol and the context of its use. The particular focus of my research concerns how young children understand and use symbols as information about reality, that is, as a source of information about something other than the symbol itself. It is easy to generate examples of symbols providing critical information. Road maps provide the traveler with information about the most efficient route to take, and navigational charts help sailors avoid dangerous rocks. An architect’s blueprints show his or her clients what they will be paying for, and they tell the contractor what to build. An identikit picture of an alleged perpetrator steers police and erstwhile crime stoppers toward appropriate suspects. The first televised images of the feet of the lunar module resting on the surface of the moon gave scientists definitive evidence that some theories of the lunar surface were wrong. The diagram for the purportedly “easyto-assemble’’ bicycle may enable parents to complete the task before the Christmas dawn. The basic research strategy I have adopted is to examine symbol understanding and use in a problem-solving context in which information from a symbol provides the solution for a desired outcome. Success or failure in solving the problem reveals the extent to which children can appropriately interpret and use the symbol. In some of the research in my lab, we employ a symbol that is novel to young children in order to examine how they figure out an unfamiliar type of symbol-referent relation. Scale models and maps, in which the symbol stands for some particular entity, are unfamiliar to very young children. In other research, we study very young children’s a novel, problemability to use a familiar type of stimuli-pictures-in solving context.
I0
Judy S. DeLoache
In the basic task used for this research, the information provided by the symbol concerns the location of a hidden toy. The child’s job is to use the symbolically communicated information to retrieve the toy. Object retrieval is an ideal dependent measure to use with very young children. First, it is relatively nonverbal, both in terms of what the experimenter must communicate to the child and in terms of the child’s response. Because the children in our studies display their understanding of a symbol-referent relation by using it to find a hidden toy, their restricted verbal skills are not problematic. Second, toddlers are notoriously uncooperative subjects (and offspring), but they generally enjoy and are highly motivated by games that involve finding hidden objects. 11. The Scale Model Task
In the majority of the research that my colleagues and I have done, a smallscale model serves as a symbol for a larger space. The data from those studies will be the primary focus of this chapter. Generally, in this task, the child must use his or her knowledge of where an object is hidden in a scale model of a room to infer where to find a larger toy in the room itself. The toys are always highly similar except for size, and the same is true of the two spaces. We have used three different models and rooms. Two of the rooms are real, large rooms, and the other is a tentlike portable room constructed of white fabric walls supported by plastic pipe. Figure 1 shows the layout of one of the real rooms and its model. Each of the rooms is furnished with standard items of living-room furniture (couch, chairs, tables, baskets, pillows, etc.), and its corresponding model contains miniature versions of those same objects. In our standard model task, the objects in the two spaces are highly similar in surface appearance (i.e., the two couches are covered with the same upholstery fabric, the tables are of similar colored wood, the pillows are of the same fabric, etc.). The corresponding objects are in the same spatial arrangement in the two spaces, and the model and room are in the same spatial orientation. The child can never see both spaces at the same time. The standard task always starts with an extensive orientation in which the similarity and correspondence between the dolls and between the spaces is described and demonstrated to the child. The experimenter explains that the two dolls (e.g., Little Snoopy and Big Snoopy) “like t o do the same things in their rooms. Wherever Little Snoopy hides in his little room, Big Snoopy likes to hide in the same place in his big room.” Each trial begins with a hiding event: The child watches as the experimenter hides the miniature toy somewhere in the model (e.g., under the
Early Symbol Understanding and Use
0
I
71
0
C
Fig. 1. Layout of a room and corresponding scale model used in some of our research. The room measures 6.5 X 5.5 X 2.6 m, and the model is 84 X 74 X 33 cm (a size ratio of 58 to 1). The model is always in the same spatial orientation as the room.
chair, behind the couch, under a floor pillow). She tells the child that she is going to hide the other toy in the same place in the room. After doing so, she asks the child to find that toy, reminding the child that the larger toy is in the same place in the room as where the child saw the smaller toy being hidden in the model. ("Can you find Big Snoopy? Remember, he's hiding in the same place in his big room as Little Snoopy's hiding in his little room.") The child then attempts to find the toy hidden in the room-retrieval 1. To succeed, the child must use his or her memory representation of the hiding event observed in the model to draw an inference about the unseen
12
Judy S. DeLoache
hiding event in the room. We thus take retrieval 1 performance as a measure of the child's understanding of the model-room relation; understanding that relation provides a firm basis for knowing where to search on retrieval 1, whereas failure to realize the model and room are related leaves the child with no basis for knowing where to search on retrieval 1. In all of our earlier studies, we counterbalanced space (whether the hiding event occurred in the model or room). Thus, half the children observed the miniature toy being hidden in the model, and their retrieval 1 search was in the room; the other half watched the larger toy being hidden in the room and searched in the model. Results were always the same, regardless of where the hiding event occurred, so we eventually discontinued the counterbalancing. Throughout this chapter, I will, for convenience, only describe the case in which the hiding event occurred in the model, but many of the studies involved both. Only errorless retrievals are scored as correct; that is, the child must find the toy on thefirst search. If the child's first search is incorrect (or the child fails to search at all), increasingly explicit prompts are given, ending with the child actually retrieving the toy on each trial. We have followed this practice on the theory that it may help children to remain motivated and continue to enjoy their participation in the task. (Performance does not differ, however, if some or all of the standard prompts are omitted or if the child is not prompted to retrieve the toy after an initial failure.) Following retrieval 1 in the room, the child is taken back to the model and asked to retrieve the miniature toy that he or she had observed being hidden. This retrieval 2 serves as a memory and motivation check. It is essential to the logic of our task that the children have a memory representation of the original hiding event; if children failed to find the larger toy in the room on retrieval 1 because they failed to remember where the miniature toy had been hidden in the model, we could draw no valid inferences about their understanding (or lack of understanding) of the model-room relation. Good retrieval 2 performance indicates that the children d o remember the original hiding event. Furthermore, it shows that they are involved and motivated to participate in the task-a necessity with very young and sometimes recalcitrant subjects. A. ORIGINAL STUDY
In the original study using the scale model task (DeLoache, 1987), two age groups of very young children participated: The younger group consisted of 2-1/2-year-olds (30-31 months, M = 31 months) and the other group consisted of 3-year-olds (36-39 months, M = 38 months). Each child re-
Early Symbol Understanding and Use
73
ceived four trials. (The small number of trials is necessitated by the difficulty of keeping such young children fully cooperative and highly motivated for more than the 20 to 30 min needed for the task.) Figure 2 shows the performance of the children on retrievals 1 and 2. As is clear from the figure, both age groups performed at the same high level on retrieval 2. Thus, both the older and younger children remembered where they had observed the toy being hidden in the model. There was, however, a large difference in their performance on retrieval 1. The older children were very successful at using their memory for where the toy had been hidden in the model to figure out where to search in the room (75% errorless retrievals). They were so successful, in fact, that there was no difference between their direct memory-based retrieval 2 and their inference-based retrieval 1 performance. In contrast, the younger children performed very poorly on retrieval 1 (15%). (Although it is difficult to specify precisely what the chance level is in these studies, the 2-1/2-yearolds’ performance clearly did not exceed chance.) These children did not use their memory of the location of the miniature toy in the model to infer where to find the larger toy in the room. There was no evidence in their retrieval 1 performance or in other aspects of their behavior that they had any awareness of the model-room relation. This basic result-high-level success by 3-year-old children and extremely poor performance by children only a few months younger-is highly reliable and has been replicated many times, both in our lab and in research by other investigators (e.g., Dow & Pick, 1992). Naturally, given such high and low mean levels of performance, the data for the individual subjects mirror the group performance. Twelve of the 3-
-
v)
Older
d
.-aJ 4 L -
v)
al
d
L
e
b m 8
Younger
0
Retrieval 1 (Analogous katml
Retrieval 2 lOriginal Location1
Fig. 2. Percentage of errorless retrievals by two age groups (3- and 2-112-year-old children) in the original model study (DeLoache, 1987; reprinted with permission of the AAAS).
14
Judy S. Debache
year-olds, but none of the 2-1/2-year-olds, succeeded on three or four of their four trials; conversely, 14 of the younger, but only one of the older children, were correct no more than once. B. REQUISITES FOR SUCCESSFUL PERFORMANCE What must the child understand and represent in order to succeed in this task? First, there are various pragmatic aspects: The child must understand that there is a toy hidden in the room and that his or her job is to find it. Both age groups are clear on this point: Even the younger, markedly unsuccessful children search for the larger toy-often enthusiastically and with great delight when they find the toy (thanks to the experimenter’s prompts). The younger children thus seem to understand everything about the task except the crucial thing; that is, that what they know about the model tells them something about the room. They apparently interpret retrieval 1 as a guessing game, and they are happy to play it. A second factor essential for success is mapping between the model and room: The child must, at a minimum, detect and represent the relations between the corresponding objects in the two spaces. Doing so is supported in our standard task by both perceptual and conceptual similarity. The corresponding objects are of the same type (e.g., a miniature and a fullsize couch), and they also look alike, except for size. The older, successful children can clearly perform the requisite mapping. What about the younger, unsuccesful subjects? Could their failure in this task stem from a simple inability to match the corresponding objects in the two spaces rather than anything to do with symbol understanding and use? To test this possibility, a group of 2-1/2-year-olds was asked simply to indicate which object in the room corresponded to each of the objects in the model. (“See this. Can you find something like this in the room?”) Their high level of success at choosing matching objects (79% correct) indicates that another explanation is needed for the poor performance of this age group in the model task. A third necessity for success is memory for the original hiding event. The child must have an accessible memory representation of the relation between the miniature toy and its hiding place on each trial. The fact that retrieval 2 performance is always high (75-95%) indicates that deficient memory is not responsible for the younger children’s failure to use the model as a source of information about the room. Thus, there is evidence that both 3-year-old children, who are generally very successful in the model task, and 2-1/2-year-old children, who are generally very unsuccesful, do not differ with respect to three of the requisites for successful performance. Successful and unsuccessful children do differ on the fourth-mapping the relation between the toy and its hiding place in the model and the larger
Early Symbol Understanding and Use
75
toy and location in the room. What accounts for whether or not children map the memory representation that we know they have of the location of the miniature toy in the model? Is there an additional requisite for successful performance? My colleagues and I believe that there is. Specifically, the claim is that success in the model task involves representation of the higher order relation between the model and room. Although children could, in principle, succeed just on the basis of lower level object-hiding place mappings, we have argued that such an explanation is inadequate to account for the pattern of performance across a variety of studies. Instead, it is necessary to invoke the detection and representation of the general model-room relation. Figure 3 depicts two extreme versions of the representational framework that could, in principle, support successful performance in the model task. According to Figure 3A, successful retrieval 1 performance is based on independent mappings of the toy-hiding place relation on each trial. Thus, on trial 1, the child might map the Little Snoopy-miniature couch relation in the model onto the Big Snoopy-full-sized couch relation in the room. On trial 2, the child would map Little Snoopy-tiny pillow onto Big Snoopy-real pillow, and so forth. By this view, each successful trial involves the mapping of a single relation in one space onto another single relation in the second space. There is no need to represent any of the relations among objects within each space, let alone the similarity of the relational structure of the two spaces. Furthermore, there is no necessity to detect and represent the higher order, general relation between model and room. Figure 3B depicts the opposite extreme. Here, the child has a representation of the general model-room relation and, in addition, the spatial relations among all four of the hiding locations within each space. Mapping the toy-hiding location from the model onto the corresponding objects in the room would thus be supported by substantial relational structure, including the higher level model-room relation. I will claim that successful performance reflects a representation that is closer to what is shown in Figure 3B than 3A. Specifically, although Figure 3B very probably overestimates the nature of the representational state of successful children, Figure 3A definitely underestimates it. The precise nature of how the model-room relation is represented is not clear. It might be as a rule that stipulates, “If t is hidden in x in the model, then T will be in X in the room.” Such a rule provides a clear basis for drawing the necessary inference. The crucial relation might also be represented as an analogy: “The miniature toy is to hiding place x in the model as the larger toy is to hiding place X in the room.” Regardless of the format of the representation of the model-room relation, that represen-
Judy S. DeLoache
16
a
Q
Fig. 3. Two representational frameworks that could, in principle, underlie successful performance in the model task: (A) On each trial, the relevant toy-hiding place relation in the model is independently represented and mapped onto the corresponding relation in the room; (B) Representation of the higher order model-room relation supports representing and mapping the relational structure of the model, including the toy-hiding place location on any given trial, onto the corresponding relational structure of the room.
Early Symbol Understanding and Use
77
tation specifies that the model is relevant to the room, and it provides a basis for drawing inferences from the first to the second. 111. A Model of Symbol Understanding and Use
Figure 4 presents a Model (referred to as Model to avoid confusion with the scale model) of children’s understanding and use of symbols as a source of information. The Model includes processes or constructs hypothesized to intervene between certain variables that we have investigated and the behavior we have observed as a result of those manipulations. It is not possible to take a linear path through the Model because, as will become clear, the contribution of any given factor or intervening variable depends on the presence and level of others. This Model was developed through our research with scale models, pictures, and maps, but it is intended to apply more broadly than to just those symbols. (This Model is a revision and extension of one presented in DeLoache, 1990.)
Fig. 4. A Model of symbol understanding and use. The five rounded rectangles on the left represent factors demonstrated or hypothesized to affect the behavior of interest (symbol use), represented by the rectangle on the far right. The ellipses represent intervening variables assumed to mediate between the manipulated factors and children’s behavior.
78
Judy S. DeLoache
A. BEHAVIOR We start at the endpoint of the Model. The behavior that is of interest is the appropriate use of a symbol. It could be reading a word, counting a set of objects, figuring out what merit badges a Boy Scout has earned from the patches on his uniform, getting the point of a New Yorker cartoon, and so forth. In the case of our scale model task, it is the retrieval of the hidden object (retrieval 1).
B. MAPPING Mapping the elements of the symbol onto the corresponding elements of its referent is a necessary component of using a symbol as a source of information. To use a street map to navigate through an unfamiliar city, one must match the circles, lines, arrows, and so forth on the map with the represented features of the world. Mapping can be bidirectional: Although we normally think of going from symbol to referent, elements of a familiar referent can be mapped onto an unfamiliar representation. In the streetmap example, some knowledge or familiarity with the city in question could help the erstwhile map reader interpret the elements on the map. Mapping can be a serious challenge in symbol use. Indeed, mapping alphabet letters onto sounds and then mapping configurations of letters onto words is a major hurdle in learning to read. With a street map, some people (myself included) experience grave difficulty using a map when it is rotated with respect to the represented space. In the scale model task, mapping involves matching the individual elements (items of furniture) in the model with their counterparts in the room, as well as carrying the toy-hiding place relation over from model to room. In our standard task, mapping per se is not very difficult, primarily because of the high level of physical similarity of the two spaces. The main challenge lies in other aspects of the task. C. REPRESENTATIONAL INSIGHT The pivotal component of the Model is Representational Insight. A vital element in symbol use is the mental representation of the relation between a symbol and its referent. This includes both the fact that there is a relation between them and some knowledge of how they are related. The nature or level of that representation can vary greatly. It could be conscious, explicit knowledge that the symbol stands for its referent, such as an adult might have. A touchstone example of explicit insight is Helen Keller’s dramatic and life-transforming realization of the point of her teacher’s finger movements. Representational insight could also be some inexpress-
Early Symbol Understanding and Use
79
ible, implicit sense of relatedness, such as a young child might havesomething more akin to the insight of Kohler’s apes. One can encounter and even learn something about a symbol and not realize that it stands for something other than itself. Preschool children happily memorize the alphabet song and number names with no awareness of what they stand for. Many American families have a story about the Pledge of Allegiance, which children often learn to recite before they are capable of understanding what it means. In my family, it was my younger brother’s question, “What are witches’ stands?” Adults can similarly fail to recognize the symbolic import of a novel entity. A fascinating example concerns the small clay shapes (spheres, pyramids, disks, etc.) that were regularly turned up for centuries by archaeologists working in the Middle East. These objects were generally assumed to be of no particular interest until Schmandt-Besserat (1992) had the insight that the insignificant-appearing items actually constituted one of the earliest symbol systems. The clay shapes were tokens used for keeping track of trading exchanges. There was a one-to-one correspondence between token and referent, with different kinds of tokens used for each kind of goods traded; for example, a sphere stood for a measure of grain, two cylinders represented two animals. The appropriate number of tokens of different types were sealed up together, thus constituting a permanent record of the transaction. According to Schmandt-Besserat (1992), this token system gradually evolved into a sophisticated number system and constituted the foundation for the invention of writing. On a less grand scale, representational insight in our model task refers to the child’s realization, at some level, that the model and room are related. As mentioned before, a full account of our data requires positing that successful performance depends upon the detection and representation of the higher order relation between the two spaces. According to the Model, Representational Insight supports mapping between symbol and referent. The child who appreciates the overall model-room relation has a basis for mapping the relation between the miniature toy and its hiding place in the model onto the larger toy and location in the room. What determines the achievement of representational insight with respect to any given symbol-referent relation? The Model indicates that there are multiple factors that interact to facilitate or impede this awareness. The Model shows five factors, four of which we have extensively examined in the context of scale models, pictures, and maps. The five factors include aspects of the social context involved in the task, the nature of the symbolreferent relation, characteristics of the symbol itself, the child’s general knowledge, and his or her prior experience with symbols. In the following
80
Judy S. DeLoache
sections, the evidence for the role of each of the components of the Model will be summarized, and the interactions among them will be discussed. D. INSTRUCTION The social context of symbol use is very important. We expect direct tuition by other people to be necessary for children to master many symbols. Reading is an obvious example. Although there are fascinating examples of children who seem to have learned to read on their own, the majority of children require direct, extensive instruction to learn to recite the alphabet, recognize letters, and read words. The importance of communication from others about the nature and appropriate interpretation of symbols is not restricted to abstract, purely conventional symbols like letters and numbers. Even highly iconic symbols like realistic pictures are not necessarily interpreted correctly without social interaction and communication regarding their nature and use. Recent research in my laboratory (DeLoache, Pierroutsakos, Uttal, & Rosengren, 1995; Pierroutsakos, 1994) has established that 9-month-old infants are confused about the nature of two-dimensional stimuli. Although even younger infants can recognize pictured objects and can discriminate between objects and pictures (e.g., DeLoache, Strauss, & Maynard, 1979; Dirks & Gibson, 1977), our subjects treat depicted objects as if they share some properties of real objects. Specifically, 9-month-old infants manually explore and even attempt to grasp and pick up pictured objects. By 20 months, such manual exploration is rare. Presumably as a function of experience with pictures, these children have learned that pictures do not share tangibility with their referents. For middle-class American children, the majority of early picture experience comes in the context of interaction, specifically, in parent-infant picturebook reading interactions. In the model task, instruction has been shown to play an important role. Our standard task involves an extensive orientation in which children are told everything about the task, including a description of the two toys “liking to do the same things,” a description of their rooms as being alike, and a demonstration and explanation of the correspondence between all the relevant objects in the two spaces (i.e., all the items of furniture to be used as hiding places for the toys). For this demonstration, the experimenter takes the miniature objects from the model into the room, holds each one up against its larger counterpart, and points out the correspondence between the two. Furthermore, the experimenter explains that she always hides the two toys in the “same places” in their rooms. Experiments in which we have manipulated the instruction given the children about the model-room relation show a significant effect on the likelihood that the relation between the two spaces will be noticed. The
Early Symbol Understanding and Use
81
extent and nature of instruction that is needed depends on the age of the child. With 2-1/2-year-olds, even our standard extensive and explicit instructions are not enough to lead them to apply their knowledge of the model to the room. Three-year-olds are very successful in the standard task with the explicit instructions. Part of the standard instructions were omitted in a study by Marzolf, DeLoache, and Kolstad (1995). Instead of demonstrating all the individual correspondences between the items in the two spaces, the experimenter simply mentioned that Little and Big Snoopy had the same things: “Little Snoopy has a basket, a table, and a dresser; Big Snoopy also has a basket, a table, and a dresser.” This was repeated for the rest of the objects. The performance of the 3-year-olds was good (71-81%). However, if a more minimal version of the standard instructions is provided, 3-year-olds’ performance is dramatically poorer. In DeLoache (1989), none of the object correspondences were pointed out to the children, and their success rate was only 25%. Even 3-1/2-year-old children fail to appreciate the model-room relation with these minimal instructions (41%), but 4-year-olds are successful (75%) (DeMendoza & DeLoache, 1995). The necessity for relatively explicit communication about the modelroom relation shows that even highly iconic symbols like our scale models are not transparent to inexperienced symbol users. We assume that our instructions contribute to the achievement of representational insight in various ways. For one thing, they direct the children’s attention to the respects in which the two spaces are similar; for example, they emphasize the individual object correspondences, which children must detect to succeed, and de-emphasize the size difference between the spaces. Furthermore, telling children that the spaces are alike may lead them to actively compare the model and room and hence to discover additional similarities that the experimenter does not mention (e.g., the spatial relations among the objects, or features of the spaces such as windows, bookshelves, pictures on the walls, doors, etc.). In addition, instructions convey the general structure of the task, including the crucial parallel between the hiding events in the two spaces. In Gentner’s terms (Gentner, 1989; Gentner & Ratterman, 1991), instructions should foster structural alignment. They encourage children to compare their mental representations of the room and model, leading them to discover the common relational structure of the two spaces and the events that transpire within them. From this comparison, a representation of the higher order model-room relation emerges; in other words, representational insight is achieved.
E. SIMILARKY An important component of the Model concerns the symbol-referent relation, specifically, the extent of physical similarity between symbol and refer-
82
Judy S. DeLoache
ent. There is great variability in how closely symbols resemble their referents. Color photographs, realistic scale models, and representational statues have a high level of similarity or iconicity, whereas there is no physical resemblance whatever between letters and the sounds they represent, numerals and the quantities they stand for, or musical notes and the tones the musician should produce. The proverbial man-on-the-street would assume that similarity should make symbol learning easier. The commonsense assumption would be that the more a symbol resembles its referent, the easier it would be to detect the relation between them and to map from one to the other. Our research basically agrees with this natural assumption, but with two caveats. The first has already been mentioned (and will be reiterated later): Iconicity, no matter how extensive, does not guarantee transparency to a novice symbol user. The second caveat is that, as will be described at length later, high iconicity can involve features that actually make a symbol morerather than less-difficult for young children to interpret appropriately. Given these caveats, we and the proverbial person-on-the-pavement generally agree that higher levels of similarity facilitate the acquisition and use of symbols. Part of what makes highly abstract symbol systems-such as letters, numbers, and musical notation-difficult to learn is their lack of iconicity. The absence of natural or prior links to their referents necessitates memorizing the set of symbols and learning by rote their relations to their referents and one another. Maps are partially iconic, in that they typically preserve one form of iconicity-the spatial relations among elements-but few others. There are, however, notable exceptions: The original London subway map made cartographic history by subordinating spatial accuracy to route information. Many first-time visitors to London are unaware of this fact and hence find themselves hopelessly confused when they emerge from a tube station and their spatial position is not what they expected from their interpretation of the map. Realistic pictures, which have extremely high iconicity, can be “read” even by pictorially naive individuals, because most of the information about the referent is available in the picture. As Fig. 4 shows, iconicity or physical similarity facilitates the perception ofsimilarity between symbol and referent. Perception of similarity facilitates the achievement of both representational insight and mapping. In the model task, similarity has been shown to be very important, most notably with respect to the overall size of the two spaces, the surface appearance of the objects within the spaces, the appearance of the background, and the congruity of the spatial relations among the objects. As was also true for instruction, the effects of similarity differ as a function of the age of the children tested.
Early Symbol Understanding and Use
83
1. Scale
Diminishing the degree to which the model and room differ in overall size makes the task easier. DeLoache, Kolstad, and Anderson (1991) gave a group of 2-1/2-year-olds the standard model task (i.e., full instructions, highly similar objects), except that the “room” was only twice as large as the model. Thus, both were small-scale,surveyable spaces (the entire space could be seen without moving). A second group of children of the same age received the standard task in which the room was a surrounding space approximately 16 times larger than the model. In both conditions, the children could see only one space at a time. Performance was, as expected, poor (41%) in the standard, different-scale, condition, but the children in the similar-scale condition were significantly more successful (75%). Thus, increasing the overall physical similarity between the two spaces in terms of size facilitated the achievement of representational insight by children who would not otherwise have detected the model-room relation. It is not clear exactly how the scale manipulation had its effect. One possibility is sheer perceptual similarity-the objects in the two spaces may have simply looked more alike. Another is that the relational similarity of the two spaces may have been more obvious, because all the objects within each space could be seen from a single vantage point. Furthermore, it could be that conceptual similarity also played a role; the two small-scale spaces may have been interpreted by our 2-112-year-old subjects as play or “pretend” rooms (i.e., as the same kind of spaces). These are not mutually exclusive possibilities, and all may have contributed to the improved performance observed in the similar-scale condition. 2.
Object
An obvious source of similarity between the model and room is the physical appearance of the corresponding objects in the two spaces. Object similarity has been found to have a large impact on children’s performance in the model task, but this time the effect appears for 3-year-olds. In the standard task, the corresponding objects within the two spaces were specifically constructed to be very highly similar to one another. For example, the same upholstery fabric covers the miniature and full-sized chairs, the same fabrics are used for miniature and large tablecloths, couch pillows, and floor pillows. The model coffee table and end table are made of light wood, just like their larger counterparts. Several studies have been performed using a model task with low object similarity (DeLoache et al., 1991; Marzolf & DeLoache, 1994). In this case, the miniature objects are of the same general category and shape as the larger ones in the room, but their surface appearance is quite different;
84
Judy S. DeLoache
for example, different fabrics covering corresponding objects, painted or contact-paper-covered surfaces instead of bare wood, and so forth. Threeyear-old children, who are highly successful in the standard task with high object similarity, perform poorly with low object similarity (25%). Apparently, similarity serves as scaffolding for achieving representational insight. Three-year-olds are capable of representing the abstract relation between the model and room, but they need concrete similarity to do so. It is worth noting that object similarity does not have to be complete to support 3-year-olds’ performance. In the earlier model studies, only part of the furniture was highly similar: Of the four items of furniture used as hiding places, two were as similar looking as we could make them, and two, by design, were dissimilar. In more recent studies (with a different lab and model or with the portable room), all the corresponding objects were highly similar. Overall performance has been the same in the total and partial object similarity situations. Furthermore, no differences in the latter ever appeared for individual hiding places; children are equally successful with the similar and dissimilar hiding places. It appears that so long as there are some objects in the two spaces that look very much alike, 3year-olds are able to detect the overall model-room relation and match the corresponding objects in the two spaces. 3. Background
DeLoache et al. (1991) also manipulated the similarity of the background in the model task (i.e., the walls of the model were either the same white fabric as in the portable room or they were made of cardboard covered with white contact paper). For both age groups, performance was slightly but not significantly higher when the backgrounds matched. This result provides further, although weak, evidence of the effect of physical similarity: The background does not seem to be very important, but not completely irrelevant either. Presumably, similarity of the background simply adds to the overall perceptual similarity of the two spaces. 4. Physical Similarity-Summary
Figure 5 shows the interaction between age (2-1/2 and 3-year-olds) and three sources of similarity (scale, objects, and background). At the two extremes, the two age groups do not differ. With very low similarity, both groups fail to relate the model to the room; and with very high similarity, both groups are successful. At the intermediate levels, however, the older children succeed with lower levels of similarity than do the younger children. 5. Relational Similarity
Another potential source of similarity is the relations among the objects within the two spaces. In the standard model task, all the objects in the
Early Symbol Understanding and Use
0 '
85
High
Imv
Similarity Fig. 5. Performance in the model task increases as a function of three sources of perceptual similarity. (Data from DeLoache, Kolstad, & Anderson, 1991.)
two spaces are in the same spatial arrangement. For example, the miniature and full-sized chairs are always in the back right corners of their respective spaces. The other objects also maintain the same relative positions, so the relations among the objects are always the same. Thus, the chair in both spaces is always to the right of the basket and to the left of the dresser. The high level of relational similarity in the standard task means that the relations among the objects in one space correspond to and can be mapped onto the relations in the second space. Are young children sensitive to relational similarity in this task? To address this question, Marzolf, DeLoache, and Kolstad (1995) disrupted relational similarity by rearranging the items of furniture in the model. Thus, the miniature chair was in the back right corner of the model, next to a table, but the large chair was in the front left corner of the room, next to a plant. The results of three studies assessing the effect of relational similarity on performance were interesting and complex. In the initial study in the series reported by Marzolf et al. (1995), the degraded relational similarity had no effect on 3-year-olds' performance. In this study, we employed our standard model task with high object similarity and the full instructions that explicitly describe and demonstrate the object correspondences. Performance with the disrupted spatial arrangements-and hence low relational similarity-was 75%, a figure that is clearly the same as that typically observed for 3-year-olds in the standard model task with high relational similarity. This result suggested that 3-year-old children do not notice and represent relations among the objects in the model task. However, two more studies
86
Judy S. Debache
on the role of relational similarity paint a different picture. In one, we tested a group of 3-year-olds in exactly the same situation as in the previous study with one exception: We gave less explicit instructions about the object correspondences; that is, we gave the modified instructions described earlier in which instead of pointing out the relation between each individual object from the model and its counterpart in the room, the experimenter commented on the correspondence of three items at a time. A control group received the same modified instructions, but the objects in the two spaces were in the same spatial arrangement. In this study, a significant effect of relational similarity appeared: Children in the Different Arrangement condition performed significantly less well (33%) than those in the Same Arrangement control condition (81%).Thus, when object correspondences were not emphasized and relational similarity was disrupted, 3-year-old children failed to detect the model-room relation. These results show that 3-year-old children are affected by relational similarity in the scale model task, and their performance is not based simply on detecting object correspondences. A third study in this series illuminated further the role of relational similarity in the model task. Here we asked whether prior success in the standard task would enable children to cope with noncongruent spatial relations. Accordingly, a group of 3-year-old children was first given the standard model task, but with the modified instructions (Lee,with the individual object correspondences not highlighted in the orientation). As in the study discussed above, the children performed successfully (71%). The same children then came back a second day and were tested in the Different Arrangement task. In contrast to the 3-year-olds’ poor performance in this task in the preceding study, these children were quite successful (73%). Prior experience with a model task enabled them to map from model to room, even though the spatial relations among the objects within the two spaces had been disrupted. These three studies establish that 3-year-old children are capable of detecting and representing relations, and that relational similarity supports their performance in the standard model task. The results thus disagree with the suggestion made by Blades and Spencer (in press) that successful performance in the model task could be accounted for if children “simply noted the hiding place in the model (i.e., the name of an item of furniture) and then went into the room to look for that item of furniture’’ (p. 31). The results fit Gentner’s (1989) framework for analogical reasoning in two ways. First, similarity of relational structure helps children align the corresponding elements in the two spaces and to reason from one to the other. Second, consistent with the relational shift hypothesis, object corre-
Early Symbol Understanding and Use
87
spondences have much more impact on performance than relational similarity does for the very young children in this research. The general story regarding similarity is that, as stated in the beginning of this section, more is better: Increasing physical and relational similarity typically facilitates the detection of symbol-referent relations. However, there are, as mentioned earlier, two caveats to this general principle. The first, which is a major theme of this chapter, is that iconicity does not guarantee transparency of a symbol-referent relation. As our research with pictures and scale models makes clear, inexperienced symbol users can be remarkably impervious to symbol-referent relations that others think are obvious. Parents accompanying their young children to our lab are frequently astonished when their children fail the model task. Generally, adults should not assume children understand a given symbol or symbolic medium in the same way they do, from the “beauty pageant” for alphabet letters on Sesame Street to the vacation snapshots in the family photo album. The second caveat is that similarity can actually interfere with symbol understanding and use in some ways. One example comes from children learning to read. Bialystok (1991, 1992) has shown that beginning readers often assume that there should be some iconicity between printed words and the concepts or objects they represent. They expect a relation between the size of a word on a page (i.e., the number of letters in it) and the size of the object the word represents. For example, children who could not yet read the words Banana and Car were asked to place cards containing those words with pictures of their referents. Many children placed the banana card with the picture of a car, because banana is a long word and cars are large objects, and they placed the smaller word car with the smaller object. Errors of this sort indicate that young children confuse the properties of a symbol and its referent when they are first learning to use a symbol system, and that they attempt to use iconicity as a strategy for relating a symbol to its referent. In many cases this strategy would be appropriate; in the case of totally arbitrary symbols such as printed words, it is counterproductive. Another example of iconicity interfering with children’s symbol use comes from research on early map reading. Liben and Downs (1989, 1992) asked preschool and young elementary school children to identify and explain some of the symbols on common maps. Most children correctly interpreted blue areas on the maps as bodies of water. However, many interpretive errors revealed a misplaced reliance on iconicity. When told that a red line on the map was a road, one child objected, arguing that roads are not red. Another child said the red line could not be a road because it was too narrow for a car. When shown an aerial photograph of
88
Judy S. DeLoeche
a city, one child correctly identified a river, but then labeled a nearby area “cheese,” apparently because it looked like a large (a very large) hunk of Swiss cheese. Thus, in interpreting novel symbols, young children use iconicity as a sometimes inappropriate and misleading fallback strategy. Just as young children’s best guess is that a big object will be represented by a big word, they also think map symbols should share the physical properties of the entities they represent. An additional caveat to the generally positive effect of similarity stems from the characteristics of highly iconic symbols themselves. This point is the focus of the next section of this chapter.
F. DUALREPRESENTATION Symbols of the sort being considered in this chapter have a “dual reality” (Gibson, 1979; Gregory, 1970; Potter, 1979; Sigel, 1978) in that they have both a concrete and an abstract nature. The information displayed is dual. The picture is both a scene and a surface, and the scene is paradoxically behind the surface. The duality of the information is the reason the observer is never quite sure how to answer the question, ‘What do you see?’ (Gibson, 1979, p. 281) Drawings, paintings, and photographs are objects in their own right-patterns on a flat sheet-and at the same time entirely different objects to the eye. . . . Pictures are unique among objects; for they are seen both as themselves and as some other thing, entirely different from the paper or canvas of the picture. Pictures are paradoxes. (Gregory, 1970, p. 32)
A dimension on which symbols vary is the salience of their physical manifestation. Although some symbols exist as representations and nothing else, others are salient as objects as well as symbols. Indeed, many are designed to attract our attention to both their concrete and their abstract qualities. We admire the exquisite marble and the carving in Michelangelo’s David, as well as the evocative portrait of an apprehensive young man; and we are amused and intrigued by the puffy fabric constituting an Oldenburg drum set. Indeed, it is possible to be so interested in the concrete features of such works that one fails to appreciate the more abstract concepts that the artist is attempting to convey. To understand and use a symbol, one must achieve dual representation; that is, one must represent both facets of its dual reality, both its concrete characteristics and its abstract function. Early accounts of the failure of “primitive” people to perceive pictures often stemmed from the initial failure of these pictorially inexperienced individuals to achieve dual repre-
Early Symbol Understanding and Use
89
sentation. Sometimes they treated the unfamiliar item as a novel object, examining its properties by feeling the paper or turning the picture around to look at its back, ignoring its content. Other times they responded to a picture’s content as though it were real, ignoring its two-dimensionality. The best example of this is the famous story of natives being frightened by a projected slide of an elephant. Much was made of these people’s initial focus on only one facet of pictures, but this restricted focus was short-lived: Fascination with the paper itself gave way rapidly to attention to the content of the picture, and the direct response to the depicted object turned into interest in the novelty of its two-dimensional nature (see Deregowski, 1989). Pictorially experienced people of any age are only aware of responding to pictures in terms of their content. We “see through them” to their referents. However, we also represent their dual nature as evidenced by the fact that we never confuse pictures with reality, regardless of how evocative they are. A scale model, and especially a highly realistic one such as we use in our model task, is a symbol with a particularly high degree of salience as an object. The model is three-dimensional, complex, interesting, and attractive in its own right (especially to very young children). However, the very features that make the model interesting and appealing to young children make it more difficult for them to achieve dual representafionto represent the model both as an object (or set of objects) and at the same time as a symbol for something other than itself. To succeed, the child must form a meaningful representation of the model as a miniature room in which toys can be hidden and found, and the child must interact with it in this sense. In addition, the child must represent the model as a term in an abstract, “stands for” relation and must use that relation as a basis for reasoning and drawing inferences. The need for dual representation is thus a hurdle in the achievement of representational insight. According to the dual representation hypothesis, the more salient a symbol is as a concrete object, the more difficult it is to recognize and represent its abstract nature. This should apply to all symbols and to symbol users of any age, but it should be particularly problematic for individuals with relatively little symbolic experience. The dual representation hypothesis has been tested in several studies, and a number of predictions generated by it have been confirmed. The support for the hypothesis is particularly strong, because all these predictions were highly counterintuitive on any grounds other than the concept of dual representation. One series of studies testing this hypothesis followed the logic that changing the salience of a scale model should in turn change the difficulty of using the model as a symbol. Accordingly, in one study, we attempted to
90
Judy S. Debache
increase the salience of the model as an object, expecting this to make it more difficult to achieve dual representation and hence predicting a decrement in children’s performance. Three-year-old children (the age group that typically succeeds in the standard model task) were given 5- to 10min experience simply playing with the model before participating in the standard model task. We reasoned that interacting with the model in a nonsymbolic context would make it more difficult to appreciate its symbolic role in the model task. The prediction was supported: The children given the extra experience with the model performed more poorly (44%)in the model task than 3-year-olds normally do. In the second experiment in this series, we did the opposite: We tried to decrease the salience of the model as object, a manipulation designed to make it easier to achieve dual representation. We predicted improved peformance by 2-1/2-year-olds,the age group that typically fails the standard model task. To decrease its salience, we placed the model behind a window in a puppet theater. The child never touched the model or its contents, nor did the experimenter. To indicate the relevant location, she simply pointed to the appropriate place in the model. As predicted, the performance of the 2-1/2-year-olds in the window condition was significantly better (54%) than that of a comparison group in the standard model task. As these two studies show, increased physical access to the model makes children less able to appreciate its relation to the room, whereas distancing children from the model makes them more able to use it as a source of information about something else. Physical separation from the model apparently helps them achieve psychological distance from it (Sigel, 1970). 1. Pictures
It follows from the dual representation hypothesis and the above two studies that further diminishing the salience of the symbol by which information is communicated should improve young children’s performance even more. Pictures are relatively nonsalient and uninteresting as objects, so they should be less challenging in terms of dual representation. There is nothing about a picture to distract children from its primary, representational role. We thus conducted a series of studies testing the prediction that pictures would be easier than models, or, more specifically, that 2-1/2-year-old children should be better able to find a toy hidden in a room if its location were communicated via a picture than by a scale model. This prediction is counterintuitive in that pictures are generally known to be less effective than three-dimensional objects at supporting learning, memory, categorization, and other cognitive activities (e.g., Daehler, Lonardo, & Bukatko, 1979; DeLoache, 1986; Hartley, 1976; Sigel, Anderson, & Shapiro, 1966; Steinberg, 1974).
Early Symbol Understanding and Use
91
Several studies have supported the picture-superiority prediction of the dual representation hypothesis (DeLoache, 1987, 1991; Marzolf & DeLoache, 1994). In all of them, the experimenter pointed to the relevant picture, saying to the child, “This is where Snoopy is hiding in the room; he’s hiding back [under] here.” The nature of the pictures used has varieda wide-angle color photograph of the room showing all the relevant hiding places, a lightly colored line drawing of the same scene, and a set of color photographs of individual items of furniture or hiding places. The results, which are shown in Fig. 6, are highly consistent. In every study, 2-1/2-yearolds who were informed via pictures about the location of a hidden toy performed very well (70 to 85%), significantly better than the same children in the standard model task (15 to 25%). Thus, the series of picture studies supports the claim that 2-1/2-year-olds have difficulty with the model task because it is hard for them to represent the model in two different ways at the same time. Another study in this series (DeLoache, 1991) established that the picture-model difference in performance was a function of the different media and not some other factor such as whether there was a hiding event. Figure 7 illustrates the four conditions in this study, the first two of which were replications: (1) Hide-Model-the experimenter hid a miniature toy
Fig. 6. Picture-superiorityeffect: Comparison of 2-ln-year-old children’s performance in the standard model task (hatched bar) versus various picture tasks (black bar). Very young children more successfully use pictures as an information source than they do a model. * Between-subjects design: all others were within-subjects. (Data from DeLoache, 1991.)
Judy S. DeLoache
92
Hide Model
Point Picture
Point Model
Hide Picture
Fig. 7. Four conditions resulting from the combination of two methods of denoting the relevant hiding place in the room (hiding a miniature toy vs. pointing) with two types of media (scale models vs. pictures). (DeLoache. 1991; reprinted with permission of the Society for Research in Child Develpoment.)
in the model; (2) Point-Picture-the experimenter pointed to the relevant one of four photographs; (3) Point-Model-the experimenter simply pointed to the correct place in the model; and (4) Hide-Picture-the experimenter hid the miniature toy behind one of the four pictures. Each of the 2-1/2-year-old subjects participated in one condition. The retrieval 1 results are shown in Fig. 8. Performance in the two replication conditions (Hide-Model and Point-Picture) closely replicated previous results. Performance in the crucial condition, Point-Model, was exactly the same as in the standard model task (Hide-Model), the result that was expected based on the dual representation hypothesis. If children fail to detect the model-room relation because of the salience of the model, then it should not matter whether information is communicated to them by a hiding event or a simple point; in either case, they have no basis for knowing where to find the toy in the room. Additional, particularly interesting support for the dual representation hypothesis came from the Hide-Picture condition, in which the child
Early Symbol Understanding and Use
WIN1
HIDE
MODEL
POINT
93
HIDE
PHOTOGRAPHS
Fig. 8. Performance as a function of medium (scale model vs. pictures) and method (hiding vs. pointing). (DeLoache, 1991; reprinted with permission of the Society for Research in Child Development.)
watched as the experimenter hid the miniature dog behind one of the four pictures. Performance in this condition was zero; not a single child ever made a successful retrieval. We think that this condition was so devastating for two reasons. First, dual representation was involved because we were asking the children to treat something (a picture) both as a symbol (a picture of something) and as an object (a hiding place). The children in this condition apparently succeeded in responding to the pictures as objects, in that they had a high retrieval 2 score (74%). However, it appears that treating the pictures as objects blocked interpreting them as symbols. After watching the miniature toy being hidden behind the picture of the couch, the children knew where to find that toy, but they had no idea that the picture told them where to find the larger toy in the room. The second problematic feature of this task was that it violated the normal function of pictures. Pictures are not normally treated as objects, something that even 20-month-olds know (Pierroutsakos, 1994). Thus, the Hide-Picture condition constituted an anomalous use of symbols; having learned to respond to pictures only in terms of their representational content, our subjects were stymied by the need to treat them as both symbol and object.
2. The Incredible Shrinking Troll We recently conducted an even more stringent test of the dual representation concept by attempting to eliminate the need for it (DeLoache, Miller, Rosengren, & Bryant, 1993). In this study, we endeavored to convince a group of 2-1/2 year-old children that a “shrinking machine” could shrink
94
Judy S. DeLoache
a room. Our reasoning was that if children believe the machine has shrunk the room (into the scale model of that room), then there is no representational relation between model and room. Instead, there is an identity relation-the model is the room. Hence, the task of retrieving the hidden toy is simply a memory problem. We thus predicted that 2-1/2 year-olds, who typically fail the standard model task, would succeed in the nonsymbolic shrinking room task. In the orientation, the child was introduced to “Terry the Troll” (a troll doll with wild fuschia hair) and was shown “Terry’s room” (the tentlike portable room). Then the shrinking machine (an oscilloscope with flashing green lights) was introduced, and its remarkable powers were demonstrated. The troll was placed in front of the machine, it was “turned on,” and the child and experimenter waited in the adjoining area, listening to computergenerated “sounds the shrinking machine makes while it’s working.” The child then returned to discover a miniature troll in place of the original one. Figure 9 shows the troll before and after the “shrinking event.” The experimenter then demonstrated that the machine could also make the
Fig. 9. The incredible shrinking troll: (A) shows the troll positioned in front of the shrinking machine; (B) shows the troll after the shrinking event.
Early Symbol Understanding and Use
95
troll “get big again.” A similar demonstration was conducted of the machine’s power to shrink and enlarge Terry’s room. The sight of the model in the middle of the area previously occupied by the portable room was even more dramatic than that of the miniature troll in place of the larger one. The child then watched as the experimenter hid the larger doll somewhere in the portable room. After waiting while the machine shrank the room, the child was asked to find the hidden toy. The miniature troll was, of course, hidden in the model in the place corresponding to where the child had seen the larger troll being hidden in the room. Thus, just as in the model task, the child had to use his or her knowledge of where the toy was hidden in one space to know where to search in the other. The first question was whether the children believed the scenario presented to them. In an effort to assess the children’s belief, independent of their retrieval performance, the two experimenters and the accompanying parent rated the level of the child’s belief that the model actually shrank the troll and the room. On a scale ranging from 1 (“firmly believes”) to 5 (“does not believe”), the average ratings were 1.1 by the experimenters and 1.5 by the parents. (The reader should keep in mind that most of these children also believe in the tooth fairy and other violations of natural law.) We always debriefed the children fully at the end of the session, showing them the two dolls and the room and model together and explaining that we were “just pretending.” Most children appeared uninterested in our explanation, and several, after listening to it, responded, “Can we shrink him again?” Figure 10A shows that, as predicted, performance was significantly better in this nonsymbolic task (76%) than in a control task (19%) involving the usual symbolic relation between the model and room. This superior performance occurred even though the shrinking room scenario was more complicated and the delay between the hiding event and the child’s retrieval was much longer than in the standard model task. This study thus provides very strong support for the dual representation hypothesis. The success of the shrinking room procedure and the logic of dual representation suggest that similar improvement in performance should occur in any shrinking room version of a model task, as long as the resulting memory task is within the competence of the age group tested. Accordingly, we devised a shrinking room task appropriate for 3-year-olds. It was based on the task used by Marzolf et al. (1995, Experiment 2) in which the objects in the two spaces were of high similarity but were in different spatial arrangements and,the instructions omitted explicit description of the object correspondences. In the previous study with this procedure, performance was only 33%.
Judy S. DeLoache
96
a
Standard shrinlcing Model Rmm
2.5-year-olds
Standard
Shrink k
Scrambled
Scramble
3-year-olds
Fig. 10. Performance in two shrinking room studies: (A) 2-1/2-year-olds in the original study (DeLoache, Miller, Rosengren, & Bryant, 1993); (B) 3-year-olds in the shrink and scramble study.
In the shrinking room version of this task, the experimenter briefly included in the description of the machine’s ability to shrink the room the statement that the machine also “can change things around.” The furniture in the two spaces was always arranged differently. We know that this age group (and even younger children) can succeed in a comparable memory task; that is, if children see an object hidden in one of a set of distinctive containers, and the positions of the containers are then scrambled, they can still retrieve the hidden object (e.g., DeLoache, 1986; Horn & Myers, 1978; Perlmutter et al., 1981). Fig. 10B shows the children’s performance in the shrink and scramble study, in comparison to the performance of the same age group in the relevant model task (Marzolf et a]., 1995). As predicted, the level of errorless retrievals was significantly higher (72%) in the “shrink and scramble” condition than in the comparable model task. These results constitute a strong replication of the original shrinking room study in that the hypothesis was supported for two different age groups in two different tasks. We have very recently used the procedure successfully with 3-year-olds in yet another task. The importance of the shrinking room studies is that they make clear that the nature of the child’s representation of the model-room relation is crucial. No matter how implausible it is to an adult, the shrinking machine scenario apparently makes sense to young children. It provides a plausible mechanism for a causal relation between the two spaces. Thus, in the
Early Symbol Understanding and Use
97
shrinking room condition, the children represent the relation between the room and model as identity. Because the model is the room, whatever the child knows about one applies to the other. I should point out that our young subjects’ performance in the shrinking room studies provides evidence of a very early form of conservation or identity preservation in the face of a substantial perceptual transformation. Piagetian conservation involves the belief in identity (of substance, quantity, number, etc.) in spite of a perceptual transformation. Thus, conservers understand that certain transformations (e.g., pouring liquid from a tall, thin glass into a short, wide one) alter the appearance but not the amount of some entity. In a related vein, Keil(l989) argued that even young children base their judgments about identity and change on their more general knowledge, their theory of domains such as biological kinds. Children from kindergarten through the fourth grade were shown pictures of animals that were then described to undergo some fairly dramatic transformations. For example, a picture of a raccoon was presented and labeled. Then a new picture was shown, and the depicted animal was said to have been painted black and white and to have undergone an operation that inserted “super smelly yucky stuff” under its tail. The children were asked whether the animal was a raccoon or a skunk. All age groups discounted the altered perceptual appearance of the creature, maintaining that it was still a raccoon, even though it now looked like a skunk. In Keil’s research, as well as in classic conservation tasks, children’s reasoning is based upon the same biological and physical principles that adults use in reasoning. In the case of the shrinking machine, children’s reasoning is based on the incorrect information we supply. In all these cases, however, it is children’s “understanding” of the nature of the causal mechanism that changes the appearance of the display that supports their belief in identity. The main point here is that any explanation of children’s performance in the shrinking room or the standard model studies must invoke the nature of the children’s mental representation of the model-room relation. The shrinking machine scenario leads children to represent a causal relation between the two spaces, and their representation of this relation supports reasoning from one to the other. Young children apparently find it much easier to represent and reason based on a causal than a “stands for” relation. It is worth noting that the negative effects of rich, real objects on young children’s reasoning are not limited to the symbolic domain or to our tasks or subjects. I will give two rather disparate examples of similar phenomena. Ratterman, Gentner, and DeLoache (1990) asked preschool children to use a relative size rule to reason from one set of three objects to another.
98
Judy S. DeLoache
The child was shown that a sticker was under one of the objects in the experimenter’s set, for example, the largest of the three. To find the sticker in his or her set, the child had to realize that it could be found under the object of the same relative size (e.g., the largest one). Sometimes, the largest object in the child’s set was identical to the middle-sized object in the experimenter’s set; thus, to find the sticker, the child had to disregard the object matches and choose on the basis of the relational (same relative size) matches. For some children, the objects in each set were all of the same type and were relatively plain and uninteresting (e.g., flower pots). For other subjects, the three objects were all of different types, and each was relatively rich and interesting (e.g., flower pot with flowers, toy house, decorated cup). The children were more successful at learning and applying the general rule to the sparse, relatively uninteresting objects. In particular, when object matches had to be ignored in favor of relational matches, they had more difficulty doing so in the rich objects condition. Thus, detecting and using an abstract relation was hindered by the presence of interesting (distracting) objects. A similar negative effect of real objects was found in a very different situation with very different subjects. Boysen (personal communication, November 12,1994) was working with chimpanzees who had already been taught numbers; they knew that the numeral “2” signified two objects, “3” signified three objects, and so on. She had attempted to teach these chimpanzees an abstract rule-the subject would receive whichever of two displays he or she did ltot choose. In spite of extensive training with displays of real, desirable objects (such as candies), the chimps could not inhibit responding to the display that contained more objects. Thus, when given displays of two versus five candies, the chimps pointed to the five-item set even after repeatedly seeing the larger number go to a different animal or be dumped in a bucket, while the subject received the smaller set. Then Boysen shifted from the real-object displays to numerical displays. The chimps now succeeded in applying the “choose the one you don’t want” rule. In other words, when given a choice of “2” versus “5,” they reliably pointed to the numeral “2”, thus ensuring that they would receive the larger quantity for themselves. Again, we have an example of abstract, relational reasoning being hindered by real, attractive objects, but facilitated by symbols with no interest value of their own. G. KNOWLEDGE
Two aspects of the Model concern the symbol user him- or herself. One is domain knowledge, either of the symbol (or symbol system) or of the
Early Symbol Undenstanding and Use
99
referent. There is a wealth of evidence for the contribution of domain knowledge to analogical reasoning (e.g., Goswami, 1992), concept formation and use (Carey, 1985;Gelman & Markman, 1987; Keil, 1989), scientific reasoning (e.g., Chi, Feltovich, & Glaser, 1981), and others. The Model indicates that domain knowledge should facilitate performance by making it more likely that the similarity between symbol and referent is noticed and by making it easier to map from one to the other. With respect to map use, for example, it should be easier to match the features of a map to landmarks in a city with which one is already somewhat familiar. Similarly, it should be easier to apply a well-studied map to unfamiliar terrain. First learning the distinctive perceptual features of alphabet letters may make it easier for children to learn specific letter-sound correspondences when reading instruction begins. Prior familiarity with the keyboard might help the beginning pianist map the middle F# on the score onto the middle F# on the piano. Although we think it likely that in general domain knowledge should affect symbol understanding and use, we have no positive evidence to date from our research. Indeed, we have evidence that with respect to the standard model task, neither familiarity with the symbol nor the referent improves performance. One relevant study was described earlier in which 3-year-old children were exposed to the model for 5 to 10 min before participating in the standard model task. This experience, as predicted on the basis of dual representation, made the children less successful at using the model as a symbol than they would otherwise have been. In another study, we gave a group of 2-1/2-year-old children substantial exposure to the room itself to see if familiarity with the referent would make it easier to detect the symbol-referent relation. These children came to the lab in small groups for a total of nine times over three weeks. During each visit, they participated in a variety of activities in the room. Each time, the experimenter at least once casually called their attention to each of the four items of furniture that would serve as hiding places in the subsequent model task. There was no discernible effect of familiarization; these children had 15% errorless retrievals in the standard model task, the typical level of performance for children of their age. Thus, unlike the other components of the Model, domain knowledge does not have direct empirical support from scale model studies. However, it remains in the Model on the assumption that it has to be important in some symbol-referent relations and that it might apply to scale models in situations we have not yet investigated. For example, referent familiarity might be useful when mapping is itself more challenging than is the case for most of our studies to date.
100
Judy S. Debache
H. SYMBOLIC EXPERIENCE The final component of the Model that we have investigated extensively concerns the child him- or herself, specifically, the amount of experience that the child has had with various symbols and symbolic media. Experience is crucial in that it constitutes the primary developmental component of the Model. As Fig. 4 shows, the Model stipulates a cumulative effect of experience with symbols. Such experience leads to the development of Symbolic Sensitivity, a general readiness or proclivity to interpret a novel entity primarily in terms of something other than itself. The more experience a child has had taking a representational stance to a given entity, the more likely he or she is to recognize when such stance is appropriate with a different entity. Thus, achieving representational insight with one symbol increases the likelihood that the child will achieve representational insight with new symbols. With age, children are naturally exposed to an increasing number and variety of symbolic representations, and the number of occasions on which they achieve representational insight increases. As a consequence, their general sensitivity to the potential existence of novel symbols increases with age. Thus, older children should have a lower threshold for interpreting a new entity as a representation than younger children do. Adults are so experienced at adopting a representational stance that we do so automatically and unconsciously. A variety of cues signal that some entity is probably a symbol, and we immediately think of it in that way. For example, marks on a surface that appear to have been intentionally made are probably always evaluated as potential symbols. If the marks form a recognizable pattern-either a picture or a familiar symbol-the hypothesis is confirmed. Even if the marks are not recognizable, the adult may be fairly sure that they are symbols. The finders and students of the Rosetta Stone probably never seriously considered the possibility that the marks on its surface were not symbols; the sole issue was how to decode those symbolic markings. Similarly, an adult who sees an aboriginal spear with surface marking would at least consider the hypothesis that the markings were meaningful symbols, although he or she would be unlikely to correctly interpret them as maps. Recognizing the symbolic import of various entities can be quite challenging to young children, as has been well documented with respect to numbers and letters. A substantial body of research focuses on “print awareness”young children’s dawning realization that the marks on the pages of books carry meaning (Teale & Sulzby, 1986).However, the challenge lessens with age and experience; an older child who encounters a novel calculus symbol
Early Symbol Understanding and Use
101
or a representation of the DNA double helix would immediately assume they were symbols for something. The Model thus specifies that age-related increases in performance on symbolic tasks like our model task are in large part attributable to the amount of experience a child has had with other symbol-referent relations. By this account, one would also expect individual differences in performance as a function of symbolic experience. I should emphasize here that the claim is not that symbolic experience is the sole factor responsible for improved performance with age; there are, of course, a variety of cognitive and other changes that should also contribute to this development, including advances in language, memory, self-control, and others. Consistent with the concept of symbolic sensitivity, age (and presumably amount of symbolic experience) interacts with factors known to be related to representational insight. As has been described before, older children require less explicit instruction than do younger children, and older children can detect a model-room relation with a lower level of physical similarity between the two spaces. There are, of course, many developmental changes that probably contribute to these age differences; although we think that children’s experience with symbols is likely to play an important role, experimental evidence for the role of previous experience is needed to support this component of the Model. The primary experimental support for the effect of experience and the role of symbolic sensitivity in the Model comes from a series of transfer studies. These studies have followed the logic that experience with a relatively easy symbol-referent relation that they understand should make it more likely that children would subsequently detect a more difficult symbol-referent relation (one that they would otherwise not interpret correctly). Table I1 summarizes some of the transfer studies we have carried out, several with 2-1/2-year-oIds and one with 3-year-olds. All of these studies involved two days of testing. On the first day, a group of children performed successfully in a scale model or picture task known to be well within the competence of their age group. On the second day, they were given a second, more difficult task, one that their age group typically fails. The question was whether success in the first task would increase the likelihood of success in the second. These studies thus tested the hypothesis that achieving representational insight in one symbolic task should increase symbolic sensitivity, hence facilitating the achievement of representational insight in a different task. Two types of control procedures were used in these studies. In some, the order of the two tasks was counterbalanced; hence, performance in the difficult task on day 2 (following the easy task) should be better than
Judy S. DeLonche
102
TABLE I1 EXAMPLES OF SUCCESSFUL TRANSFER STUDIES INVOLVING TRANSFER FROM EASY(TRAINING) TASKON DAY1 TO DIFFICULT (TRANSFER) TASKON DAY 2 Age
Training task
2-li2-year-olds
Pictures Pictures (Room 1) Pictures Similar-scale model Similar-scale model Similar-scale model
3-year-olds
High object-similarity Model
a
Transfer task Standard Model'' Standard Model (Room 2)" Low object-similarity modelb Different-scale model' Map' Different-scale model Different contextb Low object-similarity model'
DeLoache (1991). Marzolf and DeLoache, 1995. Marzolf and DeLoache (1994).
performance in the difficult task on day 1 (before any relevant successful experience). In other studies, a separate control group received the difficult task on both days, thus providing a baseline against which to evaluate any increase from day 1 to day 2 by the transfer group. Figure 11 shows a summary of the general pattern of results we have found across the several transfer studies using the latter control group.
Tnnafa Group
conhol Group
0'
Day 1
Day 2
Fig. 11. A summary of performance in several studies of transfer in the model task. The transfer groups were given an easy task on day 1 and a difficult task on day 2; the control groups received the same difficult task on both days. (Based on data reported in Marzolf & DeLoache, 1994.)
Early Symbol Understanding and Use
103
Specifically, the transfer group performs well, as expected, on day 1. As predicted by the Model, their performance in the difficult task on day 2 is significantly better than that of the control group on either day 1 or day 2. In some transfer studies with significant mean differences, nearly all the children in the transfer group are successful, whereas in other studies, only some of the children master the transfer task. In either case, individual performance is highly consistent with the hypothesized effect of prior successful experience. As Table I11 shows, consistent with the hypothesis, children who succeed in the easy day 1 task are highly likely to succeed on day 2, whereas children who perform poorly on day 1 usually perform poorly on day 2 as well. The crucial cell-the one that could disconfirm the hypothesis-is the number of children who fail on day 1 but succeed on day 2; there are extremely few cases that violate the logic of the transfer hypothesis. To examine the generality of transfer, we have varied the degree of similarity between training and transfer tasks. In some studies, both tasks used the same space. For example, in one study 2-1/2-year-oldsfirst experienced a picture task in which the pictures depicted the items of furniture in a standard room (DeLoache, 1991, Experiment 2). Their transfer task involved the same room and a scale model of it. In a different study, 3year-olds were given the high object-similarity task using the portable room and tested for transfer to the low object-similarity task, which involved different objects but the same space (Marzolf & DeLoache, 1994, Experiment 2). In another study with 2-1/2-year-olds, the picture task was given with respect to one room, and the scale model task took place in a different room down the hall (DeLoache, 1991, Experiment 3). In all cases, significant transfer occurred. TABLE I11 RELATION BETWEEN PERFORMANCE IN TRAINING AND TRANSFER TASKS ACROSS SEVERAL STUDIES: NUMBER OF SUBJECTS WHOWERESUCCESSFUL AND UNSUCCESSFUL IN THE Two TASKSO Day 2 Transfer (hard) task:
Day 1 Training (easy) task:
Successful Unsuccessful
Successful
Unsuccessful
31 3
8 20
' Success = three or four errorless retrievals in four trials.
Judy S. DeLoache
104
The most impressive level of generalizability of transfer we have obtained to date comes from a recent study in which 2-1/2-year-old children were first given the similar scale model task (high object similarity and a small difference in overall size between the scale model and the larger space) (Marzolf & DeLoache, 1995). On the first day, they were tested by one experimenter in a room in a building on one side of campus. The hidden toys were the Big and Little Snoopy dogs. For the transfer test on day 2, the children were brought to a different building on another part of the campus where they met a different pair of experimenters. They were given the standard model task with the trolls as hidden objects. Thus, these children were asked to transfer based purely on the similarity of the general structure of the two tasks. The instructions, the general nature of the hiding events, and the child's role were essentially the same in both tasks, but there was no lower level support for detecting a relation between the two tasks. Figure 12 depicts the results of this study. Clearly, the 2-1/2-year-old children's experience in the easy task helped them to figure out the difficult task. The basis for transfer in this study cannot be object matches, because there were none, nor could it involve familiarity of setting or people. Instead, transfer had to be based on something more abstract, such as the fact that in both cases a symbol-a scale model-served as a source of information about another space. The achievement of representational insight in the first task appears to have made the children more sensitive to the possibility of another symbol-referent relation, one that they would not have detected without that prior experience.
loo
-
80-
TnneferCmp
60-
40-
20
-
0 '
I
Day 1
I
Day 2
Fig. 12. Transfer by 2-1/2-year-oldchildrenwith very little contextual support.The children received an easy task on day 1 and a very difficult task on day 2.
Early Symbol Understanding and Use
105
What mechanisms might underlie the transfer we have found? One possibility is through a process of re-representation (Gentner, 1989; Gentner & Ratterman, 1991; Karmiloff-Smith, 1992). In their initial experience, children begin with representations of the model and of the room. The situation induces them to attempt to align their mismatching representations of the two spaces, resulting in a representation of the structure common to the two entities (leaving behind the mistmatching elements). The resulting abstract representation of the particular model-room relation then facilitates detection of a structurally similar relation between a new symbol and referent. Although we have examined symbolic experience within our objectretrieval paradigm, the concept of symbolic sensitivity is presumably quite general; that is, we would expect it to increase as a function of experience with widely disparate symbols. Thus, the increased performance that we observe in our picture and model studies during the third year of life might result in part from the children’s experience with picturebooks, television, pretend play, and other forms of symbolic experience. One would thus expect that children with less experience with common early symbols like picturebooks might achieve success in our standard scale model task somewhat later than children with more of such experience. We have data consistent with this expectation from a comparison of the performance of a group of children who attended a local day care center serving primarily lower socioeconomic status (SES) families with the performance of children attending a predominantly middle-class center. The children in the lower SES group were less successful (39%) in our high objectsimilarity model task than the middle-SES children (78%). By 3-1/2 years, the lower-SES sample was successful (81%) in the task. In another comparative study, a group of mostly middle-class Argentinean children performed less well in this task than our typical sample of middle-class Illinois youngsters. According to the teachers in the center they attended, these children received relatively little exposure to picturebooks and representational toys. There are, of course, myriad differences between the various groups in these studies, in addition to symbolic experience, so these results must be interpreted with great caution. Nevertheless, they are consistent with the hypothesis that greater experience with symbols in general should facilitate performance in the scale model task. I. FURTHER RESEARCH ON THE MODEL We have recently conducted several studies examining components of the Model briefly described earlier in this chapter-namely the nature and strength of representational insight and the process of mapping from symbol
106
Judy S. Debache
to referent. These studies reveal further limitations on young children’s symbol understanding and use. 1. Representational Insight
The claim that children who succeed in the model task must be representing the higher order relation between the spaces should not be taken to imply that they have either a complete or a fully articulated representation of that relation. In a recent study, 3-year-old children showed no evidence of explicit knowledge of the nature of the model-room relation. The children first had experience in our standard task with the troll dolls. They returned for a second day and were told that “Little Terry wants to have a room just like Big Terry’s room.” The children were then asked to choose which of two models was “better,” “more like Big Terry’s room.” There were four models, resulting from the combination of high or low object similarity with congruent or discrepant spatial relations among the objects. For example, in the highest similarity model, all the objects in the model were highly similar in physical appearance to their counterparts in the room, and they were arranged in the same way in the two spaces. In the lowest similarity model, all the miniature objects were of low similarity, and they were arranged differently from those in the room. On each of six trials, the children were asked to choose between a different pair of models. The children chose randomly on every dimension we examined, selecting the highest similarity model only 54% of the time. We conclude that these children had a functional, but not an explicit, accessible representation of the model-room relation. Another series of studies investigated the robustness of children’s initial insight into the model-room relation. Uttal, Schreiber, and D e b a c h e (in press) examined the effect of delays between the hiding event and retrieval 1 on the performance of 3-year-olds in the standard model task (including the standard extensive orientation). After seeing the miniature toy being hidden in the model, the children waited 15 s, 2 min, or 5 min before searching for the larger toy in the room. Each child received two trials with each of the three delay intervals, and there were three different orders of trials. Retrieval 2 performance was high (82%), indicating that the children’s memory representation of the hiding event persisted over all delays. The retrieval 1 results were interesting but complicated, with the primary result being an effect of delay order. The basic finding was that those 3-year-olds who had to endure a long delay on their first trial were quite unsuccessful on the rest of their trials, even those with the shorter delays (19% overall). In contrast, children who experienced the shortest delay (essentially no delay) on their first trial were
Early Symbol Understanding and Use
107
relatively successful throughout (69% overall). Follow-up studies indicated that two factors were responsible for the poor performance of the longdelay-first group. First, these children apparently forgot about the model or its relevance to the task during the initial long delay. A reminder of the model-a brief look at it before retrieval 1-improved performance markedly. Also, prior experience with the standard model task with no delays on one day inoculated children against the effects of a long initial delay on a second day. Thus, when they had a firm representation of the model-room relation, they were able to cope with the long delay between the hiding event and their first opportunity to search. Second, having forgotten about the model on their first retrieval attempt in the room, the children construed the task as a guessing game, one in which they had no basis for knowing where to search. In other words, the children represented the model and room as independent entities. Therefore, on subsequent trials, they continued to search in the room independently of what they had observed in the model, even when there was essentially no delay between the hiding event and retrieval 1. These studies testify to the fragility of 3-year-old children’s success in our model task. They are capable of achieving representational insight given adequate support (i.e., from high similarity, explicit instructions, and/ or prior experience), but it is difficult for them to maintain that insight. Presumably, a group of older children would be less susceptible to the effects of delay than the 3-year-olds in the Uttal et al. (in press) research. 2. Mapping
In the standard scale model task, with high object and relational similarity, mapping is not a serious challenge. However, Marzolf (1994) recently modified the standard task to make the mapping component more difficult. In his modified task, the toys are always hidden in one of a set of four identical hiding places. Thus, children must encode the initial hiding place relationally in order to remember it, and they must then carry over that additional relation to the room to perform retrieval 1. In Marzolf’s task, there are four identical miniature white boxes in the model and four larger boxes in the room. The hiding event takes place in one of the small boxes, so the child must encode which box contains the hidden toy. The only way to identify the relevant box is to encode its relation to available landmarks in the model. For example, the child might encode something like “the box on the floor in front of the cabinet’’ or “the box sitting on the chair.’’ Thus, relational encoding is required to find the toy the child observed being hidden, that is, to carry out retrieval 2. To succeed on retrieval 1,the child must map the embedded relation from
108
Judy S. DeLoache
the model to the room: for example, “the toy in the box on the chair in the model to the large toy in the large box on the large chair in the room.” A series of studies has examined 3-year-old children’s ability to map such embedded relations. In all of them, the children were successful on retrieval 2 (average of 81%errorless retrievals). Thus, they clearly encoded the requisite relational information from the hiding event. The main question was whether they could successfully map this additional relation onto the second space. The results are clear: Even though 3-year-old children encoded the relation between the target box and available landmark(s), they often failed to map that relation to the room. Retrieval 1 performance has averaged 41%. This poor performance occurred even if the children had already represented the model-room relation. In one study, the subjects were first given three trials of the standard task. Although the boxes were present, the toy was always hidden in the standard way with an item of furniture. Their retrieval 1 performance was just as expected for 3-year-olds (83%). However, when the experimenter then switched to hiding the toy in one of the four boxes, performance plummeted (34%). In all the studies in this series, all of the children’s retrieval 1 searches were directed to one of the boxes. Thus, they knew the toy was in a box; they just didn’t know which one. The results of these studies thus reveal a limit on the number of relations that 3-year-old children can map from model to room. In our standard task, they map the target toy-hiding place relation in one space onto the target toy-hiding place relation in the other space. In the task with boxes as the set of hiding places, they similarly map a single target toy-hiding place relation, leaving behind the hiding place-landmark relation(s). This restriction on the number of mappable relations is true only in the symbolic domain, that is, only when the child must also represent the higher order model-room relation. Marzolf and Pascha (1995) demonstrated this by performing a shrinking room version of his task. As predicted from what we know about young children’s memory-encoding capabilities, the 3-year-old children who were given the shrinking room version of the box task were successful at retrieving the toy. After observing the larger toy being hidden in one of the large boxes in the room, they successfully retrieved the miniature toy from the “shrunken room.” Thus, they were capable of mapping a two-step relation when that was the only relation involved. We see that even when there is a high level of supportive iconicity between symbol and referent, mapping can be a stumbling block when the relational structure that must be mapped is complex.
Early Symbol Understanding and Use
109
IV. Conclusion Several conclusions can be drawn from the research described in this chapter. OF SYMBOL-REFERENT RELATIONS A. NONTRANSPARENCY
The first is that symbols are not transparent. As my research makes clear, a given symbol-referent relation can be quite opaque to someone encountering the symbol for the first time. Children are relative novices with respect to symbols; one can never assume that any symbol, no matter how highly iconic or how transparent it is to an adult or older children, will be obvious to young children. Our research has amply documented the difficulty that 2-1/2- and even 3-year-olds can have appreciating the relation between a room and a highly similar scale model. I have found the same pattern for pictures with even younger children (DeLoache & Burns, 1994). Two-year-olds who see a color photograph depicting a toy hidden in a room do not know where to find the toy. They fail to use the picture as a source of information. In contrast, 2-1/2-yearolds (as discussed in this chapter) readily interpret pictures as informative representations. These results with pictures, a very familiar medium even for young children, lead us to suspect that young children may have a similarly shaky understanding of other familiar media. We are currently investigating what very young children do and do not understand about the relation of television and videos to reality (Troseth & DeLoache, 1995). B. CRUCIAL ROLEOF INSIGHT A second conclusion supported by our research is that insight into symbolreferent relations is a necessary part of symbol understanding and use. Although the representation of that relation does not have to be explicit or verbalizable, there must be a mental representation of the relation for the symbol to be interpreted in terms of something other than itself. A satisfactory account of the pattern of results summarized here requires positing mental representation of the higher order relation between symbol and referent, not just a set of independent representations of the links between elements of the two entities. Experience adopting a representational stance to various symbols makes children increasingly sensitive to the symbolic import of novel entities. C. COMPLEXITY OF EARLY SYMBOLIC FUNCTIONING
As the Model presented in Fig. 4 illustrates, the process of understanding and using a symbol as a source of information about reality is complex.
Judy S. DeLoacbe
110
There are multiple factors that interact to affect whether young children detect a symbol-referent relation and map from one to the other. A high value on one of the relevant factors can compensate for a low value on one or more others. To emphasize this point, Table IV summarizes the performance outcomes for two age groups as a function of the combination of five of the variables we have investigated in the scale model task-object similarity, relational similarity, scale similarity, instruction, and experience. Thus, we see that 3-year-olds generally require fewer sources of support to succeed than do 2-UZyear-olds and that, to some degree, the sources of support are interchangeable. A further complexity in early symbolic functioning is that there is no single pathway through the Model; one can enter the system at various points. One bottom-up route would be for the child first to notice some aspects of physical similarity, such as the fact that the miniature objects in the model look like their larger counterparts. The child might then actively look for and discover further similar features, as well as different ones, and through the process of structural alignment (Gentner, 1989; Gentner & Ratterman, 1991) represent the higher order relation between model and room. A top-down route could involve a child with past experience in a similar task who begins with the assumption that the model and room may be related and actively looks for the object and relational matches to support this initial assumption. There are many roads to representation. TABLE IV PERFORMANCE OF Two AGEGROUPS AS A FUNCTION OF DIFFERENT COMBINATIONS OF FIVEFACTORS Source" Performanceb Object Relational Scale similarity similarity similarity Instruction Experience 3-year-olds 2-1/2-year-olds
+ + + -
-
+ +
-
+ -
+
+ + + + +
-
+
-
+ +
-
+ + + + +
-
U -
-
S S U
-
U
U S S S
-
+ +
+
High level indicated by f; Low level (or absence) indicated by -. Successful performance. S; Unsuccessful performance. U.
S
-
-
S
Early Symbol Understanding and Use
111
D. CHARACTERISTICS OF SYMBOLS The most dramatic, and certainly the most surprising, results discussed in this chapter had to do with the effect of characteristics of a symbol itself on young children’s ability to appreciate its representational status. The salience of the physical characteristics of a symbol affect how easy it is for young children to interpret it as a representation of something other than itself. The more interesting something’s concrete properties are, the more difficult it is to represent it as a term in an abstract relation. The power of the dual representation hypothesis derives in part from its violation of commonsense assumptions and its counterintuitive predictions.
E. IMPLICATIONS There are some important implications from the research described in this chapter. As stated before, one should never simply assume that young children will appreciate the symbolic character of any given entity or be able to figure out how it relates to what it represents. The more objectlike the representation, the more of a potential problem there is. A relevant example comes from the various sets of blocks and rods that elementary school teachers commonly use to represent number and to help children learn arithmetic operations. Such “manipulatives” may or may not facilitate acquisition of the target concepts, depending on whether and how well the children understand the block-number relation to begin with. There is evidence that children sometimes experience precisely the sort of problems the dual representation hypothesis predicts (see DeLoache, Uttal, & Pierroutsakos, in press). In a very different domain, anatomically correct dolls are commonly used to interview young children when sexual abuse is suspected. The assumption is that the presence of a doll will be particularly helpful for very young children, because of their restricted vocabularies and general tendency to provide meager verbal reports. For a doll to be useful, however, children have to be able to make a link between themselves and the doll. The dual representation concept suggests that appreciating a self-doll relation will be problematic. Recent research in my lab indicates that young children do indeed have difficulty using a doll as a self-representation (DeLoache, 1995; DeLoache & Marzolf, in press; Smith, 1995). In conclusion, the universality and centrality of symbolization in human cognition should not lead us to assume that the acquisition of symbols and symbol systems is an easy task. As the research described here shows, young children experience many challenges in identifying entities whose
112
Judy S. Debache
main function is representational and in figuring out how those entities are related to what they stand for. ACKNOWLEDGMENTS A grant from the National Institute of Child Health and Human Development (HD25271)supported preparation of this chapter and the conduct of the research described. Some of the research was also supported by a training grant from NICHD (HD-07205).
REFERENCES Bialystok, E. (1991). Letters, sounds, and symbols: Changes in children’s understanding of written language. Applied Psycholinguistics, 12, 75-89. Bialystok, E. (1992).Symbolicrepresentation of letters and numbers. Cognitive Development, 7, 301-316. Blades, M., & Spencer, C. (in press). The development of children’s ability to use spatial representations. In H. W. Reese (Ed.), Advances in child development and behavior. San Diego: Academic Press. Carey, S. (1985).Conceptual change in childhood. Cambridge, MA: MIT Press. Chi, M. T. H., Feltovich, P.. & Glaser, R. (1981).Categorization and representation of physics problems by experts and novices. Cognitive Science, 5, 121-152. Daehler, M. W..Lonardo, R., & Bukatko, D. (1979). Matching and equivalence judgments in very young children. Child Development, 50, 170-179. DeLoache, J. S. (1986). Memory in very young children: Exploitation of cues to the location of a hidden object. Cognitive Development, 1, 123-137. DeLoache, J. S. (1987). Rapid change in the symbolic functioning of very young children. Science, 238, 1556-1557. DeLoache, J. S. (1989). Young children’s understanding of the correspondence between a scale model and a larger space. Cognitive Development, 4, 121-139. DeLoache, J. S. (1990).Young children’s understanding of models. In R. Fivush & J. Hudson (Eds.), Knowing and remembering in young children (pp. 94-126). New York: Cambridge University Press. DeLoache. J. S. (1991).Symbolicfunctioningin very young children:Understanding of pictures and models. Child Development, 62, 736-752. DeLoache, J. S. (1995). The use of dolls in interviewing young children. In M. S. Zaragoza, J. R. Graham, G. C. N. Hall, & Y. S. Ben-Porath (Eds.). Memory and testimony in the child witness (pp. 160-178). Thousand Oaks, C A Sage. DeLoache, J. S. & Burns, N. M. (1994).Early understanding of the representational function of pictures. Cognition, 52, 83-110. DeLoache, J. S., Kolstad, D. V., & Anderson, K. N. (1991). Physical similarity and young children’s understanding of scale models. Child Development, 62, 111-126. DeLoache, J. S. & Marzolf, D. P. (in press). The use of dolls to interview young children. Journal of Experimental Child Psychology. DeLoache, J. S., Miller, K. F., Rosengren. K., & Bryant, N. (1993,November). Symbolic development in young children: Honey, I shrunk the troll. Paper presented at the meeting of the Psychonomic Society, Washington, DC. DeLoache, J. S., Pierroutsakos, S. L., Uttal, D. H., & Rosengren, K. (1995).Do infants grasp the meaning of pictures? Manuscript in preparation.
Early Symbol Understanding and Use
113
DeLoache, J. S., Strauss, M., & Maynard, J. (1979). Picture perception in infancy. Infant Behavior and Development, 2, 77-89. DeLoache, J. S., Uttal, D. H., & Pierroutsakos, S. L. (in press). The development of early symbolization: Educational implications. Learning and Instruction: The Journal of the European Association for Research on Learning and Instruction. DeMendoza, 0.A. P., & DeLoache, J. S. (1995). The effects of instruction on young children’s understanding of a symbol-referent relation. Manuscript in preparation. Deregowski, J. B. (1989). Real space and represented space: Cross-cultural perspectives. Behavior and Brain Sciences, 24, 51-119. Dirks, J. R., & Gibson, E. (1977). Infants’ perception of similarity between live people and their photos. Child Development, 48, 124-130. Dow, G. A., & Pick, H. L. (1992). Young children’s use of models and photographs as spatial representations, Cognitive Development, 7, 351-363. Gelman, S. A., & Markman, E. M. (1987). Young children’s inductions from natural kinds: The role of categories and appearances. Child Development, 58, 1532-1541. Gentner, D. (1989). The mechanisms of analogical learning. In S. Vosniadou & A. Ortony (Eds.), Similarity, analogy, and thought (pp. 199-241). Cambridge, England: Cambridge University Press. Gentner, D., & Rattermann, M. J. (1991). The career of similarity. In S. A. Gelman & J. P. Byrnes (Eds.), Perspectives on thought and language: Interrelations in development (pp. 225-277). Cambridge, England: Cambridge University Press. Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghton Mifflin. Gombrich, E. H. (1969). Art and illusion: A study in the psychology of pictorial representation. Princeton, NJ: Princeton University Press. Goswami, U. (1992). Analogical reasoning in children. East Sussex, England: Lawrence Erlbaum. Gregory, R. L. (1970). The intelligent eye. New York: McGraw-Hill. Hartley, D. G. (1976). The effects of perceptual salience on reflective-impulsive performance differences. Developmental Psychology, 12, 218-225. Horn, H. A., & Myers, N. A. (1978). Memory for location and picture cues at ages two and three. Child Development, 49, 845-856. Huttenlocher, J., & Higgins, E. T. (1978). Issues in the study of symbolic development. In W. A. Collins (Ed.), Minnesota Symposia on Child Psychology (Vol. 11, pp. 98-140). Hillsdale, NJ: Erlbaum. Karmiloff-Smith, A. (1992). Beyond modularity. Cambridge, MA: Bradford. Keil, F. C. (1989). Concepts, kinds and deuelopment. Cambridge, MA: Bradford Books/ MIT Press. Liben, L. S., & Downs, R. M. (1989). Understanding maps as symbols: The development of map concepts in children. In H. W. Reese (Ed.), Advances in child development and behavior (Vol. 22, pp. 146-202). San Diego: Academic Press. Liben, L. S., & Downs, R. M. (1992). Developing an understanding of graphic representations in children and adults: The case of GEO-graphics. Cognitive Development, 7, 331-349. Mandler, J. (1983). Representation. In J. H. Flavell & E. M. Markman (Eds.), Handbook of child psychology: Vol. 3. Cognitive development (pp. 420-494). New York: Wiley. Marzolf, D. P. (1994, June). Remembering and mapping relations in a symbolic task. Poster presented at the International Conference for Infant Studies, Paris. Marzolf. D. P., & Pascha. P. T. (1995, March). Youngchildren’s relational mapping in symbolic and non-symbolic tasks. Poster presented at the meeting of the Society for Research in Child Development, Indianapolis.
114
Judy S. DeLoache
Marzolf, D. P., & DeLoache, J. S.(1994).Transfer in young children’s understanding of spatial representations. Child Development, 64, 1-15. Marzolf, D. P., DeLoache, J. S., & Kolstad, D. V. (1995).The role of relational information in young children’s understanding of scale models. Manuscript submitted for publication. Perlmutter, M., Hazen, N., Mitchell, D. B., Grady, J. G., Cavanaugh, J. C., & Floor, J. P. (1981).Developmental Psychology, 17, 104-110. Pierroutsakos. S.L. (1994,June). The bigger picture: Infants’ differential manual exploration of depicted objects. Paper presented at the meeting of the International Conference on Infant Studies, Paris. Potter, M. C. (1979).Mundane symbolism: The relations among objects, names, and ideas. In N. R. Smith & M. B. Franklin (Eds.), Symbolic functioning in childhood (pp. 41-65). Hillsdale, NJ: Erlbaum. Ratterman, M. J., Gentner, D., & DeLoache, J. S. (1990).Effects of relational and object similarity on children’s performance in a mapping task. Unpublished manuscript. Schmandt-Besserat,D. (1992).Before writing: From counting to cuneiform. Austin, T X University of Texas Press. Sigel, I. E. (1970). The distancing hypothesis: a causal hypothesis for the acquisition of representational thought. In M. R. Jones (Ed.), Miami Symposium on the prediction of behavior, 1968: Effect of early experiences (pp. 99-118). Coral Gables, Florida: University of Miami Press. Sigel, I. E. (1978). The development of pictorial comprehension. In B. S. Randhawa & W. E. Coffman (Eds.), Visual learning, thinking, and communication (pp. 93-111). New York: Academic Press. Sigel, I. E., Anderson, L. M., & Shapiro, H. (1%6). Categorization behavior of lower and middle class Negro preschool children: Differences in dealing with representation of familiar objects. Journal of Negro Education, 35, 218-229. Smith, C. M. (1995,March). Young children’s use of dolls as self symbols. Paper presented at the meeting of the Society for Research in Child Development, Indianapolis. Steinberg, B. M. (1974).Information processing in the third year: Coding, memory, transfer. Child Development, 45, 503-507. Teale, W. H., & Sulzby,E. (1986).Emergent literacy: A perspective for examining how young children become writers and readers. In W.H. Teale & E. Sulzby (Eds.), Emergent literacy: Writing and reading. Norwood, NJ: Ablex Publishing Corporation. Troseth, G. L., & DeLoache, J. S. (1995).Very young children’s understanding of video information. Manuscript in preparation. Uttal, D. H., Schreiber, J. C.. & DeLoache, J. S. (in press). Waiting to use a symbol: The effects of delay on children’s use of models. Child Development.
MECHANISMS OF TRANSITION: Learning with a Helping Hand Susan Goldin-Meadow and Martha Wagner Alibali
Learning involves change from a less adequate to more adequate understanding of a task. Although it is important to describe the learner’s state before and after the task has been mastered, characterizing the transitional period that bridges these two states is, in a sense, the key to understanding learning. Unfortunately, the transitional period has proved to be difficult to study for several reasons. First, the transitional period is likely to be fleeting or, at the least, more ephemeral than the beginning and end states that anchor it. Second, the transitional period is difficult to identify before the fact. It is easy to identify a learner as having been in transition after a task has been mastered-the learner who made progress and succeeded on the task was “in transition” with respect to the task, whereas the learner who failed to make progress was not. However, such a post hoc measure is of limited usefulness, both for experimenters interested in exploring the transitional period and for teachers interested in identifying learners who might be in a transitional period and therefore particularly receptive to instruction. The purpose of this chapter is to present a measure of the transitional period that is not post hoc, and to describe the findings on transitions in learning that this measure has allowed us to discover. We begin with the phenomenon upon which the measure is based-the mismatch between gesture and speech. Consider a child who is participating in a Piagetian liquid conservation task and is asked to explain why he thinks the quantities in the two containT H E PSYCHOLOGY OF LEARNING A N D MOTIVATION. VOL 33
115
Copyright 8 1995 by Academic Press, Inc. All rights of reproduction in any form rcservcd.
116
Susan Goldin-Meadow and Martha Wagner Alibali
ers are different in amount. The child says “they’re different because this one is tall and this one is short,” while indicating with his hands the height of each container (he holds a flat, horizontal palm at the brim of the glass and then moves his palm to indicate the shorter height of the dish). Such a child has justified his belief that the amounts are different by focusing on the heights of the containers, and has done so in both speech and gesture. Now consider a second child, also asked to explain why she thinks the quantities in the two containers are different in amount. This child sounds just like the first child, offering in speech the same justification based on the height of the two containers, “they’re different because this one is tall and this one is short.” However, in gesture, the child conveys different information-she indicates with her hands the width of each container (she holds two flat, vertical palms at the sides of the glass and then moves her palms to indicate the larger width of the dish). This child has justified her belief that the amounts are different by focusing on the heights of the containers in speech, but on the widths of the containers in gesture. The first point to notice is that one can, on a single task, say one thing and gesture another-a phenomenon we have labeled “gesture-speech mismatch” (Church & Goldin-Meadow, 1986). Such mismatches are not limited to a certain age and a certain task. Gesture-speech mismatches have been observed in preschoolers learning to count (Graham, 1994), elementary school children reasoning about mathematics problems (Alibali & Goldin-Meadow, 1993; Perry, Church, & Goldin-Meadow, 1988), middle-schoolers reasoning about seasonal change (Crowder & Newman, 1993), children and adults reasoning about moral dilemmas (Goodman, Church, & Schonert, 1991), adolescents reasoning about Piagetian bending rods tasks (Stone, Webb, & Mahootian, 1992), and adults reasoning about gears (Perry & Elder, 1994). The second point to note is that the two children appear to be equally knowledgeable (or unknowledgeable) about conservation of liquid quantity when we listen to their speech. However, as we will show in a later section, if the two children are given instruction in conservation, the child who produces gesture-speech mismatches will be more likely to benefit from instruction than the child who produces gesture-speech matches (Church & Goldin-Meadow, 1986). Thus, gesture-speech mismatch appears to identify children who are ready to profit from instruction in conversation; that is, it identifies children who are “in transition” with respect to this concept. This article will be divided into four parts. In the first, we provide evidence that gesture-speech mismatch is, in fact, a reliable index of the transitional period. In this first section, we argue that mismatch is not just an epiphenomenon associated with the transitional period but rather reflects a fundamental characteristic of the transitional state. In the second section, we explore
Learning with a Helping Hand
117
the sources of gesture-speech mismatch, arguing that if mismatch is central to the transitional period, the factors that create mismatch might also be the factors that render a learner transitional. In the third section, we explore the role that gesture-speech mismatch plays in the learning process. We argue that gesture-speech mismatch, occurring as it does in naturalistic contexts of communication, may provide a signal to those who interact with the learner, announcing to those who can interpret the signal that the learner is in transition and thus is ready to learn. Finally, we close with a discussion of how information is represented in gesture and speech.
I. Gesture-Speech Mismatch as an Index of Transition Our claim that gesture-speech mismatch is an index of transition is not meant to apply exclusively to learning during childhood. However, all of our empirical work has thus far been done with children. The studies described in this article therefore focus on children, and leave as an open (and testable) question whether gesture-speech mismatch identifies learners in transition throughout the life span. We begin with a brief review of the central findings in developmental psychology that serve as a basis for current theories of how change occurs.
A. CONSISTENCY vs. VARIABILITY WITHIN THE LEARNER 1.
The Child as Consistent Thinker
One of Piaget’s major contributions to developmental psychology was the observation that children’s knowledge is systematic and rule governed, even when that knowledge is incorrect or incomplete. Thus, the period prior to mastery of a concept can be described and related in a systematic way to the period following mastery. Piaget described the developmental paths that children take when acquiring a concept, and he argued that all children tend to follow a common series of steps in acquiring a particular concept-in other words, there is consistency across children. Moreover, according to Piaget, each step within this common developmental path can be characterized by a single strategy that children typically use when solving problems instantiating the concept. For example, in his description of conservation, Piaget portrayed children as first focusing on a single dimension in the conservation task, then focusing alternately on two dimensions, and finally focusing on the transformation (Piaget, 1964/1967).Thus, in the Piagetian view, there is not only consistency across children but consistency within the child as well.
118
Susan Goldin-Meadow and Martha Wagner AUbali
Following Piaget, developmental studies of a large number of tasks have attempted to document the sequence of states through which children’s thinking progresses over developmental time, and to identify the strategy that characterizes each state in the series. For example, studies of children’s acquisition of simple addition have depicted children as first relying on counting from one, then relying on counting from the larger addend, and finally relying on retrieval (Ashcraft, 1987). Studies utilizing information integration methodology have claimed that children progress from additive to multiplicative strategies in solving balance scale problems (Wilkening & Anderson, 1982) and in making perceptual judgments of area, volume, and velocity (Anderson & Cuneo, 1978; Wilkening, 1980, 1981). An alternate account of the development of area judgment-but one that still argues for commonality across children and a single strategy at each developmental step-claims that children initially center on one side of a rectangle, later are influenced by the covariation between area and shape, and finally reach perceptual constancy (Gigerenzer & Richter, 1990). Other studies have been specifically designed to identify the rules children use at various points in a developmental progression. Siegler (1976) developed his rule assessment methodology with this goal in mind. For each type of problem that he investigated, Siegler identified a set of possible rules that children could use to solve that type of problem, and devised a set of test problems that would yield a different pattern of correct answers and errors for each possible rule. In this way, the specific rule that a child used in solving the problems could be unambiguously determined. In a research program that investigated a variety of Piagetian tasks, including balance scale problems, projection of shadows problems, and probability problems, Siegler (1976, 1981) showed that, at all points in the developmental progression characterizing each task, most children could be classified as using a single “rule.” Thus, historically, developmental research has emphasized consistency across children and, equally if not more important, consistency within each child. Note that this characterization of the child implies that the process of developmental change will be abrupt and characterized by a total transformation in the child’s thinking. The child entertains a single strategy for some period of time, and then (often with no apparent change in environmental circumstances) adopts a completely distinct strategy. As Siegler (1994a) pointed out, portraying children’s thinking and knowledge as monolithic for a period of time creates a need to explain the wide gulfs between the strategy used at one point in development and the strategy adopted at a later point.
Learning with a Helping Hand
119
2. Theoretical Accounts That Posit Variability within the Child In fact, it is somewhat odd that the emphasis in developmental studies has been on consistency within the child because many of the major theoretical accounts of developmental change assume variability within the child as one of the conditions for change. Even Piaget, a pioneer of the view of the child as consistent thinker, proposed a mechanism for change that relies on variability within the child. For example, Piaget’s (19791985) theory of equilibration posits that the impetus for transition comes from disequilibrium, which is the result of “internal conflict” between competing approaches to a problem. In the Piagetian view, variable approaches to a problem become integrated into a more advanced approach via the process of equilibration. Within the Piagetian tradition, Langer (1969), Snyder and Feldman (1977), and Strauss (1972; Strauss & Rimalt, 1974) have argued that during transitional periods, children display at least two functional structures with respect to a concept. The children’s appreciation of discrepancy between those functional structures leads to disequilibrium, which then serves as the impetus for change. Turiel (1969, 1974) has advanced similar arguments regarding changes in moral reasoning. Other, non-Piagetian theorists have also argued that variability may motivate or characterize transition. Among information-processing theorists, Klahr (1984; Klahr & Wallace, 1976) has posited conflict-resolution rules as a mechanism of change in self-modifying production systems. These rules are set into action when two productions are eligible to be activated on a single problem. Thus, in Klahr’s model, variability serves as a trigger that sets special “change” rules into action. The rules act to strengthen weights that apply to “good” (adaptive or effective) productions and to weaken weights that apply to ineffective, less adaptive productions, so that useful strategies are maintained in the repertoire, and poor ones fade away. Dynamical systems theories of development have also suggested that variability may characterize transition points. Specifically, one empirical prediction derived from dynamical systems theory is that, when a system goes from one stable (attractor) state to another, the transition should be characterized by greater fluctuation or variability in the behavioral measure, compared to a baseline (Thelen, 1989). For example, during the changes from consistent reliance on a “compare length” strategy for solving number conservation problems to consistent reliance on a “count items” strategy, a child’s performance should become more variable. It is not clear whether, within the dynamical systems framework, variability is causally related to change or is merely epiphenomena1 to change. However, as in the Piagetian and information-processing accounts, within the dynamical systems framework, when change occurs, variability should be observed.
120
Susan Goldin-Meadow and Martha Wagner Alibali
3. Empirical Evidence Indicating Variability within the Child
Recent studies have begun to focus explicitly on documenting and quantifying variability within a child (e.g., Acredolo & O’Connor, 1991; Acredolo, O’Connor, & Horobin, 1989; Crowley & Siegler, 1993; Siegler, 1984; Siegler & Jenkins, 1989; Siegler & McGilly, 1989). What was heretofore considered “noise” in the data, when viewed from a perspective that focused on consistency within the child, is now recognized as data of particular interest from a perspective that focuses on variability within the child. Indeed, variability has been found within an individual child when the child is asked to solve a set of related problems. For example, despite the claims that 5-year-olds think of number conservation solely in terms of the lengths of the rows, trial-by-trial assessments indicate that most 5-year-olds rely sometimes on the lengths of rows, sometimes on the type of transformation, and sometimes on other strategies such as counting or pairing (Siegler, 1995). Evidence of variability within an individual child is not limited to Piagetian tasks but can be found across a wide variety of domains, ranging from language and reading to motor development (see Siegler, 1994b, Table 1). Variability has even been found when a child is asked to solve the same problem twice. For example, a preschooler presented with the identical addition problem on two successive days quite often will use different strategies on the two days (Siegler & Shrager, 1984). Variability on the same problem has also been observed within an individual child in timetelling tasks (Siegler & McGilly, 1989) and in block-building tasks (Wilkinson, 1982). Although it is interesting that variability can be found within a child over trials, it is not variability alone that provides the impetus for change in the theoretical accounts described above. According to the Piagetian and information-processing accounts described above, more than one strategy must be activated on a single problem in order for change to be likely. Under this view, what propels a child forward is not the mere availability of more than one strategy in the child’s repertoire, but rather the concurrent activation and evaluation of those strategies. However, a child who uses one strategy the first time she solves an addition problem and a second strategy the next time she solves the same problem might not ever consider those two strategies concurrently. Thus, the real question is whether we find variability within a child even on a single problem. Evidence of variability on a single problem has been particularly difficult to obtain in part because the “forced choice” techniques frequently used in developmental research encourage children to choose or report a single solution for each problem. In an attempt to circumvent this methodologic deficiency, Acredolo et al. (1989) offered children a variety of answers to
Learning with a Helping Hand
121
a single conservation problem and asked the children to evaluate each answer (some of which were wrong, i.e., nonconserving) using a probability scale. On a number conservation problem, one-third of a group of kindergarten through sixth-grade children accepted (i.e., assigned nonzero probabilities to) more than one of three possible solutions. On an area conservation problem, close to three-quarters of the same group of children accepted more than one solution. These results demonstrate that, when pressed, many children are willing to consider more than one solution to a single problem. However, the data still do not allow us to determine whether individual children spontaneously consider several solutions within the same problem. One technique that can yield evidence about within-problem variability in children’s reasoning involves examining both what a child says and what he or she gestures when explaining a problem. Indeed, when we examine the gestures that children produce along with their spoken explanations of a problem, we find that they often produce gestures that convey information that is different from the information conveyed in their speech (Church & Goldin-Meadow, 1986; Perry et al., 1988). The example with which we began this article illustrates this point: When asked to explain her belief that the liquid quantities in the two containers were different, the second child we considered focused on the heights of the containers in her speech but on the widths of the containers in her gestures. Thus, the gesture-speech mismatches that children produce when asked to explain their solutions to a problem provide clear evidence that children can, and do, entertain a variety of strategies or solutions on the same problem. In the next section, we examine whether the within-problem variability evident in gesturespeech mismatch is associated with periods of transition, as we might expect if it functions as an impetus for change.
B. GESTURE-SPEECH MISMATCH A N D TRANSITION 1. Gesture-Speech Mismatch Predicts Openness to Instruction
a. Gesture-Speech Mismatch in Conservation In our previous work, we have observed and explored gesture-speech mismatch in two concepts, mastered at two different ages. The first is an understanding of conservation, as measured by a series of Piagetian conservation tasks (two number tasks, two liquid quantity tasks, and two length tasks). The child is initially shown two objects that are identical in number, liquid quantity, or length and is asked to verify the equivalence of the two objects. The experimenter then transforms one of the two objects (e.g., spreading the row of checkers out, pouring the water into a differently shaped container, reorienting the stick so it is no longer aligned with the other object) and asks the child whether
122
Susan Goldin-Meadow and Martha Wagner Alibali
the transformed object has the same or different amount as the untransformed object. Children younger than 7 years of age typically respond that the transformed object has a “different” amount. The child is then asked to justify his or her judgment (see Church & Goldin-Meadow, 1986, for additional procedural details). Although many investigators have analyzed the spoken justifications children give for their conservation judgments, few have analyzed-or even explicitly noticed-the gestures that children produce along with their speech. In fact, it turns out that children gesture a great deal when asked to justify their conservation judgments (indeed, even congenitally blind children, who gesture infrequently on other tasks, produce gestures on the conservation task, Iverson & Goldin-Meadow, 1995). The gestures children produce on the conservation task often convey the same information as is conveyed in the speech they accompany. However, as described above, gesture can also be used to convey different information from speech. For example, a child in one study said, “they’re different because you spread them out,” while moving his index finger between the two rows of checkers, pairing each checker in one row with a checker in the second row. This child focused on the experimenter’s actions on the objects in speech but focused on the one-to-one correspondence between the checkers in gesture.
b. Gesture-Speech Mismatch in Mathematical Equivalence The second concept we have explored is an understanding of mathematical equivalence (the notion that one side of an equation represents the same quantity as the other side of the equation), as measured by children’s responses to addition problems of the following type: 3 + 4 + 6 = - + 6. The child is first asked to solve the problem (i.e., put a number in the blank) and is then asked to explain how he or she arrived at the solution (see Perry et al., 1988, for procedural details). American children typically master this task by the age of 12. A frequent error that children make when solving these problems incorrectly is to add up all of the numbers in the problem (i.e., the child puts 19 in the blank in the above example). When asked to explain this solution, the child might say, “I added the 3, the 4, the first 6, and the second 6, and got 19,” while pointing at each of the four numbers. Such a child is conveying the same procedure (the Add-All procedure) in both gesture and speech. However, children can, and frequently do, convey a different procedure in gesture than they convey in speech-that is, they produce gesture-speech mismatches. For example, in explaining the same problem, a child might say, “I added the 3, the 4,the first 6, and the second 6, and got 19,” but indicate only the 3 and the 4 with a two-finger V-shaped point. Such a child is conveying the Add-All procedure in speech, but in gesture is giving special status to the two numbers which, if grouped and added, result in a correct answer to the problem.
Learning with a Helping Hand
123
c. Training Studies to Determine Whether Gesture-Speech Mismatch Predicts Readiness to Learn We have hypothesized that gesture-speech mismatch is a reliable index of the transitional period. To test this hypothesis, we conducted training studies in each of the two concepts, predicting that the children who produced a large number of gesture-speech mismatches would be most likely to benefit from the instruction we provided. Church and Goldin-Meadow (1986) gave children ages 5 to 8 a pretest in conservation, and Perry et al. (1988) gave children ages 9 to 10 a pretest in mathematical equivalence. In each pretest, the child was asked to solve six problems and to explain his or her solutions to each problem. Only children who failed the pretest, that is, who did not solve the six conservation or addition problems correctly, were included in the study. Children were then provided with instruction in the task and, after the training, were given a posttest designed to assess how much their understanding of the task had improved. On the basis of the explanations they produced on the pretest, children were divided into two groups: ( 1 ) those who produced gesture-speech mismatches in three or more of their six explanations, labeled discordant, and (2) those who produced mismatches in fewer than three of their explanations, labeled concordant. The question then is how many children in each of these two groups were successful on the posttest. Posttest performance was evaluated relative to each child’s pretest; as a result, posttest success reflects improvement in a child’s understanding of the task. The results are presented in Figure 1, which displays the proportion of concordant and discordant children who were successful on the posttest in the conservation task (A) and in the mathematical equivalence task (B). In both tasks, the discordant children were significantly more likely to succeed on the posttest than were the concordant children (x’ = 5.36, df = 1 , p 5 .02 for conservation; x2 = 4.47, df = 1 , p < .025 for mathematical equivalence). Thus, the children who often exhibited variability within a single response on a task, that is, who often produced gesture-speech mismatches, were more likely to profit from instruction on that task than were the children who exhibited little or no within-response variability. It is important to note that the children in these studies were indistinguishable when viewed from the perspective of speech; all of the children in both studies produced incorrect spoken justifications. Thus, identifying children who were ready to profit from instruction was possible only if the children’s gestures, as well as their words, were taken into account (but see Graham & Perry, 1993). In addition, we stress that gesture-speech mismatch should not be taken as a general characteristic of the learner independent of the task. Rather, a learner will produce mismatches on a particular task only when he or she is in a transitional period with respect
Susan Goldin-Meadow and Martha Wagner Alibali
124
a" 5
Discordant 0.8
3 0.6 3
0 C
2 2 .-
0.4
S
0 + 0
'0 e
0.2
8
a P 0.0 A. Conservation Task
+Q
;
a"O
"
B. Mathematical Equivalence Task Fig. 1. Success after instruction in concordant versus discordant children. The figure presents the proportion of concordant (hatched bar) versus discordant (solid bar) children who improved their performance after instruction and were successful on a posttest on the conservation task (A) or the mathematical equivalence task (B). In both tasks, the discordant children were significantly more Likely to improve their performance and succeed on the posttest than were the concordant children. (A is adapted from Church & Goldin-Meadow, 1986, and B is adapted from Perry, Church, & Goldin-Meadow. 1988.)
Learning with a Helping Hand
125
to that task. Thus, the same child might well be expected to be concordant on one task and discordant on another, as we have found is often the case (Perry et al., 1988). 2.
Gesture-Speech Mismatch is Preceded and Followed by a Stable State
We have shown that the discordant state in which children produce a large number of gesture-speech mismatches is transitional in the sense that it is a state in which change is likely. There is, however, another sense in which the discordant state might be expected to be transitional. If the discordant state indexes the transitional period, it ought to be both preceded and followed by a more stable, concordant state. Thus, children might be expected to begin learning about a task in a state in which they produce gesture-speech matches containing incorrect explanations, that is, a concordant incorrect state. They should then progress to a state in which they produce gesture-speech mismatches (which may themselves contain either correct or incorrect explanations), that is, a discordant state. Finally, they should return to a state in which they again produce gesture-speech matches, but matches that contain correct explanations, that is, a concordant correct state. To test this prediction, Alibali and Goldin-Meadow (1993) conducted a microgenetic study of children’s acquisition of mathematical equivalence. They gave 9- and 10-year-old children instruction in mathematical equivalence and observed the children’s explanations of the problems they solved over the course of the pretest and training period. The relationship between gesture and speech in each explanation was monitored over the series of problems for children who gestured during the study. Of the 35 children who improved their understanding of mathematical equivalence over the course of the study, 29 (83%) followed the predicted path (p < .001, Binomial Test)-11 progressed from a concordant incorrect state to a discordant state, 15 progressed from a discordant state to a concordant correct state, and 3 traversed the entire path, moving from a concordant incorrect state through a discordant state and ending in a concordant correct state. Moreover, the few children who violated the path, moving directly from a concordant incorrect state to a concordant correct state without passing through a discordant state, appeared to have made only superficial progress on the task: They performed significantly less well on a posttest than the children who progressed through a discordant state. Thus, the discordant state appears to be transitional in that it both predicts openness to instruction and is sandwiched between two relatively stable states. These findings suggest that the within-problem variability evident in gesture-speech mismatch is indeed associated with periods of transition.
126
Susan Goldin-Meadow and Martha Wagner Alibali
C. GESTURE-SPEECH MISMATCH REFLECTS THE ACTIVATION OF Two IDEASIN SOLVING, AS WELLAS EXPLAINING, A PROBLEM
We have established that gesture-speech mismatch is associated with transitional periods. But does mismatch tell us anything about what makes a state transitional? Gesture-speech mismatch, by definition, reflects two different pieces of information within a single response, one conveyed in speech and another conveyed in gesture. As mentioned above, what we take to be significant about mismatch is that it provides evidence, not just that the speaker has two ideas available within his or her repertoire, but that the speaker has invoked both of those ideas on a single problem. The concurrent activation of two ideas on a single problem is what we suggest may be a defining feature of the transitional period. Note, however, that gesture-speech mismatch is found in explanations, and explanations are produced after a problem is solved and may have little to do with how the problem actually was solved (see Ericsson & Simon, 1980, and Nisbett & Wilson, 1977, for discussions of this issue). Thus, the fact that children exhibit two ideas when explaining how they solved a problem does not necessarily mean that the children consider both ideas when actually soluing the problem. Discordance could reflect post hoc reasoning processes rather than on-line problem solving. To explore this possibility, Goldin-Meadow, Nusbaum, Garber, & Church (1993) conducted a study to determine whether discordant children activate more than one idea, not only when they explain their solutions to a problem, but also when they solve the problem itself. The approach underlying the study assumes that activating more than one idea when solving a problem takes more cognitive effort than activating a single idea. Thus, a child who activates more than one idea on one task should have less capacity left over to simultaneously perform a second task than a child who activates only a single idea. The concordant and discordant children in our mathematical equivalence studies all solved the addition problems incorrectly. If explanations are an accurate reflection of the processes that take place in problem solving, we would expect the discordant children, who tend to produce two ideas on each explanation (one in speech and a second in gesture), to also activate two ideas when solving each addition problem. In contrast, concordant children, who tend to produce a single idea per explanation, would be expected to activate only one idea when solving each addition problem. If this is the case, the discordant children are, in a sense, working harder to arrive at their incorrect solutions to the addition problems than are the concordant children, and should have less cognitive effort left over to tackle another task.
Learning with a Helping Hand
127
Goldin-Meadow, Nusbaum, Garber, and Church (1993) tested these predictions in a two-part study. In the first part, children were given six addition problems and asked to solve and then explain the solution to each problem. These explanations were later used to divide children into discordant and concordant groups, as described above. The second part of the study contained a primary math task and a secondary word recall task. On each of 24 trials, children were first given a list of words that they were told they would be asked to remember. They were then given a math problem that they were asked to solve but not explain. Finally, the children were asked to recall the word list. It is important to note that the children were not asked for explanations at any time during this second part of the study, and that the primary and secondary tasks were conducted concurrently, thus presumably both drawing upon the limited pool of cognitive effort a child has available. The math task contained two types of problems: Hard math problems that were identical to those used in our previous studies, except that all four numbers in the problem were different (e.g., 3 + 6 + 7 = - + 8), and Easy math problems that also contained four distinct numbers, but all four were on the left side of the equal sign, (e.g., 4 + 7 + 3 + 5 = -). The Easy math problems were included as a control because children of this age typically solve problems of this type correctly and produce gesture-speech matches (i.e., single idea explanations) to explain these correct solutions. The word recall task contained two types of lists: one-word lists that were not expected to tax the children’s cognitive capacities, and three-word lists that might be expected to strain the children’s capacities, particularly if those capacities were already taxed by activating two ideas on a single problem. Thus, each child received six problems of each of four types: a one-word list with an Easy math problem, a three-word list with an Easy math problem, a one-word list with a Hard math problem, and a threeword list with a Hard math problem. Figure 2A presents the proportion of math problems the children solved correctly on the primary math task. Not surprisingly, none of the children solved the Hard math problems successfully, and almost all of the children solved the Easy problems successfully. Performance on the math task was not affected by the number of words the children had to recall, nor was it affected by whether the child was concordant or discordant in his or her pretest explanations. Thus, the concordant and discordant children performed the same on the math task in terms of number of problems solved correctly. However, we suggest that this identical performance is deceptive and conceals differences in how the two groups of children actually solved the math problems. On the basis of the explanations they produced on the pretest, the discordant children might be expected to activate two ideas
Susan Goldin-Meadow and Martha Wagner Alibali
128
A. Performance on the Math Task
+ Concl-word
*
Conc3-word
--t Dlscl-word --O-
Easy Math Problem
4+7+3+5---
Disc3-word
Hard Math Problem 3+6+7--+8
B. Performance o n the Word Recall T a s k
3 z E 0 E
1.o
0.8
*
5 0.6
z
0
6
0.4
C
xe
Concl-word
+ ConcSword + Dlscl-word + Disc3-word
0.2
P
0.0 Easy Math Problem 4+7+3+5---
Hard Math Problem 3+6+7=--+8
Fig. 2. Performance on the mathematical equivalence addition problems and a concurrently administered word recall task. (A) presents the proportion of Easy and Hard math problems solved correctly by concordant (Conc) and discordant (Disc) children under conditions of low (one-word list) and high (three-word list) cognitive load. The children did not differ in their performance on the math problems, either as a function of concordant versus discordant status or as a function of low versus high cognitive load. (B) presents the proportion of word lists given with Easy and Hard math problems that were recalled correctly by concordant and discordant children under conditions of low (one-word list) and high (three-word list) cognitive load. After solving the Hard math problems, the discordant children had significantly less capacity available to recall the word lists under conditions of high cognitive load (i.e., on three-word lists) than the concordant children. These data suggest that the discordant children
Learning with a Helping Hand
129
when solving each Hard math problem and thus to be working harder to achieve their incorrect solutions on these problems than the concordant children, who, on the basis of their pretest explanations, are expected to activate only one idea per problem. We turn to the secondary word recall task to provide a gauge of how much effort the children in the two groups expended on the primary math task. Figure 2B presents the proportion of word lists the children recalled correctly on the secondary task. Consider first performance on the word lists given along with Easy math problems. Both concordant and discordant children were expected to activate a single idea per problem when solving the Easy math problems. Thus, the children should expend the same amount of effort on these math problems and should not differ in the amount of effort left over for word recall. The two groups therefore should recall the same proportion of word lists after solving the Easy math problems-as, in fact, they did. Not surprisingly, both concordant and discordant children recalled a higher proportion of one-word lists than three-word lists after solving the Easy math problems. We turn next to the word lists recalled after solving the Hard problems, focusing first on the concordant children. On the basis of their pretest explanations, the concordant children were expected to activate a single idea when solving a Hard math problem. Thus, their performance on the word recall task should not differ for lists recalled after solving the Hard problems and for lists recalled after solving the Easy problems. The results presented in Figure 2B confirm this prediction. In contrast, the discordant children, on the basis of their pretest explanations, were expected to activate two ideas when solving a Hard math problem and to expend more effort on this task than the concordant children. They should consequently have less effort left over to expend on recalling the words after solving a Hard math problem and should perform poorly on this task, particularly when their capacities are taxed (i.e., when they are asked to recall the threeword lists). The data displayed in Figure 2B confirm that the discordant children recalled significantly fewer three-word lists after solving the Hard math problems than did the concordant children (F(1, 15) = 16.477, p < .001).
were working harder to arrive at their incorrect solutions to the Hard math problems than were the concordant children. This greater effort is hypothesized to be an outgrowth of the fact that the discordant children activated two ideas when attempting to solve each Hard math problem. The bars reflect standard errors. (Reprinted from Goldin-Meadow, Nusbaum, Garber, & Church, 1993.) Copyright 1993 by the American Psychological Association. Reprinted by permission.
130
Susan Goldin-Meadow and Martha Wagner Alibdi
The data in Figure 2B suggest that the discordant children were working harder than the concordant children to solve the Hard math problems. We suggest, on the basis of their pretest explanations, that the discordant children were working harder on problems of this type because, on each problem, they activated two ideas when attempting to solve the problem. Intuitively, a child in transition ought to be more advanced and should perform better than the child who has not yet entered the transitional period. However, the data in Figure 2B suggest otherwise-the discordant children, who were shown to be particularly ready to learn in previous studies, performed less well on the word recall task than the concordant children. When solving the Hard math problems, the discordant children carry the extra burden of too many unintegrated ideas, a burden that appears to take cognitive effort, leaving less effort available for other tasks. These findings suggest that there may be a cost to being in transition, a cost that may explain why the transitional state is an unstable state. The extra burden carried by the child in transition may make it difficult to remain in a transitional state for long periods. A child in transition who receives good instructional input is likely to progress forward (Alibali & Goldin-Meadow, 1993; Church & Goldin-Meadow, 1986;Perry et al., 1988), but one who does not receive input may find it difficult or costly to maintain transitional status and may regress (Church, 1990). The fact that being in a transitional state demands additional psychological effort may explain why regression to a prior state is commonly observed in learning.
11. The Sources of Gesture-Speech Mismatch
We have suggested that gesture-speech mismatch provides a good index of transition because it reflects a fundamental property of the transitional state. Mismatch reflects the fact that the speaker is activating two ideas on a single problem-a characteristic we take to be central to being in transition. If gesture-speech mismatch is an important characteristic of the transitional state, the factors that create mismatch may be the same factors that render a learner transitional. Thus, our next step is to examine how gesture-speech mismatch comes about. A. INFERRING THE LEARNER’S KNOWLEDGE BASEFROM GESTURE AND SPEECH We begin by using the explanations a child produces over a set of problems as a basis for inferring what the child knows about the problem. In our studies, a child is asked to solve a problem and then explain that solution.
Learning with a Helping Hand
131
The child describes a procedure for arriving at a solution and from this procedure we make inferences about the child’s understanding of the problem. For example, we assume that children who say they solved the math problem 3 + 6 + 7 = - + 7 by “adding the 3, the 6, the 7, and the 7” have a representation of the problem that includes all four numbers, with no meaningful subgroupings within the numbers (an Add-All procedure). In contrast, children who say they solved the same problem “by adding the 3, the 6, and the 7” are assumed to have a representation of the problem that includes only those numbers on the left side of the equal sign (an Addto-Equal-Sign procedure). Thus, the procedures the child describes provide insight into the way in which that child represents the problem. Note that by observing both gesture and speech, we have two different access routes to the child’s representation: one through the procedures that the child articulates in speech and a second through the procedures that the child describes in gesture. In concordant children, the two access routes provide evidence for the same representation because, by definition, concordant children tend to produce in gesture the same procedures that they produce in speech. Thus, rarely does a concordant child produce a procedure in speech that the same child does not also produce in gesture. In other words, the repertoire of procedures that the concordant child has tends to be produced in both gesture and speech.’ Table I presents an example of the six explanations produced by a child who would be considered concordant in our math studies. The child produced five explanations in which the procedure expressed in speech was the same as the procedure expressed in gesture (that is, five matching explanations) and one explanation in which the procedures expressed in the two modalities were different (that is, one mismatching explanation). In three of the matching explanations, Add-All (AA) was the procedure expressed in both modalities; in the remaining two, Add-to-Equal-Sign (AE) was the procedure expressed in both modalities. In the single mismatching explanation, AA was expressed in speech but A E was expressed in gesture. The bottom of the table presents the repertoires of procedures I Our data show that if a concordant child produces gestures on the mathematical equivalence task, the child is likely to produce all of the procedures he or she has available in both gesture and speech. As described in the text, this does not mean that the child produces a gestural equivalent for a spoken procedure on every explanation, or even that the child produces a gestural response on every explanation-only that the child produces a gestural equivalent for each spoken procedure on at least one explanation. It is important to point out, however, that some children (albeit a relatively small proportion) do not gesture at all when explaining their solutions to the mathematical equivalence task. These children obviously must produce all of their procedures in speech and none in gesture. Data from nongesturers have not been included in the repertoire analyses presented here, but see Alibali and Goldin-Meadow (1993) for further discussion and description of children who do not gesture on the math task.
Susan Goldin-Meadow and Martha Wagner Alibali
132
TABLE I
EXPLANATIONS AND REPERTOIRES OF A CONCORDANT CHILD ON SIX MATHPROBLEMS Explanations ~
Problem number 1 2 3 4 5
6
Relationship between speech and gesture
Procedure expressed in speech
Procedure expressed in gesture
Match Match Match Match Match Mismatch
AA” AA AA AEb AE AA
AA AA AA AE AE AE
The repertoires inferred from this set of six problems Type of repertoire
Procedures expressed in the repertoire
Gesture and speech Speech only Gesture only
AA, AE None None
AA, the Add-All procedure. AE. the Add-to-Equal-Sign procedure.
that we would infer from this set of responses. The child expressed two procedures in both gesture and speech (AA and AE), and no procedures in speech alone or gesture alone. Note that the child’s repertoires are based on whether a procedure ever appears in a given modality over the set of six responses. Thus, a procedure did not have to be expressed in gesture and speech on the same problem to be considered part of the Gesture and Speech repertoire; it had only to appear sometime in gesture and sometime in speech. Conversely, a procedure that is expressed in gesture alone on a single problem (e.g., the AE procedure in problem 6 in Table I) is not necessarily part of the child’s Gesture Only repertoire; it would qualify as a part of this repertoire only if it were never produced in speech on any of the six explanations. In general, concordant children, by definition, tend to have repertoires in which all of the procedures they produce are part of a Gesture and Speech repertoire, with very few, if any, procedures in either the Gesture Only or the Speech Only repertoires. In contrast to concordant children, in discordant children, the two access routes provided by speech and gesture offer evidence for two different
Learning with a Helping Hand
133
representations: one expressed in speech and a second expressed in gesture. Thus, discordant children appear to be working with two different representations of the same problem. What types of repertoires does the discordant child then have? In fact, there are two different types of repertoires that a child could have and still be discordant. Table I1 presents two possible sets of explanations that, in principle, a child could produce and be considered discordant. Note that the two sets result in different repertoires. In response pattern #1, the child produces two matching explanations and four mismatching explanations. AA is expressed in both speech and gesture in the matching explanations (problems 1 and 2). AA is also expressed in speech in two of the mismatching explanations (problems 3 and 4), and is expressed in gesture in the two other mismatching explanations (problems 5 and 6). AE is expressed in speech in two of the mismatching explanations (problems 5 and 6), and is expressed in gesture in the two other mismatching explanations (problems 3 and 4). Thus, AA belongs to
TABLE I1 POSSIBLE EXPLANATIONS AND REPERTOIRES FOR A DISCORDANT CHILD Possible sets of explanations Response pattern #1
Response pattern #2
Problem number
Relationship between speech and gesture
Procedure expressed in speech
Procedure expressed in gesture
Procedure expressed in speech
Procedure expressed in gesture
1 2 3 4 5 6
Match Match Mismatch Mismatch Mismatch Mismatch
AA" AA AA AA AE AE
AA AA AEh AE AA AA
AA AA AA AA AA AA
AA AA AE AE AE AE
The repertoires inferred from the two possible sets of explanations Procedures expressed Procedures expressed in the repertoire for in the repertoire for Type of repertoire response pattern #1 response pattern #2 Gesture and speech Speech only Gesture only
AA, AE
AA
None None
None AE
Nore: Note that Response Pattern #I results in an overall repertoire comparable to the concordant child's overall repertoire, whereas Response Pattern #2 results in an overall repertoire that differs from the concordant child's overall repertoire. AA. = the Add-All procedure. AE = the Add-to-Equal-Sign procedure.
134
Susan Goldin-Meadow and Martha Wagner Alibali
the Gesture and Speech repertoire, as does AE; there are no procedures in the Speech Only or the Gesture Only repertoires. Note that a child who produces response pattern #1, although discordant, has precisely the same overall set of repertoires as the concordant child in Table I. We now turn to response pattern #2 in Table 11. Here the child again produces two matching and four mismatching explanations. AA is again expressed in speech and gesture in the matching explanations (problems 1 and 2). In addition, AA is expressed in speech in all four mismatching explanations (problems 3-6). Note, however, that AE is not expressed in speech anywhere in this set of six responses; it appears in gesture in all four mismatching explanations (problems 3-6). Thus, AA belongs to the Gesture and Speech repertoire but AE belongs to the Gesture Only repertoire; again there are no procedures in the Speech Only repertoire. The hypothetical response patterns displayed in Table I1 make it clear that there are two ways in which a child can be discordant. One way is associated with a set of repertoires that is identical to the set a concordant child has, but the second way is associated with a set of repertoires that is decidedly different from the concordant child’s set. If concordant and discordant children differ only in the response patterns they exhibit and not in their repertoires, then this suggests that the knowledge the two types of children have about the task is identical-the children differ only in how they activate that knowledge on a given problem. Concordant children tend to activate one procedure on each problem (expressed in both speech and gesture), whereas discordant children tend to activate two procedures (one in speech and another in gesture). But overall the children have the same number of procedures in their sets of repertoires, and all of those procedures are expressed in both modalities and, at some time, appear in speech and in gesture. In contrast, if concordant and discordant children differ not only in their response patterns but also in their repertoires, the knowledge the two types of children have about the task may not be identical. Differences in a child’s knowledge base may ultimately be responsible for creating a discordant response pattern and may, in the end, be what makes a child transitional. We address this issue by examining the types of repertoires that concordant and discordant children actually produce. B. THEGESTURE A N D SPEECH REPERTOIRES PRODUCED BY CONCORDANT AND DISCORDANT CHILDREN We examined the repertoires of responses produced before instruction by each of the 58 children who gestured on the pretest problems in Alibali and Goldin-Meadow’s (1993) study. There were 35 discordant children
Learning with a Helping Hand
135
and 23 concordant children in this group (see Goldin-Meadow, Alibali & Church, 1993, for the details of this analysis). Figure 3 presents the mean number of different procedures that the concordant and discordant children produced in their Gesture and Speech repertoires (A), their Speech Only repertoires (B), and their Gesture Only repertoires (C). As is apparent in the figure, the concordant and discordant children did not have the same sets of repertoires. The discordant children produced significantly more procedures in their Gesture Only repertoires than did the concordant children (F (1,56) = 36.34, p < .001). Interestingly, the two groups of children did not differ in the number of procedures they produced in their Gesture and Speech repertoires (F(1,56) = 1.802, p > .lo), nor in the number of procedures they produced in their Speech Only repertoires (F(1,56) = 2.28, p > .lo). Note that both groups of children produced very few procedures in speech without, at some time, also producing those procedures in gesture. If the number of procedures found in the Gesture and Speech and in the Speech Only repertoires is the same for the discordant and concordant children, and if the discordant children have more procedures in their Gesture Only repertoires than the concordant children, then the discordant children must have more procedures in their repertoires overall than do the concordant children. That is, the discordant children have more procedures potentially available to them than the concordant children-they have larger knowledge bases. However, virtually all of the additional procedures that the discordant children have in their repertoires are procedures that the children express in gesture but do not (and perhaps cannot) express in speech. If a large number of procedures in the Gesture Only repertoires is essential to the discordant state, then we ought to see changes in this repertoire as children move into and out of this state. Alibali and Goldin-Meadow (1993) examined the repertoires of the children who, after receiving instruction, progressed from a concordant incorrect state to a discordant state, and the repertoires of the children who, after instruction, progressed from a discordant state to a concordant correct state. The data are presented in Figure 4; note that this analysis is within-child (i.e., the same child in a concordant state at one time vs. a discordant state at another time) rather than across-child as seen in Figure 3 (i.e., concordant children vs. discordant children). As predicted, the children produced significantly more procedures in their Gesture Only repertoires when they were in a discordant state versus a concordant incorrect state ( t = 3.1, df= 11, p < .01, left graph) and in a discordant state versus a concordant correct state ( t = 3.0, df = 16, p 5 .008, right graph). Thus, the children increased the number of procedures in their Gesture Only repertoires when they moved into a discordant state and decreased the number when they moved out of a discordant state. In other words, the children increased the number of
136
Susan Goldin-Meadow and Martha Wagner Alibali
A. Gesture 8 Speech Repertoire
FA Concordant Discordant
B. Speech Only Repertoire
C. Gesture Only Repertoire Fig. 3. The repertoires of procedures produced by concordant (hatched bar) versus discordant (solid bar) children in their explanations of mathematical equivalence problems. The figure presents the mean number of different procedures the children in each group demonstrated in their Gesture and Speech repertoires (A), in their Speech Only repertoires (B), and in their Gesture Only repertoires (C). The discordant children had significantly more procedures in their Gesture Only repertoires than did the concordant children, but did not differ from the concordant children in the number of procedures they had in their Gesture and Speech and Speech Only repertoires. The concordant and discordant children thus differed only in the number of different procedures they expressed uniquely in gesture. The bars reflect standard errors. (Adapted from Goldin-Meadow, Alibali, & Church, 1993.)
Learning with a Helping Hand
137
A. Progression from
Concordant Incorrect to Discordant m
e
2
Concordant Discordant
1 h
7s n $ 1
s
z m c
90 Gesture Only Repertoire
6. Progression from
Discordant to Concordant Correct
Gesture Only Repertoire Fig. 4. The Gesture Only repertoires children produce as they progress from one state to another in the acquisition of mathematical equivalence. The figure presents the mean number of different procedures children demonstrated in their Gesture Only repertoires as they progressed from the concordant (hatched bar) incorrect state to the discordant (solid bar) state (A) and as they progressed from the discordant state to the concordant correct state (B). The children significantly increased the number of procedures they expressed uniquely in gesture when they entered a discordant state, and significantly decreased the number when they exited the state. The bars reflect standard errors. (Adapted from Alibali & Goldin-Meadow, 1993.)
procedures they expressed uniquely in gesture when they entered a transitional state and decreased the number when they exited the state. These findings suggest that having a large number of procedures unique to gesture may be an important aspect of being in transition, and that the “experimentation” with new procedures that may occur during the transitional period is likely to be expressed in gesture rather than in speech.
138
c.
Susan Goldin-Meadow and Martha Wagner Alibali
THEPROCESSES
BY
WHICH REPERTOIRES CHANGE
We have found that the size of a child’s repertoire of procedures, that is, the number of different procedures the child has available to deal with a task, changes as the child moves in and out of transition with respect to that task. However, it is worth noting that, at all points in the acquisition process, children tend to have at their disposal more than one procedure (cf. Fig. 3). Even the concordant children in the Alibali and Goldin-Meadow (1993) study had more than one procedure in their repertoires overall (Mean = 2.09 procedures, SD = 0.85, summing across all three repertoires, compared to 3.37, SD = 1.24, for the discordant children). Thus, children exhibit variability throughout the acquisition process, more variability than one might expect if learning were to involve moving from one consistent state to another. We turn now to the question of how a child’s repertoire might change. A model in which a learner moves from one consistent state to another would predict that the learner abandons the procedure he or she has at time 1, replacing it with a completely different procedure at time 2. Under this view, very little of the learner’s repertoire would be maintained over time, and change would be accomplished primarily by abandoning old procedures and generating new ones. Alibali (1994) explored this issue in a training study in mathematical equivalence. Alibali’s goal in this study was to provide children with minimal instruction, designed to encourage children to change their understanding of the task but not necessarily to master the task. And, indeed, as intended, the children in the study made small amounts of progress, some progressing not at all, thus allowing Alibali to examine the initial steps children take as they begin to tackle mathematical equivalence. The study involved four groups of children, three who received some type of instruction in mathematical equivalence and one control group. Of the children who received instruction, 35 began the study in a concordant incorrect state and stayed there, 14 began in a concordant incorrect state and progressed to a
Fig. 5. The total repertoires children produce before and after minimal instruction in mathematical equivalence. The figure presents the mean number of different procedures children produced in all three repertoires (i.e., in their Gesture and Speech, Speech Only, and Gesture Only repertoires combined) when they remained in the concordant (hatched bar) incorrect state (A), when they progressed from the concordant incorrect state to the discordant (solid bar) state (B), when they regressed from the discordant state to the concordant incorrect state (C), and when they remained in the discordant state (D). The children produced larger repertoires (i.e.,more different procedures were available to them) when in a discordant state than when in a concordant state. The bars reflect standard errors.
Learning with a Helping Hand
139
Total Number of Procedures in All Three Repertoires A. Remain Concordant Incorrect
Pretest
Posttest
B. Progress from Concordant Incorrect to Discordant a
5
2 2
4
8
3
i n
2
2
0
E l
Pretest
Postlest
C. Regress from Discordant to Concordant Incorrect
z
1
44
Pretest
Posttest
D. Remain Discordant E 5 1
Pretest
Posttest
El Concordant Discordant
140
Susan Goldin-Meadow and Martha Wagner Alibali
discordant state, 17 began in a discordant state and regressed to a concordant incorrect state, and 14 began in a discordant state and stayed there. Our first step was to confirm in this new data set that children produce more different types of procedures in their repertoires overall when in a discordant state than when in a concordant state. Figure 5 presents the data. Children who remained in the state in which they began the study did not significantly increase or decrease the number of procedures they had in their overall repertoires (t = .927, df = 34, n.s., for the children who remained in a concordant incorrect state; t = 0, df = 13, n.s., for the children who remained in a discordant state). The children who remained in a concordant state throughout the study had approximately two procedures in their overall repertoires both before and after training, and the children who remained in a discordant state throughout the study had close to four procedures in their overall repertoires both before and after training. In contrast, and as predicted, children who progressed from a concordant incorrect state to a discordant state significantly increased the number of procedures in their overall repertoires as they entered the discordant state ( r = 2.74, df = 13, p < .02), and children who regressed from a discordant state to a concordant incorrect state significantly decreased the number of procedures in their overall repertoires as they left the discordant state (t = 4.15, df = 16, p 5 .001). Alibali (1994) then compared the particular procedures that comprised each child’s overall repertoire before and after instruction (i.e., on the pretest and posttest) and determined whether each procedure was present in the pretest repertoire and maintained to the posttest repertoire, present in the pretest repertoire and abandoned in the posttest repertoire, or absent in the pretest repertoire and generated in the posttest repertoire. Figure 6 presents the proportion of children who maintained at least one procedure
Fig. 6. The proportion of children who maintained (solid bar), abandoned (shaded bar), and generated (white bar) procedures after minimal instruction in mathematical equivalence. The figure presents the proportion of children who maintained at least one procedure from the pretest to the posttest, the proportion who abandoned at least one procedure, and the proportion who generated at least one procedure. The children are categorized according to the states they were in before and after instruction: Children who remained in the concordant incorrect state (A), children who progressed from the concordant incorrect state to the discordant state (B), children who regressed from the discordant state to the concordant incorrect state (C), and children who remained in the discordant state (D). Virtually all of the children maintained at least one procedure from the pretest to the posttest regardless of the type of transition they made. In addition, whereas the concordant incorrect children remained in that state by keeping their repertoires intact (maintaining procedures rather than abandoning or generating them), the discordant children maintained their status by revamping their repertoires (maintaining, abandoning, and generating procedures).
Learning with a Helping Hand
A. Remain Concordant Incorrect
141
142
Susan Goldin-Meadow and Martha Wagner Alibali
over the course of the study, the proportion who abandoned at least one procedure, and the proportion who generated at least one procedure. Children could, and frequently did, exhibit all three processes. The data in Fig. 6 are categorized according to the type of transition the child made during the study (remained concordant incorrect, progressed from concordant incorrect to discordant, regressed from discordant to concordant incorrect, and remained discordant). Note first that, independent of the type of transition the child made, the children were very likely to maintain at least one procedure from the pretest to the posttest; across all four types of transitions, 88% of the 80 children maintained at least part of their repertoires. Thus, children were not replacing the repertoires they had at time 1 with a completely new repertoire at time 2. Perhaps not surprisingly, children who remained concordant incorrect throughout the study tended to keep their repertoires intact-almost all of the children maintained at least one procedure (but not more than two procedures, cf. Fig. 5), and relatively few abandoned old procedures or generated new ones. In contrast, the children who began the study concordant incorrect and progressed to a discordant state exhibited all three processes-they maintained some of their old procedures, abandoned other old procedures, and generated new procedures, as one might expect given that they enlarged their repertoires over the course of the study. The children who began the study in a discordant state and regressed to a concordant incorrect state maintained some old procedures and abandoned others, but very few generated new procedures-again as one might expect because these children shrank their repertoires over the course of the study. The surprising data came from the children who remained discordant throughout the study. Unlike the children who remained concordant incorrect and maintained their status over the course of the study by keeping their repertoires intact, the discordant children maintained their status by totally revamping their repertoires-they maintained some old procedures, abandoned others, and generated new procedures, keeping the total number of procedures the same (and high, see Fig. 5 ) but changing the particular procedures. Despite the fact that these children did not make overt progress during the study, they obviously had been working on the task and might well have progressed to a concordant correct state if they had been given adequate instruction. In sum, children appear to maintain a great deal of continuity when learning a new task, retaining some old procedures for dealing with the task even as they acquire new ones. Continuity of this sort suggests that the transitions children make as they learn a task are gradual rather than abrupt. Children do not appear to change abruptly from a state in which
Learning with a Helping Hand
143
they entertain a single approach to a problem to a state in which they entertain a completely new approach to that problem. Rather, they maintain a variety of approaches to the problem throughout acquisition, gradually increasing that variety as they enter a transitional state and decreasing it as they emerge from the transitional state. Importantly, most of the increased variety that a child exhibits when in the transitional state can be found in the procedures the child expresses uniquely in gesture (i.e., the procedures the child produces in gesture and not in speech), highlighting once again that the transitional state may be difficult to detect if all one does is listen.
111. The Role of Gesture-Speech Mismatch in the Learning Process
We have shown that gesture-speech mismatch in the explanations a child produces when explaining a task is a reliable indicator that the child is in transition with respect to that task. Thus, the mismatch between gesture and speech can serve as an index that experimenters may use to identify and characterize children in transition. In this section, we explore whether gesture-speech mismatch has significance, not only for the experimenter, but also for the learner. In other words, what role (if any) does gesturespeech mismatch play in the mechanism of cognitive change? OF GESTURE-SPEECH MISMATCH A. THEEFFECTS ON THE LEARNER
Our findings suggest that children in transition with respect to a task will tend to produce gesture-speech mismatches when they explain that task. We have further suggested that mismatch comes about because children in transition possess information that they can express in gesture but do not express in speech. If this information is to be conveyed, it must inevitably be produced in a mismatch because there is no match for this particular gesture in the child’s speech repertoire. Mismatch thus provides an excellent vehicle for identifying children who have more knowledge at their disposal than is evident in their speech. The nature of this knowledge base may, in fact, be what makes a child transitional. However, it is also possible that the production of mismatches may be important to transition. Earlier we suggested that expressing two ideas on the same problem might itself provide the impetus for change. The question is whether mismatch merely serves as an excellent device for detecting transition, or whether it plays a role in causing transition as well-that is, is gesture-speech mismatch a marker or a mechanism?
144
Susan Goldin-Meadow and Martha Wagner Alibali
Although we do not yet have an answer to this question, the findings we have presented thus far suggest a way in which the question might be approached. We have shown that discordant children have procedures that they express in gesture but not speech (i.e., they have a Gesture Only repertoire), In addition, discordant children, by definition, produce a large number of mismatches. They thus satisfy both of the features associated with the transitional state. In order to determine whether it is the underlying knowledge base or the production of mismatches itself that is essential to transition, these two features must be pulled apart. To do this, we need a group of children who have a Gesture Only repertoire but produce few gesture-speech mismatches. Fortuitously, concordant children fall naturally into two groups: those who produce some procedures in gesture but not in speech (i.e., who have a Gesture Only repertoire) and those who do not. Note, however, that none of the concordant children, by definition, produces a large number of gesture-speech mismatches. The result is three groups of children whose progress after instruction can be compared: (1) discordant children who have a Gesture Only repertoire and produce a large number of mismatches, (2) concordant children who have a Gesture Only repertoire and produce few mismatches, and (3) concordant children who d o not have a Gesture Only repertoire and produce few mismatches.* If having a Gesture Only repertoire is all that is essential to the transitional state, then groups 1 and 2 should be more likely to profit from instruction (i.e., do better on the posttest after training) than group 3. Such a finding would suggest that the nature of the knowledge base-in particular, having information in gesture that is not expressed in speech-is what is essential to the transitional state. In contrast, if number of mismatches is the key to transition, then group 1 should d o better on the posttest than groups 2 and 3. This finding would suggest that it is the activation of two ideas on the same problem, rather than having a larger repertoire of ideas uniquely encoded in gesture, that is essential to transition. Finally, there is the possibility of an interaction-having a Gesture Only repertoire may be necessary for a child to be in a transitional state, but the production of mismatches may further enhance the child’s ability to profit from instruction. If the interaction hypothesis is correct, group 1 should perform better on the posttest than group 2 which, in turn, should perform better than group 3. This outcome would suggest that having a repertoire of ideas uniquely encoded in gesture determines whether a child is in a transitional state, but the activation of those two ideas on the same Although it is logically possible to have the fourth group-discordant
children who do
nor have a Gesture Only repertoire and produce a large number of mismatches (cf., response
pattern #1 in Table 11)-we have found no children who meet this description in any of our studies thus far.
Learning with a Helping Hand
145
problem-one in speech and the other in gesture-measurably increases the child’s readiness to learn. Our future work will be designed so that these analyses can be performed, thus allowing us to determine whether that act of producing gesture-speech mismatches itself facilitates transition. Even if it turns out that the production of gesture-speech mismatches has little role to play in facilitating cognitive change by affecting the learner directly, it is still possible that mismatch can play an indirect role in cognitive change by exerting an influence on the learning environment. In the next section, we explore the conditions that would have to be met in order for this possibility to be realized.
B. THEEFFECTS OF GESTURE-SPEECH MISMATCH ON THE LEARNING ENVIRONMENT We have shown that gesture-speech mismatch identifies children in a transitional state, that is, children who are ready to learn a particular task. Of course, whether children actually learn the task depends on many factors, not the least of which is the input they receive from the environment (see, for example, Perry, Church, & Goldin-Meadow, 1992). The mismatch between a child’s gestures and speech can, in principle, alert a communication partner to the fact that the child is ready to learn and can indicate the areas in which the child is most ready to make progress. Equipped with such information, a communication partner may be able to provide input that is tailored to the child’s current needs. Children would then, in a sense, be shaping their own environments, making it more likely that they would receive the type of input they need to progress. The question is, are they? The first step in exploring whether mismatch plays this type of indirect role in cognitive change is to show that gesture is interpretable, even to those who have not been trained to observe and code it. We investigated whether adults, not trained in coding gesture, used the information conveyed in a child’s gestures to assess that child’s knowledge of conservation (Goldin-Meadow, Wein, & Chang, 1992) or mathematical equivalence (Alibali, Flevares, & Goldin-Meadow, 1994). The design of the studies was straightforward. We selected from videotapes collected in our previous studies examples of 12 children, all of whom gave incorrect explanations in their speech. The examples were chosen so that 6 of the children produced gestures that matched their spoken incorrect explanations, and 6 produced gestures that did not match their spoken incorrect explanations. Adults were asked to view the videotape containing these 12 examples and, after seeing each child, to assess the child’s understanding of conservation or mathematical equivalence. The adults’ spoken and gestured assessments of each child were evaluated in terms of how faithful they were
146
Susan Goldin-Meadow and Martha Wagner Alibali
to the information conveyed in the child’s speech. Explanations that the adult attributed to the child but which the child had not expressed in speech were called Additions. Figure 7 presents the proportion of responses containing Additions that the adults produced when they assessed concordant children versus discordant children in the conservation task and the mathematical equivalence task. In both tasks, the adults produced significantly more Additions when assessing discordant children than when assessing concordant children ( t = 4.25, df = 19, p < .001 for conservation; t = 3.15, df = 19, p < .005 for mathematical equivalence). Moreover, a large proportion of the Additions produced in each task could be traced to the gestures the child produced. More remarkably, although the children had produced the additional information in gesture, the adults often “translated” this information into words and expressed it in their own speech. For example, a child on the videotape
In
.0
3 0
T Concordant
T
Conservation
W Discordant
Mathematics
Type of Task Fig. 7. Adults’ spontaneous assessments of concordant (hatched bar) and discordant (solid bar) children’s understanding of conservation and mathematical equivalence. The figure presents the proportion of the adults’ responses containing Additions (i.e., responses in which the adult added to the information a child conveyed in speech) that were produced when the adults assessed children’s understanding of conservation or mathematical equivalence. The children were categorized according to whether their gestures matched (concordant children) or mismatched their speech (discordant children). The adults were significantly more likely to add to the information conveyed in the children’s speech when the children were discordant than when they were concordant, suggesting that the adults attended to the children’s gestures and considered those gestures in relation to the children’s speech. The bars reflect standard errors.
Learning with a Helping Hand
147
in the conservation study explained his belief that the numbers in the two rows are different by saying “they’re different because you spread them out” but gesturing that each of the checkers in one row could be paired with a checker in the other row (one-to-one correspondence). In assessing this child’s understanding of the checker problem, one of the adults said, “he thinks they’re different because you spread them out, but he sees that the rows are paired up,” thereby picking up on the reasoning the child expressed not only in speech but also in gesture. Thus, the adults were able to interpret the information the child conveyed uniquely in gesture and incorporate it into their own spoken assessments of the child. These findings suggest that adults can interpret the gestures a child produces, at least when they are presented on videotape. In an attempt to determine whether adults can interpret children’s gestures when they are produced in a naturalistic situation, Momeni (1994) altered the experimental paradigm. Rather than give the adult the open-ended task of assessing each child’s understanding of conservation, she gave the adult a list of possible conservation explanations that the child could produce and asked the adult to check off all of the explanations that the child actually did produce. This technique allowed the adult to assess the child’s understanding of the task as the task was being administered, a procedure that could be used in a naturalistic context. In addition, giving the adult the same set of explanations for each child allows us to determine how often an explanation will be selected when the child expresses it in gesture only, and compare it to how often the same explanation will be selected when the child does not express it at all. In this way, this technique provides us with a baseline for determining how likely a particular explanation is to be attributed to a child, even when it is not expressed by the child. Momeni (1994) first tried the checklist technique with a videotaped stimulus; she asked 16 adults to view a videotape comparable to the one used in the Goldin-Meadow et al. (1992) study, and to check off the explanations that could be attributed to each child on the tape. She then used the same technique with seven adults asked to observe a series of children as they actually participated in the conservation task. Each adult watched six children, each of whom responded to six conservation tasks. The adult was given a separate checklist for each conservation problem. Figure 8 presents the data for both the videotaped condition (A) and the naturalistic condition (B). The figure displays the proportion of times the adults attributed an explanation to a child when that explanation was produced by the child in gesture only versus not produced by the child at all. In both conditions, the adults were significantly more likely to attribute an explanation to a child when the child expressed it in gesture only than when that same
Susan Goldin-Meadow and Martha Wagner Alibali
148
A. Videotaped Condition
I
n 3
Not in Stimulus
In Child‘s Gestures
Site of Expianation in Stimulus
4
B. Naturalistic Condition
3 0.5 0
aUJ
-
2
0.3
-m 2n
0.2-
m
E
Not in Stimulus
In Child‘s Gestures
Site of Explanation in Stimulus
Fig. 8. Adults’ responses on a checklist assessing children’s understanding of conservation. The figure presents the proportion of times the adults attributed an explanation to a child, that is, the proportion of times the adults responded “yes” to an explanation on a checklist, when that explanation was produced by the child in gesture only (i.e., explanation in Child’s Gestures) versus not produced by the child at all (explanation Not in Stimulus). In the Videotaped Condition (A), a videotape of a series of children explaining their conservation judgments was presented to the adults; in the Naturalistic Condition (B), a series of children were observed “live” as they participated in the conservation tasks. Under both conditions,
Learning with a Helping Hand
149
explanation was not expressed at all (t = 4.08, df= 16, p < .01, for the videotaped condition; t = 5.17, df= 6, p < .01, for the naturalistic condition). Thus, adults are capable of gleaning meaning from a child’s gestures even when those gestures are seen once, for a fleeting moment, in a naturalistic context. These findings suggest that adults can detect and interpret gesture even in a relatively naturalistic situation. Of course, whether adults notice and interpret a child’s gestures when they themselves are interacting with a child, and whether adults alter the way in which they interact with a child on the basis of the information they glean from the child’s gestures, remain open questions that must be pursued before we can be certain that gesture plays a role in shaping the child’s learning environment.
IV. The Representation of Information in Gesture and Speech A. VERIFYING THE ASSUMFITON THATGESTURE REFLECTS KNOWLEDGE We have predicated all of the studies discussed thus far on the assumption that gesture is a vehicle through which children can express their knowledge. In this section, we describe empirical evidence that we have collected in support of this assumption. We ask whether the gestures children produce when explaining a problem convey information that can be accessed and recognized in another context. If gesture is a vehicle through which children express their knowledge, then it should be possible to access the knowledge that the children express in gesture via other means. Garber, Alibali, and Goldin-Meadow (1994) gave 9- and 10-year-old children the standard mathematical equivalence pretest, and used those pretests to determine which procedures the children expressed in their Gesture and Speech repertoires, their Speech Only repertoires, and their Gesture Only repertoires. They then gave each child a judgment test containing 36 math problems. On each problem, the child was presented with a solution generated by one of the six most common procedures children use to solve math problems of this type, and was asked to judge how acceptable that solution is for this problem. For example, for the problem
the adults were significantly more likely to attribute an explanation to a child when the child expressed it in gesture only than when the child did not express it at all, suggesting that the adults were able to glean accurate information from the children’s gestures. The bars reflect standard errors.
150
Susan Goldin-Meadow and Martha Wagner Alibdi
3 + 6 + 7 = - + 7, the child was shown the number 23, a solution arrived at using the Add-All procedure (i.e., adding all four numbers in the problem), and was asked to judge how acceptable 23 is as a solution to this problem. The children were not asked for explanations at any point during the judgment task. Garber et al. (1994) first examined the children’s pretests and identified 20 children whose procedures were all in a Gesture and Speech repertoire (i.e., any procedure the child produced in one modality was also produced at some time in the other modality), and 16 children whose procedures were divided between a Gesture and Speech repertoire and a Gesture Only repertoire (i.e., some procedures were produced in both modalities, whereas others were produced in gesture but not in speech). Not surprisingly given the data in Figure 3, Garber et al. found very few children who had procedures in a Speech Only repertoire, that is, very few children who produced a procedure in speech without also at some point producing it in gesture; the judgment data for these few children will not be discussed here (but see Garber et al., 1994). Garber et al. then examined the judgment data for the 20 children whose procedures were all in a Gesture and Speech repertoire. They compared the mean acceptability rating for solutions generated by procedures that the children produced in both gesture and speech on the pretest, to the mean acceptability rating for solutions generated by procedures that the children had not produced in either modality on the pretest. They found that the children gave significantly higher ratings to procedures that they produced in both gesture and speech than to procedures that they did not produce at all ( r = 4.17, df = 19,p < .OOOl). These results are not surprising, but they do confirm that the acceptability rating has some validity as a measure of a child’s knowledge. We turn now to the data of interest-that is, to the judgment data for the 16 children who had procedures in both a Gesture and Speech repertoire and a Gesture Only repertoire. Figure 9 presents the mean acceptability rating for solutions generated by procedures that these children produced in both gesture and speech on the pretest, the rating for solutions generated by procedures that the children produced in gesture but not speech on the pretest, and the rating for solutions generated by procedures that the children did not produce in either modality on the pretest. The three ratings differed significantly from one another (F(2, 15) = 23.07, p < .OOOl). In particular, the rating for procedures produced in both gesture and speech was significantly higher than the rating for procedures produced in gesture only (F(1,15) = 1 3 . 8 3 , ~< .01) which, in turn, was significantly higher than the rating for procedures produced in neither modality (F (1, 15) = 12.08, p < .01). The latter comparison is of particular interest to us because it
Learning with a Helping Hand
151
a
2
1
In neither G nor S
In G but not S
InGandS
Modality of Procedure on Pretest Fig. 9. Ratings of solutions generated by procedures as a function of the modality in which those procedures were produced on a mathematical equivalence pretest. The children rated solutions generated by procedures that they themselves produced, or failed to produce, on the pretest. The procedures are categorized according to the modality in which they appeared on the pretest. Procedures produced in both gesture (G), and speech (S), procedures produced in G but not S, and procedures produced in neither G nor S. The children gave significantly higher ratings to solutions generated by procedures that they produced in gesture only than they gave to solutions generated by procedures that they produced in neither modality, suggesting that they had access to, and could make use of, the information conveyed in their gestures. The bars reflect standard errors.
suggests that the information a child conveys uniquely in gesture, although never spoken, can nevertheless be accessed and judged. Thus, the gestures that a child produces do appear to reflect knowledge that the child possesses. Even if that knowledge is never articulated in speech, it can be recognized. Note that, in the judgment task used by Garber et al., children were not actually asked to rate the procedure but rather were asked to rate the solution generated by that procedure. It is unclear whether, if given the procedure outright, children who produced the procedure in gesture but not in speech would judge that procedure acceptable. In other words, if asked directly whether a given procedure is acceptable, children might be unable to accept the procedure despite the fact that they can produce it in gesture and can accept a solution generated by it. Such a finding would suggest that the knowledge children have that they express uniquely in gesture is knowledge that is still relatively embedded in the task; at this point in development, the knowledge can be isolated and used in only a limited range of other contexts (cf. Karmiloff-Smith, 1992; see also GoldinMeadow & Alibali, 1994).
152
Susan Goldin-Meadowand Martha Wagner Alibali
B. GESTURE AND SPEECH CAITUREDIFFERENT ASPECTS OF MEANING
We have shown that, at certain times in the acquisition of a task, children possess knowledge about the task that they can express in gesture but do not, and perhaps cannot, express in speech. The obvious question is, why? Huttenlocher (1973, 1976) argued that it is not useful, and sometimes not even possible, to represent all aspects of human knowledge in spoken language. For example, a map of the East Coast of the United States is far more effective at capturing and conveying the contour of the coastline than words, even a large number of them, could ever be. Thus, Huttenlocher argued that there must be representational systems that do not involve words that humans use to encode information. Similarly, Anderson (1983) and Johnson-Laird (1983) each proposed a variety of representational systems, including systems that involve imagery, as options for encoding knowledge. McNeill (1992) has suggested that gesture can serve as a vehicle for one of these options. Because of its mimetic properties, gesture is particularly adept at capturing global images. Indeed, one gesture tracing the outline of the East Coast of the United States is likely to be a more informative representation of the contour of the coastline than any set of sentences could be. Unlike speech, which reflects the linear-segmented and hierarchical linguistic structure dictated by the grammar that underlies it, gesture is idiosyncratic and is constructed at the moment of speaking-it is consequently free to reflect the global-synthetic image for which it is a natural representation (see Goldin-Meadow, McNeill, & Singleton, 1995, for further discussion of this point). We suggest that, at certain moments in the acquisition of a task, gesture may be better able than speech to capture the knowledge that a child has about the task. This may be particularly true if the child’s knowledge is in the form of an image that cannot easily be broken down into the segmented components that speech requires. For example, children who use their pointing fingers to pair each checker in one row of a conservation task with a checker in the second row but do not express this one-to-one correspondence in their speech may have an image in which the two rows are aligned checker-by-checker. This image is easily captured in gesture but, unless the children have an understanding of the components that comprise the image, the image is not going to be easy to express in speech. Simon (1992; see also Tabachneck, 1992) provided an example in adults which is reminiscent of the state we describe in children. The adults are able to grasp an image and read certain types of knowledge off of it, but are not able to understand fully the relationships displayed in the image.
Learning with a Helping Hand
153
The adults in Simon’s study were provided with a graphic display of supply and demand curves (curves showing the quantities of a commodity that would be supplied, or demanded, at various prices) and asked what the equilibrium price and quantity would be. Although often able to answer the question about equilibrium price and quantity correctly, the adults were not able to give cogent reasons for their correct responses. Simon argued that the adults were able to respond to the perceptual cues in the image presented in the graph but did not have a semantic interpretation of the meaning of these cues. We suggest that it is exactly at this point in their understanding of the problem that the adults may be able to express cogent reasons in gesture, despite their failure to do so in speech. We end with a caveat. Although gesture may be better able than speech to capture the knowledge a child has about certain tasks, it may be less well suited for other tasks. The tasks we have explored in our studiesconservation and mathematical equivalence-are spatial in nature, leaving open the possibility that our results are specific to tasks of this sort and that speech (rather than gesture) will have privileged access to aspects of a child’s knowledge in other domains. For example, because moral reasoning is more culturally and socially bound than mathematical reasoning, talk may be essential to making progress in the task. In this case, we might not expect gesture to have privileged access to insights in this domain, and information might well be expressed uniquely in speech rather than gesture at moments of transition in the acquisition of this concept. In fact, Goodman et al. (1991) reported that gesture and speech do not always match in children’s and adults’ responses to Kohlberg’s moral reasoning tasks; however, it remains an open question as to whether insights into the task are first expressed in gesture or in speech within this domain.
C. GESTURE AS A WINDOW INTO THE MINDOF THE LEARNER McNeill (1992) argued that gesture, like speech, can serve as a channel of observation into mental processes and representations, that is, as a window into the mind. Because spontaneous gesture is less codified than speech and is dictated by different constraints, it tends to reflect different aspects of a speaker’s knowledge than does speech. We agree with McNeill and suggest that gesture can be a particularly revealing window in children (or learners of any age) who are in transition. We suggest that the relationship between gesture and speech not only serves as an index of the transitional state, but also provides insight into the internal processes that characterize the mind of a child in transition. By observing the gestures that children produce along with speech, we have been led to a view of the transitional state as one in which more than one
154
Susan Goldin-Meadow and Martha Wagner Alibali
viewpoint is considered on a single problem. The concurrent activation of two views is seen directly in the gesture-speech mismatches children produce when explaining their solutions to a problem, one view expressed in speech and a second view expressed within the same response in gesture. It is also seen indirectly in the fact that activation of two views on a single problem takes cognitive effort and diminishes performance on a secondary task when children are actually solving the problems. Gesture-speech mismatch itself appears to be an outgrowth of the fact that, at a certain point in the learning process, children have a set of ideas about a problem that they express in gesture but not in speech. The set of ideas that children express in gesture at this point in development is both different from, and larger than, the set of ideas the children express in speech. Note that our characterization of the transitional state puts constraints on the types of mechanisms that can be proposed to account for learning. Any mechanism of change purported to account for this type of transition must involve two different processes. One process serves to introduce new ideas into the learner’s repertoire. In the tasks we have studied, new ideas are introduced into the learner’s gestural repertoire and therefore enter in a form that is readily expressed in gesture but not in speech. This first process thus results in an overall increase in the number of ideas the learner has available. A second process serves to sort out the multiple ideas in the learner’s repertoire, abandoning some ideas and recoding others (perhaps by recoding the images reflected in gesture into the linear and segmented code that characterizes speech). This process thus results in a decrease in the number of ideas the learner has available, all of which are now accessible to both gesture and speech. It is important to stress, however, that throughout the acquisition process, learners appear to have a variety of approaches to a problem (as opposed to a single, consistent approach) at their disposal. What we see during transitional periods is a marked increase in this variability-an increase that, at a minimum, serves as an index of the transitional state, and that may even be central to the learning process itself. ACKNOWLEDGMENTS This research was supported by Grant R01 HD18617 from the National Institute of Child Health and Human Development and by a grant from the Spencer Foundation to Goldin-Meadow, and by a National Science foundation graduate fellowship,a William Rainey Harper Dissertation Fellowship, and a P.E.O. Scholar Award to Alibali. We thank Michelle Perry for her helpful comments on the manuscript.
Learning with a Helping Hand
155
REFERENCES Acredolo, C., & O’Connor, J. (1991). On the difficulty of detecting cognitive uncertainty. Human Development, 34, 204-223. Acredolo, C.. O’Connor, J., & Horobin, K. (1989, April). Children’s understanding of conseruation: From possibility to probability to necessity. Paper presented at the biennial meeting of the Society for Research in Child Development, Kansas City, MO. Alibali, M. W. (1994). Processes of cognitive change revealed in gestureand speech. Unpublished doctoral dissertation, University of Chicago. Alibali, M. W., Flevares, L., & Goldin-Meadow, S. (1994). Going beyond what children say to assess their knowledge. Unpublished manuscript. Alibali. M. W., & Goldin-Meadow, S. (1993). Gesture-speech mismatch and mechanisms of learning: What the hands reveal about a child’s state of mind. Cognitive Psychology, 25, 468-523. Anderson, J. R. (1983). The architecture of cognition. Cambridge, M A Harvard University Press. Anderson, N. H., & Cuneo, D. 0. (1978). The height + width rule in children’s judgments of quantity. Journal of Experimental Psychology: General, 107, 335-378. Ashcraft, M. H. (1987). Children’s knowledge of simple arithmetic: A developmental model and simulation. In J. Bisanz, C. J. Brainerd, & R. Kail (Eds.), Formalmethods in developmental psychology (pp. 302-338). New York: Springer-Verlag. Church, R. B. (1990, May). Equilibration: Using gesture and speech to monitor cognitive change. Paper presented at the Twentieth Anniversary Symposium of the Jean Piaget Society, Philadelphia. PA. Church, R. B., & Goldin-Meadow, S. (1986). The mismatch between gesture and speech as an index of transitional knowledge. Cognition, 23, 43-71. Crowder, E. M., & Newman, D. (1993). Telling what they know: The role of gesture and language in children’s science explanations. Pragmatics and Cognition, I , 341-376. Crowley, K., & Siegler, R. S. (1993). Flexible strategy use in young children’s tic-tac-toe. Cognitive Science, 17, 531-561. Encsson, K. A., & Simon, H. A. (1980). Verbal reports as data. Psychological Review, 87 (3), 215-251. Garber, P., Alibali, M. W., & Goldin-Meadow, S. (1994). Knowledge conveyed in gesture is not tied to the hands. Manuscript submitted for publication. Gigerenzer, G., & Richter, H. R. (1990). Context effects and their interaction with development: Area judgments. Cognitive Development, 5, 235-264. Goldin-Meadow, S., & Alibali, M. W. (1994). Do you have to be right to redescribe? Behavioral and Brain Sciences, 17, 718-719. Goldin-Meadow, S., Alibali, M. W., & Church, R. B. (1993). Transitions in concept acquisition: Using the hand to read the mind. Psychological Review, 100(2), 279-297. Goldin-Meadow, S., McNeill, D., & Singleton, J. (1995). Silence is liberating: Removing the handcuffv on grammatical expression in the manual modality. Psychological Review, in press. Goldin-Meadow, S., Nusbaum, H.. Garber, P., & Church, R. B. (1993). Transitions in learning: Evidence for simultaneously activated hypotheses. Journal of Experimental Psychology: Human Perception and Performance, 19, 92-107. Goldin-Meadow, S., Wein, D., & Chang, C. (1992). Assessing knowledge through gesture: Using children’s hands to read their minds. Cognition and Instruction, 9, 201-219. Goodman, N., Church, R. B., & Schonert, K. (1991, May). Moral development and gesture: What can the hands reveal about moral reasoning? Paper presented at the annual meeting of the Jean Piaget Society, Philadelphia, PA.
156
Susan Goldin-Meadow and Martha Wagner Alibali
Graham, T. (1994, June). The role of gesture in learning to count. Paper presented at the annual meeting of the Jean Piaget Society, Chicago, IL. Graham, T., & Perry, M. (1993). Indexing transitional knowledge. Developmental Psychology, 29, 779-788. Huttenlocher, J. (1973). Language and thought. In G. A. Miller (Ed.), Communication, language, and meaning: Psychological perspectives (pp. 172-184). New York, NY: Basic Books. Huttenlocher, J. (1976). Language and intelligence. In L. B. Resnick (Ed.), The nature of intelligence (pp. 261-281). Hillsdale, NJ: Erlbaum Associates. Iverson, J., & Goldin-Meadow, S. (1995). What’s communication got to do with it? Gesture in blind children. Manuscript submitted for publication. Johnson-Laird, P. N. (1983). Mental models. Cambridge, MA: Harvard University Press. Karmiloff-Smith, A. (1992). Beyond modularity: A developmental perspective on cognitive science. Cambridge, MA: MIT Press. Klahr, D. (1984). Transition processes in quantitative development. In R. J. Sternberg (Ed.), Mechanisms of cognitive development (pp. 101-140). New York: W. H. Freeman & Co. Klahr, D., & Wallace, J. G. (1976). Cognitive development: An information-processing view. Hillsdale, NJ: Erlbaum. Langer, J. (1969). Disequilibrium as a source of development. In P. Mussen, J. Langer, & M. Covington (Eds), Trends and issues in developmental psychology (pp. 22-37). New York: Holt, Rinehart & Winston. McNeill, D. (1992). Hand and mind. Chicago: University of Chicago Press. Momeni, C. (1994). Detection of gesture in experimental and natural conditions. Unpublished masters thesis, University of Chicago, Chicago, IL. Nisbett, R. E., & Wilson, T. D. (1977). Telling more than we can know: Verbal reports on mental processes. Psychological Review, 84 (3), 231-259. Perry, M., Church, R. B., & Goldin-Meadow, S. (1988). Transitional knowledge in the acquisition of concepts. Cognitive Development, 3, 359-400. Perry, M., Church, R. B., & Goldin-Meadow, S. (1992). Is gesture-speech mismatch a general index of transitional knowledge? Cognitive Development, 7, 109-122. Perry, M., & Elder, A. D. (1994). Knowledge in transition: Adults’ developing understanding of a principle of physical causality. Manuscript submitted for publication. Piaget, J. (1964/1967). Six psychological studies. New York: Random House. Piaget, J. (1975/1985). The equilibration of cognitive structures. Chicago: The University of Chicago Press. Siegler. R. S. (1976). Three aspects of cognitive development. Cognitive Psychology, 8,481-520. Siegler, R. S. (1981). Developmental sequences within and between concepts. Monographs of the Society for Research in Child Development, 46 (Whole No. 189). Siegler, R. S. (1984). Mechanisms of cognitive growth: Variation and selection. In R. J. Sternberg (Ed.), Mechanisms of cognitive development (pp. 141-162). New York: W. H. Freeman & Co. Siegler, R. S. (1994a). Cognitive variability: A key to understanding cognitive development. Current Directions in Psychological Science, 3, 1-5. Siegler, R. S. (1994b). Children’s thinking: How does change occur. In W. Schneider & F. Weinert (Eds.), Memoryperformance and competencies: Issues in growth and development. Hillsdale, NJ: Erlbaum. in press. Siegler, R. S. (1995). How does change occur: A microgenetic study of number conservation. Cognitive Psychology, 28. Siegler, R. S., & Jenkins, E. (1989). How children discover new strategies. Hillsdale, NJ: Erlbaum.
Learning with a Helping Hand
157
Siegler, R. S., & McGilly, K. (1989). Strategy choices in children’s time-telling. In I. Levin & D. Zakay (Eds.), Time and human cognition: A life span perspecrive (pp. 185-218). Elsevier Science: Amsterdam. Siegler, R. S., & Shrager, J. (1984). Strategy choices in addition and subtraction: How do children know what to do? In C. Sophian (Ed.), The origins of cognirive skills (pp. 229-293). Hillsdale, NJ: Erlbaum. Simon, H. A. (1992, August). Why the mind needs an eye: The uses of mental imagery. Distinguished Centennial Address presented at the 100th Annual Convention of the American Psychological Association, Washington, D.C. Also in Complex Informarion Working Papers (No. 499). Pittsburgh, PA: Carnegie Mellon University. Snyder, S. S.,& Feldman, D. H. (1977). Internal and external influences on cognitive developmental change. Child Development, 48, 937-943. Stone, A., Webb, R., & Mahootian, S. (1992). The generality of gesture-speech mismatch as an index of transitional knowledge: Evidence from a control-of-variables task. Cognitive Development, 6, 301-313. Strauss, S. (1972). Inducing cognitive development and learning: A review of short-term training experiments. I. The organismic developmental approach. Cognition, I(4). 329-357. Strauss, S., & Rimalt, 1. (1974). Effects of organizational disequilibrium training on structural elaboration. Developmental Psychology, 10, 526-533. Tabachneck, H. J. M. (1992). Computational differences in menial represenrarions: Effecrs of mode of data presenrarion on reasoning and understanding. Unpublished doctoral dissertation, Carnegie Mellon University, Pittsburgh, PA. Thelen, E. (1989). Self-organization in developmental processes: Can systems approaches work? In M. Gunnar & E. Thelen (Eds.), Sysrems and development: The Minnesora Symposium on Child Psychology (pp. 77-117). Hillsdale, NJ: Erlbaum. Turiel, E. (1969). Developmental processes in the child’s moral thinking. In P. Mussen, J. Langer, & M. Covington (Eds.), Trends and issues in developmenial psychology (pp. 92-133). New York: Holt, Rinehart & Winston. Turiel, E. (1974). Conflict and transition in adolescent moral development. Child Development, 45, 14-29. Wilkening, F. (1980). Development of dimensional integration in children’s perceptual judgment: Experiments with area, volume, and velocity. In F. Wilkening, J. Becker, & T. Trabasso (Eds.), Information inregration by children (pp. 47-69). Hillsdale, NJ: Erlbaum. Wilkening, F. (1981). Integrating velocity, time, and distance information: A developmental study. Cognitive Psychology, 13, 231-247. Wilkening, F., & Anderson, N. H. (1982). Comparison of two rule-assessment methodologies for studying cognitive development and knowledge structure. Psychological Bulletin, 92, 215-237. Wilkinson, A. C . (1982). Partial knowledge and self-correction: Developmental studies of a quantitative concept. Developmental Psychology, 18, 876-893.
This Page Intentionally Left Blank
THE UNIVERSAL WORD IDENTIFICATION REFLEX Charles A. Perfetti and Sulan Zhang
Reading appears to be an effortless, fluid process, yielding meanings without detectable cognitive sweat. This appearance of a facile, meaningdominated activity arises from largely invisible intermediate nonsemantic processes, both phonological and syntactic. These nonsemantic processes include, we suggest, a set of “reflexes” that operate more or less blindly on print, just as phonological and syntactic processes operate automatically in language in general. Be referring to reflexes, we do not intend to evoke all the properties of sensorimotor reflexes; rather, we want to suggest specifically the obligatory nature of the processes we discuss.’ Our focus in this chapter is on just one of these reflexes, the phonological.* The central claim is that word identification consists in making available a familiar phonological form from a graphic input. If this claim seems tepid, compare it with the more standard view of lexical “access,” which usually refers to access of meanings as the defining event of word recognition. To 1 An alternative is to characterize some reading processes as showing an “acquired modularity” (Perfetti, 1989). The advantage of the “reflex” metaphor is that it makes no obvious claim on the architecture of a system that produces processing characteristic of modules. Instead it picks out the key features-involuntary and rapid. * The idea of syntactic reflexes rests on the assumption that parsing processes are dominated by (autonomous) syntactic information arising primarily from local configurational and lexical syntactic information, as opposed to information from context. This assumption remains strongly contested, but even interactive models include a rich source of syntactic information (MacDonald, Pearlmutter, & Siedenberg, 1994). See Mitchell (1994) for a review of parsing and Perfetti (1989) and Perfetti and Britt (1995) for a model that tries to accommodate both reflex ideas and context.
THE PSYCHOLOGY OF LEARNING AND MOTIVATION, VOL. 33
159
Copyright 0 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
160
Charles A. Perfetti and Sulan Zhang
suggest that it is a phonological output that is the defining event is to place meaning access in a secondary role instead of the primary role, reversing the usual priority. In the lexical access approach to reading, meaning is primary and the processing question is whether or under what conditions nonsemantic information might be involved. Thus we have the classic question of whether lexical access is mediated by phonology or whether phonological information is activated “post lexically” or not at all. Thinking or word recognition not as lexical access but as word identification leads to a different perspective, as shown in Fig. 1. Whereas lexical access asks what happens in order to gain access to word meanings, word identification asks what happens in order to identify a word. What is different are the presupposition and the entailments of posing one problem rather than the other. The lexical access problem entails an account of processes that produce access to a word (a word’s “location” in some accounts) in a lexicon; it presupposes a concept of access. The word identification problem entails an account of processes that produce a phonological representation of a printed word; it presupposes a concept of identification. (When lexical access models take a phonological output as the problem, they also imply a model of word identification. Models of naming (e.g., Besner & Smith, 1992) can be construed as models of identification that require explicit identification outputs.) We take the central word-reading event to be not access but the identification of a word, the recovery of a specific phonological object having certain linguistic and meaning properties. To identify a word such as safe is to be able to name it. Actual naming is not synonymous with identification, but the potential to do so, if required, is the essential heart of identification. Identifying a word gives it referential and linguistic possibilities. The processes of word identification, however, are not presupposed but are empirical questions, just as they are for lexical a c ~ e s s . ~ These preliminaries lead to a summary of the basic argument that follows: Word identification, the recovery of a phonological object and its associated nonphonological components, is the fundamental obligatory process of reading. It is the central universal reading reflex. The remainder of this chapter expands on this basic claim, examining recent research on English and, especially, Chinese word reading. To make the case that there is a universal phonological reflex in reading, research on Chinese reading takes on importance. Chinese, as commonly described, is a writing system that allows, or even requires, a script-to-
’
Of course, identification is not decoding; the letter string broie can be decoded, but no identificationhas taken place. Word identification is about words, and about identifying them, being able to say what word is being read.
The Universal Word Identification Reflex
161
A Orthographic Input
I I
word meaning
"Safe"
I
meaning of safe (and maybe pronunciation /seyf/)
B OrthographicInput
I