V O LU M E
F I F T Y- T H R E E
THE PSYCHOLOGY OF LEARNING AND MOTIVATION Advances in Research and Theory
Series Editor Brian H. Ross Beckman Institute and Department of Psychology University of Illinois at Urbana-Champaign Urbana, Illinois
V O LU M E
F I F T Y- T H R E E
THE PSYCHOLOGY OF LEARNING AND MOTIVATION Advances in Research and Theory EDITED BY
BRIAN H. ROSS Beckman Institute and Department of Psychology University of Illinois at Urbana-Champaign Urbana, Illinois
AMSTERDAM • BOSTON • HEIDELBERG • LONDON NEW YORK • OXFORD • PARIS • SAN DIEGO SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO Academic Press is an imprint of Elsevier
Academic Press is an imprint of Elsevier 525 B Street, Suite 1900, San Diego, CA 92101-4495, USA 30 Corporate Drive, Suite 400, Burlington, MA 01803, USA 32 Jamestown Road, London, NW1 7BY, UK Radarweg 29, PO Box 211, 1000 AE Amsterdam, The Netherlands
Copyright # 2010, Elsevier Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise without the prior written permission of the publisher Permissions may be sought directly from Elsevier s Science & Technology Rights Department in Oxford, UK: phone (+44) (0) 1865 843830; fax (+44) (0) 1865 853333; email:
[email protected]. Alternatively you can submit your request online by visiting the Elsevier web site at http://elsevier.com/locate/permissions, and selecting Obtaining permission to use Elsevier material Notice No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. Because of rapid advances in the medical sciences, in particular, independent verification of diagnoses and drug dosages should be made ISBN: 978-0-12-380906-3 ISSN: 0079-7421
For information on all Academic Press publications visit our website at elsevierdirect.com
Printed and bound in USA 10 11 12 13 10 9 8 7 6 5 4 3 2 1
CONTENTS
Contributors
ix
1. Adaptive Memory: Evolutionary Constraints on Remembering
1
James S. Nairne 1. Introduction: Nature’s Criterion 2. The Mnemonic Value of Fitness-Relevant Processing 3. Memory Theory and Nature’s Criterion 4. Remembering with a Stone-Age Brain 5. Conclusions Acknowledgments References
2. Digging into De´ja` Vu: Recent Research on Possible Mechanisms
2 3 12 20 27 28 28
33
Alan S. Brown and Elizabeth J. Marsh 1. Introduction 2. Perceptual Explanation 3. Implicit Memory Explanation 4. Physiological Explanation 5. Reports in Anomalous Individuals 6. Continuing Issues 7. Concluding Remarks References
3. Spacing and Testing Effects: A Deeply Critical, Lengthy, and At Times Discursive Review of the Literature
34 36 43 52 54 56 59 60
63
Peter F. Delaney, Peter P. J. L. Verkoeijen, and Arie Spirgel 1. 2. 3. 4.
Introduction A Field Guide to the Spacing Literature: Spotting Impostors The Failure of Existing Spacing Theories Extending a Context Plus Study-Phase Retrieval Account of Spacing Effects 5. The Testing Effect
64 66 80 104 112 v
vi
Contents
6. Spacing and Testing in Educational Contexts 7. Conclusions References
4. How One’s Hook Is Baited Matters for Catching an Analogy
126 135 137
149
Jeffrey Loewenstein 1. Introduction 2. Key Roles for Retrieving Analogies 3. Underlying Structure and Retrieving Analogies 4. Facilitating the Retrieval of Analogies at Retrieval Time 5. Implications 6. Conclusion References
150 151 160 167 173 176 177
5. Generating Inductive Inferences: Premise Relations and Property Effects
183
John D. Coley and Nadya Y. Vasilyeva 1. Introduction 2. Effects of Premise Relations on Inference Generation 3. Effects of Property on Inference Generation 4. Inference Generation: Conclusions and Implications Acknowledgments References
6. From Uncertainly Exact to Certainly Vague: Epistemic Uncertainty and Approximation in Science and Engineering Problem Solving
184 191 203 217 224 224
227
Christian D. Schunn 1. Introduction 2. Linguistic Pragmatics of Uncertainty and Approximation 3. Coding Approximation and Uncertainty from Speech 4. Coding Uncertainty from Gestures 5. Uncertainty, Approximation, and Expertise 6. From Uncertainty to Approximation via Spatial Reasoning 7. Summary and Discussion 8. Future Directions Acknowledgments References
228 229 231 234 237 241 246 248 249 250
Contents
7. Event Perception: A Theory and Its Application to Clinical Neuroscience
vii
253
Jeffrey M. Zacks and Jesse Q. Sargent 1. Introduction 2. Event Segmentation Theory 3. Schizophrenia 4. Obsessive-Compulsive Disorder 5. Parkinson’s Disease 6. Lesions of the Prefrontal Cortex 7. Aging 8. Alzheimer’s Disease 9. Conclusions Acknowledgments References
8. Two Minds, One Dialog: Coordinating Speaking and Understanding
254 255 262 264 269 272 275 282 287 290 290
301
Susan E. Brennan, Alexia Galati, and Anna K. Kuhlen 1. Introduction: The Joint Nature of Language Processing 2. Dialog: Beyond Transcripts 3. Process Models of Dialog 4. The Role of Cues in Grounding 5. Partner-Specific Processing 6. Neural Bases of Partner-Adapted Processing 7. Conclusions Acknowledgments References
302 304 307 313 315 324 335 337 338
9. Retrieving Personal Names, Referring Expressions, and Terms of Address
345
Zenzi M. Griffin 1. Introduction 2. Psychological Research on Personal Name Production 3. Personal Names and Reference Across Cultures 4. Direct Address in Spoken Language 5. Conclusion Acknowledgments References Subject Index Contents of Recent Volumes
345 346 364 371 379 379 380 389 395
This page intentionally left blank
CONTRIBUTORS
Susan E. Brennan Department of Psychology, Stony Brook University, Stony Brook, NY, USA Alan S. Brown Department of Psychology, Southern Methodist University, Dallas, TX, USA John D. Coley Department of Psychology, Northeastern University, Boston, MA, USA Peter F. Delaney Department of Psychology, The University of North Carolina at Greensboro, Greensboro, NC, USA Alexia Galati Department of Psychology, Stony Brook University, Stony Brook, NY, USA Zenzi M. Griffin Department of Psychology, University of Texas at Austin, Austin, TX, USA Anna K. Kuhlen Department of Psychology, Stony Brook University, Stony Brook, NY, USA Jeffrey Loewenstein McCombs School of Business, University of Texas at Austin, TX, USA Elizabeth J. Marsh Department of Psychology and Neuroscience, Duke University, Durham, NC, USA James S. Nairne Department of Psychological Sciences, West Lafayette, IN, USA Jesse Q. Sargent Department of Psychology, Washington University, St Louis, MO, USA Christian D. Schunn LRDC, University of Pittsburgh, Pittsburgh, PA, USA Arie Spirgel Department of Psychology, The University of North Carolina at Greensboro, Greensboro, NC, USA Nadya Y. Vasilyeva Department of Psychology, Northeastern University, Boston, MA, USA ix
x
Contributors
Peter P. J. L. Verkoeijen Department of Psychology, Erasmus University Rotterdam, Rotterdam, The Netherlands Jeffrey M. Zacks Department of Psychology, Washington University, St. Louis, MO, USA
C H A P T E R
O N E
Adaptive Memory: Evolutionary Constraints on Remembering James S. Nairne Contents 2 3 4 8 12 13 14 16 18 20 21 23 24 27 28 28
1. Introduction: Nature’s Criterion 2. The Mnemonic Value of Fitness-Relevant Processing 2.1. The Survival Processing Paradigm 2.2. Explaining the Survival Processing Advantage 3. Memory Theory and Nature’s Criterion 3.1. The Encoding–Retrieval Match 3.2. Levels of Processing 3.3. Episodic Future Thought 3.4. Rational Analysis of Memory 4. Remembering with a Stone-Age Brain 4.1. Building the Case for Cognitive Adaptations 4.2. Ancestral Priorities in Survival Processing 4.3. What Is the Adaptation? 5. Conclusions Acknowledgments References
Abstract Human memory evolved subject to the constraints of nature’s criterion— differential survival and reproduction. Consequently, our capacity to remember and forget is likely tuned to solving fitness-based problems, particularly those prominent in the ancestral environments in which memory evolved. Do the operating characteristics of memory continue to bear the footprint of nature’s criterion? This is ultimately an empirical question, and I review evidence consistent with this claim. In addition, I briefly consider several explanatory assumptions of modern memory theory from the perspective of nature’s criterion. How well-equipped is the toolkit of modern memory theory to deal with a cognitive system shaped by nature’s criterion? Finally, I discuss the inherent difficulties that surround evolutionary accounts of cognition. Given there are no fossilized memory traces, and only incomplete knowledge about ancestral environments, is it possible to develop an adequate evolutionary account of remembering? Psychology of Learning and Motivation, Volume 53 ISSN 0079-7421, DOI: 10.1016/S0079-7421(10)53001-9
#
2010 Elsevier Inc. All rights reserved.
1
2
James S. Nairne
1. Introduction: Nature’s Criterion Imagine you were given the task of designing a human memory system from scratch. What features would you include and why? As memory’s architect, you would need a criterion, a metric against which you could judge the acceptability of design features. Modern memory theorists use a task-based criterion: People are asked to remember information for a test, such as recall or recognition, and proffered features must help predict or explain performance on the test. Although rarely justified, the choice of criterial task is obviously important. It constrains theory development and colors the theory’s final form. For example, theories of free recall lean heavily on a construct called ‘‘temporal context’’ because free recall requires people to remember information in the absence of explicit cues (e.g., Howard & Kahana, 2002; Raaijmakers & Shiffrin, 1981). Yet, the capacity to remember and forget did not emerge from the mind of a memory theorist—it evolved through a tinkering process called natural selection (Darwin, 1859; Jacob, 1977). Design through natural selection has its own stringent criterion: Structural features, once they arise, are maintained if they enhance fitness—that is, survival en route to differential reproduction. If the capacity to remember failed to confer a fitness advantage, modern brains would likely lack a tendency to reference the past. Memory systems need to be adaptive or, at least, they needed to have been adaptive at some point in our evolutionary past (e.g., Symons, 1992). In building a memory system from scratch, then, the lesson of evolutionary biology is clear—pay heed to nature’s criterion. In this chapter, I consider how nature’s criterion potentially shaped the human capacity to remember and forget. In Section 2, I review recent empirical evidence indicating that our memory systems may be specially tuned to remember information that is processed for fitness. Memory evolved because it helped us survive and reproduce and, not surprisingly, it shows sensitivity to fitness-relevant processing in these domains. In Section 3, I scan the landscape of modern memory theory through the lens of nature’s criterion. How wellequipped are the postulates, principles, and perspectives of modern memory theory to deal with a cognitive system shaped by nature’s criterion? Finally, I discuss strategies for developing an evolutionary account of remembering. There are no fossilized memory traces, our knowledge about the heritability of ancestral memory ‘‘traits’’ is limited, nor can we pinpoint the exact environments in which selection took place. Can we ever hope to develop a truly evolutionary account of remembering?
Adaptive Memory: Evolutionary Constraints on Remembering
3
2. The Mnemonic Value of Fitness-Relevant Processing If memory evolved, crafted by the forces of natural selection, then its operating characteristics likely bear some imprint of ancestral selection pressures (Klein, Cosmides, Tooby, & Chance, 2002; Nairne & Pandeirada, 2008a). This is true throughout the physical body, where the footprints of nature’s criterion are easily observed. Each of the body’s major organs plays a crucial role in helping us survive and reproduce and, typically, the fit between form and function is tight. Remnants of the original adaptive problem, or selection pressure, are readily gleaned from the organ’s architecture. The function of the heart is to pump blood and its physical structure reflects that end; the eye’s function is to transduce electromagnetic energy, and retinal cells are uniquely tuned to this task. The body also divides its labor into component parts, each designed to accomplish a particular goal (pumping and filtering blood, collecting and processing oxygen, and so on). Only adaptive problems can engage the design tools of natural selection—problems directly applicable to survival or reproductive fitness—so specificity abounds. The central thesis of evolutionary psychology is that the architecture of the mind—our cognitive processes—shows similar specificity (Tooby & Cosmides, 1992). Particular selection pressures, or adaptive problems, fueled the development of human memory systems; consequently, the proximate mechanisms that enable us to remember and forget are likely tuned to solving such problems, particularly as prominent in the ancestral environments in which memory evolved. Although we can never fully know ancestral environments, it is reasonable to suppose that our ancestors faced recurrent adaptive problems, ones that remained relatively constant across situations. Table 1 lists some potential candidates that apply specifically to remembering (from Nairne & Pandeirada, 2008a). Note that each entry potentially relates to fitness, either through affecting the likelihood of survival, protecting kin, or increasing the chances of successful reproduction. Human memory researchers rarely investigate the specific functional problems listed in Table 1, but relevant data do exist. For example, we have long known that fitness-relevant events can produce salient long-term retention. One compelling example is flashbulb memories, which track the retention of significant life events (Brown & Kulik, 1977; for a recent review, see Luminet & Curci, 2009). Both children and adults report strong and vivid memories for highly emotional events, such as situations in which their lives were in danger, although such memories tend to be reconstructive (e.g., Buss, 2005; Winograd & Neisser, 1992). Additional evidence comes from the study of cultural transmission: What kinds of information are most likely transferred from person to person and across generations?
4
James S. Nairne
Table 1
Potential Candidates for Domain-Specific Mnemonic Processes.
Type of fitness-relevant selection pressure
Examples of potential mnemonic targets relevant to each type of selection pressure
Survival-related events
Food (edible vs. inedible), water, shelter, medicinal plants, predators, prey Landmarks, constellations, weather patterns Physical and/or social characteristics of potential mating partners and/or rivals Altruistic acts, reciprocation, violation of social contracts, social status or hierarchy Physical features and social actions of kin versus nonkin
Navigation Reproduction Social exchange Kin
Note: For each category, our memory systems might be tuned to remember the examples on the right; for example, remembering the locations of edible food, medicinal plants, the meaning of weather patterns, family members, and altruistic acts.
Not surprising, fitness-relevant information, such as information about social interactions or heroic exploits, tends to transmit easily and effectively (Mesoudi & Whiten, 2008; Rubin, 1995). In the laboratory, studies have consistently found that people can easily associate fitness-relevant stimuli, such as snakes and spiders, with aversive ¨ hman & Mineka, 2001). So-called ‘‘taboo’’ words, which are events (O often sexual in nature, are also remembered particularly well and may induce prioritized ‘‘binding processes’’ between items and their context (Guillet & Arndt, 2009; Schmidt & Saari, 2007). In the word recognition literature, people are faster at recognizing words that rate highly along a ‘‘usefulness to survival’’ dimension relative to matched controls (e.g., Wurm, 2007). There is also evidence for a kin-related bias in autobiographical memory: Unpleasant events resulting from social interactions with kin are remembered as having occurred farther back in time than similar interactions with nonkin (Lu & Chang, 2009). People are also particularly good at attributing statements about the violation of social contracts to faces in a source attribution paradigm (Buchner, Bell, Mehl, & Musch, 2009). Not surprisingly, people also tend to remember attractive faces better than average-looking faces, although the effect is larger for female than male faces (see Kenrick, Delton, Robertson, Becker, & Neuberg, 2007).
2.1. The Survival Processing Paradigm As the studies just described illustrate, one can attempt to identify fitnessrelevant events or situations and assess their mnemonic power. Collectively, the evidence suggests that our memory systems effectively retain
Adaptive Memory: Evolutionary Constraints on Remembering
5
information pertinent to situations such as those listed in Table 1. However, studies of this sort suffer from an inherent methodological problem because comparisons are typically made across different items—for example, taboo words and nontaboo words. One can attempt to equate the stimuli on other relevant dimensions, but item-selection effects are always a lingering concern. One can never be completely certain that the stimuli differ only along the particular dimension of interest. Our laboratory has taken a different approach. Rather than comparing retention across item-type (fitness-relevant or not), participants in our experiments are asked to remember the same information (usually unrelated words). What differs across conditions is how those items are processed prior to a subsequent memory test—that is, either in terms of fitness-relevance or not. Table 2 lists the typical survival processing scenario we have used along with two relevant control conditions (Nairne, Thompson, & Pandeirada, 2007). Words are presented individually and people respond by producing a rating—for example, would this item be relevant if stranded in the grasslands of a foreign land without any survival materials? Surprise recall or recognition performance for the survival rating condition is then compared to performance in the ‘‘control’’ conditions, which also require meaningful, or ‘‘deep,’’ processing (Craik & Tulving, 1975). Table 2 Scenarios Used in Nairne et al. (2007).
Survival
Moving
Pleasantness
In this task we would like you to imagine that you are stranded in the grasslands of a foreign land, without any basic survival materials. Over the next few months, you will need to find steady supplies of food and water and protect yourself from predators. We are going to show you a list of words, and we would like you to rate how relevant each of these words would be for you in this survival situation. Some of the words may be relevant and others may not—it is up to you to decide. In this task we would like you to imagine that you are planning to move to a new home in a foreign land. Over the next few months, you will need to locate and purchase a new home and transport your belongings. We are going to show you a list of words, and we would like you to rate how relevant each of these words would be for you in accomplishing this task. Some of the words may be relevant and others may not—it is up to you to decide. In this task, we are going to show you a list of words, and we would like you to rate the pleasantness of each word. Some of the words may be pleasant and others may not—it is up to you to decide.
6
James S. Nairne
0.70
Proportion correct recall
0.65
0.60
0.55
0.50
0.45
0.40
Survival
Moving
Pleasantness
Figure 1 Proportion correct recall for words rated for their relevance to a survival scenario, a scenario involving moving, or for pleasantness (data adapted from Nairne et al., 2007).
Figure 1 shows the standard finding: Survival processing enhances retention relative to other forms of meaningful processing. In this particular example, survival processing produces better retention than processing for pleasantness, a condition known to be a highly effective form of deep processing (Packman & Battig, 1978). The ‘‘moving’’ condition is included as schematic or thematic control. One might argue that survival processing is effective simply because it forces people to encode information into a rich and coherent schema, one that is particularly salient and accessible at retrieval. In fact, both survival processing and the moving control do tend to produce more nonlist intrusions in recall compared to pleasantness processing, suggesting that some kind of schematic processing may be involved, but survival processing still produces the best retention. (Nairne et al., 2007). The survival processing effect has been replicated a number of times in our laboratory and in other laboratories as well (Kang, McDermott, & Cohen, 2008; Weinstein, Bugg, & Roediger, 2008). The effect occurs in both within- and between-subject designs, when either recall or recognition is used as the retention measure, and when pictures instead of words are used as the to-be-remembered stimuli (Otgaar, Smeets, & Van Bergen, 2010). Perhaps most impressively, a few seconds of survival processing produces better long-term recall than a veritable ‘‘who’s who’’ of classic encoding manipulations. Nairne, Pandeirada, and Thompson (2008) used a between-group design to compare the effects of survival processing against
7
Adaptive Memory: Evolutionary Constraints on Remembering
0.65
Proportion correct recall
0.60
0.55
0.50
0.45
0.40 Survival Pleasantness Imagery Self-reference Generation
Intentional
Figure 2 Proportion correct recall for words rated for their relevance to a survival scenario along with recall proportions for a host of other recognized encoding techniques (data adapted from Nairne et al., 2008).
forming visual images, self-reference (relating the item to a personal experience), generating an item from an anagram, and intentional learning. Each of these comparison conditions is widely recognized to enhance retention— in fact, these are the encoding manipulations typically championed in human memory textbooks—yet survival processing produced the best retention. The relevant data are shown in Figure 2. Once again, everyone in these experiments is asked to remember exactly the same stimuli, so survival advantages cannot be attributed to the inherent qualities of the to-be-remembered items. Rather, it is the nature of the processing that produces the enhancement. Inducing participants to process information in a survival ‘‘mode’’ leads to effective long-term retention, regardless of whether information is rated as relevant to survival or not (see Nairne et al., 2007). This last result may seem surprising—one might have expected that only survival-relevant stimuli would be remembered well. In fact, participants usually are more likely to remember items given a high survival relevance rating (see Butler, Kang, & Roediger, 2009; Nairne et al., 2007), but such comparisons suffer from the item-selection concerns noted
8
James S. Nairne
earlier. In addition, the fit, or congruence, between to-be-remembered material and the encoding context is an important determinant of retention as well (Craik & Tulving, 1975; Schulman, 1974). Items deemed highly relevant to survival could be remembered better simply because they are more congruent with the survival-based encoding scenario. This makes comparisons between survival relevant and irrelevant stimuli difficult in the survival processing paradigm. The fact that survival processing enhances retention even for items that are seemingly unrelated to fitness provides another indication of its mnemonic power. Any stimulus bathed in the spotlight of survival processing seems to receive some kind of mnemonic boost. Of course, in natural settings it will be the fitness-relevant stimuli that typically receive the spotlight of processing attention—irrelevant events, unlike in the laboratory, will either be ignored or processed with less vigor. At the same time, importantly, fitness-relevance is not an inherent property of most stimuli; instead, fitness-relevance is context-dependent. As Nairne and Pandeirada (2008a) put it: ‘‘food is survival relevant, but more so at the beginning of a meal than at its completion; a fur coat has high s-value at the North Pole, but low at the Equator’’ (p. 240). Even mundane stimuli, such as a pencil, can become quite fitness-relevant under the right circumstances (e.g., a pencil can be used as a weapon in an attack). For this reason we have suggested that survival processing may be the key to long-term enhancement, although stimuli that are naturally fitness-relevant (at least most of the time) might show better retention as well. As noted earlier, words rated as useful to survival are recognized faster and more accurately in a lexical decision task than are matched control words (e.g., Wurm, 2007).
2.2. Explaining the Survival Processing Advantage Still, the fact that survival processing yields particularly good retention does not tell us much about the proximate mechanisms that produce the advantage. The survival advantage is an a priori prediction of an evolutionary analysis, but standard memory principles might explain it. For example, survival processing could simply lead to greater emotional arousal than control conditions, boosting later recall of information encoded in such a context (see Nairne et al., 2007; Weinstein et al., 2008). Such an account would be consistent with an evolutionary locus—that is, nature solved the adaptive problem of remembering fitness-relevant information indirectly by linking memory to emotional arousal (e.g., McGaugh, 2003, 2006). 2.2.1. Emotional Processing However, there does not appear to be any simple link between memory and emotional processing; the relevant literature is filled with complex and conflicting findings. Increased arousal does not always lead to enhanced
Adaptive Memory: Evolutionary Constraints on Remembering
9
retention and may, in fact, reduce retention in some circumstances (Kensinger, Garoff-Eaton, & Schacter, 2007; LaBar & Cabeza, 2006). In addition, if emotional arousal mediates the survival advantage, the size of the effect should depend on the emotional rating or valence of the processed stimuli. Otgaar et al. (2010) obtained separate measures of arousal and valence for pictures and assessed recall after people rated the pictures for survival relevance, moving to a foreign land, or pleasantness. Both arousal and valence affected recall performance overall, but failed to interact with the size of the survival recall advantage. Nairne et al. (2007) performed a similar analysis with word stimuli and also failed to find any relationship between emotionality rating and the size of the survival processing advantage. Research on memory for emotional words often shows design effects as well—that is, retention advantages for emotional words are confined to mixed designs in which both emotional and neutral words are contained in the same list (e.g., Schmidt & Saari, 2007). Such a pattern suggests that emotional words tend to be remembered well only when they ‘‘stand out’’ or are distinctive relative to neutral words presented in the same context. As noted earlier, the survival processing effect remains highly robust in both within- and between-subject designs. In fact, we have directly compared survival processing in within- and between-subject designs—for example, survival and pleasantness processing occurred either randomly intermixed in the same list or in different lists—and the size of the survival advantage in recall remains essentially the same in both designs. This last finding—that survival processing advantages do not show design effects—also helps distinguish the effect of survival processing from many other standard findings in the memory literature. For example, the generation effect (generated items are remembered better than read items), the effect of bizarre imagery (forming a bizarre image of an item produces better memory than a common image), the enactment effect (subjectperformed actions are remembered better than experimenter-performed actions), and the perceptual interference effect (perceptually masked words are remembered better than unmasked words) all show strong design effects, at least when free recall is used as the retention measure. Each effect is typically stronger in a within-subject design and may even fail to materialize in a between-subject design (for other examples, see McDaniel & Bugg, 2008). Again, survival processing shows no such sensitivity. 2.2.2. Thematic Processing One could also argue that survival processing is effective simply because the rated information is processed in a rich thematic context. Thematic processing affords a number of mnemonic benefits, including enhanced relational processing, that are absent or minimized in item-based processing tasks of the type compared in Figure 2. In our original work we attempted to counter this interpretation by comparing survival to another thematic scenario—moving to
10
James S. Nairne
a foreign land. Although we matched the moving and survival scenarios as closely as possible, one could still argue that thinking about survival is inherently more arousing, interesting, or novel than moving. Since our original report (Nairne et al., 2007), we have replicated the survival benefit using a number of alternative thematic scenarios. For example, we have compared survival to scenarios in which (a) people are asked to imagine themselves vacationing at a fancy resort with all of their needs taken care of, (b) eating dinner at a restaurant, and (c) planning a charity event with animals at the local zoo (Nairne & Pandeirada, 2007; Nairne et al., 2007, 2008)—in each case, a survival processing advantage was found. Our survival scenario also produces better memory than one involving the planning and execution of a bank heist (Kang et al., 2008). In this case, the bank heist scenario was chosen because Kang et al. felt our original moving condition was somewhat mundane, lacking the novelty and excitement of the survival scenario. Nairne and Pandeirada (2008b) also found robust survival processing advantages when people rated words in categorized lists. We reasoned that survival processing might induce people to encode unrelated words into an ‘‘ad hoc’’ category representing ‘‘things that occur in a survival situation.’’ Once primed by the rating task, the ad hoc category could then provide an efficient retrieval structure relative to item-based tasks such as pleasantness processing. However, such an account predicts that if the to-be-rated words are inherently related (i.e., the list is categorized), then any relational processing induced by the survival rating task should be less beneficial to retention. Many studies have shown that relational processing of items in a related list, such as sorting items from an obviously categorized list into categories, yields few mnemonic advantages compared to identical processing of words in an unrelated list (e.g., Hunt & Einstein, 1981). In fact, encoding procedures that focus on the item itself, such as rating the item for pleasantness, produce the best recall when a list is categorized. Nairne and Pandeirada (2008b) found that survival processing continued to produce better recall than pleasantness processing, even when the lists were categorized and the items were drawn from survival-relevant categories. Perhaps the most convincing evidence against a thematic or relational processing account, though, comes from a recent study using more focused survival scenarios (Nairne, Pandeirada, Gregory, & Van Arsdall, 2009). Evolutionary psychologists often argue that extant cognitive processes evolved primarily during the Pleistocene when our species survived largely as foragers or hunter-gatherers. We continue to house a ‘‘stone-age mind,’’ one filled with adaptations uniquely designed to handle problems relevant to early hunter-gatherer environments (e.g., Tooby & Cosmides, 2005). With this in mind, we developed scenarios to tap prototypical ‘‘hunting’’ and ‘‘gathering’’ activities. In the hunter scenario, people were asked to imagine themselves living in the grasslands as part of a small group; their task was to contribute needed meat to the tribe by hunting big game, trapping small
11
Adaptive Memory: Evolutionary Constraints on Remembering
animals, or fishing in a nearby lake. In the gathering condition, the task was to gather food for the tribe by scavenging for edible fruits, nuts, or vegetables. In line with our earlier work, participants were asked to rate the relevance of random words to these activities prior to a surprise memory test. Of main interest are the two control conditions. In the gathering control condition, participants were asked to rate the relevance of words to a task involving searching for and locating food items, but under the guise of a nonfitness-based scavenger hunt. The hunter scenario was compared to a matched control in which participants rated the relevance of words to participating in a hunting contest. Importantly, both control scenarios required people to imagine tracking and hunting for food—the same activities required in the survival scenarios—but only in the survival versions were the activities fitnessrelevant (necessary for continued survival). As shown in Figure 3, significantly better recall performance was found when the scenarios induced people to process information in a survival mode (Nairne et al., 2009). 2.2.3. Special Adaptation? Does processing information in a survival ‘‘mode’’ engage special mnemonic machinery—perhaps some kind of targeted adaptation uniquely sculpted by the processes of natural selection? I address this possibility in more detail later in the chapter, but some clear conclusions are possible at this point. First, at a 0.70
Proportion correct recall
0.65
0.60
0.55
0.50
0.45
0.40
Gatherer
Scavenger
Hunter
Hunting contest
Figure 3 The left-hand side shows proportion correct recall for words rated with respect to a ‘‘gathering’’ scenario, which was fitness-relevant, and a matched ‘‘scavenger hunt’’ scenario, which was not. The right-hand side shows data for the ‘‘hunting’’ and matched ‘‘hunting contest’’ scenario (data adapted from Nairne et al., 2009).
12
James S. Nairne
purely empirical level, survival processing produces excellent retention— better, in fact, than virtually all known encoding techniques. For example, as just noted, survival processing produces better retention than pleasantness processing in a categorized list—the latter is generally considered to be the ‘‘gold standard’’ against which effective encoding techniques are compared (see Hunt & McDaniel, 1993). From the perspective of nature’s criterion, this is the anticipated result—memory needs to be adaptive, particularly with respect to the maintenance and use of information related to fitness. Second, as the experiments discussed in this section illustrate, it is unlikely that domain-general factors, such as interest, novelty, emotional or thematic processing, will easily account for the retention advantages found after survival processing. ‘‘Standard’’ memory processes may yet explain the advantage, but the proximate mechanisms involved remain unknown. In the next section, I consider some of the standard explanatory mechanisms used by memory theorists in more detail, but viewed from the unique perspective of nature’s criterion. To preview, most memory theorists rely on general purpose processes to explain retention, ones that fail to consider either nature’s criterion or any specific purposeful end. It is widely accepted that our sensory systems evolved to solve a set of highly specified problems—for example, detecting edges, extracting wavelength information, maintaining shape constancy—but little is known about the comparable problems that drive our capacity to remember. Instead, researchers focus on explaining retention performance in a few well-specified tasks, such as free recall or recognition, rather than isolating the adaptive problems that memory presumably evolved to solve. Regardless of the proximate mechanisms that actually underlie the advantage, however, survival processing remains an extremely effective encoding technique. To maximize retention in both normal and impaired populations, it is critical to develop encoding techniques that are congruent with the natural design of memory systems. Semantic-based processing and self-referential processing have been used for years in clinical settings to improve retention (e.g., Bird, 2001; De Vreese, Neri, Fioravanti, Belloi, & Zanetti, 2001; Mimura et al., 2005), yet a few seconds of survival-based processing produces better free recall than either of these encoding tasks. Thus, understanding the functional problems that drive remembering, and the particular role that fitness-relevant processing contributes to long-term retention, should help to improve retention in a variety of populations and retrieval settings.
3. Memory Theory and Nature’s Criterion As noted earlier, very few of the topics listed in Table 1 have received much attention in the human memory literature. This is partly due to the emphasis that cognitive psychologists place on understanding tasks, but
Adaptive Memory: Evolutionary Constraints on Remembering
13
there is the propensity to rely on domain-general learning and memory processes as well. Most memory theorists accept that memory evolved, but fail to factor nature’s criterion into their task analyses. The possibility that there are a host of domain-specific memory processes, each uniquely crafted to solve particular fitness-relevant problems, is either ignored or rejected by the community of modern memory researchers (for some exceptions, see Klein, Cosmides, et al., 2002; Paivio, 2007; Sherry & Schacter, 1987). Instead, theorists appeal to a few general constructs or principles to explain how retention varies across situations. I discuss two of the most popular constructs below—encoding specificity and levels of processing—and then consider some recent functionally themed approaches that fit more snugly with nature’s criterion.
3.1. The Encoding–Retrieval Match One of the most widely used theoretical constructs in memory theory is encoding specificity or, more generally, the principle of the encoding– retrieval match (see Tulving, 1983; Tulving & Thomson, 1973). This principle can be summarized as follows: Conditions present at encoding establish memory records that, in turn, are differentially accessible depending on the retrieval environment. What ultimately determines retention, at least with respect to a particular target event, is the relative match between the encoded record and the retrieval cue(s) in effect. The better the match, or more precisely the extent to which the retrieval cue matches the target better than other possible retrieval candidates, the more likely the target memory will be retrieved (see Jacoby & Craik, 1979; Nairne, 2002). Things remembered best are those with memory records that match or resemble the cues likely to be present in the testing or retrieval environment. The key element in this principle is equipotentiality: Neither events, processes, nor retrieval cues are assumed to have any special mnemonic properties (see Surprenant & Neath, 2009; Tulving, 1983). What matters is simply the functional match between the encoding and retrieval environments. Consider the picture-superiority effect: One generally finds that pictures are easier to remember than words, but the advantage is dependent on the nature of the retrieval environment (usually recall or recognition). Retrieval environments can be arranged in which words are remembered better than pictures—one merely needs to employ retrieval cues at test that are more diagnostic of previously encoded words than they are of pictures (e.g., using word fragments as cues at retrieval; see Weldon & Roediger, 1987). It is the relationship between the encoding and retrieval conditions that reigns supreme, not the content of information or the manner in which it is processed. As a result, survival processing must be beneficial because it produces diagnostic memory records—that is, those that are likely to be matched in
14
James S. Nairne
later retrieval environments. By itself, of course, this reasoning is circular; additional assumptions are needed to explain why one type of encoding produces more ‘‘matchable’’ retrieval records than another. Historically, memory researchers have appealed to ‘‘elaboration’’ or ‘‘spread of encoding’’ to help solve this problem (e.g., Craik & Tulving, 1975). Effective encoding procedures are those that promote the generation of multiple retrieval cues through the linking of the target item to other information in memory. As the number of linkages—or ‘‘spread’’ of the encoding— increases, the chances that an effective retrieval cue will be encountered later increase as well. But the process itself is domain-general. Retention is controlled by the presence of a diagnostic retrieval cue; environmental factors, rather than information content alone, determine when (or if) an effective retrieval cue will be present. There are no inherent memory ‘‘tunings,’’ only taxonomies relating encoding and retrieval contexts. Viewed through the lens of nature’s criterion, of course, equipotentiality seems unworkable. How could such a system evolve—that is, one that does not discriminate among the adaptive consequences of the processed event? There are simply too many critical problems for the developing human to solve—avoiding predators, locating nourishment, selecting an appropriate mate—to rely on such a general, content-free principle. Again, the engine that drives natural selection and structural change is fitness enhancement. Any evolved system, as a result, likely guarantees that fitness-relevant events receive some processing priority, at least relative to events that are largely fitness-irrelevant. Nature builds physical structures that solve specific problems—livers, hearts, visual systems—not general systems that remain insensitive to content. Moreover, a system that relies merely on the match between encoded records and retrieval environments remembers continuously. On-line experiences always yield cues that will match some elements of previous experience, so restrictions are essential for the memory system to function. Decisions need to be made about how to restrict the retrieval cues that are processed, as well as the range of allowable memory records that can be matched. Selection advantages will accrue to memory systems that remember appropriately—that is, to systems that remember information pertinent to improving survival and reproduction. The match between encoding and retrieval environments may be important, perhaps even critical to successful retention, but its role in remembering must ultimately be understood in terms of some larger functional agenda.
3.2. Levels of Processing A similar argument applies to the popular encoding theory known as the levels of processing framework (Craik & Lockhart, 1972; Craik & Tulving, 1975). According to this view, successful retention depends on the depth of
Adaptive Memory: Evolutionary Constraints on Remembering
15
processing that an item receives, in which ‘‘depth’’ is defined as the extent of meaningful or conceptual processing. Empirically, it is well established that thinking about the meaning of an item produces excellent long-term retention compared to more superficial forms of processing, such as attending to the shape or sound of a verbal item (e.g., Hyde & Jenkins, 1973). However, as with the picture-superiority effect, the advantage of meaningful processing depends on the characteristics of the retrieval environment. One can arrange retrieval environments in which nonmeaningful (shallow) forms of processing lead to comparatively better retention (Stein, 1978; also see Roediger, Gallo, & Geraci, 2002). But for traditional retrieval environments (e.g., free recall or recognition) processing for meaning remains an excellent vehicle for long-term retention. At first glance, the levels of processing framework seems like a domainspecific theory—our memory systems are ‘‘tuned’’ to the processing of meaning. Yet, the theory ultimately subscribes to a kind of equipotentiality as well—meaningful processing ‘‘works’’ only because it promotes the encoding of information into highly organized and differentiated retrieval structures. Craik (2007) has used the analogy of a library: If a new acquisition is ‘encoded deeply’ it will be shelved precisely in terms of its topic, author, date, etc., and the structure of the library catalog will later enable precise location of the book. If the new book was simply categorized in terms of its surface features (‘blue cover, 8" 10", weighs about a pound’) it would be stored with many similar items and be difficult or impossible to retrieve later. The ability to process deeply is thus a function of a person’s expertise in some domain—it could be mathematics, French poetry, rock music, wine tasting, tennis, or a multitude of other types of knowledge. (p. 131)
.
From the standpoint of theory, then, memory is conceptualized in a completely domain-general way. Successful retention depends on fitting tobe-remembered material into rich, established knowledge structures that are easy to access when retrieval is needed. Moreover, it is experience, or expertise, that is the ultimate arbiter of effectiveness. Although recurrent aspects of the environment may lead to common knowledge structures across people, an individual’s unique interests and life experiences build those domains of expertise that afford the best opportunity for excellent long-term retention. As Craik (2007) notes, the levels of processing framework ‘‘postulates no special ‘store’ or ‘faculty’ of memory—or even special memory processes’’ (p. 132). Again, is it reasonable to assume that such a domain-general process is well suited for solving the wide range of mnemonic problems that humans faced throughout their evolutionary history, everything from remembering food locations, predator routes, potential mate choices, cheaters on social contracts, and so on? One might argue that fitness-relevant knowledge
16
James S. Nairne
structures, those germane to survival and reproduction, are simply better described than nonfitness-relevant events—that is, more organized and differentiated. However, it is unlikely that these characteristics, if present, developed with experience or expertise. Most people have limited experience with survival situations, particularly those involving predators in the grasslands of a foreign land. More importantly, remembering fitnessrelevant information is too important to rely on the whims of environments that may or may not deliver the experiences necessary to build appropriate retrieval structures. Empirically, as reviewed earlier, a few seconds of survival processing leads to enhanced retention relative to traditional ‘‘deep’’ processing tasks, including ones that should activate highly organized and differentiated retrieval structures. For example, Nairne et al. (2008) compared survival processing to a self-reference task. People were asked to make survival relevance ratings about words or to rate the ease with which the word brought an important personal experience to mind. So-called ‘‘self schemas’’ are highly organized and differentiated, and well practiced, yet survival processing produced better retention. Moving and spending time at a restaurant are also well practiced compared to surviving in the grasslands, and should activate highly organized knowledge structures, yet it is survival processing that produces the better memory. Finally, as Nairne et al. (2009) have shown, one can use rating scenarios that trigger exactly the same activities (e.g., hunting or searching for food) but retention depends importantly on whether the activities are deemed fitness-relevant or not. There is little question that depth of processing and the encoding– retrieval match are important to retention; it would be folly to suggest otherwise. Decades of research have established that retention is retrievalcue dependent and improved by encoding techniques that maximize the chances that effective cues will be present when needed (see Tulving & Craik, 2000). However, to suggest that these two principles are sufficient to capture the essential properties of memory’s evolved architecture is nonsense. The idea that our memory systems are insensitive to content—that neither events, processes, nor retrieval cues are ‘‘special’’—ignores the specificity and defining characteristics of nature’s criterion.
3.3. Episodic Future Thought Although the majority of memory researchers remain focused on understanding specific retrieval environments, invoking general constructs such as encoding specificity or levels of processing to explain retention, more functionally oriented perspectives do exist. One relatively recent idea is that our memory systems are fundamentally prospective—that is, oriented toward the future rather than the past (Schacter & Addis, 2007; Szpunar & McDermott, 2008). Of course, from the perspective of nature’s criterion
Adaptive Memory: Evolutionary Constraints on Remembering
17
such a conclusion must be true. It is the ability to use the past, in combination with the present, that produces adaptive behavior (Nairne & Pandeirada, 2008a; Suddendorf & Corballis, 2007). The core idea behind the emerging concept of episodic future thought is adaptive simulation (Atance & O’Neill, 2001). People possess the unique ability to imagine, or pre-experience, events that may happen in the future, thereby enabling them to cope more effectively with future events. To consider an obvious case, the nervous teenager anticipating his first date actively envisions scenarios—what his partner might do or say and he can practice witty retorts. One can also re-create scenarios from the personal past—for example, a botched job interview—and cast alternative versions of events in the hope of performing more effectively in the future. The adaptive value of mental simulation is widely practiced across domains. Golfers, for example, often mentally picture the trajectory of a shot before addressing the ball. It has been suggested that one of the primary functions of episodic memory, one reason why the system might have evolved, is to provide the key elements or building blocks from which future thoughts can be constructed (Schacter & Addis, 2007). Indeed, evidence from a variety of sources—neuropsychological, neuroimaging, and behavioral data—indicates a close relationship between episodic retrieval and future thought simulation. For example, individuals who have lost the capacity to remember personal episodes from the past have trouble imagining personal events in the future (Klein, Loftus, & Kihlstrom, 2002; Tulving, 2002). Neuroimaging studies suggest that a common core brain network may be engaged during both episodic remembering and episodic future thought (Buckner & Carroll, 2007; Szpunar, Watson, & McDermott, 2007). Behaviorally, the ability of normal people to imagine vivid and detailed future scenarios depends on the availability of relevant past episodes (Szpunar & McDermott, 2008). If our memory systems truly evolved to anticipate and plan for the future, then processing information in a future-oriented ‘‘planning mode’’ might produce particularly good retention. Evidence consistent with this idea has been reported recently by Klein, Robertson, and Delton (2010). People were asked to make ratings about objects in the context of a camping trip in the woods. In one condition, focused on the past, people were asked to rate the likelihood that particular objects had been taken on a past camping trip; in a second atemporal condition, people were asked simply to imagine a camping site and to rate the chances that objects were contained in the image; in the future-oriented planning condition, people were asked to rate the likelihood that they would plan to take a particular object with them on a future trip. A surprise recall test revealed that futureoriented planning produced the best retention—even better, in fact, than yet another condition in which people were asked to rate the survival value of each of the objects.
18
James S. Nairne
Again, to satisfy nature’s criterion, cognitive systems must be adaptive— that is, they need to produce behavior that directly or indirectly increases survivability and reproduction. A system designed merely to remember the past could not have easily evolved. The past can never occur again, at least in exactly the same form, so memory systems gain their adaptive edge by improving future responding (Suddendorf & Corballis, 1997). At the same time, a memory system that is designed simply to simulate the future falls short of nature’s criterion as well—the concept is too general. If the ability to construct future scenarios is an evolved characteristic, it arose because it ultimately enhanced fitness. Consequently, as with survival processing, we might expect the imprint of nature’s criterion to be observable in the operating characteristics of episodic future thought. For example, we might anticipate that people will simulate future events more effectively when those events are relevant to fitness than when they are not. At this point, though, the mark of nature’s criterion on episodic future thought remains to be investigated. Interestingly, the survival processing paradigm can be conceived as one that induces episodic future thought. People are asked to imagine a grasslands scenario and then to rate the relevance of events to surviving in such a context. It is easy to imagine that people in these experiments are actively simulating the scenario and anticipating how the presented events apply (or not). Memory is enhanced relative to conditions in which events receive only item-based processing, such as rating for pleasantness or forming a visual image—but also to simulated scenarios that involve activities that are not fitness-relevant (such as moving to a foreign land or participating in a hunting contest). However, one difference between a simulated survival scenario and ‘‘planning’’ a specific future event, such as a camping trip, is that the survival scenario is more likely to rely on generic knowledge than on personally relevant episodes. Few, if any, college-age participants have a background in grasslands-based survival situations, so it is unlikely that a survival simulation is constructed from personally relevant episodes (Klein, Loftus, et al., 2002; Szpunar, 2010). Unraveling the connections between the building blocks of episodic future thought and their evolutionary roots should prove to be a productive avenue for future research.
3.4. Rational Analysis of Memory Another functional perspective on retention proposes that our memory systems evolved, in part, to reflect the statistical regularities of events in the environment. These ‘‘rational’’ models of memory adopt a Bayesian framework, assuming that one important function of memory is to calculate the conditional probabilities associated with event occurrence. There is presumably a cost to remembering, so it is adaptive to consider the probabilities that
Adaptive Memory: Evolutionary Constraints on Remembering
19
particular memories will be relevant, and therefore needed, in a particular environment (Anderson & Milson, 1989; Shiffrin & Steyvers, 1997). In fact, our retention functions do seem to track the way events actually occur and recur in the environment. Forgetting functions are negatively accelerated, meaning that most of the retention loss occurs early in the function and slows thereafter. It turns out that the statistical properties of event occurrence follow essentially the same form. For example, Anderson and Schooler (1991) assessed the probability that a particular word would appear in the headlines of the New York Times as a function of the number of days that passed from an initial occurrence. So, if the phrase ‘‘Cap and Trade’’ appears in the headlines today, there is a relatively good chance that the same phrase will appear tomorrow. But the odds fall off with each successive day in a form that mimics the classic forgetting function. Anderson and Schooler’s (1991) results suggest that ‘‘forgetting’’ is simply an optimal reflection of the way events actually occur and recur in the environment. We are less likely to remember a specific occurrence with time, but that is because the event is less likely to occur again and be needed (for other supporting applications, see Anderson & Schooler, 2000). The rational approach successfully captures the idea that our cognitive systems are inherently constrained by nature—we think and remember in particular ways in order to optimize expected utilities (gains vs. costs) in a given situation. Unlike most approaches to human memory, the rational viewpoint is also functional; it assumes that memory systems are purposeful and crafted to solve specific problems in the environment. However, from the perspective of nature’s criterion, two caveats deserve mention. First, evolutionary psychologists generally believe that our brains developed to solve adaptive problems prevalent during the so-called environment of evolutionary adaptedness (e.g., Symons, 1992; Tooby & Cosmides, 1992). Thus, although our cognitive systems may be optimally designed, they evolved to solve problems in ancestral environments, particularly those associated with foraging lifestyles. This means that our memory systems may not be optimal in modern environments, and it may be a mistake to assume that they are designed merely to detect statistical regularities in such environments (for a discussion of optimality modeling and evolution, see Gangestad & Simpson, 2007). The second caveat emerges from a recurrent theme of this chapter—the engine that drives structural change through natural selection is the enhancement of fitness. Consequently, it is unlikely that our memory systems evolved simply to reflect the statistical properties of events—in either modern or ancestral environments; instead, the content (or, more specifically, the fitness-relevance) of the information needs to be taken into the account. Our memory systems should be optimally designed to reflect the occurrence and recurrence of fitness-relevant information, rather than information in general. In fact, Anderson and Schooler (2000) suggest that it
20
James S. Nairne
might be less costly to process or retrieve certain kinds of memories, based on the content or ‘‘importance’’ of the events involved, although they have not pursued the issue empirically.
4. Remembering with a Stone-Age Brain Throughout the chapter, I have developed logical arguments for specially tuned memory systems, those sculpted by the processes of natural selection. For example, memory’s tunings are unlikely to have emerged entirely from experientially based learning mechanisms—on-line experiences often do not deliver the information necessary to respond appropriately. In addition, memory systems could not have evolved to record and remember everything—problems of combinatorial explosion arise quickly so selectivity in storage is required (see Ermer, Cosmides, & Tooby, 2007). Instead, given the severity of nature’s criterion, cognitive systems likely come equipped with ‘‘crib sheets’’ or built-in biases about how to respond rapidly and efficiently to fitness-relevant input (Tooby & Cosmides, 1992). At the same time, there is a difference between recognizing that our memory systems are functionally designed—that is, ‘‘tuned’’ to solve particular kinds of problems—and discovering the ultimate origins of those tunings. Identifying an adaptation, especially a cognitive one, is notoriously difficult. There are no ‘‘fossilized’’ memory traces, and we have only limited knowledge about the ancestral environments in which our memory systems actually evolved (Buller, 2005). Adaptive solutions to recurrent problems can arise indirectly, by piggybacking on adaptations that evolved for different reasons (exaptations), or as a result of natural constraints in the environment (e.g., the physical laws of nature or genetic constraints). The proximate mechanisms that enable us to read and write, for example, could not have evolved directly for those ends even though reading and writing are very adaptive abilities. To establish that a given cognitive mechanism, such as a mnemonic tuning, reflects an adaptation—that is, a mechanism arising directly as a consequence of evolution through natural selection—requires satisfying multiple criteria (e.g., Brandon, 1990; Williams, 1966). In principle, one would need to establish that the trait can be inherited, or passed along across generations through differential reproduction. One would also want to show that at some point in our ancestral past there were individual differences among people along the trait dimension, and that certain forms (such as a special memory tuning for fitness-relevant information) were selected because they promoted differential survival and reproduction relative to other forms. Obtaining this kind of evidence is difficult, if not functionally
Adaptive Memory: Evolutionary Constraints on Remembering
21
impossible, for most of the cognitive adaptations of interest to evolutionary psychologists (e.g., see Richardson, 2007). It is also important to recognize that our cognitive systems were not built from scratch—natural selection ‘‘tinkers,’’ which means that changes emerge from preexisting structures. The design of these structures, in turn, introduces constraints that color how the adaptive problems that drive evolution are ultimately solved. Thus, even if we could correctly identify the ancestral selection pressures that drove the development of our memory systems, it would still be difficult to predict how nature solved the relevant adaptive problems. As noted above, the task becomes even more difficult with the recognition that adaptations can be co-opted to solve problems that are ostensibly unrelated to their functional design (Gould & Vrba, 1982). Even worse, some mnemonic phenomena may even be artifacts—so-called ‘‘spandrels’’ or incidental byproducts of other design features. For example, sensory persistence (e.g., iconic memory) may occur simply as a byproduct of the fact that neural responses are extended in time (Haber, 1983; Loftus & Irwin, 1998).
4.1. Building the Case for Cognitive Adaptations Despite these inherent problems, one can still build compelling arguments in favor of evolutionary loci (see Andrews, Gangestad, & Matthews, 2002). Evolutionary biologists, ethologists, and comparative psychologists have been proposing adaptationist hypotheses for generations, without satisfying the stringent criteria mentioned above (see Bolles & Beecher, 1988; Shettleworth, 1998). Most scholars agree that cognitive adaptations exist in humans—for example, sensory and perceptual systems—although the evolutionary lineage is unavailable or difficult to track even in the most obvious cases. Moreover, the absence of relevant evidence is not sufficient to falsify adaptationist arguments, nor does it mean that nonadaptationist hypotheses are correct (e.g., all memory ‘‘tunings’’ emerge from experience). Both adaptationist and nonadaptationist hypotheses need to be constructed (and judged) on an empirical base. One can infer the existence of an adaptation, generate and test empirical predictions, and systematically rule out alternative explanations (Williams, 1966). As the research reviewed in this chapter illustrates, it is possible to adopt a functional/evolutionary perspective and generate empirically-testable hypotheses. One common complaint against evolutionary psychology is the proliferation of ‘‘just-so’’ stories—that is, post-hoc explanations for phenomena that seem apt from an evolutionary perspective but lack relevant empirical grounding (e.g., giraffes have long necks because they could reach more easily for food; Gould & Lewontin, 1979). By themselves, these kinds of accounts have explanatory value, but mainly to the extent that they can be used to generate empirically-testable predictions. Recognizing that
22
James S. Nairne
our memory systems evolved, and were subject to the constraints of nature’s criterion, led to the prediction that processing information for fitness would lead to especially good retention. Because of how labor was divided during early environments of adaptation, Silverman and Eals (1992) generated the clear prediction that women may be better equipped than men to remember the locations of objects set in fixed locales (see also New, Krasnow, Truxaw, & Gaulin, 2007). Similarly, recognizing that males and females differ in their relative amounts of parental investment generates predictions about sex-based mating strategies and parental behavior that, in turn, can be confirmed or disconfirmed empirically (Buss, 2006). It is also possible to attempt comparative analyses, across cultures and species, either to establish the universality of the trait or to demonstrate that it occurs only in environments affording the relevant selection pressures. Given an evolutionary locus, one would presumably expect to find fitnessbased retention advantages across species and peoples. This may seem like a trivial prediction but, in fact, an early criticism of our work was that survival processing advantages might have arisen from exposure to culture-specific media, such as the television program Survivor—they do not (see Nairne et al., 2007; Weinstein et al., 2008). At the same time, comparative analyses, by themselves, do not provide unequivocal support for the presence of an adaptation. Universality can arise for many reasons—for example, common experiences or natural constraints across environments—and the presence of a trait across species does necessarily mean that common adaptations are involved. Comparative analyses can be effective in helping to eliminate alternative hypotheses and may serve as one piece in a larger argument in favor of an adaptationist account. One can also look for tunings or specificity in development. For example, many scholars believe that language learning in children is biologically prepared—the capacity for language develops easily and reliably and follows rules that cannot be readily gleaned from everyday experiences (e.g., Pinker, 1994). Moreover, the human ear and vocal tract seem perfectly tailored to meet the needs of speech, and there are specific regions in the brain that control the production and comprehension of spoken language. Many developmental psychologists argue as well that babies are born knowing all kinds of things about the world, everything from an intuitive sense of motion and the physical world to differences between animate and inanimate objects (Bloom, 2005; Gelman, 2003). Babies may also be born with a bias to recognize and remember faces, gender-specific voices, and fear natural predators such as snakes (DeLoache & LoBue, 2009). Again, none of these data, by themselves, can decisively confirm an adaptationist locus—for example, these abilities could be exaptations, co-opted from other adaptations—but they help to bolster an adaptationist case. It would be interesting to know, for example, whether fitness-based retention effects arise easily in
Adaptive Memory: Evolutionary Constraints on Remembering
23
children or fitness-relevant processing activates regions in the brain that overlap (or not) with other forms of mnemonic processing. Another criterion that is sometimes used to defend an evolutionary locus is optimality. Proximate mechanisms resulting from evolved adaptations should show a special ability to maximize adaptive behavior. So, one can investigate the operating parameters of remembering and forgetting and establish (usually through some form of quantitative model) that the system rationally maximizes benefits and minimizes costs (e.g., Anderson & Milson, 1989). However, as noted earlier, adaptations, by definition, are rooted in the past; consequently, we should not necessarily expect to detect optimal behavior from an evolved system operating in a modern environment. Instead, at least in principle, we should expect to find ancestral priorities—that is, we should find that the system is tuned to operate most effectively in past environments, particularly environments associated with our foraging past. Evidence of this sort is particularly compelling for adaptationist accounts because it is difficult to see how general learning mechanisms could possibly account for an ancestral priority. In fact, evidence consistent with ancestral priorities exists in several cognitive domains. For example, New, Cosmides, and Tooby (2007) found that people are faster and more accurate at detecting animals, both human and nonhuman, than inanimate objects using the change-detection paradigm, a procedure in which people are asked to detect changes in rapidly alternating images. People were slower at detecting changes in familiar vehicles across images than they were at detecting changes in rarely experienced animal species. In the learning domain, some studies have found that ancestrally relevant fear stimuli, such as snakes and spiders, are easier to associate with aversive stimuli than modern fear-relevant stimuli such as ¨ hman & Mineka, 2001). In addition, specific guns and electrical outlets (see O phobias are more apt to develop to ancestral stimuli (e.g., spiders) than to aversive stimuli experienced exclusively in modern environments (e.g., weapons; De Silva, Rachman, & Seligman, 1977). Although not definitive, these data are consistent with the notion that some aspects of cognitive processing may be better tuned to ancestral than to modern priorities.
4.2. Ancestral Priorities in Survival Processing There is evidence indicating that ancestral priorities may help drive retention performance in the survival processing paradigm as well (Nairne & Pandeirada, 2010; Weinstein et al., 2008). Weinstein et al. asked people to process the relevance of words to a survival situation, but varied whether the scenario described an ancestral or a modern setting. In one condition, using the typical survival scenario (see Table 2), people were asked to imagine themselves stranded in the grasslands of a foreign land without basic survival
24
James S. Nairne
materials. Over the next few months, they would need to find steady supplies of food and water and protect themselves from predators. In a second condition, exactly the same scenario was used but two critical words were changed: city was substituted for grasslands and predators was replaced by attackers. Escaping from predators in the grasslands, the authors reasoned, is a closer fit to the problems faced in the environment of evolutionary adaptation; as a result, it should produce better memory than processing in a modern context, even though the latter is arguably more familiar and likely to lead to greater amounts of elaboration. Consistent with their hypothesis, better retention for the rated words was found for the group processing the ancestral scenario. Our laboratory has recently replicated this work and extended it to two new domains—attempting to cure an infection and finding necessary nourishment. In the first case, the survival scenario was once again set either in the grasslands or in a city, and participants were asked to imagine they had been hurt and a dangerous infection might be developing. Participants were instructed to rate the relevance of words to the task of finding ‘‘relevant medicinal plants’’ to cure the infection (ancestral) or finding ‘‘relevant antibiotics’’ (modern). In a second experiment, again employing either a grasslands or a city scenario, people were asked to imagine they had not eaten for several days and needed to ‘‘search for and gather edible plants’’ (ancestral) or ‘‘search for and buy food’’ (modern). In all other respects the scenarios were matched exactly. The rating task was followed by a surprise recall test for the rated words. The main results of interest are shown in Figure 4. In both experiments, people who imagined themselves in an ancestral context remembered more of the rated words than those who imagined themselves in a city. Importantly, both of the scenarios depicted survival situations and the adaptive problems involved (curing an infection and finding nourishment) were essentially the same. Moreover, typical for the survival processing paradigm, everyone in both experiments was asked to remember exactly the same stimuli. Despite the fact that the scenarios were very closely matched— differing in only a few words—processing an item in an ancestral survival context led to better retention than processing the same item in a modern survival context. It is tempting to conclude from these data that the ancestral scenarios induced a unique form of survival processing, one congruent with the selection pressures that originally fed the processes of natural selection.
4.3. What Is the Adaptation? Assuming that mnemonic adaptations exist, and account partly for the fitness-based ‘‘tunings’’ seen in the survival processing paradigm, what form would these adaptations be likely to take? Do we have minds filled
25
Adaptive Memory: Evolutionary Constraints on Remembering
Proportion correct recall
0.60
0.55
0.50
0.45
0.40 Medicinal plants
Antibiotics
Ancestral food
Modern food
Figure 4 Proportion correct recall for the ‘‘ancestral’’ conditions (searching for medicinal or edible plants) and the matched ‘‘modern’’ conditions (searching for antibiotics or shopping for food). Data are from Nairne and Pandeirada (2010).
with highly specialized memory adaptations, each crafted to solve a particular kind of memory problem (e.g., remembering faces, edible plants, or predator types)? Or, did we evolve a few general systems defined more by flexibility than by domain-specificity? Memory researchers sometimes propose multiple memory systems (e.g., Schacter & Tulving, 1994), but those systems are typically defined by the source of information rather than by its content (see Tooby & Cosmides, 2005). For instance, we may have evolved systems for dealing with personal autobiographical events, general knowledge, or perceptual representations, but not for specific situations related to fitness (e.g., predators, food sources, or potential mates). Some neuroscientists have argued for domain-specific knowledge systems in the brain (e.g., Caramazza & Shelton, 1998), but such proposals are rarely considered by mainstream cognitive psychologists. As argued throughout, adaptations develop to solve adaptive problems, those defined by nature’s criterion. Evolutionary psychologists tend to reject content-free architectures because it is difficult to see how such structures could evolve. Structural features evolve because they enhance fitness—so, in the case of memory, our capacity to remember and forget likely developed because our memory systems helped us solve fitness problems of the sort listed in Table 1. Adaptive problems can be solved by general systems, but general systems are rarely engineered by natural selection. For example, consider a retention system based merely on meaning—information that is
26
James S. Nairne
processed for meaning is remembered better than information processed along more ‘‘shallow’’ perceptual dimensions (Craik & Lockhart, 1972). One could argue that processing in a survival ‘‘mode’’ induces meaningful processing, and concomitant ‘‘elaborations,’’ and therefore fitness-relevant information would typically be remembered well. However, as noted earlier, failing to differentiate between important and unimportant material (i.e., the assumption of equipotentiality) leads to a host of potential problems (e.g., combinatorial explosion of information). It is more likely that a system evolved to detect and remember fitness-relevant information, a system that could then be co-opted to remember generally. At the same time, we probably did not evolve any simple kind of ‘‘survival module.’’ The concept of survival is too general as well. As Nairne et al. (2007) argued, the retention advantages that accrue from survival processing could easily result from ‘‘multiple modules working in concert—each activated to one degree or another by the survival processing task’’ (p. 270). From an evolutionary perspective, specific processing systems may have developed for dealing with particular foods, predators, potential mating partners, and the like (e.g., see Barrett, 2005). It is probably necessary to differentiate among retention environments as well. For example, most of the work conducted to date on survival processing has used free recall as the retention measure. Free recall requires a search engine, or retrieval process, that accesses stored information using a criterion of recent occurrence. It is an episodic task, one that requires people to recall information that occurred at a specific time, in a specific location, as defined by the experiment. For some kinds of fitness-relevant problems—perhaps remembering the location of a predator or a food source—enhanced episodic retrieval might be especially beneficial. However, for other fitnessrelevant problems, such as remembering whether someone is a cheater or a potential mate, remembering temporal and spatial information may be less useful. At this point, it is not possible to characterize mnemonic adaptations in any satisfactory fashion. We can use the lessons of evolutionary biology to speculate—for example, adaptations tend to be domain-specific and functionally designed—but logic alone is not a substitute for building a strong empirical case. Again, as the data reviewed in this chapter clearly show, it is possible to generate a priori empirical predictions about the possible functions and evolutionary roots of our memory systems. Future research will need to compare and contrast alternative accounts and ‘‘visions’’ of memory’s evolved architecture. However, regardless of the proximate mechanisms that are ultimately uncovered, it will be important to recognize initially that cognitive systems are functionally designed. Our memory systems are purposeful—they evolved to solve adaptive problems—and memory’s architecture is likely to reflect those functional ends.
Adaptive Memory: Evolutionary Constraints on Remembering
27
5. Conclusions Theories naturally evolve, based on the criterion of successfully predicting and describing performance on a criterial task. In the case of memory theory, psychologists have relied on an ever-expanding toolkit of memory measures—for example, recall, recognition, fMRI scans—but rarely explain or justify why one task should be preferred over another. Adopting such a ‘‘structuralist’’ mindset means, of course, that our theories tend to be task-based and rarely connected to actual problems (see Nairne, 2005). Which is likely to provide the clearest window into what it means to remember—free recall, recognition, or some other task? Most notably absent from current memory debates, however, is the recognition that nature designed our memory systems with her own criterial task—reproductive fitness. For a memory system to evolve, it must satisfy the constraints of nature’s criterion; it must easily solve the kinds of adaptive problems that engineer change through natural selection (e.g., situations of the type listed in Table 1). Accordingly, one might hypothesize, the imprints—or footprints—of those criterial problems should remain visible in the operating characteristics of memory systems. This is ultimately an empirical question, but recent research suggests that our memory systems may indeed be ‘‘tuned’’ to remember information and events that are relevant to fitness. In fact, as discussed earlier, a few seconds of survival processing produces better free recall performance than a veritable ‘‘who’s who’’ of established memory encoding techniques (Nairne et al., 2008). Recognizing a role for nature’s criterion in the design and function of memory systems has implications for our theoretical conceptions of memory as well. The crux of the functionalist agenda is the recognition that memory is functionally designed (Klein, Cosmides, et al., 2002; Nairne, 2005; Sherry & Schacter, 1987). Our memory systems are not engineered to remember everything; decisions need to be made about storage and retrieval and content matters. It is much more important to have memory systems that track the locations of predators and food, or the statements of potential mating partners, than other random events in the environment—regardless of the ultimate origins of those biases or tunings. Yet, many modern memory theorists continue to champion equipotentiality, expressed in the form of domain-general constructs such as encoding specificity or levels of processing. Again, the ultimate arbiter of whether our memory systems are indeed domain-specific, and whether it is appropriate to propose multiple highly specialized memory systems, is empirical. Adopting a truly functional perspective, recognizing that our memory systems are designed to solve adaptive problems, should help to establish productive empirical pathways in the future.
28
James S. Nairne
Finally, despite the compelling logic of an evolutionary perspective, it is important to acknowledge the difficulties that surround the search for adaptations, cognitive or otherwise. As noted, there are no fossilized memory records, the heritability of cognitive processes remains largely unknown, and we can only speculate about the selection pressures that operated in ancestral environments. There is also the troubling temptation to concoct adaptationist accounts based on plausibility rather than empirical fact (i.e., ‘‘just-so’’ stories; Gould & Lewontin, 1979). At the same time, relevant evidence can be collected about our foraging past (Tooby & Cosmides, 2005); and, as illustrated throughout, it is certainly possible to generate empirically-testable predictions about how recurrent adaptive problems impact modern memory functioning. Few scholars question the assertion that cognitive adaptations must exist, but to build a convincing empirical case for their existence requires much more. Recent research on the evolutionary determinants of memory is seeking to provide an empirical foundation on which just such a case can be made.
ACKNOWLEDGMENTS Special thanks are due to Josefa Pandeirada for many helpful comments on the manuscript. This research was supported, in part, by a grant from the National Science Foundation (BCS-0843165).
REFERENCES Anderson, J. R., & Milson, R. (1989). Human memory: An adaptive perspective. Psychological Review, 96, 703–719. Anderson, J. R., & Schooler, L. J. (1991). Reflections of the environment in memory. Psychological Science, 2, 396–408. Anderson, J. R., & Schooler, L. J. (2000). The adaptive nature of memory. In E. Tulving & F. I. M. Craik (Eds.), The Oxford handbook of memory (pp. 557–570). New York: Oxford University Press. Andrews, P. W., Gangestad, S. W., & Matthews, D. (2002). Adaptationism: How to carry out an exaptationist program. Behavioral and Brain Sciences, 25, 489–553. Atance, C. M., & O’Neill, D. K. (2001). Episodic future thinking. Trends in Cognitive Sciences, 5, 533–539. Barrett, H. C. (2005). Adaptations to predators and prey. In D. Buss (Ed.), The handbook of evolutionary psychology (pp. 200–223). Hoboken, NJ: Wiley. Bird, M. (2001). Behavioural difficulties and cued recall of adaptive behaviour in dementia: Experimental and clinical evidence. Cognitive Rehabilitation in Dementia, 3, 357–375. Bloom, P. (2005). Descartes’ baby. New York: Basic Books. Bolles, R. C., & Beecher, M. D. (Eds.), (1988). Evolution and learning. Hillsdale, NJ: Lawrence Erlbaum Associates. Brandon, R. (1990). Adaptation and environment. Princeton: Princeton University Press. Brown, R., & Kulik, J. (1977). Flashbulb memories. Cognition, 5, 73–99.
Adaptive Memory: Evolutionary Constraints on Remembering
29
Buchner, A., Bell, R., Mehl, B., & Musch, J. (2009). No enhanced recognition memory, but better source memory for faces of cheaters. Evolution and Human Behavior, 30, 212–224. Buckner, R. L., & Carroll, D. C. (2007). Self-projection and the brain. Trends in Cognitive Sciences, 11, 49–57. Buller, D. J. (2005). Adapting minds: Evolutionary psychology and the persistent quest for human nature. Cambridge, MA: The MIT Press. Buss, D. M. (2005). The murderer next door: Why the mind is designed to kill. New York: The Penguin Press. Buss, D. M. (2006). Strategies in human mating. Psychological Topics, 2, 239–260. Butler, A. C., Kang, S. H. K., & Roediger, H. L. III. (2009). Congruity effects between materials and processing tasks in the survival processing paradigm. Journal of Experimental Psychology: Learning, Memory, and Cognition, 35, 1477–1486. Caramazza, A., & Shelton, J. R. (1998). Domain-specific knowledge systems in the brain: The animate–inanimate distinction. Journal of Cognitive Neuroscience, 10, 1–34. Craik, F. I. M. (2007). Encoding: A cognitive perspective. In H. L. Roediger III, Y. Dudai, & S. M. Fitzpatrick (Eds.), Science of memory: Concepts (pp. 129–135). New York: Oxford University Press. Craik, F. I. M., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11, 671–684. Craik, F. I. M., & Tulving, E. (1975). Depth of processing and the retention of words in episodic memory. Journal of Experimental Psychology: General, 104, 268–294. Darwin, C. (1859). On the origin of species. London: John Murray. De Silva, P., Rachman, S., & Seligman, M. (1977). Prepared phobias and obsessions: Therapeutic outcome. Behavioral Research and Therapy, 15, 65–77. De Vreese, L. P., Neri, M., Fioravanti, M., Belloi, L., & Zanetti, O. (2001). Memory rehabilitation in Alzheimer’s disease: A review of progress. International Journal of Geriatric Psychiatry, 16, 794–809. DeLoache, J. S., & LoBue, V. (2009). The narrow fellow in the grass: Human infants associate snakes and fear. Developmental Science, 12, 201–207. Ermer, E., Cosmides, L., & Tooby, J. (2007). Functional specialization and the adaptationist program. In S. W. Gangestad & J. A. Simpson (Eds.), The evolution of mind: Fundamental questions and controversies (pp. 86–94). New York: Guilford Press. Gangestad, S. W., & Simpson, J. A. (2007). Whither science of the evolution of mind. In S. W. Gangestad & J. A. Simpson (Eds.), The evolution of mind: Fundamental questions and controversies (pp. 397–437). New York: Guilford Press. Gelman, R. (2003). The essential child: Origins of essentialism in everyday thought. Oxford: Oxford University Press. Gould, S. J., & Lewontin, R. C. (1979). The spandrels of San Marco and the Panglossian paradigm: A critique of the adaptationist programme. Proceedings of the Royal Society B: Biological Sciences, 205, 581–598. Gould, S. J., & Vrba, E. S. (1982). Exaptation: A missing term in the science of form. Paleobiology, 8, 4–15. Guillet, R., & Arndt, J. (2009). Taboo words: The effect of emotion on memory for peripheral information. Memory & Cognition, 37, 866–879. Haber, R. N. (1983). The impending demise of the icon: A critique of the concept of iconic storage in visual information processing. Behavioral and Brain Sciences, 6, 1–10. Howard, M. W., & Kahana, M. J. (2002). A distributed representation of temporal context. Journal of Mathematical Psychology, 46, 269–299. Hunt, R. R., & Einstein, G. O. (1981). Relational and item-specific information in memory. Journal of Verbal Learning and Verbal Behavior, 20, 497–514. Hunt, R. R., & McDaniel, M. A. (1993). The enigma of organization and distinctiveness. Journal of Memory and Language, 32, 421–445.
30
James S. Nairne
Hyde, T. S., & Jenkins, J. J. (1973). Recall for words as a function of semantic, graphic, and syntactic orienting tasks. Journal of Verbal Learning and Verbal Behavior, 12, 471–480. Jacob, F. (1977). Evolution and tinkering. Science, 196, 1161–1166. Jacoby, L. L., & Craik, F. I. M. (1979). Effects of elaboration of processing at encoding and retrieval: Trace distinctiveness and recovery of initial context. In L. S. Cermak & F. I. M. Craik (Eds.), Levels of processing in human memory (pp. 1–21). Hillsdale, NJ: Erlbaum. Kang, S., McDermott, K. B., & Cohen, S. (2008). The mnemonic advantage of processing fitness-relevant information. Memory & Cognition, 36, 1151–1156. Kenrick, D. T., Delton, A. W., Robertson, T., Becker, D. V., & Neuberg, S. L. (2007). How the mind warps: A social evolutionary perspective on cognitive processing disjunctions. In J. P. Forgas, M. G. Haselton, & W. von Hippel (Eds.), Evolution and the social mind: Evolutionary psychology and the social mind. New York: Psychology Press. Kensinger, E. A., Garoff-Eaton, R. J., & Schacter, D. L. (2007). Effects of emotion on memory specificity: Memory trade-offs elicited by negative visually arousing stimuli. Journal of Memory and Language, 56, 575–591. Klein, S. B., Cosmides, L., Tooby, J., & Chance, S. (2002). Decisions and the evolution of memory: Multiple systems, multiple functions. Psychological Review, 109, 306–329. Klein, S. B., Loftus, J., & Kihlstrom, J. F. (2002). Memory and temporal experience: The effects of episodic memory loss on an amnesic patient’s ability to remember the past and imagine the future. Social Cognition, 20, 353–379. Klein, S. B., Robertson, T. E., & Delton, A. W. (2010). Facing the future: Memory as an evolved system for planning future acts. Memory & Cognition, 38, 13–22. LaBar, K. S., & Cabeza, R. (2006). Cognitive neuroscience of emotional memory. Nature Reviews Neuroscience, 7, 54–64. Loftus, G. R., & Irwin, D. E. (1998). On the relations among different measures of visible and informational persistence. Cognitive Psychology, 35, 135–199. Lu, H. J., & Chang, L. (2009). Kinship effect on subjective temporal distance of autobiographical memory. Personality and Individual Differences, 47, 595–598. Luminet, O., & Curci, A. (Eds.), (2009). Flashbulb memories: New issues and new perspectives. New York: Psychology Press. McDaniel, M. A., & Bugg, J. M. (2008). Instability in memory phenomena: A common puzzle and a unifying explanation. Psychonomic Bulletin & Review, 15, 237–255. McGaugh, J. L. (2003). Memory and emotion: The making of lasting memories. New York: Columbia University Press. McGaugh, J. L. (2006). Make mild moments memorable: Add a little arousal. Trends in Cognitive Sciences, 10, 345–347. Mesoudi, A., & Whiten, A. (2008). The multiple roles of cultural transmission experiments in understanding human cultural evolution. Philosophical Transactions of the Royal Society B: Biological Sciences, 363, 3489–3501. Mimura, M., Komatsu, S. I., Kato, M., Yoshimasu, H., Moriyama, Y., & Kashima, H. (2005). Further evidence for a comparable memory advantage of self-performed tasks in Korsakoff’s syndrome and nonamnesic control subjects. Journal of the International Neuropsychological Society, 11, 545–553. Nairne, J. S. (2002). The myth of the encoding–retrieval match. Memory, 10, 389–395. Nairne, J. S. (2005). The functionalist agenda in memory research. In A. F. Healy (Ed.), Experimental psychology and its applications (pp. 115–126). Washington, DC: American Psychological Association. Nairne, J. S., & Pandeirada, J. N. S. (2007). Adaptive memory: Is survival processing special? Paper presented at the 48th Annual Meeting of the Psychonomic Society. Nairne, J. S., & Pandeirada, J. N. S. (2008a). Adaptive memory: Remembering with a stoneage brain. Current Directions in Psychological Science, 17, 239–243.
Adaptive Memory: Evolutionary Constraints on Remembering
31
Nairne, J. S., & Pandeirada, J. N. S. (2008b). Adaptive memory: Is survival processing special? Journal of Memory and Language, 59, 377–385. Nairne, J. S., & Pandeirada, J. N. S. (2010). Adaptive memory: Ancestral priorities and the mnemonic value of survival processing. Cognitive Psychology (2010), doi:10.1016/j. cogpsych.2010.01.005. Nairne, J. S., Pandeirada, J. N. S., Gregory, K. J., & Van Arsdall, J. E. (2009). Adaptive memory: Fitness-relevance and the hunter-gatherer mind. Psychological Science, 20, 740–746. Nairne, J. S., Pandeirada, J. N. S., & Thompson, S. R. (2008). Adaptive memory: The comparative value of survival processing. Psychological Science, 19, 176–180. Nairne, J. S., Thompson, S. R., & Pandeirada, J. N. S. (2007). Adaptive memory: Survival processing enhances retention. Journal of Experimental Psychology: Learning, Memory, and Cognition, 33, 263–273. New, J., Cosmides, L., & Tooby, J. (2007). Category-specific attention for animals reflects ancestral priorities, not expertise. Proceedings of the National Academy of Sciences of the United States of America, 104, 16598–16603. New, J., Krasnow, M. M., Truxaw, D., & Gaulin, S. J. C. (2007). Spatial adaptations for plant foraging: Women excel and calories count. Proceedings of the Royal Society B: Biological Sciences, 274, 2679–2684. ¨ hman, A., & Mineka, S. (2001). Fears, phobia, and preparedness: Toward an evolved O module of fear and fear learning. Psychological Review, 108, 483–522. Otgaar, H., Smeets, T., & van Bergen, S. (2010). Picturing survival memories: Enhanced memory after fitness-relevant processing occurs for verbal and visual stimuli. Memory & Cognition, 38, 23–28. Packman, J. L., & Battig, W. F. (1978). Effects of different kinds of semantic processing on memory for words. Memory & Cognition, 6, 502–508. Paivio, A. (2007). Mind and its evolution: A dual coding theoretical approach. Mahwah, NJ: Erlbaum. Pinker, S. (1994). The language instinct. New York: HarperCollins. Raaijmakers, J. G. W., & Shiffrin, R. M. (1981). Search of associative memory. Psychological Review, 88, 93–134. Richardson, R. C. (2007). Evolutionary psychology as maladapted psychology. Cambridge, MA: The MIT Press. Roediger, H. L., III, Gallo, D. A., & Geraci, L. (2002). Processing approaches to cognition: The impetus from the levels-of-processing framework. Memory, 10, 319–332. Rubin, D. C. (1995). Memory in oral traditions. The cognitive psychology of epic, ballads, and counting-out rhymes. New York: Oxford University Press. Schacter, D. L., & Addis, D. R. (2007). The cognitive neuroscience of constructive memory: Remembering the past and imagining the future. Philosophical Transactions of the Royal Society B: Biological Sciences, 362, 773–786. Schacter, D. L., & Tulving, E. (1994). What are the memory systems of 1994? In D. L. Schacter & E. Tulving (Eds.), Memory systems (pp. 1–38). Cambridge, MA: The MIT Press. Schmidt, S. R., & Saari, B. (2007). The emotional memory effect: Differential processing or item distinctiveness? Memory & Cognition, 35, 1905–1916. Schulman, A. I. (1974). Memory for words recently classified. Memory & Cognition, 2, 47–52. Sherry, D. F., & Schacter, D. L. (1987). The evolution of multiple memory systems. Psychological Review, 94, 439–454. Shettleworth, S. J. (1998). Cognition, evolution, and behavior. New York: Oxford University Press. Shiffrin, R. M., & Steyvers, M. (1997). A model for recognition memory: REM—Retrieving effectively from memory. Psychonomic Bulletin & Review, 4, 145–166.
32
James S. Nairne
Silverman, I., & Eals, M. (1992). Sex differences in spatial abilities: Evolutionary theory and data. In J. H. Barkow, L. Cosmides, & J. Tooby (Eds.), The adapted mind: Evolutionary theory and the generation of culture (pp. 531–549). New York: Oxford Press. Stein, B. S. (1978). Depth of processing reexamined: The effects of precision of encoding and test appropriateness. Journal of Verbal Learning and Verbal Behavior, 17, 165–174. Suddendorf, T., & Corballis, M. C. (1997). Mental time travel and the evolution of the human mind. General, Social, and General Psychology Monographs, 123, 133–167. Suddendorf, T., & Corballis, M. C. (2007). The evolution of foresight: What is mental time travel, and is it unique to humans? Behavioral and Brain Sciences, 30, 299–313. Surprenant, A. M., & Neath, I. (2009). Principles of memory. New York: Psychology Press. Symons, D. (1992). On the use and misuse of Darwinism in the study of human behavior. In J. H. Barkow, L. Cosmides, & J. Tooby (Eds.), The adapted mind: Evolutionary psychology and the generation of culture (pp. 137–159). New York: Oxford University Press. Szpunar, K. K. (2010). Episodic future thought: An emerging concept. Perspectives on Psychological Science, 5, 142–162. Szpunar, K. K., & McDermott, K. B. (2008). Episodic memory: An evolving concept. In D. Sweat, R. Menzel, H. Eichenbaum, & H. L. Roediger III (Eds.), Learning and memory: A comprehensive reference (pp. 491–510). Oxford: Elsevier. Szpunar, K. K., Watson, J. M., & McDermott, K. B. (2007). Neural substrates of envisioning the future. Proceedings of the National Academy of Sciences of the United States America, 104, 642–647. Tooby, J., & Cosmides, L. (1992). The psychological foundations of culture. In J. H. Barkow, L. Cosmides, & J. Tooby (Eds.), The adapted mind: Evolutionary theory and the generation of culture (pp. 19–136). New York: Oxford Press. Tooby, J., & Cosmides, J. (2005). Conceptual foundations of evolutionary psychology. In D. Buss (Ed.), The handbook of evolutionary psychology (pp. 5–67). Hoboken, NJ: Wiley. Tulving, E. (1983). Elements of episodic memory. New York: Oxford University Press. Tulving, E. (2002). Episodic memory: From mind to brain. Annual Review of Psychology, 53, 1–25. Tulving, E., & Craik, F. I. M. (Eds.), (2000). The Oxford handbook of memory. Oxford: Oxford University Press. Tulving, E., & Thomson, D. M. (1973). Encoding specificity and retrieval processes in episodic memory. Psychological Review, 80, 352–373. Weinstein, Y., Bugg, J. M., & Roediger, H. L. (2008). Can the survival recall advantage be explained by basic memory processes? Memory & Cognition, 36, 913–919. Weldon, M. S., & Roediger III, H. L. (1987). Altering retrieval demands reverses the picture superiority effect. Memory & Cognition, 15, 269–280. Williams, G. C. (1966). Adaptation and natural selection. Princeton: Princeton University Press. Winograd, E., & Neisser, U. (1992). Affect and accuracy in recall: Studies of ‘‘flashbulb memories’’ New York: Cambridge University Press. Wurm, L. H. (2007). Danger and usefulness: An alternative framework for understanding rapid evaluation effects in perception? Psychonomic Bulletin & Review, 14, 1218–1225.
C H A P T E R
T W O
Digging into De´ja` Vu: Recent Research on Possible Mechanisms Alan S. Brown and Elizabeth J. Marsh Contents 34 36 37 38 39 40 42 43 43 46 49 51 52 52 53 53 54 54 55 56 56 57 57 58 59 60
1. Introduction 2. Perceptual Explanation 2.1. Jacoby and Whitehouse (1989) 2.2. Split Perception: Study 1 2.3. Split Perception: Study 2 2.4. Split Perception: Study 3 2.5. Superficial Glance ¼ Shallow Processing? 3. Implicit Memory Explanation 3.1. Episodic Experience 3.2. Single-Element Familiarity Explanation 3.3. Gestalt Familiarity Explanation 3.4. Hypnosis 4. Physiological Explanation 4.1. Neural Transmission Asynchrony 4.2. Surgical Elimination of De´ja` Vu 4.3. Surgical Elicitation of De´ja` Vu 5. Reports in Anomalous Individuals 5.1. Blindness 5.2. Chronic De´ja` Vu 6. Continuing Issues 6.1. Aging 6.2. Dreams 6.3. Single versus Multiple Causes 6.4. Jamais Vu 7. Concluding Remarks References
Abstract The de´ja` vu experience has piqued the interest of philosophers and physicians for over 150 years, and has recently begun to connect to research on fundamental cognitive mechanisms. Following a brief description of the nature of this Psychology of Learning and Motivation, Volume 53 ISSN 0079-7421, DOI: 10.1016/S0079-7421(10)53002-0
#
2010 Elsevier Inc. All rights reserved.
33
34
Alan S. Brown and Elizabeth J. Marsh
recognition anomaly, this chapter summarizes findings from several laboratories that are related to this memory phenomenon. In our labs, we have found support for three possible mechanisms that could trigger de´ja` vu. The first is split perception, which posits that a de´ja` vu is caused by a brief glance at an object or scene just prior to a fully aware look. Thus, the perception is split into two parts and appears to be eerily duplicated. A second mechanism is implicit memory, whereby a prior setting actually has been experienced before by the person but stored in such an indistinct manner that only the sense of familiarity is resurrected. Another example of an implicit memory effect involves a single part of a larger scene that is familiar but not identified as such, with the result that the strong sense of familiarity associated with this portion inappropriately bleeds over onto the entire scene. Others have found support for gestalt familiarity, that the framework of the present setting closely resembles something experienced before in outline but not in specifics. We also present physiological evidence from brain and cognitive dysfunctions that relate to our understanding of de´ja` vu. Finally, some important but unresolved issues in de´ja` vu research are noted, ones that should guide future research on the topic.
1. Introduction We have all some experience of a feeling that comes over us occasionally of what we are saying or doing having been done in a remote time—of our having been surrounded dim ages ago by the same faces, objects, and circumstances—of our knowing perfectly well what will be said next, as if we suddenly remembered it. David Copperfield, Charles Dickens (1849, p. 630)
Perhaps the most exciting insights into the nature of cognitive function happen when normal processes break down. Roediger (1996) notes that the field of perceptual psychology embraced, early on, the study of illusions as a conduit to better understand normal perceptual processes. Yet memory researchers have not been as enthusiastic about such an approach, perhaps because memory dysfunction (compared to perceptual dysfunction) is more closely associated with global mental and physical pathology (cf. Brown, 2004). While a few memory illusions have been extensively investigated, such as false recall (Roediger & McDermott, 1995) and conjunction errors ( Jones & Atchley, 2006), de´ja` vu is perhaps the most interesting and dramatic of memory illusions because it involves a clash of two rational and routine cognitive evaluations—familiarity versus unfamiliarity. During de´ja` vu, one feels that a setting or event is strongly familiar, yet rationally ‘‘knows’’ that it is not.
Recent Research on De´ja` Vu
35
Stepping back into the realm perceptual psychology, there are two different classes of illusions—those that are not attention grabbing (Mu¨ller-Lyer) and those that are (wagon wheel). With the Mu¨ller-Lyer illusion, one simply perceives the arrow-head capped line to be shorter than the one with the arrow heads, without surprise or awareness of one’s error. In contrast, in the wagon wheel illusion, the spokes of the stagecoach appear to be turning backwards as in the old cowboy movie, jolting our awareness. We know that the wheels are not really turning in reverse direction, and that the movie frames are simply out of sync with the wheel spokes. Turning back to the realm of memory, there are also two categories of illusions: those that we are aware of, and those that we are not. When we fail to recognize an old friend in a crowd as they walk past us, we are unaware of it and it does not capture our attention. On the other hand, when we fly to Key West for the first time and our rented vacation condo feels strikingly familiar, we experience a realm of uncomfortable mental incongruity that grabs hold of us and elicits a de´ja` vu. The literature on the de´ja` vu experience is extensive, going back 150 years (cf. Brown, 2004). Most early reports involve personal reflections in the form of literary descriptions and personal anecdotes. A few attempted to document a connection between the de´ja` vu and various medical (epilepsy) and psychological (schizophrenia) dysfunctions, but the application of scientific scrutiny to de´ja` vu has been slow to evolve. This sluggish involvement of systematic empirical investigation is perhaps a result of de´ja` vu’s unfortunate association with things mysterious and unempirical, such as reincarnation and extra sensory perception (cf. Funkhouser, 1983). Another factor impeding research progress may be the rarity of the experience, typically occurring only once or twice a year even with those most prone (young adults) (Brown, 2003). But perhaps the most important hindrance to research on de´ja` vu is the lack of a clear eliciting stimulus. In culling through personal descriptions, it is nearly impossible to find a clear or consistent trigger for de´ja` vu. Nearly all published descriptions focus on the nature of the cognitive disruption or one’s personal reaction or what one feels during the experience. The quote by Dickens at the start of this chapter is typical of published descriptions. Thus, it is a serious challenge to identify stimuli that could reliably elicit a de´ja` vu in the lab. Later in this chapter, we will describe ways in which current research has attempted to scientifically evaluate this phenomenon. Rather than attempting to recreate a full-blown de´ja` vu experience, most research approaches this topic indirectly: how can we increase the probability of a false positive familiarity illusion? Simply put, de´ja` vu is a recognition failure—an involuntary false alarm. Under normal circumstances, we experience familiarity for objects and situations that we have encountered before, and unfamiliarity for those that we have not. With de´ja` vu, we have a sense of
36
Alan S. Brown and Elizabeth J. Marsh
strong positive familiarity for items that we know to be novel: ‘‘any subjectively inappropriate impression of familiarity of a present experience with an undefined past’’ (Neppe, 1983, p. 3). Given the rarity of de´ja` vu, most information has been gathered retrospectively through surveys. Such data reveal that de´ja` vu is experienced by two-thirds (67%) of respondents, with the incidence highest among those in their late teens and 20s, and dropping off steadily with increasing age (Brown, 2003, 2004). Among experients—those who report ever having the experience—it is reported much less frequently as one ages. The experience happens more often among more educated, more liberal (politically/religiously), and more traveled individuals, and is unrelated to gender or race. De´ja` vu is typically associated with an entire setting, rather than with specifiable elements (objects, people, or sounds). It also accompanies the preseizure aura in a small percentage of temporal lobe epileptics. Apart from specific temporal lobe pathology (seizure; tumor), de´ja` vu has not been clearly connected with any physical or psychological pathology. The vague nature of the experience provides a fertile ground for theoretical speculation, with few clear constraints. Over 50 explanations have been proposed, the most viable of which are subsumed under three different categories: perceptual, memory, and physiological (cf. Brown, 2004). All can connect to theories and findings that have emerged in research on cognition and neuroscience. In fact, we are at a propitious point in the evolution of our research designs/tools, where we can begin to conduct more precise tests of such theoretical speculation. This chapter is intended primarily to summarize research findings on de´ja` vu published since previous summaries (Brown, 2003, 2004) and to give a sense of where the field is heading.
2. Perceptual Explanation Usually referred to as perceptual gap or split perception, a de´ja` vu may occur when a person processes the present sensory input twice, in rapid succession. The first input experience is brief, degraded, occluded, and/or while distracted. The second perception, immediately following, then seems strangely familiar because it connects to the immediately prior input (unbeknownst to us). As with each category of explanation, many variations exist that can be traced back over a century (Angell, 1908). This particular explanation is exceptional because it received formal attention by a pioneer of modern cognitive science:
Recent Research on De´ja` Vu
37
. . . you are about to cross a crowded street, and you take a hasty glance in both directions to make sure of a safe passage. Now your eye is caught, for a moment, by the contents of a shop window; and you pause, though only for a moment, to survey the window before you actually cross the street. . .the preliminary glance up and down, that ordinarily connects with the crossing in a single attentive experience, is disjointed from the crossing; the look at the window, casual as it was, has been able to disrupt the associative tendencies. As you cross, then, you think ‘‘Why, I crossed this street just now’’; your nervous system has severed two phases of a single experience, both of which are familiar, and the latter of which appears accordingly as a repetition of the earlier. (Titchener, 1928, pp. 187–188)
2.1. Jacoby and Whitehouse (1989) Titchener’s quote was a focal point for the first scientifically rigorous test of a possible mechanism underlying de´ja` vu. Jacoby and Whitehouse (1989) modeled Titchener’s ‘‘hasty glance’’ through a brief visual exposure in a controlled laboratory setting. If this explanation is true, then a subthreshold glance at a word should create a heightened sense of familiarity for it when it is viewed in full, moments later. Jacoby and Whitehouse’s design involved two stages: first, an input list of words; second, an old/new recognition test. The recognition test was one word at a time. Each test word was preceded by a briefly flashed stimulus consisting of (a) the word itself (identical), (b) a word different from the test word (different), or (c) no word (none). The key finding was that when the prior glance involved the test word itself (identical), this increased the likelihood of misidentifying this new word as having occurred on the prior list—relative to new words in the different or none prime conditions. This finding was replicated both within the Jacoby and Whitehouse article, and in subsequent research (Bernstein & Welch, 1991; Gellatly, Banton, & Woods, 1995; Joordens & Merikle, 1992; Klinger, 2001). This demonstration of a false positive familiarity illusion captured the imagination of many, as reflected in a phenomenal number of subsequent articles (over 200) that have cited the Jacoby and Whitehouse study. This captured our attention as well. Rather than forcing subjects to stay mentally within the confines of a laboratory in making familiarity assessments, we wanted to know whether a false positive familiarity illusion could be pushed much further back into one’s personal past, prior to the lab (Brown & Marsh, 2009). If so, this could move a step closer to modeling actual de´ja` vu experiences. Thus, our goal was to capture some sense of the amorphous temporal quality that typifies de´ja` vu—‘‘this experience has happened sometime before in my life, but I don’t know exactly when.’’ Brown, Porter, and Nix (1994) confirmed that subjects have
38
Alan S. Brown and Elizabeth J. Marsh
difficulty identifying just when the prior experience supposedly happened: survey respondents were evenly distributed on whether the illusory prior encounter happened days, weeks, months, or years ago.
2.2. Split Perception: Study 1 We attempted to increase the verisimilitude of Jacoby and Whitehouse’s design via two experimental design changes (Brown & Marsh, 2009). First, we eliminated the input list and used only a test list. This alteration would, we hoped, force our subjects to attribute any sense of enhanced familiarity to experiences prior to the current lab session: ‘‘have you had a pre-experimental encounter with this symbol?’’ The second issue involved stimulus materials. Asking about a pre-experimental encounter rules out the use of words, because practically all words have been seen prior to the experiment. Instead, we gathered a collection of relatively unfamiliar line drawings, and cataloged how unfamiliar such symbols were by using a pilot group of subjects to rate these 300 black and white line drawing figures. Based upon these ratings, we sorted symbols into three sets: novel, low familiarity, and high familiarity. A sample of each type is shown in Figure 1. Novel symbols:
Low-familiarity symbols:
High-familiarity symbols:
Figure 1 (2009).
Novel, low-familiarity, and high-familiarity symbols from Brown and Marsh
Recent Research on De´ja` Vu
39
To recap, we followed the Jacoby and Whitehouse (1989) procedure of preceding each figure with a brief flash of (a) the same stimulus (identical), (b) a different stimulus (different), or (c) nothing (none). As is obvious from the examples in Figure 1, we expected that high-familiarity stimuli (e.g., a heart) would have been seen prior to the study, but included them so that all subjects could respond ‘‘yes’’ on some trials. However, we were uninterested in analyzing these high-familiarity stimuli because their ratings should be at a ceiling, limiting the possibility of increasing judged familiarity. Our primary finding replicated Jacoby and Whitehouse (1989): a brief glance at a figure just before judging its familiarity significantly increased a sense that it had been seen before. For novel figures, subjects were five times more likely to claim a pre-experimental encounter in the identical prime condition (15% rated as seen before) than in either the different (3%) or no prime (3%) conditions. The same significant effect also occurred with low-familiarity stimuli: an identical prime roughly doubled the probability of claiming a prestudy encounter (28%), compared to different (16%) or no prime (13%) conditions. Thus, we successfully created an illusion of a previous experience, by simply flashing the stimulus briefly ahead of itself. This again confirmed Jacoby and Whitehouse’s finding that a ‘‘new’’ stimulus word (or symbol) can be misattributed as having been seen before. However, we showed that this effect can be induced for stimuli that the subject probably has never seen before (novel symbols), and demonstrated that this misattribution can extend to a time frame and place outside the laboratory. Our intent was to test the split perception theory of de´ja` vu by pushing familiarity around, and we did not anticipate that our manipulation would be powerful enough to produce a full-blown de´ja` vu experience. Checking on this item by item would have been ill-advised from several perspectives. Not only would it have considerably slowed the procedure, we were concerned that it would create an expectation bias. But just to check on the possibility, we asked subjects after the procedure was over whether they had experienced a de´ja` vu at some point during the study. Surprisingly, 50% said that they had. There was no way to confirm that these experiences happened on an identical prime trial, rather than different or none prime trial. However, given that most of these same subjects (71%) reported that de´ja` vu occurred less frequently than once a month, this finding was intriguing.
2.3. Split Perception: Study 2 We conducted several follow-up investigations to Brown and Marsh (2009) that required a more complex evaluation of familiarity. Requiring that subjects assign any sense of increased familiarity only to pre-experimental
40
Alan S. Brown and Elizabeth J. Marsh
encounters may have been less sensitive to subtle changes in familiarity that might have occurred. What if the familiarity enhancement was modest and insufficiently intense for subjects to consider it as emanating from a prestudy exposure? In the first follow-up, we used the same design as Brown and Marsh (2009), except that subjects rated symbols on a more general familiarity scale of ‘‘have you ever encountered this design before?’’ (1 ¼ definitely no; 6 ¼ definitely yes). Congruent with our published report, a brief exposure significantly increased familiarity ratings for both novel and low-familiarity symbols. For novel figures, mean familiarity for both the different (1.8) and none (2.1) prime conditions was significantly lower than that for identical prime (4.3). As in Study 1, this effect replicated with low-familiarity symbols. Compared to the different (2.5) or none (2.7) prime conditions, a brief exposure to itself (identical prime) significantly increased rated familiarity (4.8). To gather more detail on the familiarity attribution, on each trial where a symbol was rated as familiar (ratings 4, 5, and 6), subjects also assessed the familiarity source: (1) prior exposure during the study, (2) prestudy encounter, or (3) unsure. In the identical prime condition, subjects attributed their sense of familiarity most often to in-study exposure (81%) rather than to prestudy (13%) or unsure (6%). For the different and no prime conditions, the positive familiarity attributions were more evenly distributed between in-study (43%) and prestudy (41%), with a few unsure (16%) responses. This finding suggests that the familiarity enhancement generated by a quick glance will be primarily attributed to a recent within-experiment experience, if subjects are given this option. Thus, our published report may actually underestimate the impact of our manipulation. If one feels a strong feeling of already seeing this particular symbol, the predominant attribution may be to a recent exposure, earlier in the series of just-rated symbols. Perhaps subjects in Brown and Marsh (2009) were inclined to attribute the identity prime familiarity boost to a recent encounter—within the study—and thus less likely to attribute it to a prelab encounter. Such speculation aside, the most important finding in Study 2 is a replication of familiarity enhancement found in Study 1. Interestingly, a postexperiment inquiry again revealed that about half (46%) of the subjects experienced de´ja` vu during the procedure.
2.4. Split Perception: Study 3 In Study 2, both the prime and the target symbol were presented foveally, in the center of the computer screen. We wondered whether the effect would change if the prime symbol was processed off to one side, in the parafoveal area. One explanation of de´ja` vu is that it results from an initial peripheral perception of one object while focusing on something
Recent Research on De´ja` Vu
41
else near it. You drive to a restaurant for the first time, and while you approach the front door an unusual flowering plant beside the entrance captures your attention. When you then look directly at the distinctive doorway, you are struck by an unsettling feeling of familiarity. It is possible that the visual information (doorway) was briefly preprocessed in the foveal area while you were looking at the plant, and when this impression matched the subsequent fully processed view, a de´ja` vu resulted. . . . it is very common for people to be in situations where there are many unattended stimuli outside their immediate focus of attention that are not consciously experienced. . .For this reason, the experimental conditions in studies in which unattended stimuli are presented at spatial locations removed from the current focus of attention more closely resemble the conditions under which visual stimuli are perceived in everyday situations. . . . (Merikle, Smilek, & Eastwood, 2001, p. 122)
Research on inattentional blindness provides credence to the potency of unattended, parafoveal stimuli (Mack, 2003; Mack & Rock, 1998). Participants perform a fairly simple visual task, such as judging which of the two arms of a briefly presented cross (vertical; horizontal) is longer. The procedure extends for many trials, and on a few of these trials another stimulus (a symbol, word, or letter) accompanies the cross, off to one side. Surprisingly, most participants fail to report seeing the additional item, although they show priming for this stimulus on a subsequent indirect memory test, indicating that it was processed without their awareness. To evaluate the peripheral priming possibility for de´ja` vu, Study 3 modified the design used in Study 2. Rather than the identical or different symbol appearing in the same foveal location as the subsequently rated symbol, it appeared offset toward one of the four corners of the computer screen. The outcome essentially replicated Study 2. For novel symbols, the identical prime boosted the rated familiarity substantially (4.1) over the different (2.0) and none (1.8) prime conditions. For the low-familiarity symbols, identity prime symbols were again rated much more familiar (4.5) than those in either the different (2.4) or none (2.4) conditions. Taken together, this series of studies supports the possibility that a perceptual double-take (i.e., a superficial glance followed by a close look) can elicit an exaggerated sense of familiarity for a stimulus. This enhancement is repeatedly shown to occur for both novel and low-familiarity symbols across three different studies. This boost in assessed familiarity was found with both an ambiguous (during vs. before experiment) source rating (Study 2 and 3), as well as a pre-experiment source rating (Study 1; Brown & Marsh, 2009), with the latter finding more directly supporting the concept of de´ja` vu.
42
Alan S. Brown and Elizabeth J. Marsh
2.5. Superficial Glance ¼ Shallow Processing? Given that the split perception explanation seems viable, what mechanism(s) might underlie this? In other words, what forces a subjective temporal separation between these two adjacent perceptual experiences? One possibility is that the initial glance involves shallow processing, where only superficial physical attributes are extracted from the stimulus. And perhaps stimuli that are processed in a shallow manner seem older to us, when contrasted to deeply processed stimuli. We tested this by presenting a list of words, some of which were processed deeply (can you carry this?) and others shallowly (does it have an ‘‘e’’?). After a short distractor task, subjects identified whether each word had appeared in the 1st, 2nd, or 3rd third of the input list. There was a general bias to guess ‘‘middle third’’ for both shallow and deep words, probably reflecting a middle-of-the-road response default when unsure. However, there was a clear difference between deep and shallow items on whether subjects believed that they had been presented in the first-third or last-third of the list (Figure 2). Overall, 34% of deeply processed words were judged as more proximal (end of list; last 3rd), compared to 21% of shallow words. In contrast, distal (beginning of list; first 3rd) judgments related to level of processing in the opposite manner: 24% of deeply processed words seemed to have occurred earlier in the list (first 3rd) compared to 37% of the shallowly processed words. Remarkably, this bias remained consistent across items actually appearing in 40 35 30 25 20
Shallow
15
Deep
10 5 0 1st 3rd as 2nd 3rd as 3rd 3rd as early early early
1st 3rd as 2nd 3rd as 3rd 3rd as late late late
Figure 2 Mean percentage of items in each list third judged as early (appearing in first third) or late (appearing in last third), for shallowly and deeply processed words.
Recent Research on De´ja` Vu
43
the 1st, 2nd, and 3rd thirds of the list. This outcome can potentially clarify why the split perception experience leads to a sense of de´ja` vu. The initial, shallowly processed impression gets temporally pushed back (older), while the subsequent deep look gets pulled forward (recent). This contrast in the moment, with two impressions duplicated in immediate succession drifting in opposing temporal directions, may exaggerate the actual time separation which then leads to a sense of de´ja` vu.
3. Implicit Memory Explanation There are a number of different versions of the implicit memory interpretation of de´ja` vu. All are grounded in the assumption that a de´ja` vu occurs because some aspect of the current situation has actually been experienced before. When the present stimuli hook into previously stored memories which are lacking temporal or contextual tags to assist in the conscious identification of the source of ‘‘oldness,’’ a sense of familiarity that is aroused cannot be explicitly identified. Several lines of research tie into this general explanation.
3.1. Episodic Experience One of the most reasonable and straight-forward interpretations for de´ja` vu is that a person actually has experienced this situation or setting before, but has simply forgotten it. Given the enormous amount of information that we process, it seems likely that there are stored memories of many different types of outdoor scenes, palaces, verbal phrases, plot themes, social situations, hotel lobbies, and melodies, many of which may have lost their explicit memory tag. When a current stimulus connects with one of the episodically disconnected and orphaned memories, this unbeknownst resurrection of the stored representation could yield a vague and unsettling sense of prior experience. Because the objective data that we sort through in the moment are insufficient to support this familiarity, we interpret it as a discomfiting memory illusion. A marvelous commercial by Hotels.com (Deja View) (http://www. elsevierdirect.com/companions/9780123809063/Supplemental/material/1) illustrates this scenario. A couple enters a hotel room, and against a background of spooky music, the moderately distressed man says ‘‘I’ve been in this room before!’’ His nonchalant woman partner replies ‘‘What?’’ to which he emphatically repeats ‘‘I’ve been here before!’’ The woman quickly solves his quandary by reminding him that ‘‘You took the virtual tour on Hotels.com.’’ While this serves as a great relief to the man, it illustrates how
44
Alan S. Brown and Elizabeth J. Marsh
readily such information may become planted in our experiential memory at a shallow level, and then subsequently connected with the real situation that is playing out in front of us, causing momentary memorial distress. 3.1.1. Episodic Experience: Study 1 To model this possibility in the lab, we used our captive audience of undergraduate students to create a plausible memory dilemma (Brown & Marsh, 2008). Most college students visit numerous college campuses prior to their final selection, and we used this fact to help evoke a false sense of prior experience. Students signed up for a two-stage study. During the first, they saw a variety of different scenes: mountain ranges, courtyards, campus buildings, serene lakes, etc. Embedded in each was a small cross, and their task was to identify which quadrant of the picture this cross was located in. We pushed them along at a good clip, so that they would process the pictures in a relatively superficial manner. Mixed in among these pictures were some campus shots from a university that they were not attending. We did verify, postexperimentally, that the ‘‘other’’ campus had not been visited and excluded the handful of Duke students who had actually visited SMU, and those SMU students who had toured Duke. Our main objective was to plant unfamiliar campus images in the students’ memories, in a way that could subsequently evoke a false impression of an actual prior visit. To model de´ja` vu, it was important to ask not simply if the scene was familiar, but if the student had actually been to the location depicted in the photo. Both mundane and unique scenes from both campuses were included, because anecdotal reports suggest that de´ja` vu can occur in both ordinary circumstances (hanging out with friends, relaxing, watching TV) as well as unusual settings (Brown, 2004). This difference is illustrated by these two open-ended survey responses: I was sitting in this guy’s apartment talking about something and I got little flashes like I had been there talking about the same thing and I know it never happened before. I was going to a rock concert in downtown Fort Worth. When we got to the parking lot, I looked up and noticed all the buildings around me. At that moment, I felt as if I had experienced that exact same scene before, although I had never been to downtown Fort Worth.
Examples of these unique (chapel; famous monument) and mundane (dorms; academic classrooms) campus settings are shown in Figure 3. Presentation frequency (once or twice) was varied during the initial cross-detection phase. This manipulation did not have a theoretical underpinning, but was included to see if memory strength might influence false visit attributions. After completing the rapid cross-detection task, subjects returned one week later for session 2, during which they viewed scenes from their home campus and the unfamiliar campus. Home-campus shots did not appear in
45
Recent Research on De´ja` Vu
Unique locations
Mundane locations
Figure 3 Examples of unique and mundane campus locations used in Brown and Marsh (2008).
session 1, but were added at session 2 to assure that each subject could respond that they had actually visited some of the locations. Each photo was shown briefly (half a second) to limit analytical processing, and subjects were instructed to respond quickly based on first impression. After each photo was presented and removed, subjects evaluated whether they had actually been at that particular location using a four-point scale: no, might, probably, definitely. Visit ratings for the critical (away) campus shots were significantly higher for those exposed before, in session 1, compared to those that had not. However, there was no difference between scenes viewed once versus twice in session 1. As expected, mundane shots were given higher visitation ratings than unique shots, because there were fewer clues available to discount a possible visit. But the boost in visit ratings from prior exposure was consistent across unique and mundane scenes. 3.1.2. Episodic Experience: Study 2 These results were essentially replicated in a second study (Brown & Marsh, 2008, Experiment 2). Presentation frequency was again manipulated (one or two exposures in session 1), in addition to retention interval between
46
Alan S. Brown and Elizabeth J. Marsh
sessions 1 and 2: one versus three weeks. Prior exposure in session 1 again boosted subsequent personal visit assessments, and presentation frequency and retention interval had no effect on the degree of this enhancement. Similar to the split perception studies described earlier in this chapter (Sections 2.2–2.4) (Brown & Marsh, 2009), a postprocedural interview revealed that nearly half of the subjects admitted having a de´ja` vu sometime during the procedure: 46% in Experiment 1; 49% in Experiment 2. As explained earlier, we cannot determine which specific item(s) elicited de´ja` vu, as this would require an item by item query during the procedure. However, their general responses provide encouragement that this paradigm may model real-life de´ja` vu experiences. More specifically, de´ja` vu could occur when the present scene or setting duplicates one experienced before in the form of a magazine, movie, PowerPoint presentation, website, or newspaper.
3.2. Single-Element Familiarity Explanation In the above experiments with campus scenes, we explored the possibility that de´ja` vu could stem from having seen an entire scene before. But another implicit-memory possibility is that de´ja` vu could be triggered when a small part of a scene is familiar. Imagine walking into a friend’s living room for the first time and being struck by a feeling of eerie familiarity. It is only later that you realize this familiarity stems from a lamp on her end table that is identical to one in the basement recreation room of your best friend during high school. The source of this intense familiarity—triggered by that single element—is not immediately identified and over-generalizes to the entire scene. Consider another, related example: you walk across campus when two people approach you, talking with each other. You for sure recognize the person on the left, but then feel like you must know person with them but cannot figure out from where. Does your familiarity for person A affect your sense of familiarity for person B? Both of these examples involve the possible spill-over familiarity from one element, whether it affects the familiarity of an entire scene (example 1) or another element (example 2). 3.2.1. Single-Element Familiarity: Study 1 We began with a laboratory investigation of the second example by asking whether the familiarity of one single element can ‘‘bleed over’’ and influence the familiarity evaluation of a second item (Brown & Marsh, 2007; Marsh & Brown, 2010)? Would low-familiarity symbols, selected from our symbol pool from the split-perception studies (Brown & Marsh, 2009) increase or decrease in rated familiarity, depending on the familiarity level of the symbol that was shown with them? More specifically, could we bias subjects to give a higher rating if a high-familiarity symbol accompanied the target, and would subjects reduce a target symbol’s rated familiarity
47
Recent Research on De´ja` Vu
if accompanied by a novel symbol? Two factors were manipulated: how long the target appeared on the screen (100 vs. 1000 ms), and whether the target appeared (a) alone, (b) with a novel symbol, or (c) with a highfamiliarity symbol. The procedure for the first experiment is summarized in Figure 4. Subjects were told ‘‘your job is to decide how familiar the target symbol is. In other words, you are to judge how well you are acquainted with the target symbol in everyday life.’’ On both two-symbol and one-symbol trials, the judgment was made after the symbol(s) disappeared and a question mark appeared in the location of the to-be-rated symbol. In sum, a ready prompt was followed the symbol(s), which were then briefly masked and replaced by a question mark indicating the target symbol. On the scale of 0 (very unfamiliar) to 5 (very familiar), mean performance on filler trials indicated that subjects were using the scale properly: novel ¼ 0.80; high familiarity ¼ 4.23. More importantly, test context mattered. Mean judged familiarity for a low-familiarity symbol was lower when accompanied by a novel (1.55) compared to a high-familiarity (2.10) symbol, and intermediate when presented alone (1.81). This effect did not depend upon symbol presentation time. 3.2.2. Single-Element Familiarity: Study 2 Study 1 required subjects to remember which symbol had been presented where. A question mark appeared in the location where the target had been, but the symbol itself was not in view for the judgment. Thus, subjects may have occasionally judged the wrong symbol because they had forgotten where it had been shown. To address this, the second experiment modified the procedure (see Figure 4): following a ‘‘ready’’ prompt, the symbol(s) appeared for 2 s Study 1: 2000 ms
500 ms
100 or 1000 ms
Ready?
500 ms
Judgment
?
Study 2: 2000 ms
2000 ms
Judgment
Ready?
Figure 4
Experimental procedure used in single-element familiarity studies.
48
Alan S. Brown and Elizabeth J. Marsh
before a box appeared around the target. Replicating the first experiment, a low-familiarity symbol was judged to be more familiar (2.16) if paired with a high-familiarity symbol than if alone (1.97). However, the novel symbol no longer influenced familiarity judgments: the target (low-familiarity) symbol was equally familiar when tested alone (1.97) or with a novel symbol (2.00). This outcome suggests that familiarity is easier to enhance than decrement, so in the remaining experiments in this series we focused on whether high-familiarity neighbors could pull target familiarity up. Our use of both more and less familiar neighbors in Study 1 was mainly for academic curiosity—to see if symmetrical effects exist. De´ja` vu relates primarily to increasing familiarity through a familiar accompanying element. Decrementing familiarity ties in with jamais vu, a lesser known phenomenon which is related to de´ja` vu and described later in this chapter (Section 6.4). However, as in real life, our jamais vu model appears to be less reliable than de´ja` vu. 3.2.3. Single-Element Familiarity: Study 3 One simple, but reasonable, alternative explanation for the boost in familiarity rating described above is that familiarity increase when two symbols are shown on the screen compared to one (control condition), and not because of the presence of a high-familiarity neighbor. To address this, we compared the effects of a high-familiarity neighbor symbol with a low-familiarity neighbor. A low-familiarity symbol accompanied by another low-familiarity symbol received a similar familiarity rating (1.81) to when it appeared alone (1.82). In contrast, pairing a low- with a high-familiarity symbol (M ¼ 1.94) increased its perceived familiarity. Thus the earlier effects were not simply due to seeing two symbols at one time. Rather, a more familiar neighboring symbol increases perceived familiarity of a less familiar target. 3.2.4. Single-Element Familiarity: Study 4 We also tested a perceptual explanation of the familiarity effect. Perhaps the high-familiarity symbol changed the interpretation of the target symbol. For example, does a random squiggly symbol look more like a nameable object when paired with a familiar handicap symbol? To test this, we changed our dependent measure from rating familiarity to identifying the meaning of the symbol. Participants were told that we were interested in their ability to identify symbols, and that some would be very easy to identify and for others they would have no idea of the meaning. They were warned against guessing, and instructed to type ‘‘I don’t know’’ if they did not know the meaning of a symbol. The same procedure was used, with only the evaluation measure changed. Following a ‘‘ready’’ prompt, the symbol(s) appeared for 2 s. Then, a text box appeared and subjects answered the question ‘‘What does the target drawing mean to you?’’ We scored the data in two ways. First, we computed the proportion of symbols that subjects could label,
Recent Research on De´ja` Vu
49
regardless of the nature of the label. Second, in the pair condition, we examined whether the label given to a target symbol was related to the meaning of the accompanying high-familiarity symbol. Overall, subjects were good at identifying the meaning of high-familiarity filler symbols, being correct 87% of the time. As expected, they were much less likely to ascribe meanings to low-familiarity targets, labeling just 33%. This also indicates that they followed the instruction not to guess or make up meanings. Critically, seeing meaning in low-familiarity target symbols was not influenced by pairing with a high-familiarity symbol. Subjects generated interpretations for 32% of alone targets and 33% of paired targets. Furthermore, when subjects did assign meaning to the target, it was rarely related to the high-familiarity neighbor (3%). These results suggest that the effects of the high-familiarity flanker were not due to influencing the interpretation of the target symbol. In short, having memory for part of a scene—the high-familiarity symbol, in our paradigm—can influence one’s feeling of familiarity for other elements of the scene. Less clear is how much this is under conscious control. If subjects are told not to let the familiarity of one object affect their judgment of another, can they avoid its influence? We are currently collecting these data, and our hunch is that subjects will be unable to control the influence of the familiar symbol, in the same way that people are unable to avoid attributing their emotions from one stimulus to another neutral one (Payne, Cheng, Govorun, & Stewart, 2005).
3.3. Gestalt Familiarity Explanation In addition to seeing an entire scene before (episodic experience) or a piece of a scene (single-element familiarity), another type of implicit-memory explanation for the de´ja` vu experience is that the general framework of the current circumstance or setting resembles one experienced before. Assume that you are a college student making a trip to a new campus to see a high school buddy. During the drive through the main drag on campus, you are struck by an eerie sense of having been here before. What may be familiar is a general layout: a central quadrangle, surrounded by a white chapel on the left and a fountain in the middle and a two brick classroom buildings on the right. Although no specific feature is identical to one with which you are familiar, the general layout follows a well-etched mental template. As with other de´ja` vu interpretations, this one also reaches back over a century (Sander, 1874), and Dashiell (1937) includes a great street-scene visual illustration of how this could work (cf. Brown, 2004). 3.3.1. Familiarity without Identification Research Cleary, Ryals, and Nomi (2009) designed a clever study to evaluate this gestalt model of de´ja` vu. But before describing this study, some background on Cleary’s (2004, 2008) research would help. In her study of familiarity
50
Alan S. Brown and Elizabeth J. Marsh
based recognition, or recognition without identification (Cleary, 2008), a general sense of familiarity appears to guide recognition decisions, even when we do not have access to the specific prior experience which elicits this feeling. To illustrate this, if subjects first study a list of celebrity names, and then provide celebrity names to face cues, subjects can discriminate between the celebrity names which did, versus did not, appear in the initial name list, even when they cannot produce the celebrity’s name in phase two (Cleary & Specker, 2007). It is as if the familiarity spread from the person’s name to their face, so that it received implicit activation. This activation was sufficient to support the recognition that it was connected with a prior experience, but insufficient to facilitate name retrieval. Recognition without identification also has been demonstrated with famous scenes. Similar to Cleary and Specker (2007), Cleary and Reyes (2009) had subjects first study names of famous landmarks and locations (Stonehenge, Taj Mahal), and then provide the names for pictures of such places. Among pictures that remained unnamed, subjects could discriminate those whose name had, versus had not, appeared on the prior list. This again illustrates that a sense of prior experience can be triggered by a face or edifice cue, even when the prior experience and specific studied name cannot be recalled. 3.3.2. Gestalt Familiarity Study Cleary et al. (2009) constructed a direct test of the gestalt theory of de´ja` vu, using her recognition without identification paradigm. Black-and-white line drawing stimuli depicting various scenes were constructed in pairs, resembling each other in overall configuration. A sample configural pair in Figure 5 depict an arbor (left) and castle drawbridge (right).
Figure 5 Configurally familiar scene pair from Cleary et al. (2009).
Recent Research on De´ja` Vu
51
Subjects were asked to remember each study scene and the accompanying verbal description of it (arbor). At test, none of the original scenes were shown. Rather, half of the test scenes (castle drawbridge) configurally resembled one of the studied scenes and half did not. The configurally similar scene served as the memory cue, and subjects’ attempt to identify (provide the label for) the input list picture that it resembled. As before, when subjects were unable to recall a corresponding input scene they still showed evidence of recognition without identification. Familiarity ratings were higher for tests scenes that resembled input scenes, compared to those that did not. After each familiarity decision, subjects were asked if they had experienced de´ja` vu, and these reports mirrored familiarity assessments: de´ja` vu occurred more often for test scenes resembling input scenes, compared to those with no resemblance. Given that these two ratings were always done in the same order—familiarity, then de´ja` vu—the familiarity rating may have biased the de´ja` vu rating. In Experiment 2a, Clear et al. (2009) had subjects report only de´ja` vu experiences (no familiarity rating). As before, de´ja` vu was more likely with configurally related test scenes, compared to unrelated ones. Cleary et al. (2009) argue that their findings suggest that a single process underlies both de´ja` vu and familiarity. They base this speculation on two lines of evidence. First, configural resemblance produces similar effects for both de´ja` vu and familiarity. Second, a questionnaire study revealed that 79% of respondents define de´ja` vu as logical familiarity—re-experiencing something old that you know is old. Only 7% defined de´ja` vu as illogical familiarity—something new that feels old. This survey outcome should serve as a general caution about assuming that subjects doing de´ja` vu ratings actually understand the accurate or technical definition of the term.
3.4. Hypnosis Banister and Zangwill (1941a, 1941b) attempted to elicit de´ja` vu experiences in the laboratory, to model the implicit memory explanation that de´ja` vu occurs because this particular experience has happened before but has been forgotten (Brown & Marsh, 2008). They presented pictures (Banister & Zangwill, 1941a) or odors (Banister & Zangwill, 1941b) to hypnotized subjects, followed by a posthypnotic suggestion to forget the encounter. One day later, in a normal waking state, subjects were tested about their recollection (and familiarity) for these same pictures or odors. While this approach holds promise, serious problems exist with this particular application (Brown, 2004). Recently, O’Connor, Barnier, and Cox (2008) conducted an investigation improving on this hypnosis design, using a unique puzzle task as the memory target. All subjects attempted to solve the puzzle while hypnotized. Some were given the posthypnotic suggestion to be amnesic about the puzzle, while others were told that the puzzle would later feel familiar. Later, during a nonhypnotized session, five of six
52
Alan S. Brown and Elizabeth J. Marsh
subjects in the familiarity group experienced a strong sense of de´ja` vu when encountering this puzzle, whereas none of six subjects in the amnesia group felt strong de´ja` vu. This study raises the tantalizing possibility that the sense of de´ja` vu can be recreated in a laboratory setting with the right parameters and procedures (hypnotic suggestion). Cleary (2008; Cleary et al., 2009) has echoed O’Connor et al.’s (2008) optimism, suggesting that given how much familiarity without recollection resembles de´ja` vu, we may eventually be able to reliably elicit de´ja` vu using laboratory manipulations which are proven to successfully affect familiarity ratings.
4. Physiological Explanation Turning to the third class of explanations, one of the earliest interpretations of de´ja` vu is that it reflects an alteration in the normal brain functions that utilize multiple pathways of information transmission. Osborn (1884) speculated that the sensory signals transmitted from the eyes to the occipital area separate and follow different tracks to the right and left hemispheres. This information then merges together at the occipital lobe to produce one unified perceptual impression. On occasion, the messages become slightly asynchronous, producing a sensation of de´ja` vu. The slight temporal delay in one track results in two visual impressions rather than one as they arrive successively (rather than together) at their destination. The trailing sensation seems to be a duplication of the first. These transmissions become slightly dysphasic due to a neurological event, such as a slight synaptic deficiency at some point on one of the two pathways. The brain misinterprets this slight separation as reflecting temporally distinct experiences, and the logical interpretation is that the present experience duplicates one from an earlier time and place (Brown, 2004).
4.1. Neural Transmission Asynchrony Current technology allows an experimental test of this pathway asynchrony. Bogdan Kostic at Colorado State University used brief visual presentations of a common stimulus (words; faces), sent separately to both the right and left hemispheres. An asynchronous presentation of an identical image to both the right (left visual field) and left (right visual field) hemispheres, offset slightly (20 ms apart), should result in an enhanced sense of familiarity. Kostic did find partial support for such familiarity enhancement with presentation asynchrony, but the results were not straight-forward. A word presented in the right before the left visual field was judged to be significantly more familiar than
Recent Research on De´ja` Vu
53
the reverse—left before right. Simultaneous presentation resulted in a familiarity rating intermediate between the two asynchronous conditions. Kostic speculates that the right-first asynchrony enhances familiarity, relative to left-first, due to the left hemisphere advantage in language processing. If this explanation is true, then nonverbal stimuli should result in left-first familiarity enhancement, compared to a right-first presentation. Unfortunately, face stimuli did not result in a left-first advantage, with no familiarity rating difference between asynchronous and simultaneous presentation. These findings are very intriguing, but Kostic points out that the length of the delay between presentations that he used (20 ms) may be too long, and that endogenous delays in the nervous system that produce this outcome may be much shorter.
4.2. Surgical Elimination of De´ja` Vu The earliest scientific research on de´ja` vu was based on the assumption that it indicates brain pathology—seizure activity currently exists or is likely to develop. This speculation originated from the observation that some individuals with temporal lobe epilepsy (TLE) experience de´ja` vu in their preseizure aura (Brown, 2003, 2004), but the accumulated data do not support a stronger conclusion of brain pathology. Despite this erroneous early assumption, research on TLEs has continued to provide useful evidence about the nature of recognition processes involved with false familiarity. Bowles, Crupi, Mirsattari, Pigott, Parrent, et al. (2007) describe a young woman who developed TLE in her preteen years, and her preseizure auras routinely included de´ja` vu. These seizures could not be managed by medication, and surgical correction was required. The surgery removed a brain tumor and surrounding tissue, which included the amygdala, entorhinal cortex, and perirhinal cortex. Both her seizures and de´ja` vu experiences were eliminated. But an interesting result of surgery is that her ability to assess familiarity was eliminated, while recollection was preserved. Using experimental tests involving list learning procedures with the remember/ know task (Gardiner, Ramponi, & Richardson-Klavehn, 1998), the patient performed better than a control group on recollection (do you recall the item’s presentation?) while showing a pathological absence of familiarity (does the item seem familiar?). This was confirmed across four cognitive tasks using a variety of different encoding and response manipulations. The clear implication of Bowles et al. (2007) is that de´ja` vu is associated with a separate cognitive system that governs familiarity, apart from brain structures involved with contextually guided recognition evaluations.
4.3. Surgical Elicitation of De´ja` Vu A second study dovetails nicely with Bowles et al. (2007). Prior to surgically removing tissue in epileptics, surgeons often implant depth electrodes in various areas of the brain that appear to be the origin sites of seizure activity.
54
Alan S. Brown and Elizabeth J. Marsh
These electrodes can both stimulate and record electrical activity. While procedural sophistication has evolved over recent years, the accumulated findings have not provided a reasonably precise or replicable picture concerning where de´ja` vu experiences may originate (Brown, 2004). While de´ja` vu can be created through stimulation of electrodes planted in and around the temporal area, inconsistent results and procedural problems (e.g., spread of stimulation) cloud these findings. A recent study is notable for the reliability with which it was able to elicit de´ja` vu in TLEs. Bartolomei, Barbeau, Gavaret, Guye, McGonigal, et al. (2004) found that de´ja` vu experiences could be triggered via stimulation of the rhinal cortex in seven (of 24) patients, and that repeated stimulation produced the same de´ja` vu response. Replicable electrical elicitation of de´ja` vu was a first, but they were also able to differentiate between the perirhinal and entorhinal cortices. Recall that Bowles et al. (2007) (above) discovered that removal of both perirhinal and entorhinal cortices eliminated de´ja` vu (and familiarity) in their patient. Bartolomei et al. were able to differentiate between these two structures by finding that the entorhinal cortex is the key: 3% of perirhinal stimulations resulted in de´ja` vu, whereas 17% of entorhinal stimulations elicited de´ja` vu. A second investigation implicates other areas that may be involved in de´ja` vu, or at least capable of creating the sensation through indirect pathways via spillover activation. Kovacs, Auer, Balas, Zambo, Klivenyi, et al. (2009) present a case study where de´ja` vu was repeatedly elicited through stimulating the globus pallidum. Remarkably, this woman had never previously experienced a de´ja` vu. However, there are several qualifications on this report. De´ja` vu only occurred with a relatively high-level of electrical stimulation, raising the possibility that the experience resulted from indirect activation of neighboring brain regions. Furthermore, the illusions only happened with her eyes open, and were reported only in response to a direct query. She would not volunteer reports of de´ja` vu—but only acknowledged it if asked. Data from this particular patient must also be qualified by an early brain injury that altered her normal hemispheric language lateralization. Thus, this patient provides additional evidence that de´ja` vu can be reliably elicited through stimulation of a single brain location, but the specific role of the globus pallidum needs further verification.
5. Reports in Anomalous Individuals 5.1. Blindness De´ja` vu research has primarily emphasized the visual dimension in anecdotal reports, theoretical speculation, and empirical demonstrations (Brown, 2004; Brown et al., 1994; Neppe, 1983). However, many reports involve an auditory component, particularly where a conversation seems eerily familiar:
Recent Research on De´ja` Vu
55
. . . an impression that we have previously been in the place where we are at the moment, or a conviction that we have previously said the words we are now saying, while as a matter of fact we know that we cannot possibly have been in a given situation, nor have spoken the words. Angell (1908, p. 235)
The visual bias in de´ja` vu may stem from the fact that most cognitive research involves visual rather than auditory processing, thus naturally pulling theoretical speculation in this direction. With this context in mind, O’Connor and Moulin (2008) document de´ja` vu in a male who has been blind since birth, and reports that ‘‘hearing and touch and smell often seem to intermingle in the de´ja` vu experiences’’ (p. 247). It would be very useful for our understanding of de´ja` vu if our current theoretical interpretations could be applied to, or tested in, other sensory modes. For example, could the split-perception paradigm that has been successful with visual materials (Brown & Marsh, 2009) extend to auditory identity priming? Would a brief and barely audible (at threshold) presentation of a word, just prior to a clear presentation, result in enhanced familiarity? Perhaps the single-element familiarity (Brown & Marsh, 2007) research with visual symbols could be modeled by presenting an auditory fragment (‘‘bah’’) preceding the full spoken version (‘‘bottle’’). And full spoken phrases, sentences, or short paragraphs might be a viable extension of the visual implicit memory demonstration of de´ja` vu (Brown & Marsh, 2008).
5.2. Chronic De´ja` Vu Two recent case studies report chronic de´ja` vu in four individuals who experience the sensation on essentially a daily basis. Given that de´ja` vu happens only a few times a year even in those most prone (Brown, 2004), a daily rate is extraordinary. This is even more exceptional because all persons in these reports are all middle aged or older, an age range where de´ja` vu experiences are rare. In one report, O’Connor and Moulin (2008) document a 39-year-old TLE patient who experiences de´ja` vu up to three times per day, always associated with the preseizure aura. This annoyance motivated the patient to try active strategies to terminate the sensation—turning his attention to something else; looking away from what he judged to be the eliciting visual stimulus. These efforts were to no avail, as de´ja` vu ‘‘follows my line of vision and hearing’’ (p. 145). O’Connor and Moulin (2008) suggest that this argues against a data-driven (bottom-up) etiology of de´ja` vu. They reason that if de´ja` vu was caused by visual sensations, then altering such stimulation should end de´ja` vu. Although a reasonable position, evidence against an external perceptual trigger does not prove that it can never occur through this route—only that it is not the exclusive triggering stimulus for a de´ja` vu. A second report of chronic de´ja` vu describes three elderly subjects, all 65 or older (Thompson, Moulin, Conway, & Jones, 2004), whose frequent
56
Alan S. Brown and Elizabeth J. Marsh
de´ja` vu experiences made everyday living problematic. They discontinued routine daily activities like watching TV, reading the newspaper, or listening to the radio because it felt as if they have seen or heard these before. Similar to O’Connor and Moulin (2008), Thompson et al. suggest that such cases demonstrate that de´ja` vu is a central nervous system dysfunction, unrelated to specific external perceptual triggers. Incidentally, each of these subjects had some brain pathology (atrophy; hemorrhage), and it is unclear how this might relate to the chronicity of the memory illusion. As a segue to the following section, Thompson et al. propose that their clinical observations suggest that de´ja` vu increases as one ages, a position counter to a large body of evidence (Brown, 2003, 2004). They further suggest that the prevalence of de´ja` vu is underreported because it gets lost in the higher incidence of many other more serious memory problems that pop up as one ages.
6. Continuing Issues 6.1. Aging One of the biggest empirical puzzles about de´ja` vu is its decline with age. This systematic decrease is reflected in the percentage of individuals who admit to ever having a de´ja` vu experience (Chapman & Mensh, 1951), and for individuals who do have de´ja` vu the incidence of the experience declines across their life (Brown, 2003, 2004). Superficially, these findings appear contrary to general findings regarding aging and memory. Familiarity assessment seems to remain relatively stable with age, whereas the capacity to recollect specific temporal and contextual details about experiences decreases (Mantyla, 1993). De´ja` vu represents a strong sense of subjective familiarity in the absence of any objective evidence, and these two functions should show greater divergence as we grow older. Thus, de´ja` vu should increase rather than decrease with age. What are some possible reasons? It may be a measurement issue, involving age-related changes in recall (de´ja` vu is more apt to be forgotten), response bias (older adults are more reticent to admit to de´ja` vu), or cohort (older adults are less aware of the concept) (Brown, 2004). However, it is also possible that older adults learn to rely more on familiarity than recollection, given that the former memory function is more stable. Thus, they are more likely to dismiss a discrepancy between familiarity and the absence of recollection (Cleary, 2008). Also, older adults may be less attentive to details of their surroundings that could possibly trigger a de´ja` vu, and they may also visit fewer places on a regular basis (and thus experience fewer possible triggers). Finally, in the face of an overall increase in memory difficulties, subtle issues like de´ja` vu may not be as noticeable. Incidentally, Thompson et al. (2004)
Recent Research on De´ja` Vu
57
propose that de´ja` vecu, a variant of de´ja` vu where the present experience seems to have been lived through before, increases with age. They base this upon their impression of older adults who come to their memory clinic, and further propose that the experience is underreported by older adults (see above).
6.2. Dreams Following the appearance of an article on de´ja` vu research in a major national newspaper, over 500 e-mails poured in. Most were diligently answered, even though the sender’s desire for a definitive explanation could not be provided. The most curious dimension of these reactions from the general public is that most felt that the ‘‘prior experience’’ had occurred in a dream. Survey data show that one in five college students agree with this dream-origin interpretation. This dream impression needs to be logically explained, in order to remove de´ja` vu from the realm of the occult (cf. Brown, 2004). Our best hunch is that the surreal impression created by a de´ja` vu fits with the cognitive texture of a dream, rather than a real experience, and finding ways to specify this more empirically would be helpful in the development of research on de´ja` vu.
6.3. Single versus Multiple Causes Several published reports openly challenge the notion that de´ja` vu is initiated by external stimuli, and suggest that it is only triggered by a biological dysfunction (O’Connor & Moulin, 2006, 2008; Thompson et al., 2004). All cognitive interpretations discussed earlier—split perception, implicit memory, single-element, gestalt—are predicated on the assumption that de´ja` vu is initiated by an external perceptual experience. The difference is whether that stimulus connects with itself from a few moments ago (split perception), the same scene experienced weeks or years ago (implicit memory), a piece of a prior real experience (single-element), or a familiar format (gestalt). The alternative is that de´ja` vu is all in the brain. Support for this alternative position is drawn from individuals where the de´ja` vu experience: (a) occurs with extraordinary frequency, (b) is not tied to the physical setting, and (c) cannot be ended or altered by willfully changing the perceptual input (O’Connor & Moulin, 2008; Thompson et al., 2004). We believe that there are multiple possible causes for de´ja` vu. Just as a stomach ache can have different causes (e.g., over consumption, flu, food poisoning, medications, stress), the same is true of de´ja` vu (Brown, 2003, 2004). If a de´ja` vu experience can be identified as a likely result of one possible mechanism, this does not necessarily rule out others (cf. Cleary et al., 2009). Similarly, forgetting where you put your car keys could be
58
Alan S. Brown and Elizabeth J. Marsh
traced to biological (fatigue, stress, low blood sugar) as well as psychological (distraction, multitasking) circumstances. Proving one cause for a particular incident does not rule out other possibilities. There is a considerable amount of accumulating evidence supporting de´ja` vu as caused through data-driven procedures: split perception (Bernstein & Welch, 1991; Brown & Marsh, 2009; Jacoby & Whitehouse, 1989), implicit memory (Brown & Marsh, 2008), single-element familiarity (Brown & Marsh, 2007), and gestalt resemblance (Cleary et al., 2009). No theory of de´ja` vu should be eliminated as precondition of accepting another. We are in an early phase in the exploration of this experience, and different interpretations can provide a rich source of ideas that may yield important findings to cognitive phenomena apart from de´ja` vu.
6.4. Jamais Vu Normally, we experience a perfect alignment between objective and subjective recognition: things that we know feel familiar and settings/people that have not been experienced feel unfamiliar. De´ja` vu is a mismatch between the two, with positive subjective recognition in the face of negative objective recognition. Jamais vu is the opposite—negative subjective recognition contrasted with positive objective recognition. For example, you walk into the dining room in the home that you grew up in, and it appears momentarily unfamiliar as if you are seeing it for the first time. Jamais vu is much rarer than de´ja` vu, and research on the subject is scant with only a few published reports on its nature or incidence (cf. Brown, 2004). Jamais vu was briefly noted in Jacoby and Whitehouse (1989) but their speculation did not get traction in subsequent research. The most captivating aspect of their study (discussed earlier) was that a brief presentation of a prime identical to the immediately succeeding target word enhanced its perceived familiarity. Another finding, however, caught the attention of Jacoby and Whitehouse. In their different prime condition, where the preceding prime word differed from the target, the likelihood of a false alarm decreased relative to the control (no word) condition. Jacoby and Whitehouse suggest that: the processing of a test word is disrupted when its presentation is preceded by a nonmatching context word, and this reduction in fluency gives rise to a lack of familiarity, a feeling of strangeness. (p. 134)
Although found in their Experiment 1, this difference disappeared in Experiment 2. Nonetheless, if fluency enhancement can artificially enhance false positive recognition, why should not the opposite happen? This would provide a tidy symmetry to de´ja` and jamais vu, but such does not seem to be the case.
Recent Research on De´ja` Vu
59
The lifetime incidence of jamais vu is much lower than de´ja` vu among college students. Whereas it is difficult to find an undergraduate who has not experienced de´ja` vu, barely a third of students admit to having experienced jamais vu (Brown, 2004). However, there is a more common experience that resembles jamais vu—word blindness. A survey at SMU revealed that most (60%) (N ¼ 167) college students have experienced a familiar word suddenly looking unrecognizable, so that it momentarily appears to be a nonword. Females report this more than males (56% vs. 38%) and older students (junior/senior) more than younger students (freshman/sophomore) (58% vs. 35%). Of those who admit to word blindness, most have it at least every few months. Words that respondents report becoming ‘‘blind’’ to are surprisingly simple, such as ‘‘were, through, is, of, mine, grow, from, actual.’’ A few were longer (‘‘preservation, statutory’’), and most are abstract nouns or function words. Also related to jamais vu is semantic satiation, where the meaning of a word dissolves after repeated oral presentations or pronunciations. However, this is a poorer model because meaning dissolves only after forced repetition. Jamais vu, on the other hand, seems to occur without apparent repetition. Jamais vu may be more common than reported, but is not noticed as readily as is de´ja` vu. Perhaps when subjective unfamiliarity contrasts with objective familiarity, it is not as attention-grabbing as de´ja` vu or it can be dismissed more easily. Current evidence has not provided a clear link between the two phenomena (Brown, 2004), but this would be a fruitful avenue to pursue to help clarify the mechanisms underlying de´ja` vu.
7. Concluding Remarks The de´ja` vu illusion has received considerable attention over the past century and has stimulated over 40 different interpretations (Brown, 2004). Recent empirical evaluation of some of these theoretical positions has recently appeared in published literature. This chapter summarized tentative support for de´ja` vu as possibly caused by: two perceptions that occur in rapid succession, a momentarily inaccessible prior experience of the present scene, an overly generalized familiarity emanating from one portion of a scene, and a general-form match between the present and a past experience. Evidence from brain pathology and stimulation suggests that we may be close to identifying specific brain structures involved in this illusion of false positive recognition. Examining a subtle cognitive dysfunction like de´ja` vu among cognitively disturbed or medicated patients will always be difficult (Brown, 2004), but findings relating de´ja` vu to milder forms of cognitive dysfunction (e.g., dissociation; Adachi, Akanu, Adachi, Adachi, Ikeda,
60
Alan S. Brown and Elizabeth J. Marsh
et al., 2008) and medication side effects (Kalra, Chancellor, & Zeman, 2007) may elucidate biological and cognitive dimensions of the experience. In closing, the accumulating body of intriguing research on de´ja` vu will hopefully encourage us to spend more effort delving into memory illusions as a means of understanding normal memory function (cf. Roediger, 1996). These obtuse messages from the brain are potentially packed with fascinating secrets about cognitive function.
REFERENCES Adachi, N., Akanu, N., Adachi, T., Adachi, Y., Ikeda, H., Ito, M., et al. (2008). De´ja` vu experiences are rarely associated with pathological dissociation. Journal of Nervous and Mental Disease, 196, 417–419. Angell, J. R. (1908). Psychology. New York: Henry Holt. Banister, H., & Zangwill, O. L. (1941). Experimentally induced olfactory paramnesias. British Journal of Psychology, 32, 155–175. Banister, H., & Zangwill, O. L. (1941). Experimentally induced visual paramnesias. British Journal of Psychology, 32, 30–51. Bartolomei, F., Barbeau, E., Gavaret, M., Guye, M., McGonigal, A., Regis, J., et al. (2004). Cortical stimulation study of the role of rhinal cortex in deja vu and reminiscence of memories. Neurology, 63, 858–864. Bernstein, I. H., & Welch, K. R. (1991). Awareness, false recognition, and the Jacoby– Whitehouse effect. Journal of Experimental Psychology: General, 120, 324–328. Bowles, B., Crupi, C., Mirsattari, S. M., Pigott, S. E., Parrent, A. G., Pruessner, J. C., et al. (2007). Impaired familiarity with preserved recollection after anterior temporal-lobe resection that spares the hippocampus. Proceedings of the National Academy of Sciences of the United States America, 104(41), 16382–16387. Brown, A. S. (2003). A review of the de´ja` vu experience. Psychological Bulletin, 129, 394–413. Brown, A. S. (2004). The de´ja` vu experience. New York: Psychology Press. Brown, A. S., & Marsh, E. J. (2007). Object familiarity can be altered in the presence of other objects. Paper presented at the annual convention of the Psychonomic Society, Long Beach, CA. Brown, A. S., & Marsh, E. J. (2008). Evoking false beliefs about autobiographical experience. Psychonomic Bulletin & Review, 15, 186–190. Brown, A. S., & Marsh, E. J. (2009). Creating illusions of past encounter through brief exposure. Psychological Science, 20, 534–538. Brown, A. S., Porter, C. L., & Nix, L. A. (1994). A questionnaire evaluation of the de´ja` vu experience. Paper presented at the annual convention of the Midwestern Psychological Association, Chicago, ILL. Chapman, A. H., & Mensh, I. N. (1951). De´ja` vu experience and conscious fantasy in adults. Psychiatric Quarterly Supplement, 25, 163–175. Cleary, A. M. (2004). Orthography, phonology, and meaning: Word features that give rise to feelings of familiarity in recognition. Psychonomic Bulletin & Review, 11, 446–451. Cleary, A. M. (2008). Recognition memory, familiarity, and de´ja` vu experiences. Current Directions in Psychological Science, 17, 353–357. Cleary, A. M., & Reyes, N. L. (2009). Scene recognition without identification. Acta Psychologica, 131, 53–62.
Recent Research on De´ja` Vu
61
Cleary, A. M., Ryals, A. J., & Nomi, J. S. (2009). Can de´ja` vu result from similarity to a prior experience? Support for the similarity hypothesis of de´ja` vu. Psychonomic Bulletin & Review, 16, 1082–1088. Cleary, A. M., & Specker, L. E. (2007). Recognition without face identification. Memory & Cognition, 35, 1610–1619. Dashiell, J. F. (1937). Fundamentals of general psychology. Boston, MA: Houghton Mifflin Company. Dickens, C. (1849). The personal history of David Copperfield. London: Penguin Books. Funkhouser, A. T. (1983). A historical review of de´ja` vu. Parapsychological Journal of South Africa, 4, 11–24. Gardiner, J. M., Ramponi, C., & Richardson-Klavehn, A. (1998). Experiences of remembering, knowing, and guessing. Consciousness and Cognition: An International Journal, 7, 1–26. Gellatly, A., Banton, P., & Woods, C. (1995). Salience and awareness in the Jacoby– Whitehouse effect. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 1374–1379. Jacoby, L. L., & Whitehouse, K. (1989). An illusion of memory: False recognition influenced by unconscious perception. Journal of Experimental Psychology: General, 118, 126–135. Jones, T. C., & Atchley, P. (2006). Conjunction errors, recollection-based rejections, and forgetting in a continuous recognition memory test: Little evidence for recollection. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 374–379. Joordens, S., & Merikle, P. M. (1992). False recognition and perception without awareness. Memory and Cognition, 20, 151–159. Kalra, S., Chancellor, A., & Zeman, A. (2007). Recurring de´ja` vu associated with 5-hydroxytryptophan. Acta Neuropsychiatrica, 19, 311–313. Klinger, M. R. (2001). The roles of attention and awareness in the false recognition effect. American Journal of Psychology, 114, 93–114. Kovacs, N., Auer, T., Balas, I., Zambo, K., Klivenyi, P., Horvath, K., et al. (2009). Neuroimaging and cognitive changes during de´ja` vu. Epilepsy & Behavior, 14(1), 190–196. Mack, A. (2003). Inattentional blindness: Looking without seeing. Current Directions in Psychological Science, 12, 180–184. Mack, A., & Rock, I. (1998). Inattentional blindness. Cambridge, MA: The MIT Press. Mantyla, T. (1993). Knowing but not remembering: Adult age differences in recollective experience. Memory & Cognition, 21, 379–388. Marsh, E. J., & Brown, A. S. (2010). Becoming familiar by the company you keep. Under review. Merikle, P. M., Smilek, D., & Eastwood, J. D. (2001). Perception without awareness: Perspectives from cognitive psychology. Cognition, 79, 115–134. Neppe, V. M. (1983). The psychology of de´ja` vu: Have I been here before? Johannesburg, South Africa: Witwatersrand University Press. O’Connor, A. R., Barnier, A. J., & Cox, R. E. (2008). De´ja` vu in the laboratory: A behavioral and experiential comparison of posthypnotic amnesia and posthypnotic familiarity. International Journal of Clinical and Experimental Hypnosis, 56, 425–450. O’Connor, A. R., & Moulin, C. J. A. (2006). Normal patterns of de´ja` experience in a healthy, blind male: Challenging optical pathway delay theory. Brain and Cognition, 62, 246–249. O’Connor, A. R., & Moulin, C. J. A. (2008). The persistence of erroneous familiarity in an epileptic male: Challenging perceptual theories of de´ja` vu activation. Brain and Cognition, 68, 144–147. Osborn, H. F. (1884). Illusions of memory. North American Review, 138, 476–486.
62
Alan S. Brown and Elizabeth J. Marsh
Payne, B. K., Cheng, C. M., Govorun, O., & Stewart, B. (2005). An inkblot for attitudes: Affect misattribution as implicit measurement. Journal of Personality and Social Psychology, 89, 277–293. Roediger, H. L. (1996). Memory illusions. Journal of Memory and Language, 35, 76–100. Roediger, H. L., & McDermott, K. B. (1995). Creating false memories: Remembering words not presented in lists. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 803–814. Sander, W. (1874). Ueber erinnerungsta¨uschungen. Archiv fu¨r Psychiatrie und Nervenkrankheiten, 4, 244–253. Thompson, R. G., Moulin, C. J. A., Conway, M. A., & Jones, R. W. (2004). Persistent de´ja` vu: A disorder of memory. International Journal of Geriatric Psychiatry, 19, 906–907. Titchener, E. B. (1928). A text-book of psychology. New York: Macmillan.
C H A P T E R
T H R E E
Spacing and Testing Effects: A Deeply Critical, Lengthy, and At Times Discursive Review of the Literature Peter F. Delaney, Peter P. J. L. Verkoeijen, and Arie Spirgel Contents 1. Introduction 2. A Field Guide to the Spacing Literature: Spotting Impostors 2.1. Recency Effects 2.2. Intentional Learning and Mixed Lists: Rehearsal Effects and Strategy-Switching 2.3. Primacy and Recency Buffers: The Zero-Sum Effect 2.4. Deficient-Processing Effects 2.5. Incidental Learning and Mixed Lists: List-Strength Effects 2.6. Summary: The Impostor Effects and Confounds in Spacing Designs 3. The Failure of Existing Spacing Theories 3.1. Intention Invariance 3.2. Age-Invariance 3.3. Species Invariance 3.4. The Glenberg Surface 3.5. Deliberate Contextual Variability at the Item Level Doesn’t Help 3.6. Recognition Required for Spacing Benefits 3.7. Semantic and Perceptual Priming Accounts for Cued-Memory Tasks 3.8. Hybrid Accounts 3.9. Summary: Theories and Key Phenomena 4. Extending a Context Plus Study-Phase Retrieval Account of Spacing Effects 4.1. An Account of the List-Strength Effect Using SAM 4.2. A Modified One-Shot Account of Spacing? 4.3. Some Experiments Linking Context and Spacing 4.4. Directed Forgetting as a List-Strength Phenomenon 4.5. Summary and Untested Predictions of the Account
Psychology of Learning and Motivation, Volume 53 ISSN 0079-7421, DOI: 10.1016/S0079-7421(10)53003-2
#
64 66 67 68 74 77 79 80 80 81 84 85 86 87 91 94 101 103 104 104 106 108 109 111
2010 Elsevier Inc. All rights reserved.
63
64
Peter F. Delaney et al.
5. The Testing Effect 5.1. Early Research: Tests Slow Forgetting 5.2. The Importance of Retention Interval 5.3. The Return of Deficient-Processing Accounts 5.4. Transfer-Appropriate Processing Accounts 5.5. Retrieval Effort and Desirable Difficulty 5.6. Why Does Testing Help More Than Restudy? 5.7. Testing Effects for Integrated Stimuli 5.8. Summary: The Testing Effect 6. Spacing and Testing in Educational Contexts 6.1. Do Spacing and Testing Improve Learning or Just Memory? 6.2. How Prevalent Are Spacing and Testing in Classroom Settings? 6.3. How Can One Improve Learners’ Use of Spacing and Testing? 6.4. Are There Individual Differences in Spacing and Testing? 7. Conclusions References
112 113 115 117 119 121 122 124 125 126 127 130 131 134 135 137
Abstract What appears to be a simple pattern of results—distributed-study opportunities usually produce better memory than massed-study opportunities—turns out to be quite complicated. Many ‘‘impostor’’ effects such as rehearsal borrowing, strategy changes during study, recency effects, and item skipping complicate the interpretation of spacing experiments. We suggest some best practices for future experiments that diverge from the typical spacing experiments in the literature. Next, we outline the major theories that have been advanced to account for spacing studies while highlighting the critical experimental evidence that a theory of spacing must explain. We then propose a tentative verbal theory based on the SAM/REM model that utilizes contextual variability and study-phase retrieval to explain the major findings, as well as predict some novel results. Next, we outline the major phenomena supporting testing as superior to restudy on long-term retention tests, and review theories of the testing phenomenon, along with some possible boundary conditions. Finally, we suggest some ways that spacing and testing can be integrated into the classroom, and ask to what extent educators already capitalize on these phenomena. Along the way, we present several new experiments that shed light on various facets of the spacing and testing effects.
1. Introduction This chapter reflects our best attempt to review the state of theoretical and empirical knowledge on the family of memory effects that deal with the impact of studying the same thing several times—the distributed-practice family. Extra study opportunities produce better memory, but how we distribute those study opportunities is also important for memory.
Spacing and Testing Effects
65
The distributed-practice family of effects comprises a variety of phenomena, including the spacing effect, lag effect, and testing effect. Cognitive psychologists have produced hundreds of papers over the last century arguing that there is a spacing effect—that is, a memory advantage to restudying something with a delay between the repetitions compared to immediate restudy. The spacing effect is often viewed as an instance of the broader lag effect, in which longer spacing intervals are associated with changes in later recall. Specifically, the lag effect reveals that short spacing results in lower recall relative to moderate spacing, and very long spacing begins to show declines again. Finally, the spacing effect’s first cousin is the testing effect, which refers to the advantage of testing an item relative to just studying it again. Thus, the distributed-practice family includes several of memory theory’s favored children because of their obvious implications for improving education. We intend this chapter to serve as a comprehensive review of the spacing and testing literature and their associated theories, circa 2010. We are due for a long narrative review of the spacing literature anyway. This review, like many others, culminates with a theoretical proposal that attempts to explain the vast range of empirical results in the spacing literature. We also present some new data and draw attention to the importance of some recent papers, whose importance might otherwise be missed. What we think our review contributes beyond that is a careful experimental analysis of the task used in spacing experiments: verbal list learning. No one is inherently excited about word lists, but they have been used in the preponderance of studies on the spacing effect, and therefore understanding what people are doing in these experiments is critical. We will take the rather strange stance that there is a ‘‘real’’ spacing effect somewhere and that all of the other (e.g., rehearsal borrowing, strategy changes during study) phenomena are ‘‘imposters’’ that masquerade as the spacing effect. Just because many different phenomena have a similar observable outcome—namely, better memory for spaced repetitions than for massed repetitions—does not mean that all of these phenomena are the same. It would be like arguing that giving extra study time and asking people to process items for survival value are ‘‘really the same thing’’ because they both result in better memory for studied items. We cannot rely on similar outcomes in recall rates as the sole diagnostic criterion for identifying the spacing effect. For example, ‘‘deficient processing’’ accounts of spacing propose that when people encounter a massed repetition, they exert less encoding effort on the second presentation than they do for second spaced repetitions. Several studies have demonstrated that deficient processing does happen in some cases, and it produces a spacing effect. Furthermore, the deficient-processing effect can be discriminated from other spacing effects because it weakens the benefit of massed repetitions over single presentations rather than enhancing the recall of spaced items relative to massed repetitions. Therefore, although it is
66
Peter F. Delaney et al.
phenomenonologically similar, the deficient-processing effect is not the ‘‘real’’ spacing effect—it is an impostor. These impostors often produce effect sizes as large as or larger than the ‘‘real’’ spacing effect. Furthermore, they may operate in the same direction as the real spacing effect, thereby greatly exaggerating its impact, or they may operate in the opposite direction from the real spacing effect, canceling it out. Without a careful experimental analysis of participants’ behavior during the verbal learning task, it is quite difficult to understand the circumstances under which the ‘‘real’’ spacing effect occurs and the circumstances under which it does not. This confusion has produced a bewildering thicket of experimental results that seemingly contradict one another. In this chapter, we do our very best to untangle the thicket on a briar by briar basis, identifying the impostor phenomena and providing guidelines for running future impostor-free spacing experiments. A crucial part of this effort involves the analysis of the strategies that participants use when they study lists of words. For the past 10 years, our laboratories have worked to understand what people do when they encounter an instruction to ‘‘study words for a later memory test’’ and how the strategies they choose interact—often in surprising ways—with the number of lists people study, whether items are repeated in a massed or spaced fashion, and whether massed and spaced repetitions are mixed together on the list or kept on separate lists. We think that very few experimental studies meet rigorous standards for comparing theoretical views about the ‘‘true’’ cause of spacing effects, because human participants do not cooperate with researchers by ‘‘just behaving normally’’ during memory experiments. Instead, they devise a variety of clever strategies for memorizing lists of words, and these strategies interact in surprising ways with the structure of the lists to affect memory. For example, we will see that rote rehearsal strategies sometimes enhance and sometimes reduce the impact of spacing, depending on the structure of the list.
2. A Field Guide to the Spacing Literature: Spotting Impostors There have been three major meta-analyses of the spacing literature conducted in the past decade (Cepeda, Pashler, Vul, Wixted, & Rohrer, 2006; Donovan & Radosevich, 1999; Janiszewski, Noel, & Sawyer, 2003), which produced conflicting results that depend on what studies were included. The most comprehensive meta-analysis of verbal learning was the most recent (Cepeda et al.), which identified confounds in some earlier studies and included the largest number of studies. For each study, they assessed the lag between repetitions (i.e., how much time passes between
Spacing and Testing Effects
67
each repetition), and the retention interval (i.e., how much time passes between the last repetition and the test). For any given retention interval, there is an optimal lag between repetitions that maximizes memory. Shorter-than-optimal and longer-than-optimal lags between repetitions produce suboptimal memory. Furthermore, as the retention interval increases, so does the optimal lag between repetitions. Therefore, memory is a function of both the retention interval and the lag between repetitions. Finally, they found that the same pattern held for both free recall and cuedrecall tests. Their analysis represents the best current conclusions regarding the spacing effect. However, their reliance on verbal learning data is problematic due to the large number of confounds present in existing spacing studies. Specifically, there are a variety of impostor spacing effects that deserve their own names, and should be carefully watched for in studies that attempt to measure the ‘‘true’’ spacing effect. Table 1 outlines the major phenomena we will review here that affect the conclusions of many spacing studies.
2.1. Recency Effects It is fitting to begin with recency effects, an impostor that is so well known that it stars in virtually every introductory psychology textbook’s discussion of memory. The problem with recency confounds in spacing studies is an old one in the literature, highlighted by Crowder’s (1976) review. Specifically, because spaced items must occur in multiple locations on the list, their final presentation tends to be more recent than an equal number of massed items unless care is taken to equate the final positions. Because recent items are more easily recalled than older items, an artifactual spacing effect can be observed. One approach to solving this problem, whose discovery was attributed to Melton (1967) by Crowder, was to use primacy and recency ‘‘buffer’’ items that would not be tested, or just not counted for free recall. In fact, this approach was used earlier by Waugh (1962), but it is not terribly effective at controlling recency. Zimmerman (1975), for example, found an extended recency function that produced 20% higher recall for laterpresented than earlier-presented items, even though he included primacy and recency buffers. He required participants to focus on only the current item, which eliminated the primacy effect, but resulted in an extended recency function. Even in recent work, recency control has been a problem. Toppino and Bloom (2002), in their Experiment 1, replicated an experiment of Greene (1989) that compared free recall following incidental and intentional learning. The lists contained some massed and some spaced items, with spaced items of varying lag. Greene tried to control for recency biases by counterbalancing the assignment of words to quadrants of the list.
68 Table 1
Peter F. Delaney et al.
Five Impostors: Spacing-Like Phenomena.
1. The recency effect. Even if we control rehearsal, there is an extended recency function. Failing to account for this can artificially enhance the memory of spaced items, because their last presentations are more recent and therefore stronger. 2. Rehearsal-borrowing effects on mixed lists. Mixed lists encourage rehearsal borrowing, which artificially inflates the spacing effect on mixed lists relative to pure lists. The degree of borrowing varies depending on list structure as well, so one can create some super-spaced items unintentionally. This effect is often wrongly discounted as unimportant because spacing effects emerge in incidental learning, and because people often change encoding strategies during study (see Delaney & Knowles, 2005). 3. The zero-sum effect on pure lists. Because people rehearse during study, there is no guarantee—particularly with pure-list designs—that the primacy items wo’nt receive differential practice on some types of lists compared to others. A spacing effect occurs on pure lists if you throw away the beginning of the list, but only because the beginning of the list benefits tremendously from displaced rehearsal on all-massed lists. 4. Deficient-processing effects. There are a family of deficient-processing effects, including the Deficient-processing effect, in which processing is reduced; the Rose effect, in which people choose to spend less time on massed items when they have control over study time; and the speed effect, in which toofast presentation rates encourage people to mass items or to skip spaced items. 5. List-strength effects. In free recall, there are output effects at recall that favor spaced items over massed items. These effects appear only on mixed lists and vanish on pure lists.
The Toppino and Bloom study was virtually an exact replication of the experiment, except that it more carefully controlled recency by controlling the position of the second presentation of words instead of just the quadrant. Surprisingly, this subtle change eliminated the spacing effect for incidental learning observed by Greene. The study highlights the fact that seemingly minor recency biases can inflate or deflate the magnitude of spacing effects, altering our conclusions about the magnitude of the spacing effect—or even its presence or absence under varying conditions.
2.2. Intentional Learning and Mixed Lists: Rehearsal Effects and Strategy-Switching Our second spacing impostor is the rehearsal-borrowing effect. Like recency, rehearsal is a well-known phenomenon, but it also provides a convincing impostor spacing effect when rehearsal favors spaced items over massed items. It is a serious problem for most spacing studies, because
Spacing and Testing Effects
69
most spacing studies use a mixed-list design, meaning that they have massed repetitions and spaced repetitions on the same list. Furthermore, they use nonspecific instructions to study the words on the list for a later memory test, and therefore they do not really control how long people study each item. Such designs encourage rehearsal borrowing that redistributes study time away from massed items and awards it to spaced items. The obvious result of spending a much longer time in studying the spaced items than the massed items is that the spaced items are better remembered on a test. Hall (1992a) went so far as to revive the theory that rehearsal borrowing was the only mechanism necessary to explain the emergence of a spacing effect in most studies. The borrowing explanation was first advanced in the original Atkinson and Shiffrin (1968) ‘‘modal model’’ paper. Atkinson and Shiffrin argued that people comply with instructions to study a list of words by reading each item and then rehearsing earlier-presented items in a shortterm memory buffer. Because the buffer had limited capacity, adding new items to the rehearsal buffer resulted in dropping some earlier words. The time in the buffer—equivalent to the number of rehearsals the item received—would then predict its later strength and hence probability of final recall. Such a mechanism would naturally produce a spacing effect and a lag effect, because spaced items (but not massed items) appear in multiple places on the list. The longer the lag between presentations, the more likely it was that the item had already received a ‘‘full run’’ through the buffer when it was next encountered. Upon being refreshed, it would get a new run through the buffer, receiving extra rehearsals. However, massed items appear in only one location on the list, and therefore get only one ‘‘full run’’ through the rehearsal buffer. The result is more rehearsals for spaced than for massed items. Rundus (1971) verified this prediction using rehearse-aloud protocols, and discovered that the probability of rehearsing an item was directly predictive of its probability of later recall. He further showed that spaced items received more rehearsal than did massed items, demonstrating rehearsal borrowing. If borrowing is pervasive on mixed lists, we would expect that mixed lists greatly overestimate the true benefit of spacing. Furthermore, if Hall’s (1992a) contention were correct and rehearsal borrowing were the only mechanism necessary to explain the spacing effect, then spacing would be virtually useless as a learning tool. The goal of spacing practice is to improve memory for all of the to-be-learned items, not to selectively improve memory for a few of the items at the expense of the rest! Because of this concern, Hall used pure lists—that is, lists composed of only spaced items or only massed items—to see if the spacing effect would disappear once people could no longer borrow time from massed items to help the spaced items. In three experiments, he showed that studying pure lists eliminated the spacing effect on a free recall test, using presentation times ranging from 1 to 4 s per item. Furthermore, compared to a mixed list, the pure lists resulted in lower
70
Peter F. Delaney et al.
recall of spaced items and higher recall of massed items. The latter result is consistent with Hall’s argument that for mixed lists, rehearsal borrowing awards extra study time to the spaced items at the expense of the massed items. In another study, Hall (1992b) compared pure lists of spaced items with pure lists of once-presented items that were presented for the same total duration. At 2, 4, and 6 s per item presentation rates, he obtained no spacing advantages with free recall tests. Taken together, the results suggested that rehearsal borrowing might be a serious problem for our conclusions about the spacing effect, since virtually all of the studies in the literature use mixed-list designs together with intentional learning. Two later studies seemed to overturn Hall’s (1992a) conclusions, however. An important paper by Toppino and Schneider (1999) demonstrated that you could still get spacing effects on pure lists, provided multiple study lists were employed (with a free recall test after each list). We will later see that the inclusion of multiple lists within the session is important because people change how they study throughout the course of an experiment. Toppino and Schneider also included a condition that used a mixed list, but where each half of the list was pure. That is, the first half of the list contained only spaced or only massed items, while the second half contained the opposite type of item. These ‘‘special’’ lists would presumably reduce the extent of rehearsal borrowing across item types (if that borrowing tended to come from recent items). Their most crucial evidence against the rehearsalborrowing explanation was that the pure lists and the ‘‘special’’ mixed lists produced relatively similar spacing effects (8% for mixed lists and 7% for the pure lists). It is worth noting, however, that Hall found that ‘‘regular’’ mixed lists produced spacing effects roughly twice as large (14%). A later paper by Kahana and Howard (2005) also obtained spacing effects in free recall using pure lists, and further demonstrated that the lag effect was present. Results such as these—especially when combined with earlier papers that obtained spacing effects using pure lists1 (Underwood, 1969, 1970)—seemed to indicate that rehearsal was less important than Hall (1992a) had believed. However, more recent work has suggested that the story is more complicated, and we will discuss this more recent research next. 2.2.1. People Do Not All Rehearse, and They Change Strategies with Practice Hall (1992a) assumed that most people comply with the instructions to study words for a later memory test by rehearsing. But do they really? Ironically, there are almost no studies that have asked the straightforward 1
Underwood’s (1969, 1970) studies were atypical, however, in that they used very long presentation rates (10 s per item) and often many repetitions, which would tend to produce deficient processing effects; see below for more on deficient processing.
Spacing and Testing Effects
71
behavioral question, ‘‘What do people do when you tell them to study words for a later memory test?’’ When we create cognitive models, we typically implicitly assume that people (a) all do pretty much the same thing, and (b) do pretty much the same thing from one trial to the next. As someone trained in the problem-solving tradition, these assumptions seemed rather flimsy to the first author. After all, rather ordinary people can obtain digit spans greater than 70 with a few months’ practice (e.g., Chase & Ericsson, 1981), and they rapidly discover better strategies than rote rehearsal. At the extreme, memory experts like the memorist Rajan will discover new mnemonic strategies to deal with memory tasks deliberately created to interfere with his existing mnemonic techniques in just a few days of practice (Ericsson, Delaney, Weaver, & Mahadevan, 2004). We therefore conducted a series of studies using methods typically reserved for the thinking literature. We asked participants to study lists of words, but afterwards asked them to tell us what they were thinking as they studied the words. We then coded these verbal reports into strategy groups (Delaney & Knowles, 2005; Sahakyan & Delaney, 2003). It turns out that on the first list of words that people study, about 70% use a rote rehearsal strategy in which they read each item as it appears and then rehearse earlier items. However, rote rehearsal is not a terribly effective memory strategy, and if people receive a test after each list, they will often abandon rote rehearsal for something else. The second most frequent strategy after rehearsal was the story mnemonic (Bower & Clark, 1969; Drevenstedt & Bellezza, 1993; Reddy & Bellezza, 1983), in which people make up a story using all the words on the list. There are various other ‘‘deep’’ mnemonics that people use, like linking each word to their own personal experiences or making up sentences using each word. On the first list, about 16% of participants used a deep encoding strategy. However, by the fourth study list, about equal numbers of people (43–44%) were using a deep strategy and the rote rehearsal strategy. Thus, when people study multiple lists, they tend to abandon rote rehearsal in favor of more effective strategies. Tests are one way to induce people to switch strategies. In fact, you do not even have to explicitly test people; metacognitive judgments or various disruptions of the rehearsal strategy between two lists also result in strategy changes (see Sahakyan & Delaney, 2003, 2005; Sahakyan, Delaney, & Kelley, 2004). Strategy changes favoring better encoding on later-studied lists may also work to ameliorate the deleterious effects of proactive interference build-up in cases when people are instructed to study word lists without any specific instructions on the strategy to use during study (Szpunar, McDermott, & Roediger, 2008). We have summarized the impact of encoding strategy on the magnitude of the spacing effect in Table 2, based on several recent studies conducted in our laboratories. Delaney and Knowles (2005) explored the role of study strategy in the spacing effect on pure lists of words. In Experiment 1, they
72
Peter F. Delaney et al.
Table 2 Magnitude of the Spacing Effect in Free Recall by Encoding Strategy and List Type. Strategy
Mixed lists
Pure lists
Rehearse each item alone Rehearse the items together Story mnemonic
Small Large Large
Small Null Small
Note: Assuming a list of 32 items presented twice and free recall testing, a small effect is about a 6% spacing advantage, a large effect is around 15%, and a null effect is less than 2%. Mixed lists contain both spaced and massed items, while pure lists contain only spaced or massed items (but not both).
partitioned their data into participants who used rote rehearsal and those who used a ‘‘deep’’ encoding strategy like the story mnemonic. Replicating Hall (1992a), when people reported using rote rehearsal, there was no significant spacing effect on pure lists—at best, it was a small (1–2%) advantage. There was no spacing effect regardless of how many lists they had studied, provided they stuck with rote rehearsal throughout. However, for people who switched strategies to a deep encoding strategy, the spacing effect emerged on pure lists. Thus, Delaney and Knowles concluded that Hall’s participants, who saw only a single list, were mostly using rote rehearsal, and thus showed no spacing effect. However, later papers like Toppino and Schneider’s (1999) study had people study multiple lists, which caused people to abandon the rote rehearsal strategy. Consequently, they obtained a significant spacing effect even on pure lists. In a second experiment, Delaney and Knowles (2005) controlled the study strategy their participants used by instructing them to either use a rote rehearsal strategy or to use the story mnemonic. They again found no reliable spacing effect in the rote rehearsal condition, but a significant spacing effect in the story mnemonic condition, confirming their earlier results. A similar study by Paivio and Yuille (1969) had earlier shown similar strategy-switching for cued recall. They found that participants often start by using a rehearsal strategy, but switch to a mediation or imagery-based strategy. Thus, the concern that the number of lists employed in spacing experiments, and the particular mix of strategies used, is not limited to free recall and single-item recognition experiments—although no one has specifically repeated the Delaney and Knowles (2005) experiments using cued recall. Bahrick and Hall (2005) have argued for item-specific strategy changes in cued recall, such that when people see a pair again, if they retrieve their earlier association they will strengthen it. However, if they fail to retrieve that association, then they generate a new one. In a Darwinian selection/retention process, successful mediators are retained while unsuccessful ones are replaced, resulting in better memory following long spacing of items.
Spacing and Testing Effects
73
2.2.2. Rote Rehearsal and the Borrowing Hypothesis Revisited In a recent paper, we examined the rote rehearsal strategy in order to learn how rehearsal interacts with list structure (Delaney & Verkoeijen, 2009). Specifically, we asked our participants to rehearse using the rote rehearsal strategy as described to us by people who used it in our earlier laboratory studies. Our participants described a process we called the rehearse-together strategy, in which they would read each word as it appeared on screen and then use any remaining time to rehearse earlier items. Consistent with the Delaney and Knowles (2005) studies, we found that the rehearse-together strategy resulted in a null spacing effect on pure lists. However, it resulted in a large spacing effect on mixed lists. The same results were obtained with both free recall and recognition tests. In order to understand how rehearsing groups of items affected memory, we compared the rehearse-together conditions to a rehearse-alone condition, in which participants read each word and then repeated only that item until the next item appeared (see also Wright & Brelsford, 1978; Zimmerman, 1975). In several experiments, we found identical small spacing effects on pure and mixed lists using the rehearse-alone condition. The experiments are particularly dramatic because they show that the ‘‘real’’ spacing effect— as manifest in the rehearse-alone condition—can be doubled in magnitude on mixed lists and eliminated on pure lists simply by changing how people study the lists. Another way of saying this is that the rehearsal confounds in a typical spacing experiment are larger than the spacing effect that the experiments are designed to study. An earlier study by Wright and Brelsford (1978) also compared rehearsealone and rehearse-together instructions although they used only the mixed-list conditions. In Experiment 1, they compared rehearse-alone and rehearse-together using overt rehearsal, and obtained no spacing effect with rehearse-alone instructions, but a significant spacing effect with rehearse-together instructions. However, their results were vulnerable to a floor effect interpretation (see p. 637), and we found that a spacing effect does emerge on mixed lists with rehearse-alone—it is just smaller than in the rehearse-together condition (Delaney & Verkoeijen, 2009). In their Experiment 2, they let people rehearse covertly, which may have allowed some of them to violate the instructions. However, they found results more similar to ours in that they obtained a larger spacing effect for rehearse-together than for rehearse-alone. Furthermore, in their rehearse-together condition, the massed items were recalled at a rate similar to singletons, consistent with displaced rehearsal. Why does the rehearse-together strategy affect memory so differently on pure and mixed lists? One part of the story is that rehearse-together strategies manipulate recency effects in interesting ways. In the rehearsealone condition, we obtained an extended recency effect (better memory for the end of the list) and no primacy effect (better memory for the
74
Peter F. Delaney et al.
beginning on the list). This is similar to what one observes when people do not expect a test and do not try to study the words at all. When people rehearse items together, we obtained both primacy and recency effects. People tend to rehearse early items on the list throughout the entire duration of the list, making them artificially recent (cf. Tan & Ward, 2000). Another effect is that rehearsing earlier-studied items turns them into a kind of spaced item. When we asked people to rehearse out loud, we found that they tended to rehearse spaced items more frequently than massed items on the mixed lists (see also Rundus, 1971). In contrast, on pure lists, massed items benefit because they are more likely to receive distributed rehearsal than they would if people focused only on the current item, making them functionally similar to spaced words. 2.2.3. Summary In summary, research often fails to control encoding strategy in spacing experiments, which results in participants adopting increasingly better study strategies across lists. Because different encoding strategies result in different magnitudes of the spacing effect, averaging across multiple lists, even when the order is counterbalanced, can produce misleading estimates of the true effect size. Encoding strategies that encourage rehearsal borrowing tend to result in much larger spacing effects on mixed lists than on pure lists. In the typical studies conducted in the past, people have used mixed lists and intentional rehearsal, which encourage borrowing. Since the borrowing effect is as large as or larger than the actual spacing effect, such studies cannot provide accurate estimates of the true magnitude of the spacing effect.
2.3. Primacy and Recency Buffers: The Zero-Sum Effect Our third impostor is also related to rehearsal borrowing, and we call it the ‘‘zero-sum effect’’ (Verkoeijen & Delaney, 2008). The zero-sum effect is a consequence of the common experimental practice of throwing away some of the items on the list and measuring recall of the rest. Waugh (1962) introduced the practice of including items at the beginning and end of the list—called primacy and recency buffers, respectively—that were not counted and served only to reduce the impact of primacy and recency biases on massed versus spaced comparisons. This practice has apparently been enforced by generations of spacing researchers, as it is used in the majority of studies. One of the unusual features of the Delaney and Knowles (2005) and Delaney and Verkoeijen (2009) studies is that they do not include any primacy or recency buffers. Consistent with our general position that everything spacing researchers think is good is really bad, we think primacy and recency buffers are problematic—especially if they are used on pure lists.
Spacing and Testing Effects
75
To understand why, it is important to first note that Toppino and Schneider (1999) showed that the serial position function of pure-spaced and pure-massed lists differ in interesting ways. Specifically, the pure-massed lists show an enhanced primacy effect compared to the pure-spaced lists, resulting in a crossover interaction such that massing produced better memory for the beginning of the list while spacing produced better memory in the rest of the list. Toppino and Schneider termed this the enhanced primacy effect. However, we have already proposed that the spacing effect observed in Toppino and Schneider’s (1999) study reflected a mixture of strategies. When one plots the serial position function for the rehearse-together strategy, one obtains an enhanced primacy effect, but no overall spacing effect (Delaney & Knowles, 2005; Delaney & Verkoeijen, 2009). The serial position function for the story mnemonic produces no primacy and a weak recency effect, with a spacing effect throughout the list. If one mixes together some rehearse-together participants and some story mnemonic participants—as we think Toppino and Schneider’s study naturally did— one would obtain a function that displays enhanced primacy, but also has a spacing effect. As that is exactly the pattern they obtained, the strategy mixing seems quite plausible. Just because one uses pure lists does not mean rehearsal-borrowing stops; it just means participants cannot borrow from massed items to help spaced items. One can still rehearse some items more often than others. There are well-established rehearsal frequency differences that depend on serial position, such as the primacy effect, which results from extensive rehearsal of the early items on the list (e.g., Tan & Ward, 2000). On pure-massed lists, the extra rehearsal for primacy items is likely to be greater than for pure-spaced lists, because each of those primacy items is presented right away for twice as long. On the pure-spaced lists, in contrast, primacy items are already being replaced with new items soon after they are introduced. According to this logic, the enhanced primacy effect on massed lists is a result of rehearsal patterns that strengthen items at the start of the list. A corollary of this argument is that the apparent spacing effect in the rest of the list might be due to rehearsal borrowing, such that the strong primacy-region items steal rehearsal time away from the rest of the list on the massed lists. To test this idea, Verkoeijen and Delaney (2008) recently conducted a series of pure-list spacing experiments in which we required participants to use the rehearse-together strategy. As in our earlier studies, the spacing effect was small and nonsignificant. Our next step was to plot the serial position functions and to ask whether people who showed a bigger enhanced primacy effect—that is, a bigger massing advantage in the first quadrant—were the same people who showed a bigger spacing effect throughout the list. To illustrate, Figure 1 shows two participants, A and B. Participant A shows a large enhanced primacy effect, because she focuses on rehearsing the beginning of the massed list to a greater extent than
76
Peter F. Delaney et al.
Participant A
1
2 3 List quadrant
4
Massed Spaced
1
Participant B
2 3 List quadrant
4
Figure 1 The zero-sum hypothesis proposes that if you show a bigger enhanced primacy advantage (Quadrant 1 is better recalled on massed than spaced lists), then you will show a smaller spacing advantage throughout Quadrants 2–4. Participant A rehearsed the beginning of the spaced list quite a lot, resulting in lower recall of the rest of the spaced list. Participant B showed a smaller primacy effect, and hence better recall of the rest of the list.
Participant B. However, this extra rehearsal of the beginning of the massed list comes at a cost; compared to Participant B, she shows less memory for the rest of the massed list, resulting in a spacing advantage throughout the rest of the list. Verkoeijen and Delaney called this the zero-sum hypothesis, as it suggests that the better you do on one part of the list, the worse you are likely to do on the rest of the list. Indeed, this is exactly the pattern we found: people who showed larger enhanced primacy effects were the same people who showed larger spacing effects in the rest of the list. People who showed little or no enhanced primacy effect also showed little or no spacing effect in the rest of the list, suggesting trade-offs in memory. Turning back to the issue of primacy and recency buffers, it should be clear that they are part of the list to be studied from the perspective of the participants. Before we throw those parts of the list away, we should check whether the list structure affects the recall of the primacy and recency buffer items. Pure lists cease to be pure if they have primacy and recency buffers, because those items then receive rehearsal. Primacy buffer items, for example, are likely to be rehearsed throughout the list. This effect can be magnified if they are followed by a large number of massed repetitions, during which people will continue to rehearse the primacy buffer items. At this time, we also have no way of knowing whether on mixed lists the primacy buffer items receive more rehearsal during massed than during spaced repetitions. Hence, we do not favor the inclusion of primacy and recency buffer items—which is unfortunately a feature of the majority of spacing studies.
Spacing and Testing Effects
77
In summary, even designs that throw away the primacy and recency regions can result in rehearsal-borrowing effects that differ between spaced and massed lists. This is because people may not distribute practice to the primacy and recency regions equally in spaced and massed lists. Thus, rehearsal-borrowing problems persist even with pure lists. Many of these problems can be ameliorated by controlling encoding strategy and by measuring recall rates for the entire list, and not just a portion of it.
2.4. Deficient-Processing Effects One of the earliest proposed explanations for the spacing effect involved deficient processing, which is our fourth impostor. The idea behind deficient-processing explanations was that the second time an item is encountered, processing the item is somehow easier than it was the first time. In verbal learning studies where people study individual words, there is not usually very much ‘‘processing’’ that people need to do; they read the word and activate its meaning. Deficient processing makes more sense when people need to generate something on each repetition. For example, if we ask people to rate a twice-presented word for pleasantness, they have no need to think about their answer on the second occurrence unless they have forgotten their original answer. An even clearer example of deficient processing was demonstrated by Jacoby (1978), who asked people to solve word puzzles that consisted of two words. The first word was a cue that helped participants to solve the puzzle, and the second word had some missing letters. For example, he might present shoe—F _ _ T, and the answer would be FOOT. Jacoby found that when people had recently seen the word FOOT, these puzzles became trivial, and later memory for the word was much lower on a surprise cued-recall test compared to puzzles they had solved themselves (see also Cuddy & Jacoby, 1982). A classic demonstration of deficient processing was a study by Thios (1972), who presented participants with sentences whose subject and object were sometimes repeated in a later sentence. Repetitions either used the same ‘‘sense’’ of the subject and object, or a homographic ‘‘sense’’ of the subject and object. For example, if participants read, ‘‘The electric drill cut into the cinder block,’’ then a same-sense repetition might be, ‘‘The hi-powered drill entered the masonry block.’’ A homographic repetition might be, ‘‘The fire drill emptied the city block.’’ After 80 sentences, they were cued with the subject words and had to recall the object words. The major result of the study was that there was a spacing effect in both conditions, but by comparing the massed repetitions to once-presented sentences, they determined that massed homographic repetitions improved memory more than did massed same-sense repetitions. The results suggest that sentences that were more dissimilar reduced the massed-item
78
Peter F. Delaney et al.
processing deficit. (In contrast, for spaced items, the lag effect was larger with same-sense repetitions.) Similar results were reported by Dellarosa and Bourne (1985). In Experiment 1, they either repeated sentences verbatim or paraphrased them. They further varied the lag, using massed repetitions and spaced repetitions with lags out to eight sentences. Changing the surface form of the sentence improved memory for massed repetitions, but had small and inconsistent impact on spaced repetitions. In Experiment 2, sentences were repeated using either the same-gender voice or a different gender voice. Switching the gender of the speaker improved memory for the massed sentences substantially, but improved memory for spaced sentences only slightly. Both of these results are consistent with a deficient-processing explanation whereby identical or nearly identical repetitions provide little benefit to memory when they are repeated without any lag. Another source of deficient processing can be participants’ own choices about how long to study. Zimmerman (1975) gave participants the option to control the rate at which items appeared on screen for study. By hitting the space bar, they could terminate the presentation and move to the next item. He found that people would terminate study of massed items more quickly than they would spaced items, suggesting that people would intentionally induce deficient processing on the massed items. Furthermore, people terminated study of short-lag items sooner than long-lag items, producing a lag effect. A study conducted by Shaughnessy, Zimmerman, and Underwood (1972, Experiment 3) produced similar results. A recent study by Toppino, Cohen, Davis, and Moors (2009) raises another possibility for deficient processing—though in this case, for spaced repetitions. Toppino et al. manipulated the difficulty of study items, and showed that for more difficult items, participants often failed to fully perceive them at rapid presentation rates. Under these circumstances, they showed better memory for massed than spaced repetitions. The Toppino et al. study suggests that if the presentation rates in a typical spacing study are too fast, people may have no choice but to skip some of the items to cope with the fast pace. If so, they might favor massed items, which they feel they have time to process fully, and skip many of the spaced items. The itemskipping approach predicts that if the presentation time is very fast, you might observe a reverse spacing effect (i.e., better memory for massed items). It turns out that is exactly what one finds. Metcalfe and Kornell (2003) used Spanish–English word pairs to demonstrate that at a 0.5-s presentation rate, the spacing effect reverses itself, and at a 1-s presentation rate, it is a null effect (for further null spacing effects at 1-s presentation rates, see Waugh, 1963, 1967, 1970). In sum, there are several conditions under which people will show marked deficient processing of massed items (e.g., the deficient-processing effect), and a few cases when they will show deficient processing of spaced
Spacing and Testing Effects
79
items (e.g., fast presentation). These results obviously complicate the interpretation of spacing effects observed in many experiments; just as with rehearsal effects, they can sometimes magnify and sometimes diminish the effects of spacing on learning.
2.5. Incidental Learning and Mixed Lists: List-Strength Effects We would like to raise one final issue that is often important in considering spacing effects, and that is the presence of list-strength effects in free recall. However, far from being a negative feature of spacing experiments, we think list-strength effects provide important evidence regarding the source of spacing benefits. Therefore, while the list-strength effect makes the impostors list, we think it may be a consequence of the ‘‘true’’ spacing effect rather than a confound (more on that later, in Section 4). The list-strength effect was first demonstrated by Tulving and Hastie (1972), who showed that items presented multiple times on a study list reduced recall of the once-presented items. This inhibitory effect was consistent with global memory models like SAM that assumed that repeated items accumulate context strength and that stronger items are therefore sampled more frequently when the context is used as a cue to retrieve them (Ratcliff, Clark, & Shiffrin, 1990). However, subsequent studies posed a problem for global memory models because they demonstrated convincingly that once rehearsal was controlled, recognition memory did not show global competition effects (e.g., Hirshman, 1995; Yonelinas, Hockley, & Murdock, 1992). A more general conclusion is that more difficult tasks that invoke recollective processes tend to show the list-strength effect (Diana & Reder, 2005; Murnane & Shiffrin, 1991; Norman, 2002). However, simple cued-recall or recognition tests are unlikely to show a list-strength effect (see also Ba¨uml, 1997). The signature list-strength pattern is obtained by comparing recall on pure lists (i.e., all-spaced or all-massed lists) to mixed lists (i.e., lists with some spaced and some massed items). The list-strength effect consists of two effects when switching from pure to mixed lists. First, the spaced items show better recall on mixed than on pure lists. Second, the massed items show poorer recall on mixed than on pure lists. If this sounds familiar by now, it is because it is exactly the pattern obtained by Delaney and Verkoeijen (2009) in our studies on rehearsal. The concern that covert rehearsal was responsible for earlier list-strength effects led to extreme attempts to control encoding, but the final resolution of this work seems to be that list-strength effects emerge in incidental learning (Sahakyan, Delaney, & Waldum, 2008; Yonelinas et al., 1992). We also obtained a list-strength effect for free recall but not for recognition when we forced participants to use a rehearse-alone strategy (Delaney & Verkoeijen). A further twist to this story is that the list-strength effect has been observed only with spaced repetitions (Malmberg & Shiffrin, 2005;
80
Peter F. Delaney et al.
Sahakyan et al., 2008). Other methods of strengthening items, such as extra presentation time or deeper orienting tasks, increase recall of the stronger items, but do not produce the list-strength pattern; they produce only a main effect of strength, such that the strong items are recalled better than the weak items on both pure and mixed lists. Therefore, the list-strength effect can be equated with the spacing effect, and it directly predicts a larger spacing effect in free recall on mixed than on pure lists. Perhaps the list-strength effect, like the other encoding effects mentioned in this section, is a confound that must be eliminated to understand the ‘‘real’’ spacing effect. However, another possibility is that the liststrength effect is an indicator as to the true source of the spacing effect. Specifically, we will argue later that a theory that incorporates some assumptions about how context is stored with a trace and how different types of tests use context can provide a viable explanation of the spacing effect, once the encoding confounds described in this guide are taken into account.
2.6. Summary: The Impostor Effects and Confounds in Spacing Designs Our review of the impostor phenomena provides a bleak view of the spacing literature as a whole. Based on the above review, the ‘‘ideal’’ study should use presentation rates slow enough that people do not skip items. It should control recency very carefully, as even small biases in favor of spaced items can inflate estimates of the magnitude of the spacing effect. It should use pure-list designs (and perhaps compare those designs to mixed lists), and preferably have no primacy and recency buffers. Furthermore, it should carefully control the strategies participants use to study, preferably by using incidental-learning procedures. How many of the hundreds of spacing studies have used a design of this type? The answer is vanishingly few. As we then consider the theories of spacing and the evidence against each of those theories, it may be worth keeping in mind that we are using flawed data to reject most of these theories—albeit lots of flawed data collected in multiple laboratories using multiple methods.
3. The Failure of Existing Spacing Theories Before indicating what theoretical position we favor, we will examine the successes and failures of earlier theories. We cannot explore every theoretical perspective ever advanced in our limited space, so we will focus on theories that have been seriously considered by at least one
Spacing and Testing Effects
81
researcher in the past 20 years. Furthermore, we will mostly restrict our review to accounts of what we termed the ‘‘real’’ spacing effect, trying to ignore the various impostor effects that produce benefits of spacing over massing, but that apply in limited circumstances. To evaluate the theories, we will lay out what we see as the most important phenomena that spacing theories need to explain. Table 3 lists these major phenomena. In some cases, we will note that a phenomenon, although important, may need to be replicated under controlled circumstances in order to be sure that it is real. By the end, we will be poised to offer our thrilling alternative.
3.1. Intention Invariance We already outlined (in Section 2) the rehearsal-borrowing effect. However, one of the earliest theories of spacing effects was that there was no ‘‘true’’ spacing effect, and it was all due to rehearsal borrowing (Atkinson & Shiffrin, 1968). While we agree that rehearsal borrowing is an important problem when interpreting the spacing literature, it cannot be the full explanation of the spacing effect because spacing effects still emerge robustly in incidental learning (Braun & Rubin, 1998; Challis, 1993; Glenberg & Smith, 1981; Greene, 1989; Paivio, 1974; Rose & Rowe, 1976; Sahakyan et al., 2008; Shaughnessy, 1976; Toppino & Bloom, 2002; Verkoeijen, Table 3
Major Spacing Phenomena.
1. Intention invariance. Spacing effects emerge with both incidental and intentional learning, using a wide range of materials. 2. Age invariance. Children (including infants), young adults, and older adults all show the spacing effect. 3. Species invariance. Everything from marine mollusks (Carew, Pinsker, & Kandel, 1972) to honeybees (Menzel et al., 2001) to mice (Scharf et al., 2002) shows spacing effects of some sort. 4. The Glenberg surface. The effect of lag is jointly determined by retention interval and type of test. Typically, the relationship between memory and lag is U-shaped, with the peak of the U-curve moving further to the right as the retention interval increases. 5. Manipulating contextual variability seldom helps recall. There are numerous failures to get multiple retrieval routes to help recall compared to a single repeated retrieval route. 6. Recognition is required. Items people fail to recognize on later repetitions show little or no spacing benefit. 7. Perceptual priming effects. A priming account might handle material that is not semantically coded, like faces and nonwords, but it can’t handle semantic information.
82
Peter F. Delaney et al.
Rikers, & Schmidt, 2005). All of these experiments used mixed lists, so it is not clear from the literature whether spacing effects emerge following incidental learning on pure lists or not. Each of them is therefore vulnerable to a list-strength effect critique. Additionally, some of the experiments demonstrating incidentallearning effects may be vulnerable to deficient-processing explanations. For example, Greene (1989) wrote that ‘‘. . . asking subjects to make the same response to an item every time it occurs. . . may lead the subject to base the response to a second occurrence on memory for the response to the first occurrence.’’ Indeed, Jensen and Freund (1981) conducted two experiments in which they compared incidentally-learned lists containing either a single semantic judgment (done twice) or two different semantic judgments (done once each). The lists were mixed with respect to spacing and massing, and also included once-presented items. In both studies, mixing encoding strategies lowered subsequent free recall of once-presented and spaced items relative to using only a single dimension. However, mixing encoding strategies actually helped massed items. Very similar results were obtained with children in the first, third, and sixth grades by Toppino and DeMesquita (1984). Such results suggest a possible switch cost for using two different encoding strategies, but that massed items likely suffered from a processing deficit when rated twice on the same dimension. In other words, there was a deficient-processing effect for incidentally processed items when they were rated twice on the same dimension. There have been some attempts to argue that some incidental-learning instructions might encourage rehearsal-like processes that favor spaced items by forcing retrieval of earlier items. If people make ratings by comparing the current item to previously encountered items, for example, they would have to retrieve earlier-presented items. Because spaced items occur in more places on the list, they are more likely to be recent at any given time, and therefore may be differentially often used as the basis of comparisons, thus strengthening them. Of course, there is absolutely no empirical evidence to support this, but the study is easy enough to conduct—simply ask participants to do a rating task and ask them to report whether they make their judgment by comparing the item to another word (and if so, which one), or if they are rating it without reference to any other items. Having tried the task out on ourselves, we suspect the latter is more common, but it may vary depending on the difficulty of the rating task such that more sensitive scales (e.g., 1–9) may result in more covert retrieval than less sensitive scales (e.g., yes/no). Incidental-learning tasks that encourage people to look for a rule in a sequence may be the most likely to show covert rehearsal effects (e.g., Greene, 1989; Paivio, 1974), as the task requires comparison across items. In sum, it would be nice if there were a clearer demonstration of incidental-learning effects that could not be attributed to any of the impostors outlined in Section 2. While the balance of evidence seems to suggest
83
Spacing and Testing Effects
there is a ‘‘true’’ spacing effect in incidental learning, there is as yet no convincing demonstration using pure lists. However, in an unpublished study, Delaney and Verkoeijen asked 85 participants to view two lists of 32 medium-frequency nouns. Each word was repeated twice for 2 s on each presentation, with a 1-s interstimulus interval. The design of the study was 3 Lag (massed, spaced lag 2, and spaced lag 12) 2 Intentionality (incidental vs. intentional) design, with intentionality manipulated within-subjects and lag manipulated between subjects. That is, every participant saw two pure lists, one of which was learned incidentally and the other intentionally. The order of these lists was counterbalanced so that half of the people received intentional instructions first and the other half received incidental instructions first. The incidental-learning instructions told participants to indicate for each word either (a) whether it was man-made or not, if they saw an ‘‘mm’’ symbol; or (b) whether it was pleasant or not, if they saw a ‘‘;-)’’ symbol. They always received one of the instructions on the first presentation and the other on the second presentation. The intentional-learning instructions told them to rehearse the words aloud in order to learn the list. At the end of each list, there was a free recall test. Participants gave no indication that they expected the test, but of course it is always possible that they expected it. To summarize the results, there was a spacing effect in the incidental condition, but not in the intentional condition. Figure 2 shows the pattern of recall. Consistent with our other studies (Delaney & Knowles, 2005; Delaney &
0.40
Intentional
0.35
Incidental
Proportion recall
0.30 0.25 0.20 0.15 0.10 0.05 0.00 Massed
Spaced-2
Spaced-12
Figure 2 Proportion recall as a function of the lag between repetitions on pure lists for lists learned either via rote rehearsal (intentional) or incidentally. From an unpublished study by Delaney and Verkoeijen.
84
Peter F. Delaney et al.
Verkoeijen, 2009), pure lists with instructions to study via rehearsal produced no overall spacing effects. In contrast, incidental learning produced a significant spacing effect, although the lag effect was not significant (F < 1). Therefore, it seems that spacing effects do emerge on pure lists when studied incidentally, though we did not obtain a significant lag effect.
3.2. Age-Invariance A second clue that rehearsal cannot fully explain the spacing effect is that it occurs throughout the lifespan, even in children too young to rehearse. There are now numerous studies showing that the spacing effect emerges in children, using both recognition (Cahill & Toppino, 1993; Toppino, Kasserman, & Mracek, 1991; Vlach, Sandhofer, & Kornell, 2008) and free recall (Seabrook, Brown, & Solity, 2005; Toppino, 1993; Toppino & DeMesquita, 1984; Toppino & DiGeorge, 1984; Wilson, 1976). It persists over 48 h, at least in recognition (Cahill & Toppino). Although one study failed to obtain the effect with preschoolers (Toppino & DiGeorge, 1984), many later studies obtained it with preschool-age children (e.g., Rea & Modigliani, 1987; Toppino, 1991, 1993; Toppino et al.). Furthermore, the effect occurs with spacing lags up to 1 day for autobiographical events (Price, Connolly, & Gordon, 2006). These studies are important in part because preschool children are too young to implement a rehearsal strategy, and therefore the results cannot be attributed to rehearsal biases. Even infants show the spacing effect. Using habituation, Cornell (1980) showed babies a photo four times, with the repeated exposures spaced either ‘‘massed-like’’ with 3 s between viewings, or ‘‘spaced-like’’ with 60 s between viewings. The baby would then see the same photo again, along with a novel photo. Because babies usually like to look at novel things, they would be expected to spend less time looking at the previously seen photo if they remembered it better. In fact, babies looked longer at the massed-like photos than they did at the spaced-like photos, suggesting they had better memory for the spaced-like photos. This was true when the delay until the test was 1 min, 5 min, or 1 h. (An added advantage of the infants design is that it is not vulnerable to a list-strength effect interpretation.) Habituation is probably mediated by a kind of perceptual priming, suggesting that perceptual priming may be important for the spacing effect, especially with nonsemantic materials—a point we will return to later. Another infant study used operant conditioning of a foot kick in response to a toy mobile in 8-week-old infants (Vander Linde, Morrongiello, & Rovee-Collier, 1985). On a final test two weeks later, the response was retained better when 18 min of training were split into three sessions separated by 1 or 2 days compared to 18 min on a single day. The effect was quite large, with 48-h spacing resulting in an average of 25 kicks on the final test as compared to only 15 for massed study. As operant
Spacing and Testing Effects
85
conditioning relies on motor responses, it is unlikely to be due only to perceptual priming. What about older adults? Perhaps unsurprisingly, older adults show spacing effects roughly comparable to those of young adults (Balota, Duchek, & Paullin, 1989; Kausler, Wiley, & Phillips, 1990). Benjamin and Craik (2001) found that for both older and younger adults, spacing made it easier to discriminate studied from unstudied items than massing did. However, two lists were studied and the task was to respond only to the items from one of the lists—that is, when a source judgment was required— older adults were more likely to mistakenly endorse items from the wrong list. Younger adults showed no such trend. The study suggests that while item memory is improved with spacing in both older and younger adults, older adults do not show a spacing effect for source memory. In sum, the spacing effect seems to emerge throughout the lifespan and with many types of materials, which suggests that simple strategic explanations are insufficient to account for the results. Results like these suggest that very basic neural phenomena could be involved in producing the spacing effect.
3.3. Species Invariance A further piece of evidence that spacing effects might arise from basic memory processes comes from comparative psychological studies. One interesting study by Menzel, Manz, Menzel, and Greggers (2001), for example, used classical conditioning procedures to condition honeybees to extend the proboscis (in response to various stimuli such as carnations, propionic acid, and hexanol). They varied the spacing between acquisition trials to produce massed trials ( 3.44, p 0.001, Cohen’s d 1.00; see Figure 3). Consequently, as seen in Figure 2, taxonomic similarity inferences were much more frequent than extrinsic similarity inferences which in turn were much more frequent than causal inferences (F(2,46) ¼ 141.35, p < 0.0001, hp2 ¼ 0.86). Not surprisingly, taxonomic inferences were also more frequent than nontaxonomic inferences (0.75 vs. 0.25, t(23) ¼ 9.03, p < 0.0001, Cohen’s d ¼ 3.11). Thus, in both an absolute sense and a relative sense, reasoning about genes greatly increased the likelihood of generating taxonomic inferences. Interestingly, a close look at Table 3 reveals that the increase in taxonomic reasoning about genes was not due to an increase in categorybased inferences, which actually decreased in frequency (albeit not reliably). Rather, the increase in taxonomic reasoning stemmed from an increase in inferences based on perceptual similarity (t(46) ¼ 2.88, p ¼ 0.006, Cohen’s d ¼ 0.83) and behavioral similarity (t(46) > 4.61, p < 0.0001, Cohen’s d ¼ 1.33) relative to substance. This suggests that rather than simply falling back on category membership, participants may have attempted to connect the hypothetical gene with specific perceptual or behavioral attributes of premise species, and then base projections on those specific attributes. For example, one participant projected a gene from humpback whale/squirrel to ‘‘opossum, mole, gray mouse, dolphin: they are all gray in color’’ and from raccoon/pelican to ‘‘squirrel, seagull, pigeon: these are animals that rummage through things.’’ This raises the interesting possibility that when people think about genes, they give more weight to their potential to give rise to certain observable characteristics than to their general association with a taxonomic class. In other words, people in this task seemed to be projecting ‘‘gray color genes’’ or ‘‘rummaging genes’’ rather than ‘‘mammal genes’’ or ‘‘bird genes.’’
212
John D. Coley and Nadya Y. Vasilyeva
In sum, property had large effects on the relative frequency with which different inferences were generated. Inferences about substance mirrored those of Study One. Inferences about disease were more complex and multidimensional than for other properties; relative to substance, inferences about disease were more likely to be causal, and less likely to be taxonomic, although all three types of inferences were seen as equally appropriate. In contrast, inferences about genes were strongly biased toward taxonomic similarity. We next examine effects of premise relations on inference generation, and in particular, the degree to which premise relations and property interact in constraining inference generation. 3.2.2.3. Effects of Premise Relations One motivation for conducting Study Two was to be more careful in our manipulations of ecological relations among premise species. As such, we strove to choose pairs that were related via predation and shared habitat, pairs that were related via shared habitat only, and unrelated pairs. Results of posttests suggested that although participants viewed the relatedness of the premise pairs in the manner in which we intended, individual variability in salience of relations both within and between our planned item classes was again larger than we anticipated. Therefore, as in Study One, we decided to trust our participants’ beliefs about premise relatedness rather than our a priori expectations, and to construe the salience of taxonomic, habitat, and predation relations between each premise pair as continuous variables (ranging from weak to strong, based on participants’ ratings) rather than as categorical variables (present/absent) as originally conceived. We present multiple regression analyses comparable to those in Study One—using salience of shared habitat, predation, and taxonomic relations to predict item-wise frequency of each type of inference—rather than ANOVA. First, we averaged across property conditions to get an overall picture of how premise relations predicted inferences. Based on the results of Study One, we expected different inferences to be sensitive to different premise relations; of interest was whether Study Two replicated the specific relations between premises and inferences we observed in Study One. Results of this analysis are presented in Figure 4. Two things are notable in Figure 4. First, the way in which premise relations facilitated inferences was identical to what we observed in Study One. Second, unlike in Study One, the salience of shared habitat rendered taxonomic inferences less likely. Specifically, the frequency of taxonomic inferences increased with taxonomic salience, decreased with the salience of shared habitat, but was unrelated to the salience of predation relations (R2 ¼ 0.68, p < 0.0001). In contrast, extrinsic inferences were positively related to the salience of shared habitat, but unrelated to taxonomic salience (R2 ¼ 0.31, p ¼ 0.007). The negative relation between the salience of predatory relations and the frequency of inferences based on
213
Generating Inductive Inferences
1.0
*** ***
Standardized regression coefficient
** 0.5
0.0
–0.5
*** Same biological family Same habitat One eats the other –1.0 Taxonomic
Extrinsic
Causal
Inference type
Figure 4 Relations between salience of premise relations and frequency of taxonomic, extrinsic, and causal inferences averaged across property conditions, Study Two. (Note: **p < 0.005, ***p < 0.0005.)
extrinsic similarity observed in Study One was marginally significant overall; as we shall see, this particular relation varied by property. Finally, casual inferences were positively related to salience of predatory relations, but unrelated to salience of taxonomic relations or shared habitat (R2 ¼ 0.69, p < 0.0001) (as discussed below, this relation also varied somewhat with property). In sum, as in Study One, we observed a tight coupling between the salience of relations among premise categories and inferences drawn from those categories. Taxonomic relations promoted taxonomic inferences, shared habitat promoted extrinsic inferences, and predation relations promoted causal inferences. It is notable that, unlike in Study One, the salience of shared habitat strongly inhibited taxonomic inferences, suggesting that in the presence of a salient alternative relation, the appeal of taxonomic inferences faded. One possible explanation is that taxonomic inferences serve as a default, and when people notice a salient habitat relation they may tend to believe that this is
214
John D. Coley and Nadya Y. Vasilyeva
what was being specifically ‘‘communicated’’ to them by this premise pair (according to the relevance theory of Medin et al., 2003) rendering them less likely to make a default taxonomic inference. Alternatively, the presence of salient habitat relations may have led participants to develop alternative contextual hypotheses that reduced the strength of taxonomic hypotheses, consistent with findings of McDonald et al. (1996). In either case, since we do not see consistent reciprocal effects of taxonomic relations on other types of inferences, we can speculate that people may have an internal ‘‘relevance ranking’’ of different relations, with contextual relations ranked fairly high. 3.2.2.4. Does Property Influence how Premise Relations Generate Inferences? So far, results show clear effects of property and of premise relations on generation of inductive inferences. However, we were particularly interested in whether these effects were independent of each other, or whether the way premise relations led participants to generate inferences varied by property. To examine this question, we performed separate multiple regressions on item-wise salience and inference scores for each property condition. Standardized regression coefficients are presented in Figure 5; below we discuss results for each type of inference in turn.
3.2.2.4.1. Taxonomic inferences As seen in Figure 5, taxonomic inferences increased with salience of taxonomic relations between premise categories, decreased with the salience of shared habitat, and were unaffected by the salience of predation relations in all three property conditions (Substance: R2 ¼ 0.48, p < 0.0001; Disease: R2 ¼ 0.55, p < 0.0001; Gene: R2 ¼ 0.32, p ¼ 0.005). This suggests that the property being projected had little influence on the way in which premise relations licensed taxonomic inferences. Although the absolute level of taxonomic inferences varied from 74% for gene to 45% for disease, in all cases, salient taxonomic relations among premises facilitated the generation of taxonomic inferences, whereas salience of shared habitat inhibited them. Thus, property and premise relations exerted independent effects on taxonomic inferences. 3.2.2.4.2. Extrinsic inferences Extrinsic inferences were more weakly predicted by premise relations, and the nature of the relationship varied by property. As depicted in Figure 5, for those reasoning about substance, frequency of extrinsic inferences increased with the salience of shared habitat, but was unrelated to the salience of taxonomic and predation relations (R2 ¼ 0.31, p ¼ 0.007) whereas for gene, extrinsic inferences increased with the salience of shared habitat, and decreased with the salience of both taxonomic and predation relations (R2 ¼ 0.35, p ¼ 0.003). This pattern suggests that—unlike taxonomic inferences—property changed the way premise relations promoted extrinsic inferences. While any detailed explanation of this pattern of results would be speculation, results clearly
215
Generating Inductive Inferences
Taxonomic inferences
Extrinsic inferences 1.0
*** ***
Standardized regression coefficient
Standardized regression coefficient
1.0
***
0.5
0.0
–0.5
**
*
**
Same biological family Same habitat One eats the other
–1.0 Substance
Disease
0.5
0.0
*
*
–0.5
Same biological family Same habitat One eats the other
–1.0
Gene
***
**
Substance
Disease
Property
Gene
Property Causal inferences
Standardized regression coefficient
1.0
***
***
*
0.5
0.0
* –0.5 Same biological family Same habitat One eats the other
–1.0 Substance
Disease
Gene
Property
Figure 5 Relations between salience of premise relations and frequency of taxonomic, extrinsic, and causal inferences in each property condition, Study Two. (Note: *p < 0.05, **p < 0.005, ***p < 0.0005.)
demonstrate the interplay of background knowledge about distribution of properties on the one hand, and salient relations among premise categories on the other. In contrast, for disease, frequency of extrinsic inferences was unrelated to any premise relations (R2 ¼ 0.08, p ¼ 0.422). However, it is important to point out that even though extrinsic inferences about disease were not predicted by premise relations, their frequency was nevertheless relatively high. Thus, disease appears to independently promote extrinsic inferences. Such a pattern could be due to participants relying on a general theory—or overhypothesis (Goodman, 1955)—stating that diseases are distributed via
216
John D. Coley and Nadya Y. Vasilyeva
spatial or contextual relations, which would make extrinsic inferences appealing regardless of premise relations. 3.2.2.4.3. Causal inferences As seen in Figure 5, the way in which premise relations predicted causal inferences also varied by property, but less so. In all property conditions, generation of causal inferences increased with the salience of predation relations, and was unrelated to salience of shared habitat. Additionally, for participants reasoning about disease (but not substance or gene), causal inferences decreased with the salience of taxonomic relations (Substance: R2 ¼ 0.52, p < 0.0001; Disease: R2 ¼ 0.68, p < 0.0001; Gene: R2 ¼ 0.19, p ¼ 0.075). In sum, causal reasoning was consistently promoted by salience of predation relations between premise categories, but unrelated to salience of shared habitat. This suggests that contextual similarity was necessary but not sufficient to promote causal inferences, which were rendered particularly tempting when participants were reminded of predator–prey interactions among premise species. This reminding may have provided a salient causal mechanism to explain a shared property. Even for those reasoning about genes, despite the relative dearth of causal inferences (3%), such inferences were still positively predicted by the salience of predation relations among premise species. Although effects of property on the kinds of knowledge recruited to guide causal inferences were not dramatic, they confirm that the nature of the property can influence the way premise relations are used to guide inference generation.
3.3. Summary: Effects of Property on Inference Generation Results of Study Two show that property influenced inference generation at two levels. First, naı¨ve theories about the nature of the properties affected the relative frequency with which participants generated taxonomic, extrinsic, and causal inferences. Reasoning about substance replicated Study One, whereas reasoning about genes strongly biased participants toward taxonomic inferences, and reasoning about disease promoted causal reasoning, but also resulted in a more complex and multidimensional inference pattern. Second, property influenced both the degree to which relations are recruited to guide inferences and the quality of the effects of premise relations on inferences, creating a property-specific facilitation/inhibition profile. In addition, for extrinsic and causal inferences, the effects of premise relations varied by property, whereas for taxonomic inferences, they did not. Finally, Study Two also replicated the overall distribution of inferences, and the effects of premise relations on inference generation, from Study One. Salient taxonomic relations increased taxonomic inferences, salient habitat relations increased extrinsic inferences, and salient predation
Generating Inductive Inferences
217
relations increased causal inferences. The one departure from Study One was the finding that salience of shared habitat consistently inhibited taxonomic inferences.
4. Inference Generation: Conclusions and Implications In two experiments utilizing a novel open-ended induction task we have demonstrated that salient relations among premise categories, and the nature of the property being projected, both guide and constrain the ways in which people generate inductive inferences about novel properties of animals. In this section, we summarize our main findings about the process of inference generation and discuss possible implications for the broader study of inductive reasoning.
4.1. What Have We Learned About Inference Generation? In contrast to traditional methods used in the study of inductive inference, which require participants to evaluate the strength of inductive arguments, participants in our open-ended induction task generated their own inferences from the premise categories and properties we supplied. This approach encouraged them to generate a variety of inferences. Not surprisingly, taxonomic inferences—based on common category membership or shared intrinsic features—were generated most frequently (e.g., an inference from lemming/snowy owl to ‘‘other species of owl and similar species of lemming because of biological similarities between similar animals’’ or from tiger/clownfish to ‘‘a zebra because clownfish and tigers both have stripes. A zebra also has stripes’’). Extrinsic inferences—based on shared situational or contextual features—were also quite common (e.g., an inference from lobster/tuna to ‘‘crabs, catfish, salmon, oysters, shrimp, because they all live in similar environmental conditions’’). Perhaps, most striking was the finding that 20% of inferences generated by participants were based on causal relatedness or interaction (e.g., projecting a substance from salmon/black bear to ‘‘other bears and fish, because the bear might get substance A in their bloodstream by eating salmon, which also has substance A. So any other animal that eats salmon would probably have it also’’ or projecting a disease from ant/anteater to ‘‘birds because the disease may come from the ants themselves. By eating them the anteater got the disease, as would birds’’). Clearly, a broad range of knowledge is used in the process of generating inductive inferences. Moreover, the type of knowledge used to generate inferences varied systematically with the specifics of each inductive problem. Salient relations
218
John D. Coley and Nadya Y. Vasilyeva
among premise categories had a pronounced effect on the nature of inferences generated from those categories. Participants often explicitly referred to relations among premise species to explain their inferences. For example, one participant projected a substance from humpback whale/squirrel to ‘‘other mammals because whales and squirrels are both mammals.’’ Another projected a substance from owl/deer to ‘‘rabbit because all are found in woods.’’ A third projected a substance from elephant/crocodile to ‘‘rhino, hippo, alligator because all have tough, thick skins. Maybe substance E has to do with producing leathery skin.’’ Even more telling was the fact that many participants found themselves at a loss to generate an inference from an unrelated premise pair. The response of one participant, when confronted with the bullfrog/chipmunk pair, was typical: ‘‘No clue. I can’t think of a relationship between the two.’’ Indeed, the links between premise relations and inferences were quite specific. The salience of taxonomic relatedness consistently predicted taxonomic inferences, the salience of shared habitat consistently predicted extrinsic inferences, and the salience of predation relations consistently predicted causal inferences. Premise relations also had inhibitory effects. Most strikingly, salience of shared habitat reliably (in Study Two, at least) inhibited taxonomic inferences. In addition to premise relations, property also had a large effect on the inferences participants generated. One way in which property influenced inference generation was to invoke naı¨ve theories about how kinds of properties are likely to be distributed or transmitted. Substance served as a more or less neutral property; taxonomic and nontaxonomic inferences about substances were equally frequent (although more specifically, taxonomic inferences were more frequent than extrinsic inferences, which were more frequent than causal inferences). Compared to substance, participants reasoning about novel genes were biased in the direction of taxonomic inferences, whereas those reasoning about novel diseases were biased in the direction of causal inferences. Even more strikingly, property influenced what relations among premise categories were seen as relevant. To illustrate, in the gene condition participants responding to the lion/zebra item tended to generate taxonomic inferences like ‘‘tiger, gazelle, horse, because they all have 4 legs, with similar features,’’ and ‘‘tigers and giraffes, because tigers and lions are similar animals, and zebras and giraffes are similar animals.’’ In contrast, in the disease condition participants tended to generate causal inferences from the same pair, like ‘‘hyenas, and lion prey, because lion could have gotten the disease from eating the zebra and spread it to any other animal it came in contact with,’’ and ‘‘Tigers/scavengers that eat zebras because zebras may carry the disease.’’ Thus, not only did different properties engender different inferences from the very same premise pair, but they also rendered different relations among the premise categories salient. Reasoning about genes rendered taxonomic knowledge salient because of what we believe about
Generating Inductive Inferences
219
genes and how they work; therefore, what seemed most relevant about lions and zebras is that both are quadrupedal mammals. In contrast, reasoning about disease rendered knowledge of spatiotemporal interactions salient because of what we believe about diseases and how they work; therefore, what seemed most salient about lions and zebras is the fact that lions eat zebras. One final and striking finding was the frequency with which people generated vague inferences (e.g., one participant projected a substance from leaf-cutter ant/anteater to ‘‘an animal that is a predator to an anteater. I can’t think of any ‘cause I’m not an animal expert. Anteater eats ants, and they both have this substance. So I assume whatever eats an anteater will have it too or receive it by eating it.’’) Inferences like this were quite common and reinforce the idea that people can generate sophisticated and subtle inferences based on framework theories, often despite the lack of specific knowledge. Indeed, this pattern of response is strongly reminiscent of the idea of overhypotheses. Goodman (1955) suggested that people possess abstract beliefs describing the scope of properties, and that these beliefs could constrain possible hypotheses about how properties could be projected. When one of our participants projected a novel gene from leafcutter ant/anteater to ‘‘other animals in the same family as the anteater and leaf cutter ant, because related animals have similar genes,’’ they unwittingly exemplified this idea perfectly. In sum, our results suggest that people generate inductive inferences by extracting salient relations from premise categories in light of what they understand about the property being projected, and then drawing inferences consistent with those relations. This process emphasizes the degree to which categorical induction is both flexible and knowledge-driven. We next consider the broader implications of these findings.
4.2. Implications Taken together, these results show that salient relations derived from comparison of premise categories—in concert with knowledge activated by the property being projected—provide important constraints on the generation of inductive inferences. In some sense, these results should be reassuring in that they reinforce findings that have emerged from the use of argument evaluation. We knew that property influenced how people evaluate arguments (e.g., Heit & Rubinstein, 1994; Kalish & Gelman, 1992; Ross & Murphy, 1999; Shafto & Coley, 2003; Shafto, Coley, & Baldwin, 2007), and now we know it also influences how they generate inferences. We knew that premise relations had an impact on argument evaluation (McDonald et al., 1996; Medin et al., 2003), and now we know they also have an impact on inference generation. In other words, the picture of inductive reasoning that emerges from considering inference generation in
220
John D. Coley and Nadya Y. Vasilyeva
addition to argument evaluation seems to be a coherent one. However, we believe that our perspective has also highlighted aspects of inductive reasoning that might otherwise have remained in the shadows. 4.2.1. Salience of Taxonomy in Category-Based Induction Traditional accounts have emphasized the role of taxonomic similarity in evaluating category-based inductive arguments. In contrast, our results clearly show that when generating inferences, participants spontaneously appealed to extrinsic similarity and causal relatedness as often as taxonomic similarity. In particular, the prevalence of causal reasoning in these experiments is surprising given previous research showing such reasoning is common among experts, but rare among folk biological novices like the undergraduates who participated in these experiments (e.g., Coley, Shafto, et al., 2005; Coley, Vitkin, et al., 2005; Coley et al., 1999). Past research— utilizing argument evaluation—has shown that experts tend to flexibly utilize knowledge of taxonomic, extrinsic, and causal relations, whereas novices are strongly biased toward taxonomic inferences (e.g., Lo´pez et al., 1997; Shafto & Coley, 2003). As discussed above, forced-choice or argument-evaluation tasks require participants to recognize relations between given premise and conclusion categories. In contrast, our task allowed participants to generate their own inferences, and the way we coded responses gave participants credit for the knowledge underlying their inferences, even if it was vague (e.g., projecting a disease from parrot/toucan to ‘‘other birds that live in the tropical climates’’ or from newt/box turtle to ‘‘other creatures that eat newts and box turtles. . .’’) or factually incorrect (e.g., projecting a substance from snowy owl/lemming to ‘‘all owls because lemmings and snowy owls are both owls’’ or a disease from penguin/herring to ‘‘ostriches and guinea hens, because ostriches and penguins both can’t fly, and I’m not sure what a herring is but I think it might be related to a guinea hen’’). Thus, despite the lack of specific factual knowledge about tropical birds, what might eat a box turtle, or what a lemming is, this format enabled participants to nevertheless generate and articulate relatively sophisticated causal and extrinsic inferences. This suggests that the relative paucity of ecological and causal reasoning among folk-biologically naı¨ve participants in previous research may be due in part to the fact that they were being asked to recognize such relations, rather than generate them. Besides potentially taking taxonomic inferences down a peg or two, our results also have implications for the salience of taxonomic relations. A number of studies have shown that taxonomic knowledge dominates other conceptual relations in terms of salience, speed of access (e.g., Ross & Murphy, 1999; Vitkin et al., 2005), and use in guiding inductive inferences (e.g., Shafto, Coley, & Baldwin, 2007). In contrast, our results provide little evidence that taxonomic relations between premise categories are privileged in terms of their impact on inference generation. Indeed, if
Generating Inductive Inferences
221
anything, we observed the opposite; the presence of salient ecological relations among premise was more likely to suppress taxonomic reasoning than vice versa. We have several thoughts on these findings. First, because they were generating their own inferences, rather than evaluating our best guesses as to what they deem plausible arguments, participants were not constrained by lack of specific knowledge (nor, indeed, by facts or reality). As such, informationally vague yet causally sophisticated inferences—which would not be detectable in an argument evaluation paradigm—were relatively common. Second, because our task did not involve any time pressure or speeded responding, and was in fact deliberately reflective in that participants were asked to explain their inferences as well as generate them— baseline differences in knowledge accessibility (Shafto, Coley, & Baldwin, 2007; Shafto, Coley, & Vitkin, 2007) were probably not a factor. In other words, the results of tasks involving time pressure suggest that taxonomic knowledge might be initially more accessible, but our results suggest that given sufficient time, other knowledge is readily recruited to guide inductive inferences. Third, inference generation may involve stronger differentiation between specific kinds of relatedness than argument evaluation does. Beyond assessing whether the premises are sufficiently related in a general way consistent with the projected property—as required for argument evaluation—our participants had to generate novel hypotheses and then articulate the relationships between premises and their hypotheses. Internally labeling taxonomic and ecological relations among premises as such might promote discounting of irrelevant relations and focus attention on more relevant relations. If taxonomic relations are highly salient, yet on some occasions they are viewed as irrelevant for projection, such a selective approach would diminish the effect of taxonomic relatedness on inference generation compared to argument evaluation. Finally, it may also be that, more generally, taxonomic and relational categories have differing cognitive functions. Ross and Murphy (1999) point out that in the domain of food, taxonomic categories—based on intrinsic properties—are useful for categorization and identification, whereas script categories—based on habitual co-occurrence in space and time—are useful for generating solutions to problems like ‘‘what should I have for breakfast?’’ Likewise, in folk biology, relational categories like pond animals, or even noncategorical relations like predator–prey, may be especially useful for generating solutions to problems like ‘‘what other species are likely to have this substance/disease’’ because they embody relations seen as causally relevant for explaining how such properties could come to be shared among species. As such, by focusing on inference generation we may have tapped into precisely the kind of cognitive task that such categories are most useful for.
222
John D. Coley and Nadya Y. Vasilyeva
4.2.2. Challenges for Models of Category-Based Induction 4.2.2.1. What Needs to be Explained? Our findings expand the range of inductive phenomena that any successful theory of inductive reasoning needs to explain. First of all, any successful model has to incorporate a variety of potential bases for inductive inference; at the very least, these must include both taxonomic and extrinsic similarity, and causal relations, but we make no claim about whether this list is exhaustive.5 We emphatically reinforce the point (made elsewhere, e.g., Coley, Shafto, et al., 2005; Coley, Vitkin, et al., 2005; Lassaline, 1996; Medin et al., 2003) that similarity alone—no matter how flexibly conceived—cannot be sufficient to explain inductive reasoning. Second, any theory of inductive reasoning must take into account the fact that the kinds of knowledge used to generate an inference depend critically on the property being projected and on salient relations among premise categories. We think that the kind of models being developed by Tenenbaum and colleagues (e.g., Griffiths & Tenenbaum, 2005; Shafto et al., 2008; Tenenbaum et al., 2006)—which rely on property to indicate which knowledge structure might be most relevant for assessing a given argument—are a step in the right direction. However, our results suggest that not only does property influence the kinds of knowledge upon which participants base their inferences, but it also influences their interpretation of relations among premise categories, and the way in which those relations influence inferences. Any successful theory of inductive reasoning must take into account this interplay between domain knowledge, beliefs about premise relatedness, and beliefs about the property. Finally, any successful theory must take into account the fact that inferences vary widely in their specificity. This is reminiscent of Keil’s (Keil, 2003; Rozenblit & Keil, 2002) proposals about the ‘‘illusion of explanatory depth’’ in the sense that participants probably do not have a detailed understanding about mechanisms of epidemiology or genetics, or about specifics of food webs, but that relatively abstract and cursory framework principles can nevertheless effectively guide inductive reasoning. Likewise, this finding fits with Coley and colleagues’ work on hierarchical induction (Coley et al., 1997, 2004). Although they investigated a very different issue—namely the degree to which knowledge of concepts at different levels of abstraction corresponded to the relative strength of inferences to those concepts—they found that the level at which participants expected category members to share novel properties differed from participants’ knowledge of actual properties shared by category members. Coley et al. (2004) conclude that ‘‘inductive inference is driven by 5
It may well be exhaustive in the context of folk biological inductive reasoning, but for other domains such as reasoning about artifacts or about social categories, no doubt other kinds of inferences might be generated.
Generating Inductive Inferences
223
expectations about conceptual structure that go beyond what is known about particular category members’’ (p. 249). Our findings on inference generation support that conclusion. 4.2.2.2. Focus on Process By focusing on how people generate inductive inferences—how they use what they know to make sensible guesses about what they do not know—we hope to direct some attention to the littlestudied but critical issue of process in inductive reasoning. In most previous work on inductive inference, the target behavior has been evaluation of a complete argument or choice from among a limited set of alternatives. As such, the questions to be explained—and therefore the natural and appropriate goals of empirical and theoretical investigation—have concerned factors that predict argument strength ratings or choices. These have tended to focus on characteristics of arguments (and implicitly or explicitly, the interactions of these characteristics with the knowledge of the reasoner) that render them strong or weak (e.g., McDonald et al., 1996; Medin et al., 2003; Osherson et al., 1990; Sloman, 1993). There is no reason why studies of argument evaluation cannot in principle examine process issues; indeed, a few have done so. For instance, Shafto, Coley, and Baldwin (2007) showed that time pressure lowers strength ratings for inductive arguments based on extrinsic relations, but has no effect on arguments based on taxonomic relations. Likewise, Feeney et al. (2010) have shown that premise reading times are related to the changes in argument strength brought about by those premises; larger changes in argument strength are associated with longer reading times, and presumable deeper processing. Rather, the focus on process inherent in the inference generation approach is more a difference in emphasis. When the target behavior is inference generation, rather than argument evaluation, the questions to be explained concern what inferences are generated under different conditions, why they are generated, and how they are generated. These questions naturally focus attention on the characteristics of the process of inductive inference. We have assumed that the process of inference generation involves accessing knowledge about premise categories and the property being projected, making decisions about what knowledge is relevant, and then generating an actual response. Clearly, there are many processes involved in even this cursory description. These include searching semantic memory for relevant knowledge, comparing premise categories for salient relations, accessing explanatory theories about the nature of the property, and potentially searching for relevant conclusion categories once a basis for inference has been determined, to name just a few. At the moment, we lack answers about the role of any of these processes in inference generation. We do not, however, lack questions. For example,
224
John D. Coley and Nadya Y. Vasilyeva
what is the mechanism by which properties constrain inference generation? Do they focus the search at a relatively early point and thereby limit the candidate inferences that are evaluated? Or do they serve mainly to cull an exhaustive list of possible inferences generated via premise comparison down to a few likely candidates? We hope that our initial look at inference generation prepares the ground and invites further work examining the processes that underlie flexible inductive reasoning.
4.3. Conclusions At the risk of repeating ourselves, we cannot and should not base a psychology of inductive reasoning solely on studies of argument evaluation. In the hope of putting ‘‘reasoning’’ back into the study of category-based induction, we have presented an initial look at how people generate inductive inferences. Our results show that salient relations derived from comparison of premise categories—in concert with knowledge about the property being projected—provide important constraints on the generation of inductive inferences. We have also shown that such inferences vary widely in their specificity, and make contact with a broad range of real-world knowledge. In taking this approach, we hope to draw attention to the process of inductive reasoning as well as the outcome, and to emphasize the knowledge-driven and creative nature of human inductive inference.
ACKNOWLEDGMENTS This chapter is based upon work supported by the National Science Foundation under Grant No. 0236338. We are indebted to Anna Vitkin and Allison Baker for their important contributions to the research reported here. We thank Brett Hayes and Gregory Murphy for careful and thoughtful comments on previous incarnations of this paper. We are especially grateful to Kaitlyn Amato, Yui Anzai, Nicole Ciampanelli, Lindsey Davis, Konstantin Feigin, Ruiwen Hu, Janelle LaMarche, Brianna Roche, Claire Seaton, Carissa Shafto, Courtney Steller, and Jennelle Yopchick for their Herculean efforts to collect and code the data reported here.
REFERENCES Chomsky, N. (1980). Rules and representations. Oxford: Basil Blackwell. Coley, J. D., Hayes, B., Lawson, C., & Moloney, M. (2004). Knowledge, expectations, and inductive inferences within conceptual hierarchies. Cognition, 90, 217–253. Coley, J. D., Medin, D. L., & Atran, S. (1997). Does rank have its privilege? Inductive inferences within folkbiological taxonomies. Cognition, 64, 73–112. Coley, J. D., Medin, D. L., Proffitt, J. B., Lynch, E. B., & Atran, S. (1999). Inductive reasoning in folkbiological thought. In D. L. Medin & S. Atran (Eds.), Folkbiology (pp. 205–232). Cambridge, MA: MIT Press.
Generating Inductive Inferences
225
Coley, J. D., Shafto, P., Stepanova, O., & Barraff, E. (2005). Knowledge and categorybased induction. In W. Ahn, R. L. Goldstone, B. C. Love, A. B. Markman, & P. Wolff (Eds.), Categorization inside and outside the laboratory: Essays in honor of Douglas L. Medin (pp. 69–85). Washington, DC: American Psychological Association. Coley, J. D., Vitkin, A. Z., Seaton, C. E., & Yopchick, J. E. (2005). Effects of experience on relational inferences in children: The case of folk biology. In B. G. Bara, L. Barsalou, & M. Bucciarelli (Eds.), Proceedings of the 27th annual conference of the Cognitive Science Society (pp. 471–475). Mahwah, NJ: Lawrence Erlbaum Associates. Coley, J. D., Vitkin, A. Z., Vasilyeva, N. Y., & Amato, K. (2007). Experience increases flexible ecological reasoning. In: Paper presented at the 15th Biennial Conference of the Australasian Human Development Association. Sydney, New South Wales, Australia. Feeney, A., Coley, J. D., & Crisp, A. (2010). The relevance theory of category-based induction: Evidence from garden path arguments. Journal of Experimental Psychology: Learning, Memory and Cognition, 36. Feeney, A., Shafto, P., & Dunning, D. (2007). Who is susceptible to the conjunction fallacy in category-based induction? Psychonomic Bulletin & Review, 14, 884–889. Gelman, S. A. (2003). The essential child: Origins of essentialism in everyday thought. New York: Oxford University Press. Gelman, S. A., & Coley, J. D. (1990). The importance of knowing dodo is a bird: Categories and inferences in 2-year-old children. Developmental Psychology, 26, 796–804. Goodman, N. (1955). Fact, fiction, and forecast. Indianapolis, IN: Bobbs-Merrill. Griffiths, T. L., & Tenenbaum, J. B. (2005). Structure and strength in causal induction. Cognitive Psychology, 51, 354–384. Heibeck, T., & Markman, E. (1987). Word learning in children: An examination of fast mapping. Child Development, 58, 1021–1024. Heit, E. (2000). Properties of inductive reasoning. Psychonomic Bulletin & Review, 7, 569–592. Heit, E., & Feeney, A. (2005). Relations between premise similarity and inductive strength. Psychonomic Bulletin & Review, 12, 340–344. Heit, E., & Rubinstein, J. (1994). Similarity and property effects in inductive reasoning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 411–422. Kalish, C. W., & Gelman, S. A. (1992). On wooden pillows: Multiple classifications and children’s category-based induction. Child Development, 75, 1871–1885. Keil, F. C. (1981). Constraints on knowledge and cognitive development. Psychological Review, 88, 197–227. Keil, F. C. (2003). Folkscience: Coarse interpretations of a complex reality. Trends in Cognitive Sciences, 7, 368–373. Kemp, C., Perfors, A., & Tenenbaum, J. B. (2007). Learning overhypotheses with hierarchical Bayesian models. Developmental Science, 10, 307–321. Lassaline, M. E. (1996). Structural alignment in induction and similarity. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 754–770. Lin, E. L., & Murphy, G. L. (2001). Thematic relations in adults’ concepts. Journal of Experimental Psychology: General, 130, 3–28. Lo´pez, A., Atran, S., Coley, J. D., Medin, D., & Smith, E. E. (1997). The tree of life: Universal and cultural features of folkbiological taxonomies and inductions. Cognitive Psychology, 32, 251–295. McDonald, J., Samuels, M., & Rispoli, J. (1996). A hypothesis-assessment model of categorical argument strength. Cognition, 59, 199–217. Medin, D., Coley, J. D., Storms, G., & Hayes, B. (2003). A relevance theory of induction. Psychonomic Bulletin & Review, 10, 517–532.
226
John D. Coley and Nadya Y. Vasilyeva
Muratore, T. M., & Coley, J. D. (2009). The role of knowledge in folk biological induction. In: Paper presented at the international conference on Biological Understanding and Theory of Mind. Reims, France. Nguyen, S. P., & Murphy, G. L. (2003). An apple is more than just a fruit: Crossclassification in children’s concepts. Child Development, 6, 1783–1806. Osherson, D. N., Smith, E. E., Wilkie, O., Lopez, A., & Shafir, E. (1990). Category-based induction. Psychological Review, 97, 185–200. Proffitt, J. B., Coley, J. D., & Medin, D. L. (2000). Expertise and category-based induction. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 811–828. Rips, L. J. (1975). Inductive judgments about natural categories. Journal of Verbal Learning and Verbal Behavior, 14, 665–681. Ross, B. H., & Murphy, G. L. (1999). Food for thought: Cross-classification and category organization in a complex real-world domain. Cognitive Psychology, 38, 495–553. Rozenblit, L. R., & Keil, F. C. (2002). The misunderstood limits of folk science: An illusion of explanatory depth. Cognitive Science, 26, 521–562. Shafto, P., & Coley, J. D. (2003). Development of categorization and reasoning in the natural world: Novices to experts, naı¨ve similarity to ecological knowledge. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 641–649. Shafto, P., Coley, J. D., & Baldwin, D. (2007). Effects of time pressure on context-sensitive property induction. Psychonomic Bulletin & Review, 14, 890–894. Shafto, P., Coley, J. D., & Vitkin, A. (2007). Availability in category-based induction. In A. Feeney & E. Heit (Eds.), Inductive reasoning: Experimental, developmental, and computational approaches (pp. 114–136). Cambridge University Press. Shafto, P., Kemp, C., Bonawitz, E. B., Coley, J. D., & Tenenbaum, J. B. (2008). Inductive reasoning about causally transmitted properties. Cognition, 109, 175–192. Sloman, S. A. (1993). Feature-based induction. Cognitive Psychology, 25, 231–280. Sloman, S. A. (1994). When explanations compete: The role of explanatory coherence on judgments of likelihood. Cognition, 52, 1–21. Sloutsky, V. M., & Fisher, A. V. (2004). Induction and categorization in young children: A similarity-based model. Journal of Experimental Psychology: General, 133, 166–188. Tenenbaum, J. B., Griffiths, T. L., & Kemp, C. (2006). Theory-based Bayesian models of inductive learning and reasoning. Trends in Cognitive Sciences, 10, 309–318. Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 5, 207–232. Vitkin, A., Coley, J. D., & Feigin, K. (2005). Accessibility of taxonomic and script knowledge in the domain of food. Paper presented at the 46th annual meeting of the Psychonomic Society, Toronto. Vitkin, A. Z., Vasilyeva, N. Y., & Coley, J. D. (2007). Experience and the development of flexible inductive reasoning in biology. Paper presented at the annual meeting of British Psychological Society 2007 Developmental Section Conference. Plymouth, UK.
C H A P T E R
S I X
From Uncertainly Exact to Certainly Vague: Epistemic Uncertainty and Approximation in Science and Engineering Problem Solving Christian D. Schunn Contents 1. Introduction 2. Linguistic Pragmatics of Uncertainty and Approximation 3. Coding Approximation and Uncertainty from Speech 3.1. Conversation Coding in Engineering Design Team Meetings 3.2. Conversation and Interview Coding in Science and Applied Science Data Analysis 4. Coding Uncertainty from Gestures 5. Uncertainty, Approximation, and Expertise 6. From Uncertainty to Approximation via Spatial Reasoning 6.1. Uncertainty and Verbally Coded Spatial Transformations in Basic and Applied Science 6.2. Association of Uncertainty and Approximation with Spatial Gestures in Basic Science 6.3. From Approximation to Uncertainty via Mental Simulations in Engineering Design 7. Summary and Discussion 8. Future Directions Acknowledgments References
228 229 231 231 232 234 237 241 241 242 244 246 248 249 250
Abstract Epistemic uncertainty is a huge area of scholarship. It has captured the minds of scholars in psychology and many domain-specific studies of reasoning and problem solving. What does it mean to resolve uncertainty? This chapter explores the idea that resolution of uncertainty in complex science and engineering fields frequently ends with approximations rather than precise answers. The chapter begins by examining language to motivate the core Psychology of Learning and Motivation, Volume 53 ISSN 0079-7421, DOI: 10.1016/S0079-7421(10)53006-8
#
2010 Elsevier Inc. All rights reserved.
227
228
Christian D. Schunn
distinction between uncertainty and approximation. Then, the chapter explores whether the distinction can be defended empirically in reliable and valid coding of speech and gesture data in multiple science and engineering domains. Novice/Expert changes in uncertainty and approximation levels are also explored. Finally, three examinations of temporal patterns of co-occurrence with uncertainty and approximation are presented in multiple problem-solving domains to provide an overall model of uncertainty being transformed to approximation through spatial reasoning and mental simulations.
1. Introduction Studies of behavior in the real world have consistently found that uncertainty has a large influence on behavior. For example, there is a whole subdiscipline of naturalistic decision making focused on judgment under uncertainty (e.g., Klein, 1989). Indeed, there are many pragmatic implications for better understanding uncertainty. For example, the ways in which experts reason about uncertainty in future forecasts under different actions, the ways in which experts choose to communicate this uncertainty to the voting public or the future voting public (in schools), and the ways in which the public understand the uncertainty will also influence critical decisions being made by politicians today (Friday, 2003). While much progress has been made, there is still much to be learned about how uncertainty influences behavior. There are several taxonomies of uncertainty types in existence. Some come from psychology judgment and decision-making research (Berkeley & Humphreys, 1982; Howell & Burnett, 1978; Kahneman & Tversky, 1982; Krivohlavy, 1970; Lipshitz & Strauss, 1997; Musgrave & Gerritz, 1968; Trope, 1978). Others come from a broad array of particular disciplines, such as geography (Abbaspour, Delavar, & Batouli, 2003), ecology (Regan, Colyvan, & Burgman, 2002; Regan, Hope, & Ferson, 2002), finance (Rowe, 1994), management (Priem, Love, & Shaffer, 2002), geospatial information systems (Plewe, 2002), law (Walker, 1991, 1998), acoustics (Egan, Schulman, & Greenberg, 1961), medicine (Brashers et al., 2003; Hall, 2002), consumer choice (Sheer & Cline, 1995; Urbany, Dickson, & Wilkie, 1989), driving behavior (Vlek & Hendrickx, 1988), educational research (Webster & Bond, 2002), negotiation (Bottom, 1998), military tactics (Cohen, Freeman, & Thompson, 1998), and statistics. The sheer number of such domain-specific accounts makes clear how complex and central uncertainty resolution is to problem solving. These taxonomies typically emphasize the different sources of uncertainty—reasons why a problem solver might be uncertain. A different issue from the sources of informational uncertainty (objective ambiguity in the existing information) is psychological uncertainty
From Uncertainly Exact to Certainly Vague
229
( Jousselme, Maupin, & Bosse´, 2003), the internal feeling of being uncertain about information which may or may not be objectively uncertain. Presumably, it is this internal state that directly influences behavior: making choices (Kahneman & Tversky, 1982), avoiding situations, or driving new problem solving aimed at reducing the uncertainty levels (Trickett, Trafton, & Schunn, 2009). Of course the underlying source of informational uncertainty may also influence behaviors aimed at reducing the psychological uncertainty. For example, Lipshitz and Strauss (1997) found that decision makers react differently to three different types of uncertainty: inadequate understanding, incomplete information, and undifferentiated alternatives. Inadequate understanding is addressed by collecting more information; incomplete information is typically addressed through assumption-based reasoning; and undifferentiated alternatives are resolved by weighing pros and cons in more depth. But there still remains the question, what is the psychological nature of the uncertainty itself. In this chapter, I would like to argue for a distinction not previously emphasized in discussions of uncertainty: the difference between psychological uncertainty and psychological approximation, referred to as uncertainty and approximation for the rest of the chapter. Uncertainty is the lack of knowledge about possible states (e.g., is the temperature 18 C or 19 C?). Approximation declares a state as falling with a range (e.g., the temperature is between 18 C and 19 C). At first blush, this distinction appears bizarre and without conceptual merit. From an information theoretic or logical perspective, there is no difference between the two. However, I will argue that this distinction is a critical psychological distinction in science and engineering problem solving. I will show that uncertainty and approximation are discriminable constructs in behavior, that they systematically occur in different places, and that common problem-solving strategies in science engineering serve primarily to convert from uncertainty into approximation. Thus, to ignore this seemingly nondistinction is to ignore a core feature of very important types of problem solving. Further, psychological research coding uncertainty from speech or gestures will likely falsely include approximation behaviors with uncertainty behaviors unless the distinction between uncertainty and approximation is salient.
2. Linguistic Pragmatics of Uncertainty and Approximation To first provide some intuitions regarding the difference between uncertainty and approximation, consider the following everyday conversational examples, focusing on the responses in italics.
230
Christian D. Schunn
(1) Speaker 1: How old is she? Speaker 2: 40? She was born in January of 1969. (2) Speaker 1: How old is she? Speaker 2: Early forties. (3) Speaker 1: How old is she? Speaker 2: Forty plus or minus 2. (4) Speaker 1: How old is she? Speaker 2: Early forties? In (1), speaker 2 has all the information required to provide a precise answer to the question, actually provides a precise answer (40) that is accurate (in 2009), and yet is psychologically uncertain, as noted in providing an answer in a question format. By contrast, in (2), speaker 2 provides an approximate answer (early forties), but with no indicated psychological uncertainty. Example (3) is a more academic-speak response with the same key characteristics as (2): approximation but no indicated uncertainty. Example (4) shows that one can have approximation and uncertainty. From a pragmatics perspective, speaker 2’s responses in (2) and (3) are quite reasonable in that they answer the question with precision that is likely sufficient for speaker 1’s needs and they set clear bounds on the possible actual values. By contrast, speaker 2’s response in (1) of ‘‘40?’’ does not set bounds on the possible actual values, leaving open the possibility of a much wider range of actual age. Human languages contain many categorical terms that represent approximations on quantitative entities. For example, 50s, 19th century, teenage, early childhood, average height, room temperature, steep, and next door represent approximate quantities of age, time, height, temperature, slope, and location. Moreover, each of those terms represents approximations that are much more approximate than humans can perceive psychologically. That is, we could think and express ourselves more precisely than with those terms, but we on occasion choose not to. Interestingly, both uncertainty and approximation can be indicated through the use of hedge words added to more precise terms, although the two use different hedge words. Consider the following two examples. (5) Speaker 1: How old is she? Speaker 2: Maybe 40. (6) Speaker 1: How old is she? Speaker 2: Almost 40. In (5), speaker 2 uses the hedge ‘‘maybe’’ to indicate uncertainty in the precise response with no provided bounds on how far the answer could be off, whereas in (6), speaker 2 uses the hedge ‘‘almost’’ to indicate approximation in the precise response and pragmatic conventions suggest the age is less than 40 and unlikely to be more than 1 or 2 years below 40 (i.e., it might be 38 or 39). Overall there appear to be many more ways of expressing
From Uncertainly Exact to Certainly Vague
231
uncertainty through hedge words than through direct terms indicating approximate or uncertainty quantities, perhaps reflecting subdimensions of uncertainty (e.g., probability distributions or average versus peak intensity) or approximations that do not have convenient linguistic labels (e.g., temperatures between 14 and 16 C, or ages between 43 and 45). As a result, our coding from speech tends to focus on hedge words. The examples above have generally focused on uncertainty and approximation cases that are not informationally equivalent in that the possible range for the uncertainty cases was larger than the possible range for the approximation cases. There are two important points to note about this observation. First, the definitional difference is NOT about relative ambiguity in quantity. Reverse cases are possible: one could be uncertain whether the temperature is 14 or 15 C and one could assert an approximation of 13–18 C. Uncertainty is about psychologically not knowing something, whereas approximation is about asserting a range. Second, it happens to be the case that problem solving tends to reduce the possible range for which one is uncertain to a smaller range that is the approximation. For example, a problem solver might begin with an uncertainty of a very general form (what is the temperature?) or of a wide range (what is the temperature, but knowing that it is a Fall afternoon in New York) and then through some data collection from various sources and reasoning finish with a smaller possible range of 14–16 C. In other words, problem solving (especially in engineering and science for which some level of precision is required) serves to move information ambiguity from unacceptable levels to acceptable levels for the task at hand. This point will be further examined in Section 6.
3. Coding Approximation and Uncertainty from Speech In a different sense of pragmatics, the distinction of approximation versus uncertainty is useful to psychologists (or various other scientists of behavior) only if the distinction can be made reliably from observed behavior and is associated with interesting patterns of behavior. Focusing on the first issue, in a number of projects we have found that uncertainty and approximation can be reliably coded from free speech, either in the form of thinkalouds during problem solving or in the form of natural conversations.
3.1. Conversation Coding in Engineering Design Team Meetings In Christensen and Schunn (2009), we coded for uncertainty and approximation from the many hours of conversation transcripts of an innovative engineering design team during their weekly design team meetings. Our
232
Christian D. Schunn
approach to coding uncertainty and approximation was syntactical with verification, building on a hedge-word uncertainty coding approach developed with Trickett, Trafton, Saner, & Schunn (2007). Examples of uncertainty hedge words are ‘‘probably,’’ ‘‘sort of,’’ ‘‘guess,’’ ‘‘maybe,’’ ‘‘possibly,’’ ‘‘don’t know,’’ ‘‘[don’t] think,’’ ‘‘[not] certain,’’ and ‘‘believe,’’ Examples of approximation hedge words are ‘‘pretty much,’’ ‘‘virtually,’’ ‘‘generally,’’ ‘‘frequently,’’ ‘‘usually,’’ ‘‘normally,’’ ‘‘basically,’’ and ‘‘‘almost.’’ (Actually, we searched for the Danish equivalents of these terms, as the team being studied was Danish.) In either the uncertainty or approximation cases, each instance of the hedge words was examined to make sure it was being used in an uncertainty or approximation sense; if so, the segment containing these hedge words were coded as ‘‘uncertainty present’’ or ‘‘approximation present.’’ Interrater reliability for this approach was extremely high, with kappas of 0.95 for uncertainty coding and 0.96 for approximation coding. As a simple validation of each construct and the distinction between the two, we also looked at the adjacency relationships between codes from one transcript segment to the next. The assumption is that mental states of uncertainty or approximation are ‘‘sticky’’ in that they will tend to continue longer in time than just one segment. Uncertainty and approximation are conceptualized as being about particular quantities and thus co-occurrence will not be perfect, but conversations will tend to continue regarding a given quantity, so there should be some continuity. As can be seen in Table 1, this continuity was clearly shown for both approximation and uncertainty (both trends are statistically significant). Further, taking into account the base rates of uncertainty and approximation, there was no tendency for approximation to immediately follow uncertainty or vice versa.
3.2. Conversation and Interview Coding in Science and Applied Science Data Analysis Another project involved a similar coding procedure applied to two different domains of science and two different domains of applied science (Schunn, Saner, Kirschenbaum, Trafton, & Littleton, 2007; Trickett et al., 2009). Table 1 Rates of Uncertainty and Approximation in the Next Transcript Segment as a Function of their Presence in a Given Segment.
Current segment
Uncertainty in next segment
Approximation in next segment
Uncertainty (n ¼ 247) Approximation (n ¼ 308) Neither (n ¼ 5616)
16% 3% 5%
4% 8% 4%
From Uncertainly Exact to Certainly Vague
233
The first domain involved conversations of earth scientists working at the Jet Propulsion Lab analyzing data as it came down from Mars from two robotic rovers—the Mars Exploration Rovers. The coded conversations were of impromptu meetings held throughout the day between groups of 2–10 scientists from several different disciplines (soil and rock scientists, geochemists, geologists, and atmosphere scientists). There were a number of video cameras off to the sides of the large data analysis rooms. The scientists had given informed consent for this video collection, but the cameras were relatively small, discretely located, and constantly present. Thus, the scientists generally forgot about the existence of the cameras and the transcripts likely capture very typical problem-solving behaviors in this context. The remaining three domains were 13 cognitive neuroscientists analyzing fMRI data (fMRI), 18 meteorologists making weather predictions (Weather), and 22 navy officers localizing an enemy submarine using only passive sonar (Submarine). These datasets involved cued think-alouds of novices (apprentices in the domain, not random undergraduates), intermediates, and experts. Participants were videotaped as they analyzed their data on computers (their own data in the case of fMRI, canned data in the case of Weather and Submarine). After 30–45 min of data analysis, they were then shown three or four different minute-long snippets of the videotape that corresponded to critical decision-making moments during data analysis. The scientists were asked to explain what they knew and did not know at that moment in time. Sometimes problem solvers given thinkaloud instructions fall silent exactly at the interesting moments in time, especially when the task is long and complex. This cued-recall method was designed to capture additional information about these more interesting moments. Across these four domains, we used the same hedge-word technique for coding uncertainty and approximation from the transcribed speech. In all cases, we obtained interrater reliability kappas of greater than 0.8 for both uncertainty and approximation. The know/do not know probes in the fMRI, Weather, and Submarine domain studies provide another validation of the distinction between uncertainty and approximation (and coding was done blind to question context). One would expect that there would be more uncertainty speech cues in response to the ‘‘what did you not know?’’ question than in response to the ‘‘what did you know?’’ question. An opposite pattern is expected for approximation. Figure 1 presents the results. In all three domains, the predicted pattern was obtained and statistically significant for both uncertainty and approximation codes. Thus, uncertainty versus approximation is a distinction that can be made reliably in various science and engineering settings from verbal data in the form of think-alouds or natural conversations. Simple patterns in the data clearly suggest that uncertainty and approximation are temporally coherent
234
Christian D. Schunn
1 Know Q Not know Q
Proportion of segments
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2
fMRI (n = 13)
Weather (n = 18)
Approximation
Uncertainty
Approximation
Uncertainty
Approximation
0
Uncertainty
0.1
Submarine (n = 22)
Figure 1 The proportion of speech containing uncertainty or approximation (with SE bars).
within categories and temporally dissociable across categories. Finally, uncertainty and approximation speech appears under expected conditions.
4. Coding Uncertainty from Gestures In science and engineering, much of the data is inherently visual– spatial or is displayed in spatial format (e.g., graphs of temperature varying with time). Thus, much of the uncertainty and approximation are expressed about visual–spatial quantities. Because science and engineering have formalized much if not all of the quantities and relationships in symbolic formats (e.g., terms for particular quantitative data patterns, equations to represent quantitative data patterns), much can be studied from coding speech from conversations and think-alouds. However, it is likely that considerable representing, reasoning, and problem solving in science and engineering is also happening in a visual–spatial, nonverbal format. How does one measure internal problem solving on visual–spatial content? All measures of mental representations and problem solving are necessarily indirect. Verbal report is one general source of data regarding mental
From Uncertainly Exact to Certainly Vague
235
representation and problem solving. However, for visual–spatial content, it is a suspect source, as verbal data are generally thought to capture the contents of verbal working memory, not spatial working memory (Ericsson & Simon, 1993). Retrospective or intermittent drawings can be another source of data. However many people are not very skilled in drawing, and it is likely that such drawings would influence reasoning more than verbal protocols would because the drawing process is much less automatic and the results of the process are more permanent (i.e., is an object that can be used itself in problem solving). Scientists and engineers do draw (by hand or via a computer) regularly, but not densely enough in time to constitute a good online measure of thinking. A third approach is to use spontaneous gestures. In addition to serving as a communicative act between speaker and listener, spontaneous gestures are thought to be an online measure of mental representations much like verbal protocols (Alibali, Bassok, Solomon, Syc, & Goldin-Meadow, 1999; Alibali & GoldinMeadow, 1993; McNeill, 1992). In spatial tasks, in fact it is disruptive to the problem solver to prevent gesturing from occurring. In a later section, I will consider more complex representational content of gestures. But first, I want to focus on gestures as a direct measure of uncertainty or approximation. There are a number of taxonomies of gesture. One common distinction (McNeill, 1992) is between beat gestures (rhythmic, repetitive gestures often co-timed with speech), deictic gestures (pointing to things in the world around the speaker such as the clock on the wall over there), iconic gestures (gestures that are literal physical presentations of things absent, such as hand-shape holding an implied glass), and metaphoric gestures (a spatial representation of a nonspatial object, such as pointing behind oneself to represent back in time). All of these gestures can have many phases (McNeill, 2005): preparation (optional), prestroke hold (optional), stroke (obligatory), stroke hold (obligatory if the stroke is static), poststroke hold (optional), and a retraction (optional). Uncertainty gestures are typically wiggling movement in the stroke of an iconic or metaphoric gesture that represents some quantity (i.e., normally would be static). For example, a pinch indicating a size together with wavering the size or wiggling the hand. In this way, the uncertainty gesture is discriminable from a beat gesture in that there is content to the gesture beyond the movement in an uncertainty gesture of this type but the beat gesture does not have content beyond the movement (i.e., the hand does not indicate a size or distance or volume). However, another common form of an uncertainty gesture involves a shoulder shrug. In this case, one must rely on speech or perhaps another gesture to determine which quantity is producing uncertainty. We have not yet coded approximation gestures, but I could easily imagine width of gestures indicating the approximations on quantities (e.g., between fingers of one hand or between hands). Further, I could
236
Christian D. Schunn
20 Present Absent 15
10
fMRI
Weather
Approximation
Uncertainty
Approximation
Uncertainty
0
Approximation
5
Uncertainty
% Segments w/uncertainty gesture
easily imagine that some of the wiggling gestures that we previously coded as uncertainty gesture might actually be approximation gestures (e.g., specific movement between particular points). In this section, the uncertainty gesture data are used as a cross-validation: do uncertainty gestures co-occur with uncertainty speech (and less so with approximation speech)? It is important to note, however, that speech and gesture need not always line up perfectly. Speech-gesture mismatches do happen and are not thought to be simply noise in interpretation; rather they are thought to signal coactivation of competing ideas/strategies (Alibali & Goldin-Meadow, 1993; Alibali et al., 1999). We examined the overlap between uncertainty gesture and speech in the four science/applied science domains. Figure 2 presents the percentage of segments with uncertainty gestures when the segment has speech uncertainty or speech approximation present/absent in the three domains with cued-recall think-alouds. The first thing to note is that uncertainty gestures are relatively less common that uncertainty or approximation speech codes. The second thing to note is the strong cross-validation across all three domains: uncertainty gestures occurred much more often when uncertainty
Submarine
Figure 2 Percentage of segments (with SE bars) with uncertainty gestures as a function of uncertainty and approximation appearing the speech segment for the domains of fMRI, Weather, and Submarine. The ‘‘Present’’ cases each involve approximately 300 segments and the ‘‘Absent’’ cases each involve approximately 1000 segments.
From Uncertainly Exact to Certainly Vague
237
speech occurred (ps < 0.01 in all three), whereas uncertainty gestures had no consistent relationship to whether approximation speech occurred (only the Weather difference is statistically significant, p < 0.05, and in the reverse direction from the uncertainty speech pattern). In the naturalistic science conversation Mars data, 5.3% of segments with an uncertainty code had an uncertainty gesture, in comparison to 2.7% of segments without an uncertainty speech code (X2(1) ¼ 16.0, p < 0.001)— in other words, uncertainty gestures occur twice as often in the context of uncertainty speech. There is an association between approximation statements and uncertainty gestures (X2(1) ¼ 6, p < 0.02), but the association is weaker; uncertainty gestures are only 50% more likely to appear in the context of approximation speech than without approximation speech. Overall, then, uncertainty speech and uncertainty gesture are clearly related, whereas uncertainty gesture and approximation speech have a smaller ambiguous relationship, perhaps reflecting some miscoding of approximation gestures as uncertainty gestures. To further validate that there is indeed something called an uncertainty gesture that signals an internal state of uncertainty, we can examine gesture data from the fMRI, Weather, and Submarine domains, focusing on the relative frequency of uncertainty gestures in response to the Know and Not know questions. In all three domains, 2% of segments co-occurred with an uncertainty gesture during the response to the Know question. In response to the Not know question, rate of uncertainty gestures increased significantly (ps < 0.05) and generally more than doubled (5% fMRI, 8% Weather, and 4% Submarine).
5. Uncertainty, Approximation, and Expertise With multimodal affirmation of the somewhat surprising distinction between uncertainty and approximation in hand, we can now explore a third pragmatic question: whether the distinction plays a useful role in explaining behavior, in this case behavior of scientists and engineers. One intuition might be that uncertainty and approximation should differ by expertise levels, with experts showing more approximation and less uncertainty. Indeed, some expertise literature focuses on the amazing swiftness with which experts can see problems in terms of solutions features and solve problems (Chase & Simon, 1973; Chi, Feltovich, & Glaser, 1981; Gobet & Simon, 1996; Larkin, McDermott, Simon, & Simon, 1980). However, much of the expertise literature making those claims focuses on welldefined problems such as simple physics problems that are purely education tasks rather than problems an expert would actually encounter. The actual life of an engineer and scientist is much less clear-cut. Indeed, experts in
238
Christian D. Schunn
most domains deal with a very uncertain world, hence the large focus on decision making under uncertainty within naturalistic decision-making research. While an expert certainly can produce better solutions and in less time than novices in the much more ill-defined contexts of real science and engineering problem solving (Moss, Kotovsky, & Cagan, 2006; Schunn & Anderson, 1999; Voss, Tyler, & Yengo, 1983), it is not a matter of recognition of simple solutions for the expert. Issues involving uncertainty must be recognized and then resolved through complex processes, like mental simulation. It may be that novices do not even recognize what is uncertain about the current situation, treating initial point estimates as fact rather than estimates. The fMRI, Weather, and Submarine cued-recall dataset provides an opportunity to look at expertise effects on rates of uncertainty and approximation across domains to look for consistent patterns. We defined novices as those individuals having already learned enough of the task basics to be able to complete the analysis tasks on their own (e.g., analyze an fMRI dataset, make a weather prediction). Experts were those at the top performance levels. Intermediates were those with considerable experience beyond novice levels, but far from expert levels in that domain. In our participant pool for that study, only fMRI involved all three performance levels. The Weather data included novices ( juniors and seniors in weather forecasting school) and experts, and the submarine data had only intermediates and experts (both were submarine officers with field experience, but to varying degrees). Figure 3A presents the levels of uncertainty speech across the expertise levels in each domain, and Figure 3B presents the levels of approximation speech across the expertise levels in each domain. There are a few statistically significant differences, but no consistent differences across the three domains. For example, in the submarine domain, the experts have the highest levels of uncertainty, whereas in the Weather domain they have the lowest. In all three domains, the differences by expertise level are small. The best overall conclusion to draw is that recognizing uncertainty may itself be a kind of expertise and the frequency of uncertainty comments will involve two opposing trends as a function of expertise: (1) experts likely recognize more facets of uncertainty and (2) experts are better able to resolve the uncertainty. How those opposing trends balance in aggregate will depend on the complexities of the task at hand. That is, I doubt that even a whole domain will have general patterns by expertise level on amount of uncertainty as some tasks within the domain will involve more detection challenges and others will involve more resolution challenges. In support of this idea that there are recognition and resolution elements to uncertainty, one can divide a problem-solving session into two halves (early and late). If experts recognize uncertainty more readily and then are able to resolve it, we would expect their uncertainty levels to go down over
239
From Uncertainly Exact to Certainly Vague
A 35%
Uncertainty
30%
% of segments
25% 20% 15% 10% 5% 0% Novice
Intermediate
Expert
B Approximation
30%
% of segments
25% 20% 15% 10% 5% 0% Novice fMRI
Intermediate Weather
Expert Submarine
Figure 3 The percentage of segments (with SE bars) showing (A) uncertainty speech and (B) approximation speech as a function of domain and expertise levels.
time. By contrast, if novices are struggling to even see the issues of uncertainty and are less able to resolve these uncertainties, then we would expect novices’ uncertainty levels to go up over time. Figure 4 presents relevant uncertainty speech data from the fMRI domain. We see that uncertainty levels do go up for novices and intermediates whereas they go down (directionally but not statistically significant) for experts. Similar (small) interactions of early/late by expertise levels on uncertainty levels could also be seen in the other domains.
240
Christian D. Schunn
35%
fMRI uncertainty
30%
% of segments
25% 20% 15% 10% 5% 0% Novice
Intermediate Early
Expert
Late
Figure 4 The percentage of segments (with SE bars) in the fMRI domain showing uncertainty speech as a function of early and late minutes of problem solving and expertise levels. Ns for each percentage vary between 130 and 300 segments of speech.
Of course, more fine-grained coding of uncertainty detection and resolutions’ strategies included in this analysis of expertise effects on uncertainty and approximation would provide a more conclusive perspective on why uncertainty is not clearly associated with expertise and appears to be changing in different ways over time. We have done this coding in all four science/applied science domains. We specifically looked at what indicators were used to identify sources of uncertainty. For example, uncertainty becomes apparent when different data sources (as in two weather models) produce conflicting results, or when one data source produces seemingly impossible results (as in brain activation outside the skull). A number of such general indicators could be found. We also looked at the strategies used to resolve the uncertainty. It turns out that there are a very large number of such general strategies that can be observed, some more spatial in form, others less spatial. There are some expertise differences by strategy within each of the domains, but the differences are not consistent across the domains, probably because different strategies are differentially effective within each domain. In sum, uncertainty and approximation have a complex relationship to expertise levels rather than a simple-linear trend relationship, and the relationship likely depends upon the ease in which uncertainty is detectable and resolvable in a given setting given available tools and strategies.
From Uncertainly Exact to Certainly Vague
241
6. From Uncertainty to Approximation via Spatial Reasoning Thus far, I have focused on the differences between uncertainty and approximations—how they are not the same. Now I would like to focus on the positive relationship that they have to one another. In particular, the theoretical assertion that I would like to make is that uncertainty and approximation have an input/output relationship to one another with spatial reasoning lying in between, at least in science and engineering problem solving. The next three sections build up the evidence for this theoretical assertion. Section 6.1 examines verbal protocol evidence that uncertainty leads to mental spatial transformations. Section 6.2 examines gesture data to examine the relative temporal relationship of uncertainty, approximation, and spatial mental representations. Section 6.3 focuses on a particular kinds of spatial problem solving that appears to be used to move from uncertainty to approximation in problem solving.
6.1. Uncertainty and Verbally Coded Spatial Transformations in Basic and Applied Science In Trickett et al. (2007), we used the syntactic approach to coding uncertainty in speech and then also coded the speech for the presence of spatial transformations. Spatial transformations are mental operations a person mentally performs on an internal representation or an external visualization (on paper or computer screen). Typical spatial transformations are creating a mental image, adding or deleting features to an image, rotating or moving an object, or making comparisons between different views. Table 2 provides examples of uncertainty codes and spatial transformations from utterances. In one study, we examined the relative co-occurrence of spatial transformations with uncertainty in speech for an expert (over 16 years of experience) making a weather forecast while giving a think-aloud (approximately 50 min of speech to analyze). We found that the rate of spatial transformations was almost twice as high during speech with uncertainty markers than in speech without uncertainty markers. Follow-up work with more experts and novices (although still trained in weather forecasting) found that both experts and novices showed this pattern but the effect is much larger in experts than in novices. In the second study of Trickett et al. (2007), a more rigorous test was conducted using the fMRI and Weather cued-recall dataset described earlier, but in a slightly different way. Here, spatial transformations were coded from the think-aloud speech of the problem solver doing the initial fMRI data analysis or weather forecast. Then relative levels of uncertainty were coded
242
Christian D. Schunn
Table 2 Examples of Spatial Transformations and Certain and Uncertain Utterances with Indications of Uncertainty in Bold and Spatial Transformation in Italics (adapted from Trickett et al., 2007).
Utterance
Code
Spatial transformations (ST)
Nogaps [a mathematical model] has some precipitation over the Vancouver/Canada border (while viewing a visualization) This is valid today Possibly some rain over Port Angeles And then uh, at Port Angeles, there’s gonna be some rain up at the north, and if that sort of sneaks down, we could see a little bit of restriction of visibility, but only down to 5 miles at the worst I don’t think the uh front’s gonna get to Whidbey Island, but it should be sitting right about over Port Angeles right around 0Z this evening
Certain
No ST
Certain No ST Uncertain No ST Uncertain ST: mentally moving rain [sneaks down]
Uncertain ST: mentally moving front/animation
from the responses to the cued-recall questions. A minute of problem solving was determined to be a high-uncertainty minute if the cued-recall phase for that minute generated a high percentage of uncertainty speech codes whereas the minute of problem solving was determined to be a low-uncertainty minute if the cued-recall phase generated a low percentage of uncertainty speech codes. Thus, spatial transformations and relative uncertainty levels were coded from different datasets (and also by coders from different labs). Further, in our prior analyses, uncertainty speech was coded at the utterance-by-utterance level, whereas the underlying uncertainty is likely more pervasive (i.e., the speech codes may be considered as the tip of the uncertainty iceberg). This designation of entire minutes as being high or low uncertainty addresses this issue. Indeed, using this approach to examining uncertainty against spatial transformations, we found that spatial transformations were over four times greater during highuncertainty minutes than during low-uncertainty minutes.
6.2. Association of Uncertainty and Approximation with Spatial Gestures in Basic Science In addition to potentially capturing uncertainty or approximation in thinking, gestures can also capture spatial problem solving. If spatial problem solving takes place between uncertainty and approximation, then we should
From Uncertainly Exact to Certainly Vague
243
see more spatial gestures between uncertainty and approximation. But what kind of gestures should we expect to see? There are many different kinds of spatial representations. The spatial reasoning literatures (in cognitive psychology, developmental psychology, and cognitive neuroscience) frequently make distinctions between large scale and small scale, egocentric and exocentric (or allocentric), and between two-dimensional and three-dimensional visual–spatial representations. The work described in the prior section suggests that spatial transformations are frequently used by problem solvers to resolve uncertainty. The cognitive neuroscience literature has suggested for multiple decades that a ventral (‘‘what’’ or object type information) and dorsal (‘‘where’’ or object location information) pathway is a critical distinction in thinking about visual–spatial processing (Ungerleider & Mishkin, 1982). Later work (for a review, see Kosslyn, Ganis, & Thompson, 2001) has suggested that the parietal lobe (part of the where pathway) is heavily involved during spatial transformations (e.g., during mental rotation). Other neuroscience work has suggested that the parietal lobes are specifically involved in small 3D representations of space (Previc, 1998). By inference, one would expect to see high numbers of small 3D manipulation gestures following uncertainty speech and preceding approximation speech if mental transformations are doing the work of going from uncertainty to approximation and these gestures map onto mental transformations of this type. We have tested exactly this prediction in the Mars data described earlier. In addition to coding uncertainty gestures, we also coded for several other kinds of spatial and nonspatial gestures. The most common spatial gesture was small-scale 3D gestures. Based on a theoretical framework I have developed elsewhere (Harrison & Schunn, 2002), these are called manipulative gestures. Specifically, manipulative gestures are gestures that place objects and activity in a nearby space, such that the problem solver can actually manipulate or place the imaginary objects. Examples of manipulative gestures included one-handed gestures of a brain region (a cupped hand facing up) and two-handed gestures showing dusting billowing over a small crater lip (the left hand flat and held still at an angle to represent the crater lip and the right hand swooping over the left with fingers wiggling to show the billowing dust). Gestures in which the hand shape suggests placing or holding as opposed to strictly pointing were also coded as manipulative. To examine the relative temporal arrangement of uncertainty speech and manipulative gestures, we divided speech segments into several different types: segments with uncertainty speech (exact), segments that have uncertainty 1–5 segments before the current one (before), segments that have uncertainty 1–5 segments after the current one (after), segments with both before and after relationships, and then segments not near uncertainty speech (distant), which can be thought to establish a base rate of spatial gestures. We then examined the rate of manipulative gestures during each of
244
Christian D. Schunn
these segment types. The same analysis was also done for gestures’ temporal relationship to approximation speech codes. Figure 5 presents the results of this analysis. Focusing on manipulative gestures relative to uncertainty speech, the highest rates of manipulative gestures occur when the uncertainty speech occurs before the current segment. The ‘‘during’’ cases (both and exact) have lower rates of manipulative gestures, and the after case has a manipulative gesture rate similar to segments distant from any uncertainty speech. Thus, it appears that uncertainty speech occurs primarily before manipulative gestures and not after. For approximation speech and manipulative gestures, a different pattern appears. Here manipulative gestures are elevated anywhere near approximation speech, but particularly right during it. Thus, the approximation representations appear to occur simultaneously with the spatial transformation work. Overall, these data are consistent with the view that uncertainty leads to spatial transformations that produce approximation results.
6.3. From Approximation to Uncertainty via Mental Simulations in Engineering Design
Proportion of manipulative gestures
The Christensen and Schunn (2009) examination of uncertainty and approximation in engineering design also examined the temporal relationships of uncertainty and approximation relative to mental problem solving. 0.4
0.3
0.2
0.1
0 Distant
Before (1–5) Both Location of speech code Uncertainty
Exact
After (1–5)
Approximation
Figure 5 The proportion of speech segments (with SE bars) with manipulative gestures as a function of whether uncertainty speech occurred before (within five segments), after (within five segments), both before and after, or exactly in the segment. Each proportion is based on between 300 and 600 segments of data, except the ‘‘distant’’ proportions, which are based on 1300 segments.
From Uncertainly Exact to Certainly Vague
245
In particular, we focused on a kind of problem solving that was quite frequent in engineering design team meetings: mental simulations. These mental simulations happened approximately once every two minutes on average during the meetings. In the part of the meetings in which the conversation was focused on active design of the product (vs. future planning), the rate went up to one per minute. The coding scheme for mental simulations was adapted from the coding scheme developed by Trickett and Trafton (2007) for coding scientist mental simulations. A mental simulation is a mentally constructed model of a situation, building upon objects in memory of mental modifications of objects currently present. A defining feature of a mental simulation is that something is ‘‘running,’’ that is, that the process alters the representation. The simulation is not just asking a ‘‘what if’’ question. It also provides an answer about whether something will work, what a resulting feature will be, etc. Mental simulations involve a sequence of three critical elements: creating an initial representation, running the representation (elements or functions are changed, added, or deleted), and a final changed representation. Each segment was coded as ‘‘mental simulation’’ or ‘‘no mental simulation,’’ along with the separate steps. Table 3 presents an example mental simulation from the transcripts coded into three components. The interrater reliability for coding mental simulations was quite high, kappa ¼ 0.9. Figure 6 presents the rate of uncertainty and approximation speech as a function of step during a mental simulation. The base rate of (speech coded) uncertainty is 8% in this dataset. The rate of uncertainty speech was statistically significantly higher than the base rate at the initial representation and Table 3 An Example Mental Simulation from the Engineering Design Domain (from Christensen & Schunn, 2009). Step
Initial representation
Utterance
Could you add something so that you couldn’t close this thing because there would be something in the way when you try to fold this way. . . Run But if this thing goes this way, then it is in a position to allow the ear to enter. . . But then I just don’t know how it should be folded. . . ’cause if it is folded this way then it will come out here. . .then it should be folded unevenly somehow. . .You should fold it oblique. Changed representation It wouldn’t make any difference one way or the other. It would fold the same way, and come out on this side the same way.
246
Christian D. Schunn
25%
Percentage
20%
15%
10%
5%
0% Initial representation
Simulation run
Resulting representation
Mental simulation sequential steps Uncertainty
Approximation
Figure 6 Percentage of segments with uncertainty and approximation by mental simulation sequential step, with SE bars (from Christensen & Schunn, 2009).
during the simulation run, but not during the resulting representation. By contrast, approximation speech was at baseline levels (3%) at the initial representation step, and rose to significantly higher levels by the resulting representation. Thus, the temporal patterns are perfectly consistent with the hypothesis that mental simulations have the effect of turning uncertainty into approximations. More recently, Linden and Christensen (2009) coded for uncertainty and mental simulations in a different engineering design dataset and found exactly the same results—a reduction of uncertainty in the initial representation down to base levels of uncertainty by the resulting representation state of the mental simulation.
7. Summary and Discussion Epistemic uncertainty is a huge area of scholarship. It has captured the minds of scholars in psychology and many domain-specific studies of reasoning and problem solving, presumably because uncertainty is ubiquitous or nearly so in real-world problem solving. With all the rich distinctions that could be made about uncertainty, I began this chapter with a
From Uncertainly Exact to Certainly Vague
247
different psychological distinction that seemed on first inspection a nondistinction: the distinction between vaguely uncertain and certainly vague. Indeed, when I began empirical investigations into uncertainty problem solving, I assumed that uncertainty was the start state and precise certainty was the end state. That is, I assume a problem solver moved along a continuum of precision, with initial states involving little precision and final states involving high precision. Yet, examination of problem-solving transcripts hinted at a different transformation: from uncertainty to imprecision, or, as I call it now, approximation. In the early coding work on the uncertainty/approximation distinction, we had many arguments within the lab about what the distinction even was and how it could be coded with any conceptual integrity. Yet, the initial intuition about the need for such a distinction appeared to have merit. The distinction can be defined psychologically, even though the logical or information theoretic definitions are lacking. More importantly perhaps to researchers who are empirically rather than philosophically oriented, the distinction could be coded reliably in real problem-solving transcripts, and cross-validation investigations were also very successful. Of course, subtle demand characteristics of context might have created these distinctions in the minds of the coders. For example, it is hard to hide from the coders the context of participants being asked what did they know versus what did they not know. Even with the questions themselves being hidden, the participants often repeat the question verbatim or with minor rephrasings. However, the same pattern was observed in many different datasets, which involved many different coders (spread across labs in different cities), and crossvalidations of different forms. Furthermore, we focused on more syntactic approaches to coding uncertainty and approximation to reduce the possible influence of situational demand characteristics significantly determining our results. Finally, we did not find that expertise levels had clear associations with uncertainty levels, even though some of the coders had strong expectations that there would be such patterns. Thus, effects of coder expectations on coding behavior were not strong enough to create results through expectations alone. Perhaps, most persuasive and interesting are the patterns of uncertainty and approximation against reasoning indicators. We found clear temporal patterns: (1) uncertainty invokes mental spatial transformations; (2) spatial gestures seem to reside between verbal uncertainty and verbal approximation; and (3) mental simulations seem to reside between verbal uncertainty and verbal approximation. Before declaring victory in this appeal for a new general distinction, I want to return to the information theoretic/logical basis or nonbasis for the distinction. In cognitive science, there is a general view that cognition is but computation. Further, considerable recent theorizing has focused on the optimality or rationality of human cognition (Anderson, 1990; Gigerenzer, 2000; Griffiths & Tenenbaum, 2006). It should make the reader nervous to
248
Christian D. Schunn
accept a distinction as the basis of rational, expert problem solving when the computational/logical basis of the distinction is fundamentally flawed. As I noted before, the definition for uncertainty and approximation could not be made simply on the basis of logical informational ambiguity. That is, one could imagine uncertain cases that had less ambiguity than other approximation cases. However, I also noted that, empirically, problem-solving processes would generally reduce the underlying ambiguity as the problem solver moved from uncertainty to approximation. Therein lies the true rational basis of this mode of processing. The computational work of Forbus (1997) building running conceptual simulations (called qualitative reasoning) shows how approximate quantitative answers can be derived from incomplete information. To extend a computational framework to the current proposal, the idea is as follows. A problem solver is working on a task and discovers that the informational ambiguity is above some threshold such that a critical decision/inference cannot be made (e.g., will a design choice produce a satisfactory outcome?). A state of uncertainty is thus taken on, which motivates problem-solving processes (such as spatial transformations or mental simulations) to reduce the underlying ambiguity. When the ambiguity is sufficiently reduced to enable decision making, then the resulting ambiguity is declared an approximation. I should also add an important caveat. While uncertainty frequently resolves in approximation in science and engineering, I am not claiming that it always results in approximation; sometimes it just ends in more uncertainty and the problem solvers move on, and sometimes it even ends in precise certainty. Although the world of scientists and engineers is complex enough that exact, certain values are not the norm, it does happen. A final caveat involves my focus on science and engineering. Many psychologists avoid rich real domains because of the difficulties in obtaining access to participants and the complexities of studying real tasks. Those psychologists who do study real domains tend to pick a particular domain to study. I have presented data from many different domains, including several basic sciences, several applied sciences, and engineering design. Hopefully, the case is now persuasively made for science and engineering. But certainly the space of domains involving informational uncertainty is much broader still. I suspect that similar distinctions will be relevant in these other domains, but that remains an empirical question for others to examine.
8. Future Directions I have attempted to provide a simple and rational account for what problem solvers do with uncertain information, but many questions remain. For example, we have at best a very incomplete understanding of how
From Uncertainly Exact to Certainly Vague
249
information uncertainty is detected by the problem solver. In science and engineering, the problem solver might encounter hundreds to thousands of quantities, all of which may involve some uncertainty and yet psychological uncertainty is not triggered for all of those values. Most values are simply accepted. What raises the uncertainty hairs of the problem solver in these complex settings? A related question involves what it means, exactly, to have the uncertainty hairs raised. We know that information ambiguity is troubling to problem solvers. It motivates them to reduce the ambiguity and the ambiguity reduce procedures appear to be useful for problem-solving success. But this behavioral description does not precisely unpack the mental state of uncertainty. Is it purely cognitive or does it have a core emotional component? Does it have underlying phenomenological primitives or is psychological uncertainty a foundational concept? As mentioned in Section 1, we know that uncertainty derived from different factors produces different behaviors, but that, by itself, does not answer the phenomenological question. Cognitive neuroscience may provide some interesting data on this front. We know that relative predictability of outcomes is a key variable in predicting the reactions of certain brain areas (e.g., the anterior cingulate cortex or the basal ganglia), and this relative predictability is heavily implicated in learning. Another further direction involves the qualitative. Thus far I have emphasized psychological uncertainty about quantitative dimensions. What about qualitative dimensions? Perhaps, the enemy will come by plane or by train. Perhaps, it will snow or it will rain. Psychological uncertainty is clearly relevant to these qualitative ambiguities. What about approximation? Let us briefly consider some of the hedge words that we used for coding approximation in speech: ‘‘pretty much,’’ ‘‘virtually,’’ ‘‘generally,’’ ‘‘frequently,’’ ‘‘usually,’’ ‘‘normally,’’ ‘‘basically,’’ and ‘‘almost.’’ All of these hedges could be applied to the qualitative ambiguities in enemy transportation method or precipitation type. Semantically, those qualifiers would be ones of probability, which is a quantitative dimension attached to discrete qualitative states. Indeed, many of the things that were coded in our datasets as approximations involved these sorts of probabilistic hedges to qualitative issues. The task for future research is to fathom whether approximation on quantities and approximate probabilities on qualities is actually the same basic thing.
ACKNOWLEDGMENTS The reported projects were supported by grants from ONR (N000140610053, N000140210113, and N000140310061) and NSF (SBE-0738071). These projects were intense collaborations with Bo Christensen, Greg Trafton, Susan Trickett, Susan Kirschenbaum, Lelyn Saner, and Tsunhin Wong, and involved additional hard coding work by Melanie Shoup and Mike Knepp.
250
Christian D. Schunn
REFERENCES Abbaspour, R. A., Delavar, M. R., & Batouli, R. (2003). The issue of uncertainty propagation in spatial decision making. In: K. Virrantaus & H. Tveite (Eds.), Proceedings of the Scandinavian research conference on geographical information science (pp. 57–65). Helsinki, Finland: Helsinki University of Technology. Alibali, M. W., Bassok, M., Solomon, K. O., Syc, S. E., & Goldin-Meadow, S. (1999). Illuminating mental representations through speech and gesture. Psychological Science, 10(4), 327–333. Alibali, M. W., & Goldin-Meadow, S. (1993). Gesture-speech mismatch and mechanisms of learning: What the hands reveal about a child’s state of mind. Cognitive Psychology, 25(4), 468–523. Anderson, J. R. (1990). The adaptive character of thought. Hillsdale, NJ: L. Erlbaum Associates. Berkeley, D., & Humphreys, P. (1982). Structuring decision problems and the ‘‘bias heuristic’’. Acta Psychologica, 50(3), 201–252. Bottom, W. P. (1998). Negotiator risk: Sources of uncertainty and the impact of reference points on negotiated agreements. Organizational Behavior and Human Decision Processes, 76(2), 89–112. Brashers, D. E., Neidig, J. L., Russell, J. A., Cardillo, L. W., Haas, S. M., Dobbs, L., et al. (2003). The medical, personal, and social causes of uncertainty in HIV illness. Issues in Mental Health Nursing, 24, 497–522. Chase, W. G., & Simon, H. A. (1973). Perception in chess. Cognitive Psychology, 4, 55–81. Chi, M. T. H., Feltovich, P. J., & Glaser, R. (1981). Categorization and representation of physics problems by experts and novices. Cognitive Science, 5, 121–152. Christensen, B. T., & Schunn, C. D. (2009). The role and impact of mental simulation in design. Applied Cognitive Psychology, 23(3), 327–344. Cohen, M. S., Freeman, J. T., & Thompson, B. (1998). Critical thinking skills in tactical decision making: A model and a training strategy. In J. A. Cannon-Bowers & E. Salas (Eds.), Making decisions under stress: Implications for individual and team training (pp. 155–189). Washington, DC: American Psychological Association. Egan, J. P., Schulman, A. I., & Greenberg, G. E. (1961). Memory for waveform and time uncertainty in auditory detection. Journal of the Acoustical Society of America, 33, 779–781. Ericsson, K. A., & Simon, H. A. (1993). Protocol analysis: Verbal reports as data (2nd ed.). Cambridge, MA: MIT Press. Forbus, K. D. (1997). Qualitative reasoning. The Computer Science and Engineering Handbook, 715–733. Friday, E. W. (2003). Communicating uncertainties in weather and climate information: A workshop summary. Washington, DC: National Academies Press. Gigerenzer, G. (2000). Adaptive thinking: Rationality in the real world. Oxford: Oxford University Press. Gobet, F., & Simon, H. A. (1996). Recall of random and distorted chess positions: Implications for the theory of expertise. Memory & Cognition, 24(4), 493–503. Griffiths, T. L., & Tenenbaum, J. B. (2006). Optimal predictions in everyday cognition. Psychological Science, 17(9), 767–773. Hall, K. H. (2002). Reviewing intuitive decision-making and uncertainty: The implications for medical education. Medical Education, 36, 216–224. Harrison, A. M., & Schunn, C. D. (2002). ACT-R/S: A computational and neurologically inspired model of spatial reasoning. In: Paper presented at the 24th annual meeting of the cognitive science society. Mahwah, NJ: Erlbaum. Howell, W. C., & Burnett, S. A. (1978). Uncertainty measurement: A cognitive taxonomy. Organizational Behavior and Human Decision Processes, 22(1), 45–68.
From Uncertainly Exact to Certainly Vague
251
Jousselme, A.-L., Maupin, P., & Bosse´, E. (2003). Uncertainty in a situation analysis perspective. In: Paper presented at the 6th annual conference on information fusion, Cairns, Australia. Kahneman, D., & Tversky, A. (1982). Variants of uncertainty. Cognition, 11(2), 143–157. Klein, G. A. (1989). Strategies of decision making. Military Review, 56–64 (May). Kosslyn, S. M., Ganis, G., & Thompson, W. L. (2001). Neural foundations of imagery. Nature Reviews Neuroscience, 2, 635–642. Krivohlavy, J. (1970). Subjective probability in experimental games. Acta Psychologica, 34(2–3), 229–240. Larkin, J. H., McDermott, J., Simon, D., & Simon, H. (1980). Expert and novice performance in solving physics problems. Science, 208, 140–156. Linden, J. B., & Christensen, B. T. (2009). Analogical reasoning and mental simulation in design: Two strategies linked to uncertainty resolution. Design Studies, 3, 169–186. Lipshitz, R., & Strauss, O. (1997). Coping with uncertainty: A naturalistic decision-making analysis. Organizational Behavior and Human Decision Processes, 69(2), 149–163. McNeill, D. (1992). Hand and mind: What gestures reveal about thought. Chicago, IL: University of Chicago Press. McNeill, D. (2005). Gesture and thought. Chicago: University of Chicago Press. Moss, J., Kotovsky, K., & Cagan, J. (2006). The role of functionality in the mental representations of engineering students: Some differences in the early stages of expertise. Cognitive Science, 30(1), 65–93. Musgrave, B. S., & Gerritz, K. (1968). Effects of form of internal structure on recall and matching with prose passages. Journal of Verbal Learning and Verbal Behavior, 7(6), 1088–1094. Plewe, B. (2002). The nature of uncertainty in historical geographic information. Transactions in GIS, 6(4), 431–456. Previc, F. H. (1998). The neuropsychology of 3-D space. Psychological Bulletin, 124(2), 123–164. Priem, R. L., Love, L. G., & Shaffer, M. A. (2002). Executives’ perceptions of uncertainty sources: A numerical taxonomy and underlying dimensions. Journal of Management, 28(6), 725–746. Regan, H. M., Colyvan, M., & Burgman, M. A. (2002). A taxonomy and treatment of uncertainty for ecology and conservation biology. Ecological Applications, 12(2), 618–628. Regan, H. M., Hope, B. K., & Ferson, S. (2002). Analysis and portrayal of uncertainty in a food web exposure model. Human and Ecological Risk Assessment, 8(7), 1757–1777. Rowe, W. D. (1994). Understanding uncertainty. Risk Analysis, 14, 743–750. Schunn, C. D., & Anderson, J. R. (1999). The generality/specificity of expertise in scientific reasoning. Cognitive Science, 23(3), 337–370. Schunn, C. D., Saner, L. D., Kirschenbaum, S. K., Trafton, J. G., & Littleton, E. B. (2007). Complex visual data analysis, uncertainty, and representation. In M. C. Lovett & P. Shah (Eds.), Thinking with data. Mahwah, NJ: Erlbaum. Sheer, V. C., & Cline, R. J. (1995). Testing a model of perceived information adequacy and uncertainty reduction in physician/patient interactions. Journal of Applied Communication Research, 23, 44–59. Trickett, S. B., & Trafton, J. G. (2007). ‘‘What if. . .’’: The use of conceptual simulations in scientific reasoning. Cognitive Science, 31(5), 843–875. Trickett, S. B., Trafton, J. G., Saner, L. D., & Schunn, C. D. (2007). ‘‘I don’t know what is going on there’’: The use of spatial transformations to deal with and resolve uncertainty in complex visualizations. In M. C. Lovett & P. Shah (Eds.), Thinking with data. Mahwah, NJ: Erlbaum. Trickett, S. B., Trafton, J. G., & Schunn, C. D. (2009). How do scientists respond to anomalies? Different strategies used in basic and applied science. Topics in Cognitive Science, 1, 711–729.
252
Christian D. Schunn
Trope, Y. (1978). Inferences of personal characteristics on the basis of information retrieved from one’s memory. Journal of Personality and Social Psychology, 36(2), 93–106. Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual systems. In D. J. Ingle, M. A. Goodale, & R. J. W. Mansfield (Eds.), Analysis of visual behavior. Cambridge, MA: MIT Press. Urbany, J. E., Dickson, P. R., & Wilkie, W. L. (1989). Buyer uncertainty and information search. Journal of Consumer Research, 16(2), 208–215. Vlek, C., & Hendrickx, L. (1988). Statistical risk versus personal control as conceptual bases for evaluating (traffic) safety. In T. Rothengatter & R. de Bruin (Eds.), Road user behaviour: Theory and research (pp. 139–151). Assen, Netherlands: Van Gorcum & Co B.V. Voss, J. F., Tyler, S. W., & Yengo, L. A. (1983). Individual differences in the solving of social science problems. In R. F. Dillon & R. R. Schmeck (Eds.), Individual differences in cognition, Vol. 1 (pp. 205–232). New York: Academic. Walker, V. R. (1991). The siren songs of science: Toward a taxonomy of scientific uncertainty for decision makers. Connecticut Law Review, 23, 567. Walker, V. R. (1998). Risk regulation and the "faces" of uncertainty. Risk: Health, Safety, & Environment, 9, 27–38. Webster, A., & Bond, T. (2002). Structuring uncertainty: Developing an ethical framework for professional practice in educational psychology. Educational and Child Psychology, 19(1), 16–29.
C H A P T E R
S E V E N
Event Perception: A Theory and Its Application to Clinical Neuroscience Jeffrey M. Zacks and Jesse Q. Sargent Contents 1. Introduction 2. Event Segmentation Theory 2.1. Prior Evidence 3. Schizophrenia 3.1. Cognitive Deficits 3.2. Schizophrenia and Event Segmentation 4. Obsessive-Compulsive Disorder 4.1. Cognitive Disturbances 4.2. Obsessive-Compulsive Disorder and Event Segmentation 5. Parkinson’s Disease 5.1. Cognitive Deficits 5.2. Parkinson’s Disease and Event Segmentation 6. Lesions of the Prefrontal Cortex 6.1. Cognitive Deficits 6.2. Prefrontal Lesions and Event Segmentation 7. Aging 7.1. Prefrontal Cortex 7.2. Midbrain Neuromodulatory Systems 7.3. Episodic Memory and Situation Model Construction 7.4. Aging and Event Segmentation 8. Alzheimer’s Disease 8.1. Brain Changes and Cognitive Deficits 8.2. Alzheimer’s Disease and Event Segmentation 9. Conclusions Acknowledgments References
254 255 259 262 263 263 264 265 267 269 269 271 272 272 274 275 276 277 277 279 282 283 286 287 290 290
Abstract The chunking of continuous ongoing activity into discrete events is a central component of perception and cognition. It plays important roles in attention, cognitive control, and memory. Here, we review a theory of how the mind/brain Psychology of Learning and Motivation, Volume 53 ISSN 0079-7421, DOI: 10.1016/S0079-7421(10)53007-X
#
2010 Elsevier Inc. All rights reserved.
253
254
Jeffrey M. Zacks and Jesse Q. Sargent
segments ongoing activity into meaningful events. The theory proposes that event segmentation arises because perceptual systems make predictions about the near future. These predictions are guided by working memory representations, and when predictions fail memory representations are updated. We apply the theory to six conditions in clinical neuroscience: schizophrenia, obsessivecompulsive disorder, Parkinson’s disease, lesions of the prefrontal cortex, aging, and Alzheimer’s disease. This analysis makes novel suggestions for interventions to address these conditions, and points the way to new avenues of research.
1. Introduction For humans to experience the world as structured, the brain must organize the cacophonous wash of information that comes in through the senses. One powerful organizational principle is chunking: grouping a contiguous region of the input space under one representation. Chunking occurs at many stages in the central nervous system. Close to the sensory surfaces, a single neuron in the primary visual cortex may code for information gathered from many individual photoreceptor cells in the retina and thus represent an extended, coherent feature in the visual environment, such as a line at a certain orientation (Hubel & Wiesel, 1968; Marr & Ullman, 1981). At later stages, information in memory appears to be organized so that some specific items are associated or grouped with other specific items. For example, items that are learned in close temporal proximity are more likely to be recognized later if they are again presented in close temporal proximity (e.g., Faust, Balota, & Spieler, 2001; Underwood, 1957). In this chapter, we present a theory of how the human brain chunks the continuous stream of experience associated with everyday life into discrete episodes, or events. If asked to recall yesterday’s activities, one might organize the description into separate chunks such as going to the grocery store, doing the laundry and calling a friend. We propose that this organization is not just the result of conscious efforts to present an orderly description. Rather, as a function of perceiving and experiencing those episodes, a network of specific brain regions automatically inserts boundaries between discrete events as they occur. As a fundamental element of normal perceptual processing, this type of segmentation is suggested to be at the center of attention, action control, online memory updating, and episodic memory encoding. Because these functions are so important for everyday functioning, conditions that affect event segmentation may produce significant changes in cognition. Accordingly, this chapter investigates event segmentation in relation to cognitive disorders in which everyday event understanding is disturbed. We hope this will prove useful in organizing the facts about
Event Perception: A Theory and Its Application to Clinical Neuroscience
255
deficits of higher level cognition, providing testable hypotheses, and suggesting specific interventions that might not emerge from other theoretical perspectives. We begin with an overview of the theory and then consider how it may be useful in efforts to understand several neuropsychological disorders and the cognitive changes associated with healthy aging.
2. Event Segmentation Theory Event Segmentation Theory (EST) describes how and why our nervous systems segment ongoing experience into discrete episodes (Zacks, Speer, Swallow, Braver, & Reynolds, 2007; see also Kurby & Zacks, 2008; Swallow & Zacks, 2008). For example, consider what might happen during a typical visit to a coffee shop: you wait in line, you give your order, you pay, you put cream in your coffee, you leave. Different people will generate somewhat different lists of activities, but all are able to describe experience across time as organized into distinct units and overall there will be considerable agreement across individuals regarding what those units are. EST proposes this happens because, as part of normal perceptual processing, humans automatically segment episodes into units. In fact, EST suggests that the ongoing segmentation of experience is at the center of cognitive control, working memory (WM) updating, and storage and retrieval from episodic memory. The core components of EST, corresponding hypothesized neurophysiological structures, and the basic flow of information are illustrated in Figure 1. Reference to Figure 1 may be helpful as we describe the components of EST and review some of the relevant empirical evidence below. For a more detailed presentation of the neurocognitive account, see Zacks et al. (2007). For a more detailed computational presentation and computer simulation results, see Reynolds, Zacks, and Braver (2007). EST starts from the supposition that some of the most important products of perception and comprehension are predictions about what will happen in the near future. Prediction is front and center in many contemporary accounts of perceptual processing (Enns & Lleras, 2008), learning (Schultz & Dickinson, 2000), and language (Elman, 2009). Good predictions are adaptive because they allow one to plan actions more successfully (e.g., avoiding hazards or intercepting desired objects). Also, good predictions can facilitate efficient perceptual processing. For example, if a pitcher winds up and completes a throwing motion, the perceptual system anticipates that the ball will fly out of the pitcher’s hand toward home plate. In the absence of such anticipation, perceiving the ball whizzing through the air would be much more difficult—in fact, one might miss it altogether! According to EST, prediction is abetted by WM representations called event models. Event models may be thought of as representations of
256
Jeffrey M. Zacks and Jesse Q. Sargent
Error detection SN, VTA, LC Predicted future inputs ACC, ...
∗ Perceptual processing IT, MT+, pSTS, ...
Event models lateral PFC
Event schemata lateral PFC
Sensory inputs A1, V1, S1, ...
Figure 1 Schematic depiction of the model, with hypotheses about the neurophysiological structures corresponding to the different components of the model. Thin gray arrows indicate the flow of information between processing areas, which are proposed to be due to long-range excitatory projections. Dashed lines indicate projections that lead to the resetting of event models. PFC, prefrontal cortex; IT, inferotemporal cortex; MTþ, human MT complex; pSTS, posterior superior temporal sulcus; ACC, anterior cingulate cortex; SN, substantia nigra; VTA, ventral tegmental area; LC, locus coeruleus; A1, primary auditory cortex; S1, primary somatosensory cortex; V1, primary visual cortex. (Adapted with permission from Zacks et al., 2007.)
what-is-happening-now. EST suggests that all perceptual input is processed in the context of a currently activated conception of what-is-happeningnow. Our conceptualization of event models borrows heavily from work on situation models in discourse comprehension (e.g., Zwaan & Radvansky, 1998). Event models represent those aspects of a situation that are consistent within an event, while ignoring those aspects that vary haphazardly from moment to moment. Such representations are helpful not only for prediction but also because they allow the disambiguation of ambiguous sensory information and the filling-in of missing information. For example, at a baseball game an event model would represent the location of the baseball while it is hidden in the pitcher’s glove. We have proposed that event models are maintained in lateral prefrontal cortex (PFC). Event models combine current perceptual information with information acquired very recently in the present context, and with patterns of information learned over a lifetime of experience. For example, if you have never seen a baseball game, the first time the pitcher sets up to throw, you may have very little idea where the ball will go. As the pitch count goes up, your expectation that each upcoming pitch will go to home plate increases. However, if you are an experienced baseball fan, each pitch in an at-bat is perceived in the context of an event model informed by relatively stable long-term semantic memory about what happens at ball games. In EST, these long-term weight
Event Perception: A Theory and Its Application to Clinical Neuroscience
257
based representations are referred to as event schemata. In contrast, event models are activation-based WM representations. So, the content of an event model may overlap at any given time with a particular event schema, but when an event model ceases to have predictive value, it can be rapidly and completely updated to reflect the changing situation. We propose that event schemata as well as event models are implemented by the lateral PFC. A number of studies suggest that representations of events are maintained in the anterior, lateral PFC (e.g., Grafman, 1995; Schwartz et al., 1995; Wood & Grafman, 2003). We review some of this evidence in more detail in Section 6. The exact nature of the interaction between event models and event schemata is currently a topic of active research. So, while event models may be informed by current perceptual information, they can also influence how the perceptual system processes that incoming information (see Figure 1). For example, as described above, information provided by event models allows the visual system to anticipate the flight of a baseball before it is released by the pitcher. However, event models may facilitate processing of all types of sensory information across numerous, distributed brain regions. Perceptual analysis is accomplished by hierarchically organized neural systems specialized for vision, hearing, touch, and the other sensory modalities. For example, in the visual system (Felleman & Van Essen, 1991), information is initially represented in terms of simple local visual features in the early visual areas (V1 and V2, in the posterior occipital cortex). Successive processing stages form representations that are increasingly extended in space and time. Two broad streams process information important for object identification and for motor control relatively separately (Goodale, 1993). Features relevant to object identity and category are differentially represented in inferior temporal cortex (IT), whereas features related to motion and grasping are differentially represented in dorsal regions including the human MT complex (MTþ) and the posterior superior temporal sulcus (pSTS). Although there is communication between the streams and massive feedback throughout the system, these systems can be described as hierarchically organized, following a rough posterior-to-anterior spatial organization. Many of the classical studies characterizing these perceptual systems were conducted in nonhuman primates and relied on radically simplified stimuli. However, recent neuroimaging studies have shown similar responses in these areas across individuals during movie viewing (Bartels & Zeki, 2004; Hasson, Nir, Levy, Fuhrmann, & Malach, 2004; Hasson, Yang, Vallines, Heeger, & Rubin, 2008). EST proposes that event models bias processing in these streams. As we shall see shortly, EST also proposes that the updating of event models regulates processing over time in these streams. A critical feature of event models is that they need to be protected from moment-to-moment changes in sensory and perceptual information. Updating one’s event model to delete the baseball when it disappeared
258
Jeffrey M. Zacks and Jesse Q. Sargent
from sight would clearly be counterproductive. However, event models have to be updated eventually in order to be useful—the baseball game model will not be helpful at a gas station! The question is, when and how can event models be updated adaptively? EST’s answer is that event models are updated in response to transient increases in prediction error, mediated by systems in the anterior cingulate cortex (ACC) and midbrain neuromodulatory systems. The ACC maintains predictions and constantly compares them to actual inputs, producing an online error signal. Studies have shown this region to be sensitive to the commission of overt errors and to covertly measured cognitive conflict (e.g., Botvinick, Braver, Barch, Carter, & Cohen, 2001) and to the learning of sequential behaviors (Koechlin, Danek, Burnod, & Grafman, 2002; Procyk, Tanaka, & Joseph, 2000). When prediction error increases suddenly, this is detected by monitoring systems in the midbrain, which broadcast a global reset signal to the cortex. This system may include dopamine-based signaling subserved by the substantia nigra (SN) and ventral tegmental area (VTA) and norepinephrine-based signaling subserved by the locus coeruleus (LC). Neurons in the SN and VTA are sensitive to errors in reward prediction (e.g., Schultz, 1998). Dopamine cells in the SN and VTA project broadly to the frontal cortex, both directly and through the striatum, providing a mechanism for a reset signal such as is posited by EST. The LC has been implicated in regulating the sensitivity of an organism to external stimuli (e.g., Usher, Cohen, Servan-Schreiber, Rajkowski, & Aston-Jones, 1999). It also has broad connections to the cortex, these based on norepinephrine rather than dopamine. The reset signal transiently opens an input gate on the event models, exposing them to the early stages of sensory and perceptual processing (see Figure 1). This produces a short burst of increased activity in the perceptual processing stream and the event models settle into new states. As the event models are updated, predictions become more adaptive and errors decrease. The system returns to a stable configuration. A schematic representation of the temporal dynamics of the error-based updating process is shown in Figure 2. According to this account, event segmentation is an ongoing concomitant of everyday experience, which happens without intent or necessarily awareness. The processing that occurs at event boundaries can be viewed both as focal attention and as memory updating. An appropriate (stable) event model is a WM buffer whose outputs bias processing in that stream. The opening up of event models’ inputs is a form of focal attention, and the settling into a new state is a form of memory updating. Event segmentation in and of itself is not the goal of the system; instead, it is a by-product of mechanisms evolved in support of a more efficient, predictive perceptual system. Important for thinking about how EST applies to daily experience, it is suggested that event segmentation occurs simultaneously at multiple time scales. Consider the coffee shop example given above. If one’s event model for going to a coffee shop generates predictions consistent with all the distinct units of activity typically involved (e.g., waiting, ordering, paying),
Event Perception: A Theory and Its Application to Clinical Neuroscience
259
Prediction error
Event models updated
Prediction error low; event models stable
An unpredicted change occurs; prediction error increases
Prediction error returns to low level
Time
Figure 2 Temporal dynamics of event segmentation. Most of the time prediction error is relatively low and event models are stable. As a model becomes less adaptive, prediction error increases. In response, information form sensory and perceptual processing is gated into the model, updating its contents. After updating, error declines and the model settles into new state.
then no error signal would be generated, the model would be stable throughout the episode, and no event boundaries would occur. So, how does EST explain the segmentation of going to a coffee shop into distinct units of activity? We may consider events as hierarchical representations. The event ‘‘going to a coffee shop’’ is at a higher level in the hierarchy than the events ‘‘waiting in line’’ and ‘‘ordering’’. Lower level aspects of an event representation are sensitive to prediction error signals integrated over shorter time scales. So, when it comes time to place an order, the ‘‘waiting in line’’ level of the hierarchical event representation generates some degree of prediction error. That hierarchical level becomes unstable until the ‘‘ordering’’ model is instantiated at which point the error signal decreases. Meanwhile, at a higher level of the event representation, ‘‘going to a coffee shop’’ is insensitive to such short lived error signals. Higher levels are sensitive to error signals integrated over longer time scales. When one leaves the coffee shop, it is likely that there will be a more prolonged increase in error. The resulting prolonged error signal causes instability at a higher level of the hierarchical representation and ‘‘going to a coffee shop’’ is abandoned for a more adaptive model. In accordance with this explanation, we would expect models at higher hierarchical levels to make less specific predictions. Also, we would expect boundaries between events at higher hierarchical levels to align with boundaries between events at lower levels.
2.1. Prior Evidence EST makes a number of claims about behavior and brain function, some of which are consistent with previous research and some of which have been tested directly. First, EST predicts that event segmentation is an ongoing part of normal perceptual processing. Evidence for this proposal comes from
260
Jeffrey M. Zacks and Jesse Q. Sargent
behavioral and functional magnetic resonance imaging (fMRI) studies. In a typical event segmentation paradigm, participants watch movies of actors engaged in everyday activities (e.g., doing laundry) and are instructed to press a button whenever they believe one meaningful unit of activity has ended and another has begun (Newtson, 1973). When instructions direct attention to larger (coarse grain) or smaller (fine grain) units of activity, the behavioral data are thought to reflect ongoing event segmentation at higher or lower levels of hierarchical event representation. Studies have demonstrated that segmentation of videos using this method shows both stable intersubject agreement, and stable individual differences over a period of more than a year (Newtson, 1976; Speer, Swallow, & Zacks, 2003; Zacks et al., 2007). Furthermore, observers spontaneously group fine-grained event boundaries into hierarchically organized coarse-grained events (Newtson, 1976; Zacks, Tversky, & Iyer, 2001). That is, coarse grain boundaries tend to correspond to a subset of fine grain boundaries, which supports the view that event segmentation occurs simultaneously at multiple time scales. The reliability and structure of the data from the segmentation task support the suggestion that this paradigm is capturing an ongoing feature of normal perception. Ultimately, however, these results prove only that individuals can segment ongoing experience into units. Evidence that individuals do segment experience in the course of normal day-to-day perception comes from neurophysiological studies. Using fMRI, Zacks et al. (2001) first monitored participants’ brain activity during passive viewing of simple movies. Afterward, participants segmented the movies by indicating whenever, in their view, one meaningful unit of activity had ended and another had begun. During passive viewing, a collection of regions transiently increased in activity at those moments that viewers later identified as event boundaries. These regions included areas in lateral posterior cortex (including the inferior and superior temporal sulci and ventral temporal cortex), medial posterior cortex (including the cuneus and precuneus), and lateral frontal cortex. Similar results have been generated using several variations of this general paradigm (Speer, Zacks, & Reynolds, 2007; Speer et al., 2003; Zacks, Swallow, Vettel, & McAvoy, 2006). Second, EST predicts that perceptual processing increases at event boundaries. The fact that brain activity transiently increases at event boundaries is consistent with this prediction—particularly suggestive are the increases in posterior regions associated with perceptual processing. It has been shown that memory for perceptual details at or around event boundaries is better than that for details associated with event middles (Newtson & Engquist, 1976; Schwan, Garsoffky, & Hesse, 2000). Also, EST suggests that if the surface structure of events is consistent with the underlying event structure, then event segmentation mechanisms should operate more efficiently, and again, memory for the episode should improve. This too has
Event Perception: A Theory and Its Application to Clinical Neuroscience
261
been borne out in the laboratory (e.g., Schwan & Garsoffky, 2004). For example, Boltz (1992) showed participants a feature film with no commercial breaks, breaks that corresponded to event boundaries, or with breaks placed at nonboundaries. Recall of activity and memory for the temporal order of events in the movie was improved by the breaks at event boundaries and reduced by the breaks at nonboundaries. Further support for the suggestion that segmenting events in a manner that corresponds to their intrinsic structure improves memory for those events comes from a study of individual differences. Zacks, Speer, Vettel, and Jacoby (2006) found that group-typical segmentation of movies, which may be assumed to reflect intrinsic structure, predicted better performance on subsequent memory tests after controlling for overall cognitive level. Another prediction of EST is that information associated with the current event model, and thus active in WM, should be more accessible than information associated with a previously active model. When using text material, event boundaries can be induced by imposing a change such as a temporal break (e.g., ‘‘. . .a day later. . .’’) or a shift of spatial location (e.g., ‘‘the detective burst into the room’’). Such shifts result in the perception of an event boundary for films as well (Zacks, Speer, & Reynolds, 2009). Numerous studies using text comprehension have shown results consistent with this prediction (e.g., Bower & Rinck, 2001; Zwaan & Radvansky, 1998). For example, Speer and Zacks (2005) required participants to read narratives and showed that memory for items in the narrative was lower when a temporal break intervened between the mention of the item and the test. Similar results have recently been obtained with movies (Swallow, Zacks, & Abrams, 2009). In sum, EST proposes that predictions about the near future are guided by WM representations of the current event, which are updated in response to transient increases in prediction error. This updating includes upregulation of the perceptual processing pathways feeding into event models. The experience of an error spike and consequent updating is perceived as a boundary between meaningful events. Thus, event segmentation is an ongoing perceptual mechanism standing at the center of attention, cognitive control, and memory. It is subserved by a distributed set of brain mechanisms described above (see Figure 1). If one or more of these is selectively affected by a disorder or age-related process, it may have substantial consequences for cognition. In the following sections, we apply EST to the analysis of six conditions in clinical neuroscience. We have selected six conditions based on the overlap between the neurocognitive mechanisms implicated in each and the mechanisms of event segmentation as proposed by EST. The six are: schizophrenia, obsessive-compulsive disorder (OCD), Parkinson’s disease (PD), lesions of the PFC, aging, and AD. Our selections are necessarily heuristic and surely incomplete. However, we think the analysis shows the
262
Jeffrey M. Zacks and Jesse Q. Sargent
potential for EST to provide new insights regarding major cognitive deficits associated with these disorders.
3. Schizophrenia Schizophrenia is a developmental neurocognitive disorder that affects approximately 1% of adults (Bresnahan et al., 2000). In most cases, it is diagnosed in early adulthood and has consequences throughout adult life. Schizophrenia has classically been characterized by positive symptoms, which include hallucinations, delusions, and paranoia, and by negative symptoms, which include flattened affect, reduced volition, and anhedonia. However, it has become increasingly clear that cognitive impairments are a prominent part of the disease, and that these have profound effects on people’s lives. In a review and meta-analysis, Green, Kern, Braff, and Mintz (2000) examined the relations between cognitive deficits and functional outcomes. The cognitive variables studied included secondary (long-term) verbal memory, immediate verbal memory, executive control, and vigilance. The functional outcome measures included success in psychosocial skill acquisition, social problem-solving, and daily functioning such as occupational functioning and independent living. All of the cognitive variables were related to functional outcomes, accounting for 20–40% of the variance across individuals. Thus, cognitive performance is a major predictor of the ability of persons with schizophrenia to maintain employment, build social ties, and live independently. The etiology and pathophysiology of schizophrenia are complex and not fully understood. Schizophrenia selectively affects the PFC, as well as the hippocampus and thalamus (Harrison, 1999). The neurotransmitter dopamine has been shown to play a major role in the disorder, though its functions are still not fully known. Dopamine’s role in schizophrenia has been reviewed by Guillin, Abi-Dargham, and Laruelle (2007). Early research focused on the D2 dopamine receptor, which is widely expressed in the midbrain where large numbers of dopamine neurons are found. In a classic set of studies, Creese, Burt, and Snyder (1976) discovered that the effectiveness of antipsychotic medications to treat the positive symptoms of schizophrenia was correlated with their ability to occupy D2 receptor sites. Current theory holds that abnormally high activity of D2 receptors in the midbrain reduces the effectiveness of glutamate, an excitatory neurotransmitter. More recently, attention has focused on D1 receptors in the PFC. D1 receptors have been found to have reduced activity in the dorsolateral PFC, possibly as a compensatory response to chronic
Event Perception: A Theory and Its Application to Clinical Neuroscience
263
overstimulation. Reduced D1 responsiveness interferes with inhibitory local signaling based on the neurotransmitter GABA.
3.1. Cognitive Deficits The cognitive deficits in schizophrenia are specialized rather than global. For example, implicit memory appears to be relatively well preserved (Clare, McKenna, Mortimer, & Baddeley, 1993). However, WM—the ability to store and manipulate information over short durations—is impaired. Barch (2005) reviewed the data on WM impairments in schizophrenia and proposed that a specific component of WM is affected by schizophrenia. Baddeley’s (1986) WM theory proposes that WM is implemented by a set of passive storage systems and a central executive that manages the updating and transformation of information in the storage systems. Barch argued that the data suggest no impairment in the maintenance of auditory information, possible impairment of visuospatial information, and substantial impairment of the central executive. This central executive impairment is associated with functional differences in the PFC. In particular, Barch (2006) and colleagues have proposed that the ability to maintain cognitive representations of task set and use them to guide behavior is impaired in schizophrenia. This conception of the central executive is consistent with Baddeley’s (2000) recent proposal of an episodic buffer, a component of the central executive that maintains integrated multimodal representations of the current behavioral episode.
3.2. Schizophrenia and Event Segmentation In terms of EST, the neurochemical and cognitive disturbances identified in schizophrenia could produce two different effects on event understanding. If D2 hyperactivity impairs the effectiveness of long-range excitatory projections from the midbrain, this would be expected to impair event model updating. If D1 hypoactivity in the PFC affects the maintenance and use of information, this should be reflected as an impaired ability to maintain information in event models. This proposal fits with the behavioral findings that the central executive may be impaired in schizophrenia. In particular, it is consistent with the proposal that task set representations are affected by the disease. Both of these possibilities—impaired event model updating and impaired maintenance—would be expected to lead to deficits in event segmentation. There is very little direct evidence on event perception in schizophrenia, but the existing data support the existence of an event segmentation deficit. Zalla, Verlut, Franck, Puzenat, and Sirigu (2004) asked outpatients with schizophrenia and healthy controls to view movies of everyday activities and segment them into fine and coarse events. Patients and controls
264
Jeffrey M. Zacks and Jesse Q. Sargent
identified similar numbers of events, and their fine-grained event boundaries were located in similar locations. However, the patients tended to identify coarse-grained boundaries in normatively incorrect locations. This tendency was correlated with schizophrenic symptomatology. In a pilot study in our laboratory (Zacks & Barch, unpublished data), we replicated the finding that persons with schizophrenia segmented in a less normative fashion than healthy controls. Persons with schizophrenia also showed impaired memory for the temporal order of events, and impaired recognition memory for pictures taken from the events. In the future, it will be important to follow up these results to determine if schizophrenia produces a selective deficit in event model updating or maintenance. If updating is selectively impaired, it may be possible to remediate this by teaching explicit strategies for identifying event boundaries, or by explicitly highlighting event boundaries in texts or films. If event model maintenance is selectively impaired, it may be possible to intervene by teaching explicit strategies to rehearse key information such as characters and task goals, or by providing external memory aids to support maintenance. Finally, in the meta-analysis described above, Green et al. (2000) note that the mechanism by which the cognitive deficits associated with schizophrenia lead to lower functional outcome scores remains unclear. Here, it is interesting to consider that event segmentation mechanisms may be closely related to performance on functional outcome measures. For example, in one functional outcome measure, patients are required to interpret vignettes depicting interpersonal interactions. The ability to form and maintain appropriate event models would seem to be central to this task. This suggests that measures of event segmentation ability might be particularly informative regarding the ability of schizophrenics to function independently in society. In sum, cognitive dysfunction is a salient component of schizophrenia and a major predictor of the disease’s impact on a person. Neurophysiological studies implicate the dopaminergic system, including midbrain D2 receptors and prefrontal D1 receptors. Disruption of either system could produce disorders of event segmentation and memory. The limited available evidence supports the existence of such disorders and suggests point of intervention to remediate them.
4. Obsessive-Compulsive Disorder OCD is a psychiatric condition characterized by persistent intrusive thoughts and compulsive behaviors. The obsessive thoughts often have to do with threats to safety and threats of contamination. Compulsive behaviors often relate to alleviating these threats (e.g., compulsive washing associated with obsessive thoughts about dirt or disease), but in some cases
Event Perception: A Theory and Its Application to Clinical Neuroscience
265
the behaviors appear to be unrelated to the obsessive concerns (Boyer & Lienard, 2008). For a time, a dominant view of the neurochemical mechanism of OCD was that it was caused by hypoactivity of the neurotransmitter serotonin. This was motivated largely by the finding that the symptoms of OCD were ameliorated by serotonin reuptake inhibitors (SRIs), antipsychotic drugs that strengthen the effects of serotonin in the synapse. However, the effects of SRIs are widespread and complex, and some studies that have directly intervened in the action of serotonin have cast doubt on its being the primary causal mechanism. As a result, some attention recently has focused on a possible role of dopamine in OCD (Fornaro et al., 2009). Another possibility that has been proposed is that a circuit involving the orbitofrontal cortex (OFC), the SN, and the basal ganglia is dysregulated, leading innate motor programs to be triggered inappropriately (Rapoport, 1990), or to hypersensitivity of attentional systems to environmental threats (Saxena & Rauch, 2000). Both serotonin and dopamine play important roles in this circuit. Huey et al. (2008) have presented a psychological and neuroanatomical model of OCD that is particularly relevant to the current discussion because of the central role played by event representations. This model suggests that the PFC supports goal-oriented, structured sequences of events (structured event complexes, or SECs). Once this type of event representation is activated, a network of neural systems involved in reward and emotional processing (e.g., OFC, limbic system), support a motivational signal, experienced as anxiety, that abates upon completion of the SEC. The authors suggest that OCD patients do not experience the relief from anxiety normally associated with completion of an SEC. Obsession is the behavioral manifestation of the neural signal that an SEC has been ‘‘left hanging.’’
4.1. Cognitive Disturbances Evans and Leckman (2006) have recently reviewed the epidemiology, symptomatology, and neurophysiology of OCD. They note that the intrusive thoughts associated with OCD are similar to those experienced by healthy controls—they are just more frequent and more difficult to dismiss. Obsessive behaviors vary over the lifespan in healthy persons, being most prominent in early childhood (2–6), at puberty, and after becoming a new parent. Early childhood and puberty are also the peak times of onset of clinical OCD. Together, these patterns suggest that persons with OCD do not have disordered representations of events, objects, or persons; rather they have a disruption in the ability to control the unwanted influence of some of these representations. Evans and Leckman propose that OCD arises from the dysregulation of evolutionarily adaptive systems for threat monitoring and avoidance.
266
Jeffrey M. Zacks and Jesse Q. Sargent
What is the nature of this dysregulation? Current accounts propose hyperactive monitoring of overt behavioral errors or of covert conflict between information processing streams (van Veen & Carter, 2002). Degree of obsessive thought is correlated with errors on the Wisconsin Card Sort Test (WCST). The WCST requires one to sort cards according to the number, shape, or color of objects on the card. One must discover a rule for sorting cards based on feedback, and then adapt one’s performance when the experimenter covertly changes the rule. This task would seem to be quite sensitive to dysfunction in error detection mechanisms. However, there is no evidence that increased WCST errors among OCD patients are related to error or conflict processing, and the relationship could reflect broader cognitive impairments. There are stronger data linking OCD to selective deficits in motor inhibition and response suppression (Evans & Leckman, 2006). However, the strongest data come from studies that have measured neurophysiological responses to situations that produce errors or high conflict. These studies have focused on the ACC. As we have described in Section 2, the ACC is associated with monitoring conflict between information streams, and in EST it is proposed to support the evaluation of prediction error. In one study, Gehring, Himle, and Nisenson (2000) asked persons with OCD and healthy controls to perform the Stroop task. In this task, participants are shown color words printed in various ink colors (either congruent or incongruent with the color named by the word) and asked to name the ink color, rather than to read the word. This requires suppressing a prepotent response to read the word and produces slow responses or errors, depending on the task constraints. Gehring and colleagues used electroencephalography to measure a correlate of error processing, the error-related negativity (ERN), during task performance. The ERN is a negative-going wave that is found just after people commit errors in simple cognitive tasks. It is strongest over frontocentral electrodes, and is thought to originate in the ACC. The Stroop task requires a high degree of cognitive control and leads to frequent errors, accompanied by ERNs. In persons with OCD, these responses were exaggerated. In a functional MRI experiment from this group, persons with OCD and controls performed a flanker task, in which they had to respond to the identity of the central character in an array while ignoring the characters on either side. Like the Stroop task, the flanker task produces a conflict between response tendencies driven by the target stimulus information and those driven by the to-be-ignored information. Also, like the Stroop task, it produces many errors. Both the control and patient groups showed increased activity in the dorsal portion of the ACC on trials in which they made errors. However, the OCD patients also showed significant increases in the rostral ACC. Overt errors do not appear to be necessary to produce activation in the ACC, nor to dissociate the neurophysiological response of control and OCD participants to task performance (van Veen & Carter, 2002). In one study, Ursu, Stenger, Shear,
Event Perception: A Theory and Its Application to Clinical Neuroscience
267
Jones, and Carter (2003) asked participants with OCD and healthy controls to perform a version of the continuous performance task. In this task, participants view a sequence of alphabetic characters and are asked to respond only when a particular two-step subsequence is presented (e.g., an A followed by an X). The stimulus set was constructed such that when the first character (A) appeared, the second (X) was quite likely. This establishes a strong prepotent tendency to respond to the following character. Thus, if the following character is a nontarget (e.g., Y), there is a conflict between the prepotent response and the correct nonresponse. On such trials, persons with OCD showed larger responses in the ACC than controls—even when they successfully withheld their responses. A striking feature of all three of these studies—using three different tasks—is that the behavioral performance of persons with OCD did not differ substantively from the controls. Thus, the neurophysiological markers of exaggerated error or conflict processing were present even when there was no evidence of ‘‘compulsive’’ behavioral performance.
4.2. Obsessive-Compulsive Disorder and Event Segmentation In terms of EST, we consider three possible pathways by which OCD could be related to event segmentation. One possibility is that compulsive behavior results from attending to or integrating prediction error signals at an abnormally short time scale, which should cause one to experience events as segmented at an abnormally fine grain. Boyer and Lienard (2008) propose that this is the case and that it accounts for the ritualized character of compulsive behaviors. If one attends to events on a very fine time scale, one should neglect their relation to larger events and the larger goals of one’s actions (Vallacher & Wegner, 1987). Boyer and Lienard propose that this shift to a fine grain of event segmentation is adaptive because it occupies WM, which reduces the intrusion of obsessive thoughts. A recent study by Zor et al. (2009) provides support for this possibility. In this study, participants with OCD were videotaped performing activities that formed the basis for their compulsive behavior, for example, filling a pet’s bowl, lighting a cigarette, or blowing one’s nose. For each patient participant, a control participant was videotaped performing the same activities. The lowlevel actions (e.g., checking the bowl’s position, waving hands) were coded from each videotape. The patient group performed many more actions than the controls and repeated actions more often. Importantly, these ‘‘extra’’ actions tended to be idiosyncratic and apparently nonfunctional, such as waving one’s hands when filling a pet’s bowl. This result suggests that the patients were attending to the activity at a low level that neglected the goal relevance of the individual actions. A straightforward prediction from this proposal is that patients with OCD should segment activity into finergrained events than control participants. This could be tested using the
268
Jeffrey M. Zacks and Jesse Q. Sargent
behavioral tasks described previously. (One caveat is worth mentioning: Grain of segmentation in explicit event-marking tasks is quite sensitive to instructions and participants’ interpretations of those instructions. These effects presumably affect the output processes involved in performing the explicit task, such as deciding when to press a response key and executing the response, rather than affecting the ongoing segmentation process. So, if comparing patients to controls, one would want to minimize task demands that could affect segmentation grain and use converging measures to help distinguish between differences in the mechanisms of ongoing segmentation and differences in task-specific output processes.) A second possibility is that obsessions and compulsive behavior result from a chronically high prediction error signal or a too-low threshold for error-based gating. (In EST, these two components are not uniquely identifiable, because raising or lowering the mean error signal can be compensated for by lowering or raising the gating threshold.) This accords with OCD patients’ frequent report that things ‘‘just don’t feel right.’’ It is also consistent with the findings of exaggerated error and conflict signals in the ACC, described above. Exaggerated prediction error responses could result in more frequent activation of the error-based gating mechanism and therefore more frequent event boundaries. At the same time, if prediction error signals are chronically elevated, this would reduce the ability of the error-based updating system to distinguish between intervals of low and high prediction error. This in turn should produce unreliable, idiosyncratic event segmentation. Thus, this proposal predicts that the segmentation of patients with OCD should be more idiosyncratic than that of controls as well as finer grained. Failing to segment activity into the proper event units would be expected to reduce the effectiveness of actions and could produce the sorts of interruptions and perseverations reported by Zor et al. (2009). A final possibility is that persons with OCD have disordered event schemata. Schemata are long-term memory representations implemented by synaptic weight changes. These representations reflect commonly activated event models. For example, the event schema representing making toast is built up over a lifetime of experience making toast. Now, suppose that one began performing some nonfunctional perseverative behavior, such as tapping the toaster three times, whenever one made toast. Eventually, the schema for making toast would include this nonfunctional behavior. In this case, the presence of the toaster-tapping in the event schema would represent a source of compulsive behavior, in addition to whatever other sources may exist. Although this does not explain the initial appearance of the compulsive behavior, it may be a mechanism by which such behavior becomes difficult to expunge. In sum, the clinical profile and pathophysiology of OCD suggest it may involve a dysregulation of mechanisms for monitoring error or conflict. Such dysregulation could affect event segmentation directly or indirectly,
Event Perception: A Theory and Its Application to Clinical Neuroscience
269
through its long-term impacts on event schemata. If so, event segmentation measures may prove valuable for better understanding the mechanisms of OCD or for diagnosing it. We believe this is a fertile area for future research.
5. Parkinson’s Disease PD is a neurological condition characterized by a disorder of movement (Binder, Hirokawa, & Windhorst, 2009). Patients with PD have tremor, rigidity, postural instability, and bradykinesia—difficulty initiating movements combined with slow movement execution. Bradykinesia is often the most debilitating motor symptom. PD is diagnosed by a cluster of these motor symptoms combined with a finding that the individual responds to medications that increase the effectiveness of dopamine. PD is also associated with nonmotor symptoms including loss of smell, depression, anxiety, autonomic dysfunction, sleep disturbance, and cognitive deficits. The etiology of PD is not fully understood; most likely, PD can arise through multiple pathways (Olanow & Tatton, 1999). It occurs occasionally in middle age, but becomes more prevalent after age 60. The motor symptoms are the result of a dramatic reduction in the projection of dopamine cells from the SN to the striatum. These pathways form part of a set of thalamocortical loops, which are thought to be important for the online control of movement and cognition. However, PD is also associated with more diffuse lesions to subcortical and cortical structures as well. PD often is accompanied by frank dementia, particularly in the later stage of the disease (Aarsland et al., 2001). Dementia in PD is characterized by deficits in executive control, visuospatial processing, and personality disorder (particularly depression). The mechanisms of PD dementia are not well understood. The fact that it is not well controlled by dopamine agonist medications suggests that PD dementia may be caused by lesions other than those to the dopamine cells in the SN described above. Because this dementia is relatively global and its mechanisms are not currently well understood, its relevance to event perception is limited. In earlier or milder cases, the cognitive deficits are more focal and therefore may be more informative. Thus, we focus here on cognitive deficits in PD patients without dementia.
5.1. Cognitive Deficits In a comprehensive review, Taylor and Saint-Cyr (1995) described the primary cognitive deficit in PD as a selective impairment of the selection of action plans when the environment provides cues for multiple potential action plans. For example, patients with PD typically are impaired on the WCST. A key characteristic of this task is that the cues provided by the card
270
Jeffrey M. Zacks and Jesse Q. Sargent
underdetermine the correct response. PD patients are also impaired on a version of the Tower of Hanoi task. The Tower of Hanoi is a puzzle in which participants must move a stack of discs of various sizes from one of three pegs to another, subject to two rules: One can only move one disc at a time, and a larger disc can never be placed on a smaller disc. In this case, several moves are possible on each turn and the participant must hold multiple evaluations in mind to select a better move. Taylor and SaintCyr propose that the cognitive deficits can be understood neurobiologically in terms of two thalamocortical loops projecting from the SN. Both loops project through the basal ganglia to the cortex, primarily the PFC. Whereas the motor dysfunction in PD may be due to damage to projections from the SN to the caudate nucleus of the basal ganglia, the cognitive deficit may be due to projections from the SN to the putamen, as well as direct projections to the supplementary motor are and the dorsolateral PFC. A recent study focused, in particular, on cognitive deficits in PD patients without dementia (Green et al., 2002). Patients and controls were administered a battery of cognitive tests. Patients had relatively preserved short-term memory span and long-term recognition memory. However, impairments were frequently observed in the WCST and in fluency tasks (e.g., naming as many animals as possible within a 1-min interval). These deficits were interpreted as reflecting damage to the ‘‘cognitive’’ thalamocortical loops. However, patients also were frequently impaired on judgments of line orientation and the acquisition of new verbal memories; these deficits do not fit as well with this interpretation. Persons with PD are impaired at learning new sequential behaviors (Seger, 1994). For example, in the serial reaction time task, participants are cued to press one of several keys by the onset of a light above the key. Trials follow each other rapidly, and a repeating sequence of keys can be embedded in the string of trials. Performance improves over time for two reasons: Participants become faster at responding to the light, and they learn to anticipate the sequence of keypresses. This can be seen by contrasting the condition with the repeating sequence to a condition in which each light follows the previous one randomly. Performance in this control condition improves somewhat with practice, but not as much as in the sequential condition. Sequence learning in the serial reaction time task often occurs without participants becoming aware of the repeating sequence, particularly if the sequence is relatively long. Patients with PD show substantially reduced sequence learning in this and related tasks. In sum, in the early stages of PD, there may be a relatively selective deficit due to selective damage to the thalamocortical loops. This may result in impairments of action selection when multiple potential actions are possible, and in learning associations among actions in these conditions.
Event Perception: A Theory and Its Application to Clinical Neuroscience
271
5.2. Parkinson’s Disease and Event Segmentation In terms of EST, a primary lesion to the dopaminergic projections from the SN would be expected to produce a deficit in updating event models. The deficits of PD patients in the WCST accord well with this possibility. However, we will see that similar deficits can be produced by lesions to frontal cortex, which we interpret as selectively affecting event model maintenance or event schemata (see Section 2). This task is not well suited to teasing apart event model updating from maintenance. Similarly, impairments in action selection and sequential learning are consistent with a deficit in event model updating, but do not discriminate this possibility from numerous others. A pair of studies by Zalla and colleagues (Zalla et al., 1998, 2000) strongly suggest that event schemata are intact in patients with PD, dissociating their performance from those of patients with prefrontal lesions. In one study (Zalla et al., 1998), patients with PD were given cards describing steps in everyday activities such as toasting bread and going to the movies. On each trial, 20 cards were given, 5 for each of 4 activities. Participants were asked to sort the cards such that the steps for each activity were segregated and ordered. Whereas frontal lesion patients frequently mixed steps from the different activities, PD patients were able to segregate the activities and order the cards. However, their performance was quite slow, and when distractor steps were included (which did not belong to any of the activities), PD patients were less able to set these aside. Zalla et al. concluded that whereas the frontal lobe patients had deficient event representations, the PD patients had intact event knowledge but had difficulty shifting their cognitive set in order to deploy that knowledge efficiently in the task. In the second study (Zalla et al., 2000), PD patients and patients with frontal lobe lesions were asked to generate lists of the steps involved in a similar set of everyday activities. Again, the frontal lobe group showed evidence of impaired event knowledge, producing fewer correct steps and failing to place them in the correct order. The PD group showed neither impairment. However, they were less able to identify which steps were important for completing the activity. Zalla et al. interpret this as an impairment in action selection. Little is known about event segmentation in PD. EST predicts that if cortical updating due to dopamine signaling is impaired, then patients with PD should show disorganized event segmentation. This should be evident in segmentation behavior: Patients with PD should show reduced segmentation agreement. Moreover, patients with PD should show reduced evoked brain responses at event boundaries, reflecting reduced updating. This should hold whether normative event boundaries or those identified by the patient are used to estimate the evoked responses.
272
Jeffrey M. Zacks and Jesse Q. Sargent
Thus, the deficits in event understanding observed in PD are consistent with the hypothesis that dopamine-based updating is impaired in this disorder. However, the tasks that have been used thus far do not differentiate this possibility from the possibility that event model maintenance may be impaired. This is an important question for future research.
6. Lesions of the Prefrontal Cortex Lesions to the PFC produce cognitive disturbances that are at once subtle and profound. On the one hand, prefrontal lesions rarely produce dramatic deficits in sensation, perception, or movement control (though lesions to the immediately posterior parts of frontal cortex produce profound motor deficits). On the other hand, prefrontal lesions often produce disorders of intentional action that interfere greatly with everyday functioning. There is a large literature on the cognitive deficits associated with prefrontal lesions (for reviews, see Fuster, 1997; Grafman, 1995). Here, we focus on those aspects of cognition that are most relevant for event understanding.
6.1. Cognitive Deficits Persons with prefrontal lesions frequently suffer from particular forms of apraxia, or disorder of action. Whereas persons with posterior lesions are more likely to experience apraxias in which they are unable to pick up objects or perform body movements on command, persons with prefrontal lesions often have intact ability to perform simple actions but deficits in the ability to organize these actions effectively. Schwartz and her colleagues have described this as action disorganization syndrome (Schwartz, 2006; Schwartz et al., 1995). One potential cause of action disorganization syndrome is damage to the long-term memory representations supporting structured action. Evidence that such knowledge depends critically on the PFC comes from several sources. Grafman and colleagues have suggested the PFC stores representations of typical actions called structured event complexes (SECs; e.g., Grafman, 1995, 1999; Sirigu et al., 1998). SECs correspond closely to the event schemata described above (see Section 2). They are structured representations that capture information about the actions that make up an activity, their relations, the social structure of the activity, and the activity’s characteristic physical setting and objects. It is posited not only that SECs are stored in PFC, but that they are stored with category-specific localization. Using fMRI, Wood and Grafman (2003) showed that when participants made classification judgments about whether single words belonged to particular semantic categories, PFC activation patterns differed from those observed when judgments were made about whether action phrases
Event Perception: A Theory and Its Application to Clinical Neuroscience
273
belonged to particular SECs (e.g., going out to dinner). Furthermore, patterns of PFC activation differed depending on whether the classified items were social in nature or not. Similar results have been reported by Zanini and colleagues (Zanini, 2008; Zanini, Rumiati, & Shallice, 2002). One of the factors that distinguish SECs from other types of memory representations is the inclusion of information regarding the sequencing of behaviors over time. For example, Sirigu et al. (1996) examined the selection and temporal organization of actions among normal controls and patients with lesions to either the PFC or to more posterior regions. This study used the same paradigm that Zalla et al. (1998, 2000) used to measure event knowledge in patients with PD (see Section 5). Participants were given cards printed with the steps in a set of four everyday activities, and asked to sort the cards to separate the activities and place the steps in order. Patients with PFC lesions were more likely to place steps out of order, and more likely to intrude steps from one activity into another. Research by Humphreys and colleagues has directly compared action observation and action performance, suggesting that a common deficit in event knowledge can impair both (Humphreys & Forde, 1998; Humphreys, Forde, & Riddoch, 2001). A second potential cause of action disorganization syndrome is disruption to the ability to maintain representations of one’s current actions and goals online. It has been proposed that the PFC maintains representations of one’s current goals and task (e.g., Miller & Cohen, 2001; Mushiake et al., 2009). This proposal is based in part on the finding that PFC neurons exhibit sustained firing during memory and other tasks (e.g., Fuster & Alexander, 1971; Levy & Goldman-Rakic, 2000). In human fMRI studies, sustained activity is found in PFC when participants attempt to maintain information over a delay (Wager & Smith, 2003). Some individual cells in monkey PFC are sensitive to which task the monkey is to perform, independent of the sensory input (e.g., Muhammed, Wallis, & Miller, 2006), and in human fMRI experiments PFC is sensitive to the complexity and timescale of task instructions (Koechlin & Summerfield, 2007). Norman and Shallice (1986) proposed a model in which the posterior cortex stores representations of low-level actions, and the PFC is selectively involved when multiple low-level actions compete for activation. In these cases, competition has to be resolved using event knowledge and maintenance of current goals. This theory has been implemented recently as a computational model, which can reproduce the qualitative features of action disorganization syndrome (Cooper & Shallice, 2000, 2006). Another very different computational model proposes that goal maintenance and competition resolution are combined in a single processing framework that uses similarity structure learned from previous experiences to resolve competition (Botvinick & Plaut, 2004, 2006). Although they differ dramatically in their computational architecture, both models propose that knowledge
274
Jeffrey M. Zacks and Jesse Q. Sargent
about event structure and maintenance of current task information is subserved by the PFC. More generally, the available data support the view that PFC is important both for long-term knowledge about events and for the online maintenance of task and goal information. An important open question is whether these two functions are neurophysiologically dissociated. In terms of EST, event knowledge, or SECs, corresponds to event schemata, and current task and goal representations correspond to event models.
6.2. Prefrontal Lesions and Event Segmentation The data on cognitive deficits associated with prefrontal lesions have two straightforward implications for event segmentation. First, according to EST, impairments to event schemata should reduce one’s ability to use previous experience to form adaptive event models. Thus, disordered event schemata should reduce one’s ability to use knowledge to support WM and long-term memory encoding. This is not a terribly original conclusion; it is one shared with many current theories of WM and long-term memory. More specific to EST, impaired event schemata should negatively affect one’s ability to identify normative event boundaries—particularly for activities that are familiar and thus should have strong support from schemata in control participants. Second, EST proposes that impairments to event models should affect event segmentation and memory because impaired event models should be less effective in biasing predictions. Memory for recently encountered information should be particularly impaired—specifically information encountered within the current event. Segmentation should be broadly impaired. Again, the conclusion that memory should be impaired is not original, but the conclusion that event segmentation should be affected is. Importantly, disruption of event schemata and disruption of event models should produce two qualitatively different event segmentation deficits. Disordered event schemata should produce stronger impairments for more familiar activities—those for which one has a schema. Disordered event models should produce global impairments in event segmentation. More speculatively, one might guess that disordered event schemata would selectively impair segmentation at coarser temporal grains, because coarsegrained segmentation may be more sensitive to top-down influence (Zacks & Tversky, 2001). If both event schemata and event models were impaired, one would expect to see both types of impairment. To our knowledge, there has been only one study of event segmentation in patients with frontal lobe lesions (Zalla, Pradat-Diehl, & Sirigu, 2003). In this experiment, participants with PFC lesions and healthy controls segmented two short movies of everyday activities at coarse and fine temporal grains. The patient group did not differ significantly from the controls in their fine segmentation, but their coarse segmentation was less
Event Perception: A Theory and Its Application to Clinical Neuroscience
275
well ordered and delayed relative to the controls. The fact that coarse segmentation was selectively affected suggests, albeit weakly, that these patients had impaired event schemata. The fact that fine segmentation did not show obvious impairment suggests—again, weakly—that the patient group’s event models may have been intact. Clearly, there is a need for more data on the effect of PFC lesions on event segmentation. It would be particularly valuable to vary the familiarity of the activities to be segmented, and to directly compare event knowledge with segmentation. If impairments in segmentation track impairments in event knowledge and both are caused by PFC lesions, this would support EST’s proposal that event schemata subserved by the PFC contribute to forming adaptive event models. It also would be valuable to combine event segmentation measures with measures of the memory functions of event models. The available data strongly suggest that memory for within-event information is impaired by PFC lesions (e.g., Mu¨ller & Knight, 2006). If the degree of this impairment tracks impairment in event segmentation, this would support EST’s proposal that event models bias perceptual prediction. More specifically, memory impairments should predict segmentation impairments above and beyond impairments attributable to deficits in event knowledge. In sum, lesions to the PFC are likely to be of profound consequence for event segmentation. Although there are few data that bear directly on this possibility, those that exist are consistent with it. This is important in its own right, but also is important for thinking about other conditions that affect the PFC. We turn now to two such circumstances—adult aging and AD.
7. Aging While clearly neither a neuropsychological nor a cognitive disorder, normal aging has been associated with a host of changes in brain and behavior. Perhaps, the most concrete age-related change is reduction in brain weight and volume. Postmortem studies show that total brain weight declines by about 2% per decade over progression from early to late adulthood (Kemper, 1994) and in vivo volumetry MRI studies show median correlations between brain volume and age to be about 0.5 (Raz, 1996). However, reduced volume, and age-related changes in general, occur differentially across different brain regions. Here, we will focus on changes in two brain systems that are relevant to event understanding: the PFC and neuromodulatory systems in the midbrain.
276
Jeffrey M. Zacks and Jesse Q. Sargent
7.1. Prefrontal Cortex Although some regions (e.g., primary sensory cortices) show very little ageassociated shrinkage, reduction in PFC volume is severe (e.g., Raz et al., 1997). Perhaps more meaningful, age-associated reductions in synaptic density and dendritic arborization (e.g., Liu, Erikson, & Brun, 1996) and in resting cerebral blood flow (e.g., Shaw et al., 1984) are greatest in the PFC. Evidence that physiological changes are most pronounced in PFC accords with findings showing age-related cognitive deficits specifically in tasks that are thought to depend on the PFC. For example, WM tasks measure ability to maintain information in a readily available state while simultaneously performing other cognitive operations of varying complexity. Numerous studies have shown age-related deficits in WM tasks (e.g., Belleville, Rouleau, & Caza, 1998; Hartman, Dumas, & Nielsen, 2001; Verhaeghen & Salthouse, 1997; see Hasher & Zacks, 1988, for review). Damage to lateral PFC regions impairs performance on a range of WM tasks (e.g., Baldo & Shimamura, 2000; D’Esposito & Postle, 1999; GoldmanRakic, 1987; Hartley et al., 1998). Also, neuroimaging studies show that, during the retention interval of WM tasks, dorsolateral PFC increases in activity as the degree of concurrent information processing increases (see Cabeza & Nyberg, 2000; D’Esposito & Postle, 1999; D’Esposito et al., 1998, for reviews). Attentional control, which also shows age-related decline, is another specific cognitive function thought to depend on PFC (see Posner & Peterson, 1990, for a review). There are a number of different tasks that are used to measure attentional control. Selective attention tasks require deployment of attention to a particular channel (e.g., left or right ear). Focused attention tasks might require maintenance of attention on a particular target or region of space, while divided attention tasks might tap the ability to monitor several stimuli at once, or to rapidly switch attention between multiple targets. Performance on tasks that assess attentional control declines with age. For example, divided attention costs have been shown to be greater in older than in younger adults (e.g., Hartley, 1992, 1993). Hasher, Zacks, and colleagues have presented an inhibition-deficit view of cognitive aging. According to this view, many age-related cognitive deficits are due to a decreased ability to limit access to WM and delete unwanted information from WM (e.g., Hasher & Zacks, 1988; Hasher, Zacks, & May, 1999; Zacks & Hasher, 1994). For example, participants were presented with italicized passages containing distracting text (in regular font) and instructed to read the italicized and ignore the regular font text (Connelly, Hasher, & Zacks, 1991). Older adults showed slower reading times and poorer comprehension, indicating reduced ability to focus attention on only the relevant portions of the text. Providing support for this interpretation, recent results show that older adults actually retain more of the to-be-ignored material as evidenced by
Event Perception: A Theory and Its Application to Clinical Neuroscience
277
implicit memory tests (Thomas & Hasher, 2009, submitted for publication). Recent work by Hasher and colleagues suggests that declines in prefrontal mediated inhibition of distracting information are responsible for age-related declines in episodic memory (Healey, Campbell, & Hasher, 2008; Stevens, Hasher, Chiew, & Grady, 2008). In sum, the volume and structural integrity of the PFC decline with age. These declines are associated with reduced WM capacity and attentional control.
7.2. Midbrain Neuromodulatory Systems Neuromodulatory systems whose neurons have cell bodies in the midbrain may undergo significant age-related changes. As described in Section 2, neurons in the anterior LC signal with norepinephrine, project broadly to the forebrain, and may code error signals. These cells show attrition with age (e.g., Chan-Palay & Asan, 1989a,b; McGeer & McGeer, 1989). Evidence of age-related decreases in the dopamine system comes from several findings. First, postmortem studies have shown an age-related decrease in the number of dopamine neurons (Fearnley & Lees, 1991). Also, D2 receptor binding in the striatum has been shown to decline with age (Sakata, Farooqui, & Prasad, 1992). In a particularly relevant study (Volkow et al., 1998), striatal D2 receptor binding in adults ranging from 24 to 86 years of age was assessed using positron emission tomography (PET). A cognitive battery including the WCST was also administered. Consistent with previous findings, D2 receptor binding in caudate and putamen decreased with age. In addition, a significant relationship between receptor binding and cognitive performance remained even after controlling for the effects of age. This strengthens the observed relationship between decreased dopaminergic system activity and cognitive deficits. In sum, midbrain neuromodulatory systems involved in signaling errors show age-related declines, and these may be related to changes in cognitive function.
7.3. Episodic Memory and Situation Model Construction We have established that age-related differences in WM and attentional control are substantial and have been associated with differences in specific brain structures. Age-related differences in episodic memory are also substantial. However, the medial temporal lobes, which are critical to episodic memory formation (Squire & Zola-Morgan, 1991), undergo minimal change with healthy aging (Head, Snyder, Girton, Morris, & Buckner, 2005; Raz, 2000). One possibility is that age-related declines in episodic memory are due to changes in controlled processing during encoding and retrieval, which may be mediated by the PFC (Healey et al., 2008; Stevens et al., 2008).
278
Jeffrey M. Zacks and Jesse Q. Sargent
Older adults have particular difficulty remembering contextual aspects of studied material. For example, memory is poorer for perceptual details such as the color, case, or font in which target material appeared (e.g., Kausler & Puckett, 1981; Naveh-Benjamin & Craik, 1995), location of target material (e.g., Chalfonte & Johnson, 1996; Uttl & Graf, 1993), temporal order of target material (Dumas & Hartman, 2003; Kausler, Salthouse, & Saults, 1988), and even whether the target material was presented visually or auditorially (Light, La Voie, Valencia-Laver, Albertson-Owens, & Mead, 1992). Accordingly, older adults are also less likely to correctly identify the source of a memory, for example, was the stimulus, seen or imagined (e.g., Norman & Schacter, 1997). This age-related deficit in source memory has been tied to differences in activity in the PFC (e.g., Swick, Senkfor, & Van Petten, 2006). Although aging is associated with significant deficits in memory for events, particularly for their contextual details, some aspects of event memory show striking preservation. There is evidence that reading and comprehending prose is facilitated by the construction of situation models, and that older adults rely on situation models during comprehension as much as younger adults. Situation models are higher level representations that describe the gist of the situation described in the text (e.g., Zwaan & Radvansky, 1998). For example, reading the sentence ‘‘She entered the hotel lobby’’ might result in the formation of a situation model wherein there is a hotel lobby with a reception desk and elevators, even though these contextual details were not in the text. Rather, they were supplied by semantic memory for what makes up a hotel lobby. Zwaan and Radvansky (1998) distinguish between a current model, which represents the current state of affairs and is updated at boundaries between events, and an integrated model of the current event together with all the previous ones. The final integrated model (or complete model) determines later episodic memory. In the terms of EST, current models correspond to event models, and semantic memory for events is provided by event schemata. Studies have shown that these situation models are maintained and updated to similar extents by younger and older adults. For example, in a study by Morrow, Leirer, Altieri, and Fitzsimmons (1994), younger and older participants read narratives that described a protagonist moving from room to room. When reading was interrupted by probe questions about certain objects mentioned in the texts, answers were faster and more accurate for objects that were closer to the protagonist’s current location, for both younger and older adults. This suggests that readers in both age groups maintained spatial situation models that were updated to reflect the protagonist’s current location. While a number of studies on discourse processing show older adults are able to construct and maintain situation models (e.g., Radvansky & Curiel, 1998; Radvansky, Zacks, & Hasher, 1996), some suggest that the use of such models may be more demanding
Event Perception: A Theory and Its Application to Clinical Neuroscience
279
for older adults (Morrow et al., 1994; Morrow, Stine-Morrow, Leirer, Andrassy, & Kahn, 1997). It is important to distinguish between the proposal that older adults rely heavily on situation models and the proposal that situation model processing is unaffected by aging. The data seem clear that older adults rely at least as heavily on situation models as younger adults. One possibility is that older adults’ construction and use of situation models is relatively intact, and reliance on them is an adaptive response to compensate for deficits in other processing domains (Radvansky & Dijkstra, 2007). However, it is also possible that older adults’ situation models are impaired but still exert a heavy influence on comprehension. This could come about because older adults prioritize global gist in comprehension over the processing of fine details (Stine-Morrow, Gagne, Morrow, & DeWall, 2004). It could also come about because it is difficult to implement comprehension strategies that do not rely heavily on situation models, even if they would be adaptive. In our view, the currently available data provide strong evidence that older adults rely heavily on situation models, but are less convincing in showing that those situation models are not negatively impacted by aging.
7.4. Aging and Event Segmentation The neurocognitive changes associated with aging make contact with the mechanisms of event segmentation at multiple points. The data reviewed above suggest three ways in which event segmentation may change with aging. The first two possibilities follow directly from the preceding discussion of the effects of PFC lesions on event understanding (see ‘‘Frontal Lobe Lesions,’’ above). First, PFC dysfunction may indicate that event model maintenance is impaired in aging. As illustrated previously, both WM and attentional control are associated with PFC function. Moreover, current theories suggest that attentional control plays a central role in determining WM capacity (Baddeley, 1986; Kane et al., 2004; McCabe, Roediger, McDaniel, Balota, & Hambrick, 2006). But what is attentional control? One view is that attentional control is the ability to maintain task-relevant information in the face of distracting sensory stimulation (Darowski, Helder, Zacks, Hasher, & Hambrick, 2008). Another view is that attentional control is the ability to maintain a representation of one’s current task and goals (Braver & Cohen, 2001; Miller & Cohen, 2001). These proposals lead to the suggestion that changes in the ability to maintain appropriate event models and update them adaptively could be at the core of age-related differences in attentional control, accounting for some of the age differences in cognitive performance. Second, PFC dysfunction may indicate that event schemata are impaired with aging. This is possible, but seems less likely than the possibility that event models are impaired. One reason to doubt that event schemata are
280
Jeffrey M. Zacks and Jesse Q. Sargent
impaired in older adults is that other domains of semantic knowledge, such as those measured by vocabulary tests, show no impairments—rather, older adults often show better performance than younger adults (Verhaeghen, 2003). Further, older adults’ scripts for everyday events do not differ systematically in their structure or content from those of younger adults (Rosen, Caplan, Sheesley, Rodriguez, & Grafman, 2003). Finally, as we have shown above, older adults appear to make as heavy use of situational knowledge as do younger adults in text comprehension and memory. Third, reductions in the efficacy of the D2 or norepinephrine systems could produce deficits in the ability to update event models in response to spikes in prediction error. Deficits in either prediction error calculation or in error-based updating would be expected to introduce noise into the timecourse of event model updating. Although a simple change to the system, such a deficit would have cascading effects: If event models are updated at inappropriate times they will form less adaptive representations of the current situation. These representations should be reflected in poorer comprehension and performance online, and in poorer later memory. We believe that the available data suggest most strongly the possibility of age-related declines in the maintenance of event models, in their updating in response to prediction error spikes, or both. Either possibility predicts that event segmentation should become less reliable and less adaptive with age. Support for this proposal comes from a study using the event segmentation paradigm described above (Zacks, Speer, et al., 2006). Older and younger participants watched movies of actors engaged in everyday activities (e.g., making a bed) and indicated when they believed one natural meaningful unit of activity had ended and another had begun. Then participants performed an order memory task in which they were given 12 cards with still pictures taken from each movie, randomly ordered, and asked to sort them into the order in which they had occurred in the movie. Participants also performed a recognition memory task for each movie. On each trial, participants were shown one picture from the movie they had viewed and one picture from a similar movie, and asked to choose the picture from the movie they had seen. Finally, participants also completed a psychometric battery, including a measure of semantic memory for event order, the Picture Arrangement subtest of the WAIS (Wechsler, 1997). In the Picture Arrangement test, participants are given a set of cartoon drawings for a common activity (e.g., going fishing) and asked to sort them into the order in which they typically occur. Thus, whereas the order memory test is a measure of one’s episodic memory for the order of events in a particular experienced activity, the Picture Arrangement test is a measure of semantic knowledge about how events typically unfold. This may be said to measure the accuracy and depth of participants’ event schemata. There were no systematic differences in boundary location between older and younger adults, which allowed calculation of segmentation
Event Perception: A Theory and Its Application to Clinical Neuroscience
281
4
3
Recognition memory accuracy
Order memory error
Healthy Mild dementia r=– 0.4 1
2 r = –0 .32
1
0.0
0.1 0.2 0.3 0.4 0.5 Segmentation agreement
0.6
0.9 0.8
0.7
7
r=
0.5
0.6 r=
1
0.4
0.5 0.0
0.1 0.2 0.3 0.4 0.5 Segmentation agreement
0.6
Figure 3 Event segmentation in older adults correlates with memory for event order (left) and recognition memory (right). For recognition memory, this relationship remained statistically significant after controlling for clinical dementia status and psychometric performance. (Data from Zacks, Speer, et al., 2006; Zacks, Swallow, et al., 2006.)
agreement scores by comparing each individual’s segmentation to that of the sample as a whole. Segmentation agreement was lower for older adults than younger adults; in other words, older adults’ segmentation was more variable. Older adults also had poorer order memory and recognition memory. Most importantly, for older adults, after controlling for global psychometric performance, segmentation agreement was significantly correlated with memory scores. Thus, those older adults who identified boundaries in a more normative fashion showed better memory for the movies (see Figure 3). Although further work is needed, it appears that age-related dysfunction of event segmentation mechanisms may be a causal factor in age-related episodic memory problems. Picture Arrangement scores were significantly lower for the older than for the younger adults, and among the older adults these scores correlated with memory scores. One possibility is that in addition to maintenance problems, semantic event schemata information available to inform event models is reduced in older adults. However, as reviewed previously there is some evidence that semantic information about events is preserved with aging. Another possibility is that the variance shared between Picture Arrangement scores and episodic memory scores reflects not knowledge about events, but shared cognitive operations between the Picture Arrangement task and the memory tasks. In particular, both the order memory test and the Picture Arrangement test require participants to sort cards with pictures on them in temporal order. This may depend heavily on WM and attentional control.
282
Jeffrey M. Zacks and Jesse Q. Sargent
Current work in our laboratory is further exploring the relations between age, event segmentation, and memory (Kurby & Zacks, under review). As noted previously, observers spontaneously group fine-grained events hierarchically into coarse events (Zacks et al. 2001; see Section 2). This hierarchical organization is weakened in older adults compared to younger adults. Moreover, like segmentation agreement, hierarchical organization predicts subsequent memory within the older adult group. In sum, these experiments show that older adults do not segment activity in as reliable or as organized a fashion as younger adults. Across individuals, the ability to segment well predicts later memory performance. This is consistent with EST’s proposal that event models partly determine episodic memory encoding. However, the available data do not do much to tell us which of the systems affected by aging is responsible for the changes in event segmentation and memory performance. We noted previously that event model maintenance and error-based updating are good candidates for mechanisms that undergo changes due to the aging process. In future research, it will be important to test directly which of these mechanisms is affected. One possibility is to measure evoked brain responses at event boundaries during passive viewing. We know that a substantial subset of healthy older adults, when asked to perform an explicit segmentation task, segment events at less normative and less effective points in time. We have proposed that the posterior cortical responses at event boundaries may reflect the consequences of error-based updating (Speer et al., 2003). If older adults have impaired updating, this would predict that older adults with poor event segmentation would show reduced responses in these areas at event boundaries during passive viewing. Alternatively, if event model maintenance is impaired, this would predict intact responses at event boundaries during passive viewing. Another possibility is to directly measure the activity of the error signaling system. Current research in our laboratory is characterizing the response of this system in younger adults using fMRI (Kurby, Zacks, & Haroutunian, 2009); in the future we plan to extend these studies to explore age differences.
8. Alzheimer’s Disease AD is a progressive neurodegenerative disease associated with old age. The earliest neuropsychological symptoms typically cited are deficits in episodic memory (e.g., Huff et al., 1987; Welsh, Butters, Hughes, Mohs, & Heyman, 1992). However, more recently it has been suggested that attentional control deficits may be observed even earlier in the disease course (e.g., Balota & Faust, 2001; Tse, Balota, Moynan, Duchek, & Jacoby, in press). As the disease progresses, memory is affected more globally
Event Perception: A Theory and Its Application to Clinical Neuroscience
283
and eventually all higher order cognitive processes break down resulting in symptoms such as disorientation and loss of speech. In some respects, the changes in behavior associated with early AD resemble accelerations in the changes associated with normal aging (Storandt & Beaudreau, 2004). For example, episodic memory problems are associated with both normal aging and AD, and it is primarily the more severe memory loss that distinguishes AD. In fact, a recent study showed that 20–40% of a sample of healthy older adults had the neuropathological markers of AD and that even in this sample, the degree to which these markers were present at autopsy was correlated with premorbid cognitive function (Price et al., 2009). This raises interesting questions about the relationship between the cognitive and brain changes associated with normal aging and those associated with early-stage AD. However, cognitive deficits that are qualitatively unique to AD have also been identified (e.g., Johnson, Storandt, & Balota, 2003). In any case, research has shown a clear pattern of brain changes and cognitive deficits associated with AD.
8.1. Brain Changes and Cognitive Deficits Definitive diagnosis of AD requires postmortem identification of characteristic intraneuronal neurofibrillary changes (tangles) and extracellular amyloid deposits (plaques) in the brain. Through postmortem examination of healthy and diseased brains, Braak and Braak (1991) identified six stages of AD development on the basis of the distribution pattern of neurofibrillary tangles (NFTs) and neuropil threads (NTs). Stage I is associated with the appearance of NFTs and NTs in the entorhinal cortex in the medial temporal lobe. In stage II, the hippocampus is also affected. Stages III and IV are marked by denser accumulation of markers in these areas and some spreading to other limbic structures. In stage V, neocortical association areas are affected and by stage VI primary cortical areas are affected as well. The potential causal relationship between the appearance of these neuropathological markers and the clinical course of AD is complex and not fully understood. For example, although amyloid plaques are commonly thought to be causally related to AD (e.g., Hardy & Higgins, 1992), they are found in significant percentages of cognitively normal older adults (e.g., Arriagada, Marzloff, & Hyman, 1992; Mintun et al. 2006; Price et al., 2009; Sperling et al., 2009). However, there is little doubt that the progression of AD is marked by the accumulation of these markers in specific areas (e.g., Berg et al., 1998; Martin et al., 1987; Price & Morris, 2004), and that the presence of these markers in a particular region is associated with neuronal dysfunction in that region (e.g., Berg et al.; Hardy, 2002). For example, Kanne, Balota, McKeel, Storandt, and Morris (1998) showed evidence that accumulation of cored senile plaques (late-stage amyloid deposits) in specific brain areas was associated with deficits on specific cognitive tasks believed to
284
Jeffrey M. Zacks and Jesse Q. Sargent
involve those areas. A large sample of participants with mild and very mild AD completed a cognitive test battery. A factor analysis identified three factors: a mental control/frontal factor, a memory-verbal/temporal factor, and a visuospatial/parietal factor. Forty-one of these participants came to autopsy an average of 5.1 years after testing. The relative density of senile plaques in each region was correlated with performance on that region’s putative corresponding psychometric factor. This study provides some support for the idea that the cognitive changes associated with AD provide indicators of which structures are accumulating neuropathological markers and failing in their functional duties. Further support comes from imaging techniques that allow antemortem examination of AD-related brain changes. In vivo amyloid deposition can be examined using a radiological contrast compound (C-PIB) that binds specifically to amyloid plaques and can be imaged using PET. For example, Klunk et al. (2004) showed that AD is associated with C-PIB uptake in the frontal cortex, particularly the medial portion, in temporal and occipital cortices, and in the striatum as well. Using fluorodeoxyglucose PET (FDGPET) to examine patterns of glucose metabolism in the brain, the authors also showed that these regions were associated with reduced glucose metabolism. Subsequently, Buckner et al. (2005) presented converging measures showing AD pathology in a similar network of brain regions. In addition to atrophy in the MTL, early AD was associated with atrophy (as identified by structural MRI), amyloid deposition, and reduced metabolism in precuneus, posterior cingulate, and lateral temporal and parietal regions. It is noteworthy that atrophy in the MTL and precuneus was observed in very early stages, and even in healthy converters who were not diagnosed until later. Work involving radiological contrast compounds that bind to NFTs is in very early stages of development. Already, there is some evidence that binding of compounds with an affinity for both plaques and tangles across temporal, parietal, posterior cingulate, and frontal regions differentiates between normal controls and AD patients better than FDG-PET or brain volume as measured by MRI (Small et al., 2006). As the in vivo imaging of amyloid plaques and NFTs improves, a clearer picture of the relationship between the accumulation of these markers in specific areas and the clinical course of AD will emerge (for more see Hardy & Higgins, 1992; Price & Morris, 2004). The general progression of AD neuropathology identified by Braak and Braak (1991), from medial temporal structures, throughout the limbic system, cortical association areas, and eventually to the entire neocortex is supported by imaging studies of brain volume (Devanand et al., 2007; Henneman et al., 2009) and metabolism (Dickerson & Sperling, 2008; Li et al., 2008). This is in keeping with the observation of episodic memory deficits in early AD (e.g., Huff et al., 1987; Welsh et al., 1992). However,
Event Perception: A Theory and Its Application to Clinical Neuroscience
285
there is also evidence suggesting that the precuneus shows atrophy, and the medial frontal cortex accumulates amyloid very early in the disease course (e.g., Buckner et al., 2005). Both of these regions have been associated with attention (e.g., Mao, Zhou, Zhou, & Han, 2007; Nagahama et al., 1999; Thienel et al., 2009). Accordingly, deficits in attentional control are observed in very early-stage AD (Perry & Hodges, 1999; Rizzo, Anderson, Dawson, Myers, & Ball, 2000; Tse et al., in press) and even identify healthy older adults who will subsequently convert to AD (e.g., Balota et al., in press; Twamley, Ropacki, & Bondi, 2006). Although the neurophysiological correlates of changes in attention in AD are not currently well understood (Hirao et al., 2005; Johnson et al., 1998), the literature does indicate that changes in attention and the precuneus, as well changes in memory and the MTL, may characterize early and even preclinical AD. Recently, researchers have been particularly interested in a network of regions that show greater activity during rest or in passive control conditions than during focused cognitive tasks. These include a set of midline regions in the anterior and posterior cortex and regions in lateral parietal cortex. Dubbed the ‘‘default mode network’’ (DMN; Raichle et al., 2001), this network has been proposed to subserve a set of tasks performed on an ongoing basis to sustain normal functioning. Interestingly, the brain regions identified above as particularly vulnerable to early amyloid deposition (i.e., MTL, medial parietal and prefrontal areas) show considerable overlap with the DMN. The DMN appears to increase in activity during episodic and autobiographical memory retrieval, and decrease in activity when attention to external stimuli is required (e.g., Shulman et al., 1997; Svoboda, McKinnon, & Levine, 2006; Wagner, Shannon, Kahn, & Buckner, 2005). Within the DMN, AD patients show increased amyloid accumulation and disrupted neural activity, for example, decreased connectivity (e.g., Bai et al., 2008; Buckner et al., 2005; Greicius, Srivastava, Reiss, & Menon, 2004). Even in older adults without dementia, high levels of amyloid deposition in the DMN have been associated with abnormal neural activity in this network during memory tasks as measured by fMRI (Sperling et al., 2009). While work relating the DMN to AD is in early stages of development, results to date support the connection between biomarker deposition in the DMN and cognitive dysfunction observed in AD. In sum, evidence suggests that the MTL and the precuneus are affected earliest in the course of AD, followed by other cortical regions such as the posterior cingulate, temporoparietal region and the medial frontal cortex (e.g., Buckner et al., 2005). These brain changes correspond, at least partially, to the cognitive changes in the disease: Episodic memory and attention are selectively affected early on; further deterioration in these areas is observed in the middle stages, and in the late stages cognition is globally impaired.
286
Jeffrey M. Zacks and Jesse Q. Sargent
8.2. Alzheimer’s Disease and Event Segmentation This progression suggests that the effects of early-stage AD on event segmentation should resemble exaggerated versions of the effects of aging. Event segmentation itself may be little affected by selective lesions to the MTL memory system. However, such lesions predict that event segmentation has an exaggerated effect on memory accessibility. Among healthy adults, the ability to remember details from a narrative is reduced if the narrative includes a change likely to trigger an event boundary (e.g., temporal or spatial shift) since the mention of such details (e.g., Speer & Zacks, 2005). Given the importance of the MTL for retrieval of items no longer maintained in WM, or no longer in the current event model, we would expect even poorer memory for details requiring retrieval across event boundaries among early AD patients. There is also reason to believe that AD-related neuropathology in medial posterior regions, particularly the precuneus and the posterior cingulate, would have negative consequences for event segmentation mechanisms. As described previously, research in our laboratory suggests that these regions are part of a network involved in event segmentation, which shows transient increases when perceivers experience event boundaries during comprehension (see Section 2 above). We suggest that these posterior regions may be important either for detecting changes in the various dimensions that define events (e.g., time, space, actors, goals, etc.), or in providing inputs to event models when error-based gating mechanisms update a current event model. Either way, AD-related dysfunction in the posterior cingulate and precuneus might be expected to interfere with the updating of event models that no longer provide accurate predictions. Given that event models serve to guide attention, this could manifest as the type of attention problems observed in very early AD. Although we have focused on how AD-related brain changes might affect event segmentation mechanisms, it is also possible that such mechanisms might be preserved, particularly earlier in the disease course. This possibility is supported by the fact that there is relatively little overlap between the brain regions associated with EST (see Figure 1) and those affected by early-stage AD pathology described above. Previous work in our laboratory with older adults, both healthy and with very mild AD, suggests that individual differences in event segmentation predict event memory independently of clinical dementia status (Zacks, Speer, et al., 2006). Work is currently underway, using larger sample sizes, which will enable us to ask whether the strength of the relationship between event segmentation and event memory varies across levels of clinical dementia status. If this relationship is as strong among earlystage AD patients as among healthy older adults, this would suggest that some mechanisms of event segmentation are independent of those degraded in the early stages of the disease. The finding that mechanisms of event segmentation
Event Perception: A Theory and Its Application to Clinical Neuroscience
287
are robust against the moderate neural lesions of early-stage AD would have an important clinical application: Event segmentation would be an attractive target for training to remediate memory deficits. One possibility is that deliberate attention to event segmentation itself will improve memory encoding. In addition, imaging data will afford the opportunity to ask whether structural integrity in certain brain regions mediates the relation between event segmentation and memory. According to EST, effects in PFC would suggest that early dementia affects either the formation of event models or the use of event knowledge. Effects in posterior cortex would suggest early dementia affects either the processes of detecting an event boundary or of updating an event model. In the later stages of AD, damage to neural integrity is widespread, and deficits in cognition are comparably broad. Early in the disease progression, the encoding of new memories is affected but the retrieval of previously learned material is preserved (e.g., Huff et al., 1987; Welsh et al., 1992). As the disease progresses, access to autobiographical memories declines. In the later stages, even the most overlearned semantic associations are lost. At this point, in addition to the frontal maintenance problems discussed above, it is likely that reliable event schemata are no longer available or accessible. Accordingly, the perceptual guidance provided by event models is likely to be severely limited. This represents a fundamental breakdown of the event segmentation system and would have wide ranging deleterious consequences such as those observed in advanced AD, for example, disorientation. However, at this stage in the disease, global cognitive function has deteriorated to the point where drawing connections to EST may be of limited value. In sum, the brain changes associated with early AD may lead to attention and memory problems by way of disruption of event segmentation mechanisms. Alternatively, it may be that event segmentation abilities, or certain aspects thereof, are relatively well preserved in AD. In the latter case, clinical efforts to maximize the cognitive burden carried by particularly well-preserved event segmentation mechanisms may reduce attention and memory problems. Work is currently underway that will begin to address these possibilities.
9. Conclusions We have reviewed a complex and diverse set of clinical neuroscientific circumstances—and there are many more we have had to leave to the side for lack of space. A heuristic overview of the pattern of deficits we have observed is provided in Table 1. We would like to emphasize that the set of mechanisms we have examined, as well as the set of conditions, is selective. For example, we have not discussed the role of the medial temporal episodic memory system (Cohen & Eichenbaum, 1995) in event understanding and
Table 1
Overview of Potential Event Segmentation Mechanism Impairments.
Schizophrenia Obsessive-compulsive disorder Parkinson’s disease Frontal lobe lesions Aging Alzheimer’s disease
Sensory-perceptual processing
Prediction monitoring
Error-based updating
Event models
Event schemata
0 0 0 0 0 0
0 þ 0 0 0 þ
þ þ þ 0 þ þ
þþ 0 0 þ þ þ
0 þ 0 þþ 0 þ
þ: Suggestive evidence for impairment; þþ: strong evidence for impairment; 0: not yet tested.
Event Perception: A Theory and Its Application to Clinical Neuroscience
289
memory, nor have we considered persons who experience anterograde amnesia after damage to this system. For none of the conditions we have examined do we find evidence for deficits in sensory-perceptual processing. However, in other conditions—for example, visual form agnosia or motion blindness—deficits in sensory-perceptual processing are clearly evident and likely have important consequences for event segmentation. We believe the picture that emerges from this review underwrites a strong message: The mechanisms of event segmentation provide a valuable framework for understanding cognitive dysfunction. This provides an exciting leverage point for clinical diagnosis and treatment. People, including those of us who are aging or coping with a neurological or neuropsychiatric condition, tend to care about their ability to comprehend the everyday events around them, to remember those events later, and to plan adaptive actions. Theory-driven interventions that may improve event comprehension and memory have the potential to substantially improve quality of life. As we have described throughout the chapter, researchers coming from a range of theoretical perspectives are applying such interventions to a range of clinical problems. We are hopeful that the current chapter illustrates how EST may contribute to this effort. However, the basic science base underlying such interventions needs extending on at least two fronts. First, there is an urgent need for many more data on event understanding in clinical populations and in healthy aging. One can draw inferences about the mechanisms of event segmentation from the available data concerning attention, memory, and performance. However, such inferences are necessarily weak and invite direct verification. Second, there is a need for formal models that make fine-grained predictions about the consequences of specific neurological changes for specific aspects of event segmentation and memory. An initial step in this direction was taken with the computational model of Reynolds et al. (2007). This model was a connectionist implementation of the core architecture of EST. It would be valuable to extend this model to produce moment-by-moment predictions for event perception and memory. Virtual lesions could then be applied to the model, and the model’s performance could be directly compared with that of patients from the groups discussed here. Such comparisons would provide powerful means to constrain theories of event understanding and to characterize the cognitive dysfunction in these conditions. Clearly, there is much work to be done. We believe this is an exciting time for researchers studying deficits in higher cognition. New landscapes of theory and methods are opening up—the lens of event segmentation that we have applied here can encompass only a small field of view over this terrain. Basic scientists who wish to better understand how people comprehend and remember the everyday events that make up their lives have a lot to gain by taking up this exploration. Those with disorders of event perception also stand to benefit from this endeavor.
290
Jeffrey M. Zacks and Jesse Q. Sargent
ACKNOWLEDGMENTS Preparation of this chapter was supported in part by NIH grants R01-MH70674 and R01AG031150 and by NSF grant BCS-0236651, all to Jeff Zacks. The authors thank Dave Balota, Jordan Grafman, Joe Magliano, G. A. Radvansky, and Rose Zacks for helpful comments on the manuscript.
REFERENCES Aarsland, D., Andersen, K., Larsen, J., Lolk, A., Nielsen, H., & Kragh-Sorensen, P. (2001). Risk of dementia in Parkinson’s disease—A community-based, prospective study. Neurology, 56, 730–736. Arriagada, P. V., Marzloff, K., & Hyman, B. T. (1992). Distribution of Alzheimer type pathologic changes in nondemented elderly individuals matches the pattern in Alzheimer’s disease. Neurology, 42, 1681–1688. Baddeley, A. (1986). Dementia and working memory. Quarterly Journal of Experimental Psychology, 38A, 603–618. Baddeley, A. (2000). The episodic buffer: A new component of working memory? Trends in Cognitive Sciences, 4(11), 417–423. Bai, F., Zhang, Z., Yu, H., Shi, Y., Yuan, Y., Zhu, W., et al. (2008). Default-mode network activity distinguishes amnestic type mild cognitive impairment from healthy aging: A combined structural and resting-state functional MRI study. Neuroscience Letters, 438, 111–115. Baldo, J. V., & Shimamura, A. P. (2000). Spatial and color working memory in patients with lateral prefrontal cortex lesions. Psychobiology, 28, 156–167. Balota, D. A., & Faust, M. (2001). Attention in dementia of the Alzheimers type. In F. Boller & S. Cappa (Eds.), Handbook of neuropsychology, Vol. 6 (pp. 51–80). Elsevier Science No. 2. Balota, D. A., Tse, C. S., Hutchison, K. A., Spieler, D. H., Duchek, J. M., & Morris, J. C. (2010). Predicting conversion to dementia of the Alzheimer type in a healthy control sample: The power of errors in stroop color naming. Psychology and Aging, 25, 208–218. Barch, D. M. (2005). The relationships between cognition, motivation and emotion in schizophrenia: How much and how little we know. Schizophrenia Bulletin, 31, 875–881. Barch, D. (2006). What can research on schizophrenia tell us about the cognitive neuroscience of working memory? Neuroscience, 139, 73–84. Bartels, A., & Zeki, S. (2004). Functional brain mapping during free viewing of natural scenes. Human Brain Mapping, 21, 75–85. Belleville, S., Rouleau, N., & Caza, N. (1998). Effect of normal aging on the manipulation of information in working memory. Memory & Cognition, 26(3), 572–583. Berg, L., McKeel, D. W., Miller, J. P., Storandt, M., Rubin, E. H., Morris, J. C., et al. (1998). Clinicopathologic studies in cognitively healthy aging and Alzheimer’s disease: Relation of histologic markers to dementia severity, age, sex, and apoE genotype. Archives of Neurology, 55(3), 326–335. Binder, M. D., Hirokawa, N., & Windhorst, U. (2009). Encyclopedia of neuroscience. Berlin, Heidelberg: Springer. Boltz, M. (1992). Temporal accent structure and the remembering of filmed narratives. Journal of Experimental Psychology: Human Perception and Performance, 18, 90–105. Botvinick, M., Braver, T., Barch, D., Carter, C., & Cohen, J. (2001). Conflict monitoring and cognitive control. Psychological Review, 108(3), 624–652.
Event Perception: A Theory and Its Application to Clinical Neuroscience
291
Botvinick, M. M., & Plaut, D. C. (2004). Doing without schema hierarchies: A recurrent connectionist approach to routine sequential action and its pathologies. Psychological Review, 111, 394–429. Botvinick, M., & Plaut, D. (2006). Short-term memory for serial order: A recurrent neural network model. Psychological Review, 113, 201–233. Bower, G. H., & Rinck, M. (2001). Selecting one among many referents in spatial situation models. Journal of Experimental Psychology. Learning, Memory, and Cognition, 27, 81–98. Boyer, P., & Lienard, P. (2008). Ritual behavior in obsessive and normal individuals— Moderating anxiety and reorganizing the flow of action. Current Directions in Psychological Science, 17, 291–294. Braak, H., & Braak, E. (1991). Neuropathological stageing of Alzheimer-related changes. Acta Neuropathologica (Berlin), 82, 239–259. Braver, T. S., & Cohen, J. D. (2001). Working memory, cognitive control, and the prefrontal cortex: Computational and empirical studies. Cognitive Processing, 2, 25–55. Bresnahan, M. A., Brown, A. S., Schaefer, C. A., Begg, M. D., Wyatt, R. J., & Susser, E. S. (2000). Incidence and cumulative risk of treated schizophrenia in the prenatal determinants of schizophrenia study. Schizophrenia Bulletin, 26, 297–308. Buckner, R. L., Snyder, A. Z., Shannon, B. J., LaRossa, G., Sachs, R., Fotenos, A. F., et al. (2005). Molecular, structural, and functional characterization of Alzheimer’s disease: Evidence for a relationship between default activity, amyloid, and memory. Journal of Neuroscience, 25, 7709–7717. Cabeza, R., & Nyberg, L. (2000). Imaging cognition II: An empirical review of 275 PET and fMRI studies. Journal of Cognitive Neuroscience, 12, 1–47. Chalfonte, B. L., & Johnson, M. K. (1996). Feature memory and binding in younger and older adults. Memory & Cognition, 24, 403–416. Chan-Palay, V., & Asan, E. (1989a). Alterations in catecholamine neurons of the locus coeruleus in senile dementia of the Alzheimer type and in Parkinson’s disease with and without dementia and depression. Journal of Comparative Neurology, 287(3), 373–392. Chan-Palay, V., & Asan, E. (1989b). Quantitation of catecholamine neurons in the locus coeruleus in human brains of normal young and older adults and in depression. Journal of Comparative Neurology, 287(3), 357–372. Clare, L., McKenna, P. J., Mortimer, A. M., & Baddeley, A. D. (1993). Memory in schizophrenia: What is impaired and what is preserved? Neuropsychologia, 31(11), 1225–1241. Cohen, N. J., & Eichenbaum, H. (1995). Memory, amnesia, and the hippocampal system. Cambridge, MA: MIT Press. Connelly, S. L., Hasher, L., & Zacks, R. T. (1991). Age and reading: The impact of distraction. Psychology and Aging, 6, 533–541. Cooper, R., & Shallice, T. (2000). Contention scheduling and the control of routine activities. Cognitive Neuropsychology, 17, 297–338. Cooper, R., & Shallice, T. (2006). Hierarchical schemas and goals in the control of sequential behavior. Psychological Review, 113, 887–916. Creese, I., Burt, D. R., & Snyder, S. H. (1976). Dopamine receptor binding predicts clinical and pharmacological potencies of antischizophrenic drugs. Science, 19, 481–483. Darowski, E. S., Helder, E., Zacks, R. T., Hasher, L., & Hambrick, D. Z. (2008). Agerelated differences in cognition: The role of distraction control. Neuropsychology, 22, 638–644. D’Esposito, M., Aguirre, G. K., Zarahn, E., Ballard, D., Shin, R. K., & Lease, J. (1998). Functional MRI studies of spatial and non-spatial working memory. Cognitive Brain Research, 7, 1–13. D’Esposito, M., & Postle, B. R. (1999). The dependence of span and delayed response performance on prefrontal cortex. Neuropsychologia, 37, 1303–1315.
292
Jeffrey M. Zacks and Jesse Q. Sargent
Devanand, D. P., Pradhaban, G., Liu, X., Khandji, A., De Santi, S., Segal, S., et al. (2007). Hippocampal and entorhinal atrophy in mild cognitive impairment: Prediction of Alzheimer disease. Neurology, 68(11), 828–836. Dickerson, B. C., & Sperling, R. A. (2008). Functional abnormalities of the medial temporal lobe memory system in mild cognitive impairment and Alzheimer’s disease: Insights from functional MRI studies. Neuropsychologia, 46, 1624–1635. Dumas, J., & Hartman, M. (2003). Adult age differences in temporal and item memory. Psychology of Aging, 3, 573–586. Elman, J. L. (2009). On the meaning of words and dinosaur bones: Lexical knowledge without a lexicon. Cognitive Science, 33, 547–582. Enns, J., & Lleras, A. (2008). What’s next? New evidence for prediction in human vision. Trends in Cognitive Sciences, 12, 327–333. Evans, D. W., & Leckman, J. F. (2006). Origins of obsessive-compulsive disorder: Developmental and evolutionary perspectives. In D. Cicchetti & D. Cohen (Eds.), The Handbook of Developmental Psychopathology. (2nd edition). NY: Wiley. Faust, M. E., Balota, D. A., & Spieler, D. H. (2001). Building episodic connections: Changes in episodic priming with age and dementia. Neuropsychology, 15(4), 626–637. Fearnley, J. M., & Lees, A. J. (1991). Ageing and Parkinson’s disease: Substantia nigra regional selectivity. Brain, 114, 2283–2301. Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex, 1, 1–47. Fornaro, M., Gabrielli, F., Albano, C., Fornaro, S., Rizzato, S., Mattei, C., et al. (2009). Obsessive-compulsive disorder and related disorders: A comprehensive survey. Annals of General Psychiatry, 8, 13. Fuster, J. M. (1997). The prefrontal cortex: Anatomy, physiology, and neuropsychology of the frontal lobe. Philadelphia: Lippincott-Raven. Fuster, J. M., & Alexander, G. E. (1971). Neuron activity related to short term memory. Science, 173, 652–654. Gehring, W. J., Himle, J., & Nisenson, L. G. (2000). Action-monitoring dysfunction in obsessive-compulsive disorder. Psychological Science, 11(1), 1–6. Goldman-Rakic, P. S. (1987). Circuitry of primate prefrontal cortex and regulation of behavior by representational memory. In F. Plum & V. Mountcastle (Eds.), Handbook of Physiology, Vol. 5 (pp. 373–417). Bethesda, MD: American Physiological Society. Goodale, M. A. (1993). Visual pathways supporting perception and action in the primate cerebral cortex. Current Opinion in Neurobiology, 3, 578–585. Grafman, J. (1995). Similarities and distinctions among current models of prefrontal cortical functions. Annals of the New York Academy of Sciences, 769, 337–368. Grafman, J. (1999). Experimental assessment of adult frontal lobe function. In B. L. Miller & J. L. Cummings (Eds.), The human frontal lobes: Functions and disorders (pp. 321–344). New York: Guilfrod Press. Green, M., Kern, R., Braff, D., & Mintz, J. (2000). Neurocognitive deficits and functional outcome in schizophrenia: Are we measuring the ‘‘right stuff’’? Schizophrenia Bulletin, 26, 119–136. Greicius, M. D., Srivastava, G., Reiss, A. L., & Menon, V. (2004). Default-mode network activity distinguishes Alzheimer’s disease from healthy aging: Evidence from functional MRI. Proceedings of the National Academy of Sciences of the United States of America, 101, 4637–4642. Guillin, O., Abi-Dargham, A., & Laruelle, M. (2007). Neurobiology of dopamine in schizophrenia. International Review of Neurobiology, 78, 1–39. Hardy, J. (2002). The amyloid hypothesis of Alzheimer’s disease: Progress and problems on the road to therapeutics. Science, 297, 353–356. Hardy, J. A., & Higgins, G. A. (1992). Alzheimer’s disease: The amyloid cascade hypothesis. Science, 256, 184–185.
Event Perception: A Theory and Its Application to Clinical Neuroscience
293
Harrison, P. J. (1999). The neuropathology of schizophrenia: A critical review of the data and their interpretation. Brain, 122, 593–624. Hartley, A. A. (1992). Attention. In F. I. M. Craik & T. A. Salthouse (Eds.), The handbook of aging and cognition (pp. 3–49). Hillsdale, NJ: Lawrence Erlbaum Associates. Hartley, A. A. (1993). Evidence for the selective preservation of spatial selective attention in old age. Psychology and Aging, 8, 371–379. Hartley, A. A., Speer, N., Jonides, J., Reuter-Lorenz, P., Smith, E. E., Marshuetz, C., et al. (1998). Do age related impairments in specific working memory systems result in greater reliance on the central executive? In: Cognitive Neuroscience Society annual meeting abstract program: A supplement of the Journal of Cognitive Neuroscience, 88. Hartman, M., Dumas, J., & Nielsen, C. (2001). Age differences in updating working memory: Evidence from the delayed matching to sample task. Aging, Neuropsychology, and Cognition, 8, 14–35. Hasher, L., & Zacks, R. T. (1988). Working memory, comprehension and aging: A review and a new view. In G. H. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory (pp. 193–225). San Diego, CA: Academic Press. Hasher, L., Zacks, R. T., & May, C. P. (1999). Inhibitory control, circadian arousal, and age. In D. Gopher & A. Koriat (Eds.), Attention & performance, XVII, cognitive regulation of performance: Interaction of theory and application (pp. 653–675). Cambridge, MA: MIT Press. Hasson, U., Nir, Y., Levy, I., Fuhrmann, G., & Malach, R. (2004). Intersubject synchronization of cortical activity during natural vision. Science, 303(5664), 1634–1640. Hasson, U., Yang, E., Vallines, I., Heeger, D. J., & Rubin, N. (2008). A hierarchy of temporal receptive windows in human cortex. Journal of Neuroscience, 28, 2539–2550. Head, D., Snyder, A. Z., Girton, L. E., Morris, J. C., & Buckner, R. L. (2005). Frontal– hippocampal double dissociation between normal aging and Alzheimer’s disease. Cerebral Cortex, 15, 732–739. Healey, M. K., Campbell, K. L., & Hasher, L. (2008). Cognitive aging and increased distractibility: Costs and potential benefits. Progress in Brain Research, 169, 353–363. Henneman, W. J. P., Sluimer, J. D., Barnes, J., van der Flier, W. M., Sluimer, I. C., Fox, N. C., et al. (2009). Hippocampal atrophy rates in Alzheimer disease. Added value over whole brain volume measures. Neurology, 72, 999–1007. Hirao, K., Ohnishi, T., Hirata, Y., Yamashita, F., Mori, T., Moriguchi, Y., et al. (2005). The prediction of rapid conversion to Alzheimer’s disease in mild cognitive impairment using regional cerebral blood flow. Neuroimage, 28(4), 1014–1021. Hubel, D. H., & Wiesel, T. N. (1968). Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology, 195(1), 215–243. Huey, E. D., Zahn, R., Krueger, F., Moll, J., Kapogiannis, D., Wasserman, E. M., et al. (2008). A psychological and neuroanatomical model of obsessing-compulsive disorder. The Journal of Neuropsychiatry and Clinical Neurosciences, 20, 390–408. Huff, F. J., Becker, J. T., Bell, S. H., Nebbes, R. D., Holland, A. L., & Boller, F. (1987). Cognitive deficits and clinical diagnosis of Alzheimer’s disease. Neurology, 37, 1119–1124. Humphreys, G. W., & Forde, E. M. E. (1998). Disordered action schema and action disorganisation syndrome. Cognitive Neuropsychology, 15, 771–811. Humphreys, G. W., Forde, E. M. E., & Riddoch, M. J. (2001). The planning and execution of everyday actions. In The handbook of cognitive neuropsychology: What deficits reveal about the human mind (pp. 565–589). Philadelphia: Psychology Press. Johnson, D. K., Storandt, M., & Balota, D. A. (2003). A discourse analysis of logical memory recall in normal aging and in dementia of the Alzheimer’s type. Neuropsychologia, 17, 82–92. Johnson, K. A., Jones, K., Holman, B. L., Becker, J. A., Spiers, P. A., Satlin, A., et al. (1998). Preclinical prediction of Alzheimer’s disease using SPECT. Neurology, 50, 1563–1571.
294
Jeffrey M. Zacks and Jesse Q. Sargent
Kane, M., Hambrick, D., Tuholski, S., Wilhelm, O., Payne, T., & Engle, R. (2004). The generality of working memory capacity: A latent-variable approach to verbal and visuospatial memory span and reasoning. Journal of Experimental Psychology General, 133, 189–217. Kanne, S. M., Balota, D. A., McKeel, D., Storandt, M., & Morris, J. (1998). Relating anatomy to function in Alzheimer’s disease: Neuropsychological profiles predict regional neuropathology five years later. American Academy of Neurology, 50, 979–985. Kausler, D. H., & Puckett, J. M. (1981). Adult age differences in memory for sex of voice. Journal of Gerontology, 36, 44–50. Kausler, D. H., Salthouse, T. A., & Saults, J. S. (1988). Temporal memory over the adult lifespan. American Journal of Psychology, 101, 207–215. Kemper, T. L. (1994). Neuroanatomical and neuropathological changes during aging and in dementia. In M. L. Albert & E. J. E. Knoepfel (Eds.), Clinical Neurology of Aging (pp. 3–67). (2nd ed.). New York: Oxford University Press. Klunk, W. E., Engler, H., Nordberg, A., Wang, Y., Blomqvist, G., Holt, D. P., et al. (2004). Imaging brain amyloid in Alzheimer’s disease with Pittsburgh compound B. Annals of Neurology, 55, 306–319. Koechlin, E., Danek, A., Burnod, Y., & Grafman, J. (2002). Medial prefrontal and subcortical mechanisms underlying the acquisition of motor and cognitive action sequences in humans. Neuron, 35(2), 371–381. Koechlin, E., & Summerfield, C. (2007). An information theoretical approach to prefrontal executive function. Trends in Cognitive Sciences, 11, 229–235. Kurby, C. A., & Zacks, J. M. (2008). Segmentation in the perception and memory of events. Trends in Cognitive Sciences, 12, 72–79. Kurby, C. A., & Zacks, J. M. (under review). Age differences in the perception of hierarchical structure in events. Journal of Experimental Psychology: Human Perception and Performance. Kurby, C. A., Zacks, J. M., & Haroutunian, N. (2009). Event boundaries and everyday clairovoyance. In: Poster presentation (#3156) at annual meeting of psychonomic society. Boston, MA. Levy, R., & Goldman-Rakic, P. S. (2000). Segregation of working memory functions within the dorsolateral prefrontal cortex. Experimental Brain Research, 133, 23–32. Li, Y., Rinne, J. O., Mosconi, L., Pirraglia, E., Rusinek, H., DeSanti, S., et al. (2008). Regional analysis of FDG and PIB-PET images in normal aging, mild cognitive impairment, and Alzheimer’s disease. European Journal of Nuclear Medicine and Molecular Imaging, 35(12), 2169–2181. Light, L. L., La Voie, D., Valencia-Laver, D., Albertson-Owens, S. A., & Mead, G. (1992). Direct and indirect measures of memory for modality in younger and older adults. Journal of Experimental Psychology. Learning, Memory, and Cognition, 18, 1284–1297. Liu, X., Erikson, C., & Brun, A. (1996). Cortical synaptic changes and gliosis in normal aging, Alzheimer’s disease and frontal lobe degemneration. Dementia, 7, 128–134. Mao, L., Zhou, B., Zhou, W., & Han, S. (2007). Neural correlates of covert orienting of visual spatial attention along vertical and horizontal dimensions. Brain Research, 1136(1), 142–153. Marr, D., & Ullman, S. (1981). Directional selectivity and its use in early visual processing. Proceedings of the Royal Society B: Biological Sciences, 211(1183), 151–180. Martin, E., Wilson, R., Penn, R., Fox, J. H., Clasen, R. A., & Savoy, S. M. (1987). Cortical biopsy results in Alzheimer’s disease: Correlation with cognitive deficits. Neurology, 37(7), 1201–1204. McCabe, D. P., Roediger, H. L., McDaniel, M. M., Balota, D. A., & Hambrick, D. Z. (2006). The relationship between working memory capacity and frontal-lobe functioning: An adult life span study. In: Biennial Cognitive Aging Conference. Atlanta, GA.
Event Perception: A Theory and Its Application to Clinical Neuroscience
295
McGeer, P. L., & McGeer, E. G. (1989). Amino acid neurotransmitters. In G. J. Siegel, B. W. Agranoff, R. W. Albers, & P. W. Molinoff (Eds.), Basic neurochemistry: Molecular, cellular, and medical aspects (pp. 311–332). New York: Raven. Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. Annual Review of Neuroscience, 24, 167–202. Mintun, M. A., Larossa, G. N., Sheline, Y. I., Dence, C. S., Lee, S. Y., Mach, R. H., et al. (2006). [11C]PIB in a nondemented population: Potential antecedent marker of Alzheimer disease. Neurology, 67, 446–452. Morrow, D. G., Leirer, V. O., Altieri, P. A., & Fitzsimmons, C. (1994). Age differences in creating spatial mental models from narratives. Language and Cognitive Processes, 9, 203–220. Morrow, D. G., Stine-Morrow, E. A. L., Leirer, V. O., Andrassy, J. M., & Kahn, J. (1997). The role of reader age and focus of attention in creating situation models from narratives. Journal of Gerontology: Psychological Sciences, 52B, 73–80. Muhammad, R., Wallis, J. D., & Miller, E. K. (2006). A comparison of abstract rules in the prefrontal cortex, premotor cortex, inferior temporal cortex, and striatum. Journal of Cognitive Neuroscience, 18, 974–989. Mu¨ller, N., & Knight, R. (2006). The functional neuroanatomy of working memory: Contributions of human brain lesion studies. Neuroscience, 139, 51–58. Mushiake, H., Sakamoto, K., Saito, N., Inui, T., Aihara, K., & Tanji, J. (2009). Involvement of the prefrontal cortex in problem solving. International Review of Neurobiology, 85, 1–11. Nagahama, N., Okada, T., Katsumi, Y., Hayashi, T., Yamauchi, H., Sawamoto, N., et al. (1999). Transient neural activity in the medial superior frontal gyrus and precuneus time locked with attention shift between object features. NeuroImage, 10, 193–199. Naveh-Benjamin, M., & Craik, F. I. M. (1995). Memory for context and its use in item memory: Comparisons of younger and older persons. Psychology and Aging, 10, 284–293. Newtson, D. (1973). Attribution and the unit of perception of ongoing behavior. Journal of Personality and Social Psychology, 28, 28–38. Newtson, D. (1976). Foundations of attribution: The perception of ongoing behavior (pp. 223–248). Hillsdale, New Jersey: Lawrence Erlbaum Associates. Newtson, D., & Engquist, G. (1976). The perceptual organization of ongoing behavior. Journal of Experimental Social Psychology, 12, 436–450. Norman, D. A., & Shallice, T. (1986). Attention to action: Willed and automatic control of behaviour. In R. Davidson, G. Schwartz, & D. Shapiro (Eds.), Consciousness and self regulation: Advances in research and theory, Vol. 4 (pp. 1–18). New York: Plenum. Norman, K. A., & Schacter, D. L. (1997). False recognition in older and younger adults: Exploring the characteristics of illusory memories. Memory & Cognition, 25, 838–848. Olanow, C., & Tatton, W. (1999). Etiology and pathogenesis of Parkinson’s disease. Annual Review of Neuroscience, 22, 123–144. Perry, R. J., & Hodges, J. R. (1999). Attention and executive deficits in Alzheimer’s disease. A critical review. Brain, 122, 383–404. Posner, M. I., & Peterson, S. E. (1990). The attention system of the human brain. Annual Review of Neuroscience, 13, 25–42. Price, J. L., & Morris, J. C. (2004). So what if tangles precede plaques? Neurobiology of Aging, 25, 721–723. Price, J. L., McKeel, D. W., Jr, Buckles, V. D., Roe, C. M., Xiong, C., Grundman, M., et al. (2009). Neuropathology of nondemented aging: Presumptive evidence for preclinical Alzheimer disease. Neurobiology of Aging, 30(7), 1026–1036. Procyk, E., Tanaka, Y. L., & Joseph, J. P. (2000). Anterior cingulate activity during routine and non-routine sequential behaviors in macaques. Nature Neuroscience, 3(5), 502–508. Radvansky, G. A., & Curiel, J. M. (1998). Narrative comprehension and aging: The fate of completed goal information. Psychology and Aging, 13, 69–79.
296
Jeffrey M. Zacks and Jesse Q. Sargent
Radvansky, G. A., & Dijkstra, K. (2007). Aging and situation model processing. Psychonomic Bulletin & Review, 14, 1027–1042. Radvansky, G. A., Zacks, R. T., & Hasher, L. (1996). Fact retrieval in younger and older adults: The role of mental models. Psychology and Aging, 11, 258–271. Raichle, M. E., MacLeod, A. M., Snyder, A. Z., Powers, W. J., Gusnard, D. A., & Shulman, G. L. (2001). A default mode of brain function. Proceedings of the National Academy of Sciences of the United States of America, 98, 676–682. Rapoport, J. L. (1990). Obsessive compulsive disorder and basal ganglia dysfunction. Psychological Medicine, 20, 465–469. Raz, N. (1996). Neuroanatomy of aging brain: Evidence from structural MRI. In E. D. Bigler (Ed.), Neuroimaging II: Clinical applications (pp. 153–182). New York: Academic Press. Raz, N. (2000). Aging of the brain and its impact on cognitive performance: Integration of structural and functional findings. Handbook of Aging and Cognition, 2, 1–90. Raz, N., Gunning, F. M., Head, D., Dupuis, J. H., McQuain, J. M., Briggs, S. D., et al. (1997). Selective aging of human cerebral cortex observed in vivo: Differential vulnerability of the prefrontal gray matter. Cerebral Cortex, 7, 268–282. Reynolds, J. R., Zacks, J. M., & Braver, T. S. (2007). A computational model of event segmentation from perceptual prediction. Cognitive Science, 31, 613–643. Rizzo, M., Anderson, S. W., Dawson, J., Myers, R., & Ball, K. (2000). Visual attention impairments in Alzheimer’s disease. Neurology, 54, 1954–1959. Rosen, V., Caplan, L., Sheesley, L., Rodriguez, R., & Grafman, J. (2003). An examination of daily activities and their scripts across the adult lifespan. Behavioral Research Methods, Instruments & Computers, 35, 32–48. Sakata, M., Farooqui, S. M., & Prasad, C. (1992). Post transcriptional regulation of loss of rat striatal D2 dopamine receptor during aging. Brain Research, 575, 309–314. Saxena, S., & Rauch, S. L. (2000). Functional neuroimaging and the neuroanatomy of obsessive-compulsive disorder. Psychiatric Clinics of North America, 23, 563–586. Schultz, W. (1998). Predictive reward signal of dopamine neurons. Journal of Neurophysiology, 80, 1–27. Schultz, W., & Dickinson, A. (2000). Neuronal coding of prediction errors. Annual Review of Neuroscience, 23, 473–500. Schwan, S., & Garsoffky, B. (2004). The cognitive representation of filmic event summaries. Applied Cognitive Psychology, 18, 37–55. Schwan, S., Garsoffky, B., & Hesse, F. W. (2000). Do film cuts facilitate the perceptual and cognitive organization of activity sequences? Memory & Cognition, 28(2), 214–223. Schwartz, M. F. (2006). The cognitive neuropsychology of everyday action and planning. Cognitive Neuropsychology, 23, 202–221. Schwartz, M. F., Montgomery, M. W., Fitzpatrick-DeSalme, E. J., Ochipa, C., Coslett, H., & Mayer, N. (1995). Analysis of a disorder of everyday action. Cognitive Neuropsychology, 12, 863–892. Seger, C. A. (1994). Implicit learning. Psychological Bulletin, 115, 163–196. Shaw, T. G., Mortel, K. F., Meyer, J. S., Rogers, R. L., Hardenberg, J., & Cutaia, M. M. (1984). Cerebral blood flow changes in benign aging and cerebrovascular disease. Neurology, 34, 855–862. Shulman, G. L., Fiez, J. A., Corbetta, M., Buckner, R. L., Miezin, F. M., Raichle, M. E., et al. (1997). Common blood flow changes across visual tasks: Decreases in cerebral cortex. Journal of Cognitive Neuroscience, 9, 648–663. Sirigu, A., Cohen, L., Zalla, T., Pradat-Diehl, P., VanEeckhout, P., Grafman, J., et al. (1998). Distinct frontal regions for processing sentence syntax and story grammar. Cortex, I34, 771–778. Sirigu, A., Zalla, T., Pillon, B., Grafman, J., Agid, Y., & Dubois, B. (1996). Encoding of sequence and boundaries of scripts following prefrontal lesions. Cortex, 32, 297–310.
Event Perception: A Theory and Its Application to Clinical Neuroscience
297
Small, G. W., Kepe, V., Ercoli, L. M., Siddarth, P., Bookheimer, S. Y., Miller, K. J., et al. (2006). PET of brain amyloid and tau in mild cognitive impairment. New England Journal of Medicine, 355, 2652–2663. Speer, N. K., Swallow, K. M., & Zacks, J. M. (2003). Activation of human motion processing areas during event perception. Cognitive, Affective & Behavioral Neuroscience, 3, 335–345. Speer, N. K., & Zacks, J. M. (2005). Temporal changes as event boundaries: Processing and memory consequences of narrative time shifts. Journal of Memory and Language, 53, 125–140. Speer, N. K., Zacks, J. M., & Reynolds, J. R. (2007). Human brain activity time-locked to narrative event boundaries. Psychological Science, 18, 449–455. Sperling, R. A., LaViolette, P. S., O’Keefe, K., O’Brien, J., Rentz, D. M., Pihlajamaki, M., et al. (2009). Amyloid deposition is associated with impaired default network function in older persons without dementia. Neuron, 63, 178–188. Squire, L., & Zola-Morgan, S. (1991). The medial temporal lobe memory system. Science, 253, 1380–1386. Stevens, W. D., Hasher, L., Chiew, K. S., & Grady, C. L. (2008). A neural mechanism underlying memory failure in older adults. Journal of Neuroscience, 28(48), 12820–12824. Stine-Morrow, E. A. L., Gagne, D. D., Morrow, D. G., & DeWall, B. (2004). Age differences in rereading. Memory & Cognition, 32, 696–710. Storandt, M., & Beaudreau, S. (2004). Do reaction time measures enhance diagnosis of earlystage dementia of the Alzheimer type? Archives of Clinical Neurology, 19, 119–124. Svoboda, E., McKinnon, M. C., & Levine, B. (2006). The functional neuroanatomy of autobiographical memory: A meta-analysis. Neuropsychologia, 44, 2189–2208. Swallow, K. M., & Zacks, J. M. (2008). Sequences learned without awareness can orient attention during the perception of human activity. Psychonomic Bulletin & Review, 15(1), 116–122. Swallow, K. M., Zacks, J. M., & Abrams, R. A. (2009). Event boundaries in perception affect memory encoding and updating. Journal of Experimental Psychology: General, 138, 236–257. Swick, D., Senkfor, A. J., & Van Petten, C. (2006). Source memory retrieval is affected by aging and prefrontal lesions: Behavioral and ERP evidence. Brain Research, 1107(1), 161–176. Taylor, A. E., & Saint-Cyr, J. A. (1995). The neuropsychology of Parkinsons-Disease. Brain and Cognition, 28, 281–296. Thienel, R., Voss, B., Kellermann, T., Reske, M., Halfter, S., Sheldrick, A. J., et al. (2009). Nicotinic antagonist effects on functional attention networks. International Journal of Neuropsychopharmacology, 12(10), 1295–1305. Tse, C. S., Balota, D. A., Moynan, S. C., Duchek, J. M., & Jacoby, L. L. (2010). The utility of placing recollection in opposition to familiarity in early discrimination of healthy aging and very mild dementia of the Alzheimer’s type. Neuropsychology, 24(1), 49–67. Twamley, E. W., Ropacki, S. A., & Bondi, M. W. (2006). Neuropsychological and neuroimaging changes in preclinical Alzheimer’s disease. Journal of International Neuropsychological Society, 12, 707–735. Underwood, B. J. (1957). Interference and forgetting. Psychological Review, 64, 49–64. Ursu, S., Stenger, V. A., Shear, M. K., Jones, M. R., & Carter, C. S. (2003). Overactive action monitoring in obsessive-compulsive disorder: Evidence from functional magnetic resonance imaging. Psychological Science, 14, 347–353. Usher, M., Cohen, J. D., Servan-Schreiber, D., Rajkowski, J., & Aston-Jones, G. (1999). The role of the locus coeruleus in the regulation of cognitive performance. Science, 283, 549–554. Uttl, B., & Graf, P. (1993). Episodic spatial memory in adulthood. Psychology and Aging, 8, 257–273.
298
Jeffrey M. Zacks and Jesse Q. Sargent
Vallacher, R. R., & Wegner, D. M. (1987). What do people think they’re doing? Action identification and human behavior. Psychological Review, 94, 3–15. van Veen, V., & Carter, C. S. (2002). The anterior cingulate as a conflict monitor: fMRI and ERP studies. Physiology & Behavior, 77, 477–482. Verhaeghen, P. (2003). Aging and vocabulary score: A meta-analysis. Psychology and Aging, 18, 332–339. Verhaeghen, P., & Salthouse, T. A. (1997). Meta-analyses of age cognition relations in adulthood: Estimates of linear and non-linear age effects and structural models. Psychological Bulletin, 122, 231–249. Volkow, N. D., Gur, R. C., Wang, G. J., Fowler, J. S., Moberg, P. J., Ding, Y. S., et al. (1998). Association between decline in brain dopamine activity with age and cognitive and motor impairment in healthy individuals. American Journal of Psychiatry, 155(3), 344–349. Wager, T. D., & Smith, E. E. (2003). Neuroimaging studies of working memory: A metaanalysis. Cognitive, Affective & Behavioral Neuroscience, 3, 255–274. Wagner, A. D., Shannon, B. J., Kahn, I., & Buckner, R. L. (2005). Parietal lobe contributions to episodic memory retrieval. Trends in Cognitive Science, 9, 445–453. Wechsler, D. (1997). Wechsler Adult Intelligence Scale (3rd ed.). San Antonio, TX: The Psychological Corporation. Welsh, K. A., Butters, N., Hughes, J. P., Mohs, R. C., & Heyman, A. (1992). Detection and staging of dementia1 in Alzheimer’s disease: Use of the neuropsychological measures developed for the consortium to establish a registry for Alzheimer’s disease. Archives of Neurology, 49, 448–452. Wood, J. N., & Grafman, J. (2003). Human prefrontal cortex: Processing and representational perspectives. Nature Reviews Neuroscience, 4, 139–147. Zacks, J. M., Braver, T. S., Sheridan, M. A., Donaldson, D. I., Snyder, A. Z., Ollinger, J. M., et al. (2001). Human brain activity time-locked to perceptual event boundaries. Nature Neuroscience, 4, 651–655. Zacks, J., Speer, N., & Reynolds, J. R. (2009). Segmentation in reading and film comprehension. Journal of Experimental Psychology: General, 138, 307–327. Zacks, J. M., Speer, N. K., Swallow, K. M., Braver, T. S., & Reynolds, J. R. (2007). Event perception: A mind/brain perspective. Psychological Bulletin, 133, 273–293. Zacks, J. M., Speer, N. K., Vettel, J. M., & Jacoby, L. L. (2006). Event understanding and memory in healthy aging and dementia of the Alzheimer type. Psychology and Aging, 21, 466–482. Zacks, J. M., Swallow, K. M., Vettel, J. M., & McAvoy, M. P. (2006). Visual motion and the neural correlates of event perception. Brain Research, 1076, 150–162. Zacks, J. M., & Tversky, B. (2001). Event structure in perception and conception. Psychological Bulletin, 127, 3–21. Zacks, J. M., Tversky, B., & Iyer, G. (2001). Perceiving, remembering, and communicating structure in events. Journal of Experimental Psychology: General, 130, 29–58. Zacks, R. T., & Hasher, L. (1994). Directed ignoring: Inhibitory regulation of working memory. In D. Dagenbach & T. H. Carr (Eds.), Inhibitory mechanisms in attention, memory, and language (pp. 241–264). New York, NY: Academic Press. Zalla, T., Pradat-Diehl, P., & Sirigu, A. (2003). Perception of action boundaries in patients with frontal lobe damage. Neuropsychologia, 41, 1619–1627. Zalla, T., Sirigu, A., Pillon, B., Dubois, B., Agid, Y., & Grafman, J. (2000). How patients with Parkinson’s disease retrieve and manage cognitive event knowledge. Cortex, 36, 163–179. Zalla, T., Sirigu, A., Pillon, B., Dubois, B., Grafman, J., & Agid, Y. (1998). Deficient in evaluating pre-determinated sequences of script events in patients with Parkinson’s disease. Cortex, 34, 621–628.
Event Perception: A Theory and Its Application to Clinical Neuroscience
299
Zalla, T., Verlut, I., Franck, N., Puzenat, D., & Sirigu, A. (2004). Perception of dynamic action in patients with schizophrenia. Psychiatry Research, 128, 39–51. Zanini, S. (2008). Generalised script sequencing deficits following frontal lobe lesions. Cortex, 44, 140–149. Zanini, S., Rumiati, R., & Shallice, T. (2002). Action sequencing deficit following frontal lobe lesion. Neurocase, 8, 88–99. Zor, R., Keren, H., Hermesh, H., Szechtman, H., Mort, J., & Eilam, D. (2009). Obsessivecompulsive disorder: A disorder of pessimal (non-functional) motor behavior. Acta Psychiatrica Scandinavica, 120, 288–298. Zwaan, R. A., & Radvansky, G. A. (1998). Situation models in language comprehension and memory. Psychological Bulletin, 123(2), 162–185.
C H A P T E R
E I G H T
Two Minds, One Dialog: Coordinating Speaking and Understanding Susan E. Brennan, Alexia Galati, and Anna K. Kuhlen Contents 1. Introduction: The Joint Nature of Language Processing 2. Dialog: Beyond Transcripts 3. Process Models of Dialog 3.1. The Message Model 3.2. Two-Stage Models 3.3. The Collaborative View and the Grounding Model 4. The Role of Cues in Grounding 5. Partner-Specific Processing 5.1. Global and Local Adaptations 5.2. Speakers Adapt Utterances for Their Addressees 5.3. Addressees Adapt Utterance Interpretations to Speakers 5.4. Simple or ‘‘One-Bit’’ Partner Models 6. Neural Bases of Partner-Adapted Processing 6.1. Mirroring 6.2. Theory of Mind 6.3. Distinguishing a Partner’s Perspective from One’s Own: The Role of Executive Control 6.4. Mentalizing Versus Mirroring 6.5. Cues Hypothesized to Support Partner-Adapted Processing 7. Conclusions Acknowledgments References
302 304 307 308 310 311 313 315 316 320 323 324 324 325 326 330 332 333 335 337 338
Abstract In this chapter, we consider communication as a joint activity in which two or more interlocutors share or synchronize aspects of their private mental states and act together in the world. We summarize key experimental evidence from our own and others’ research on how speakers and addressees take one another into account while they are processing language. Under some circumstances, production and comprehension are adjusted to a partner’s perspective or characteristics in the early moments of processing, in a flexible and Psychology of Learning and Motivation, Volume 53 ISSN 0079-7421, DOI: 10.1016/S0079-7421(10)53008-1
#
2010 Elsevier Inc. All rights reserved.
301
302
Susan E. Brennan et al.
probabilistic fashion. We advocate studying the coordination and integration of cognitive products and processes both between and within the minds of interlocutors. We then discuss recent evidence from electrophysiology and imaging studies (relevant to Theory of Mind and to mirroring) that has begun to illuminate brain networks that underlie the coordination of joint and individual processing during communication.
1. Introduction: The Joint Nature of Language Processing The scientific study of language has been shaped by the assumption that the human language faculty evolved for thinking rather than for communicating (e.g., Chomsky, 1965, 1980). This ‘‘language-as-product’’ tradition takes language itself as the object of study, focusing on grammatical knowledge and the core processes for recovering linguistic structure from sentences. This common focus has given generations of psycholinguists and other cognitive scientists license to concentrate on the study of the linguistic representation and processing in the mind and brain of a lone (and largely generic) native speaker, independent of context. As a result, a great deal is known about how individuals store, organize, and access knowledge in the mental lexicon; how individuals parse sentences and resolve syntactic ambiguity; and how individuals plan and articulate utterances. But there is more to language processing than these (seemingly) autonomous processes, as has been demonstrated by those who work within the ‘‘language-as-action’’ tradition (e.g., Brennan & Clark, 1996; Clark, 1992; Clark & Wilkes-Gibbs, 1986; Fussell & Krauss, 1989, 1991, 1992; Glucksberg, Krauss, & Weisberg, 1966; Hanna, Tanenhaus, & Trueswell, 2003; Krauss, 1987; Schober & Clark, 1989). Consider three students, Leah, Dale, and Adam, who are trying to recall a scene from an excerpt of a movie1 that they recently watched together, in which the protagonist is forced to wear an odd and embarrassing object: ... Leah: um. . . then he gets punished or whatever? Dale: what was that, a wreath or— Leah: yeah it was some kind of browny— Adam: yeah it was some kind of straw thing or something Leah: mhm Dale: around his neck Leah: so that everybody knew what he did or something? 1
The scene comes from a John Sayles movie, The Secret of Roan Inish.
Two Minds, One Dialog: Coordinating Speaking and Understanding
303
Adam: straw wreath Dale: yeah . . . (excerpted from Brennan & Ohaeri, 1999)
Even though this transcript bears little resemblance to the idealized sentences typical of playwrights’ scripts, psycholinguists’ stimuli, or linguists’ grammaticality judgments, it unfolds in an orderly way. The three partners rapidly succeed in establishing consensus as they share a focus of attention, cue one another’s memories, and ratify one another’s proposals about what to include in the product they are constructing together: their joint memory of the event. In doing this, they even complete one another’s utterances. The product represented by this transcript reflects a process by which both memory recall and speaking are grounded in action conducted jointly, rather than achieved by minds working alone. Such data from studies of language-asaction (Clark, 1992; Tanenhaus & Trueswell, 2004) focus on language use in physical or communicative contexts. This particular spontaneous exchange comes from a large corpus recorded in an experimental study of collaborative recollection (Ekeocha & Brennan, 2008). It is so typical of everyday conversation as to seem rather unremarkable and yet at the same time, displays a level of coordination between partners that is astonishing in its virtuosity. There is a growing trend within cognitive science to examine human cognition in social contexts, either pairwise or in small groups. This includes recall of memories (e.g., Ekeocha & Brennan, 2008; Harris, Paterson, & Kemp, 2008; Hollingshead, 1998; Weldon & Bellinger, 1997), collaborative visual search (e.g., Brennan, Chen, Dickinson, Neider, & Zelinsky, 2007; Neider, Chen, Dickinson, Brennan, & Zelinsky, 2005), decision making (e.g., Kiesler & Sproull, 1992; Wiley & Jensen, 2006), learning (e.g., Wiley & Bailey, 2006), two-person motor activities (e.g., Sebanz, Bekkering, & Knoblich, 2006; Sebanz & Knoblich, 2009), and of course, psycholinguistic processing in dialog. Some have argued that processing may be qualitatively different in the context of dialog than in monologue because both speech comprehension and speech planning systems are active at once (e.g., Pickering & Garrod, 2004). Others argue that, at least initially, language processes in dialog are identical to language processes in monologue because conversational partners process language from their own ‘‘egocentric’’ perspectives in which early processing is encapsulated from partner-specific information (e.g., Barr & Keysar, 2002; Keysar, Barr, Balin, & Brauner, 2000; Keysar, Barr, Balin, & Paek, 1998; Kronmu¨ller & Barr, 2007), followed by a second stage in which they can take their partner’s perspective into account. We take the view that processing in dialog can be explained by ordinary memory processes (Horton & Gerrig, 2002, 2005a, 2005b; Metzing & Brennan, 2003) and argue that these processes need not be encapsulated, but under some circumstances, are adapted flexibly and rapidly to the perspective of a conversational partner.
304
Susan E. Brennan et al.
In addition to the coordination that takes place interpersonally, between partners, language processes are also coordinated intrapersonally, within the mind of an individual with many processes conducted in parallel: For instance, an individual speaker simultaneously plans and articulates an utterance while monitoring an addressee’s reactions, and an individual addressee simultaneously listens to and interprets an utterance moment by moment while preparing what to say next, or even how to contribute to what the speaker is saying. This appears to require that various subprocesses of planning, parsing, interpretation, articulation, and monitoring must be able to share information and influence one another in a rather fine-grained way. Even though key capabilities that make human communication possible—such as the language faculty itself, the ability to mentalize about another person’s mental state (or Theory of Mind—ToM), and the ability to respond rapidly and automatically to sensorimotor cues from human motion, speech, and other behaviors—may to some extent be supported by neural circuits thought to be distinct (Van Overwalle & Baetens, 2009), behavioral evidence suggests that there is close integration of these underlying processes (and their products), both within and between the minds of interlocutors. This, we argue, is what the study of language processing should aim to map, model, and explain. In this chapter, we consider language processing in communicative contexts as a joint activity in which two or more interlocutors share or synchronize aspects of their private mental states and act together in the world. We summarize key experimental evidence from our own and others’ research on how speakers and addressees take one another into account during communication. Under some circumstances, interlocutors can adjust to information about a partner’s characteristics, needs, or knowledge in the early moments of processing. The accumulating evidence suggests that cognitive processing is probabilistic and flexible in how it adapts to partnerspecific information (Brennan & Hanna, 2009; Jurafsky, 1996; MacDonald, 1994; Tanenhaus & Trueswell, 1995). We then discuss the evidence from electrophysiology and imaging studies that has begun to illuminate the neural architecture supporting joint and individual processing during communication.
2. Dialog: Beyond Transcripts As evident from the example of the three students recalling a movie together, the process of coordinating meaning leaves behind striking evidence in the dialog transcript. A transcript is an analyzable product that can provide evidence about how interpersonal coordination unfolds, as one utterance seems to shape what is said next. Transcripts show that successive
Two Minds, One Dialog: Coordinating Speaking and Understanding
305
utterances produced by interlocutors often display recognizable contingency. One speaker may complete another’s utterance by adding an installment that seamlessly continues its syntactic structure, as in our opening example (for studies of collaborative completions, see DuBois, 1974; Lerner, 1996; Wilkes-Gibbs, 1986). Many important descriptive insights about structural phenomena in conversation such as turn-taking, repair, and co-construction of utterances have been presented by ethnomethodologists who analyze detailed transcripts of naturally occurring conversations (e.g., Goodwin, 1981; Jefferson, 1973; Sacks, Schegloff, & Jefferson, 1974). Although a transcript can be informative, it is only an artifact of the processes that generate it; people who overhear a conversation (including those who analyze it later) may not understand it in the same way that participants do (Kraut, Lewis, & Swezey, 1982; Schober & Clark, 1989). Psycholinguists who study dialog are interested in systematically probing the processes from which a transcript emerges. To understand what people might intend when they say what they say, psychologists (e.g., Clark, 1992; Glucksberg et al., 1966) have wrestled conversation into the laboratory in order to test hypotheses about language use and processing (often inspired by insights from conversation analysts). Experimental control and reliability are achieved by assigning different pairs of subjects to complete the same task in which they refer to, look at, pick up, and move objects. By observing such task-oriented dialog, the experimenter has access not only to the transcript, but also to physical evidence of what speakers mean and what addressees understand. This has led to conclusions about the underlying cognitive mechanisms of phenomena such as lexical choice and variability, perspective taking, distribution of initiative, conversational repair, the accumulation of common ground between partners, and audience design, or tailoring an utterance to a particular partner. Consider these three excerpts from the transcript of a referential communication experiment in which two naı¨ve partners could hear but not see each other (Stellmann & Brennan, 1993). Partners A and B each had a duplicate set of 12 cards displaying abstract geometric objects. The matcher (B) needed to arrange his cards in the same order as the director’s (A’s) cards. They did this for the first time in Trial 1, after which the cards were scrambled and matched again repeatedly (Trials 2 and 3): Trial 1: A: ah boy this one ah boy alright it looks kinda like, on the right top there’s a square that looks diagonal B: uh huh A: and you have sort of another like rectangle shape, the– like a triangle, angled, and on the bottom it’s ah I don’t know what that is, glass shaped
306
Susan E. Brennan et al.
B: alright I think I got it A: it’s almost like a person kind of in a weird way B: yeah like like a monk praying or something A: right yeah good great B: alright I got it (etc. – they match about a dozen other cards)
Trial 2: B: 9 is that monk praying A: yup (etc. – they match other cards)
Trial 3: A: number 4 is the monk B: ok
This matching task elicits data about interlocutors’ spontaneous productions (from the transcript) and interpretations (from observing physical evidence provided by when and where the matcher moves the cards). The combination of behavioral evidence in the context of an experimentally controlled setting, synchronized with speech documented in the transcript, has provided powerful evidence for common ground or partially and mutually shared mental representations that presumably accumulate in the minds of both partners as they interact (whether in a laboratory experiment or in everyday conversation). Grounding enables partners to achieve a joint perspective on an object, such that referring to it becomes more efficient over time. The process of grounding typically results in entrainment, or convergence and synchronization between partners on various linguistic and paralinguistic levels—including in wording, syntax, speaking rate, gestures, eye-gaze fixations, body position, postural sway, and sometimes pronunciation (e.g., Branigan, Pickering, & Cleland, 2000; Brennan & Clark, 1996; Giles & Powesland, 1975; Levelt & Kelter, 1982; Shockley, Richardson, & Dale, 2009). Transcripts of different pairs of partners referring repeatedly to the same object demonstrate that there is less variability in the wording and perspectives associated with objects within a particular dialog than between dialogs (Brennan & Clark, 1996). In one experiment, 13 pairs each created, entrained on, and consistently reused one of 13 different perspectives for the geometric tangram figure in Figure 1 (Stellmann & Brennan, 1993). The perspective that two interlocutors ground during a dialog, then, is another kind of joint product that emerges from interpersonal interaction. At the same time, interlocutors who share a communicative goal can be flexible in revising jointly achieved perspectives when necessary. And they can be extremely flexible in what they are willing to negotiate an expression or even a single word to mean.
Two Minds, One Dialog: Coordinating Speaking and Understanding
307
“A bat” “The candle” “The anchor” “The rocket ship” “The Olympic torch” “The Canada symbol” “The symmetrical one” “Shapes on top of shapes” “The one with all the shapes” “The bird diving straight down” “The airplane flying straight down” “The angel upside down with sleeves” “The man jumping in the air with bell bottoms on”
Figure 1 Perspectives vary across conversations.
Although a transcript can vividly illustrate some of these interpersonal products of interactive dialog, it often says little about how language processing unfolds incrementally and intrapersonally (within the mind of a participant). A major methodological advance has been the ‘‘visual worlds’’ paradigm pioneered by Tanenhaus, Spivey-Knowlton, Eberhard, and Sedivy (1995). This experimental paradigm measures the looking behavior of listeners who wear inobstrusive, head-mounted eye trackers while hearing prerecorded or scripted utterances that refer to visible objects; it measures indirect evidence of processing at a fine temporal grain, computed from the proportions of looks to an object within a defined epoch, in order to uncover the time course of lexical, prosodic, syntactic, semantic, and pragmatic processing (e.g., Altmann & Kamide, 2007). Some recent studies have merged the visual worlds eyetracking paradigm with referential communication tasks done jointly by two spontaneously interacting partners (e.g., Brown-Schmidt, 2009; BrownSchmidt, Gunlogson, & Tanenhaus, 2008; Hanna & Brennan, 2007; Kraljic & Brennan, 2005). This approach has the potential to uncover not only how processing unfolds online within an individual engaged in dialog, but also how processing is coordinated incrementally between individuals.
3. Process Models of Dialog What is the nature of dialog? All experimental studies of collaborative cognition rely on some notion, often entirely implicit, of what it means to participate in a dialog or to otherwise process information along with a partner (Kuhlen & Brennan, 2008). Some studies rely on the mere presence of one or more partners who may not be allowed to interact; this approach
308
Susan E. Brennan et al.
presumes that the effect of interpersonal collaboration is strictly motivational. Others allow a partner to contribute to the interaction only once, which decouples coordination processes from language processing. These approaches seem to assume that collaboration is based on a unidirectional exchange of information: While one conversational partner speaks the other listens passively. Some studies control the timing, order, or kinds of contributions that partners may make during a task (e.g., Basden, Basden, & Henry, 2000; Wright & Klumpp, 2004); while this may be desirable for controlling variation due to behavioral contingencies, it removes partners’ ability to take initiative, treats what may be meaningful coordinating signals as noise, and probably rules out any but the simplest sorts of coordination of the processes under study. Some psycholinguistic studies of dialog gain control by using confederates (whether human or simulated). But unless a confederate is doing the task for real, with actual communicative needs, the confederate’s behavior can differ in troubling ways from the spontaneous behavior of a naı¨ve partner. For instance, when a confederate plays the role of an addressee over and over in a study about speech production, she may know what the speaker is about to say better than the speaker himself does, and her feedback and nonverbal cues, if not carefully characterized and controlled, are very likely to communicate her lack of a need for information (Brennan & Williams, 1995; Kuhlen & Brennan, 2010; Lockridge & Brennan, 2002). For that reason, we are wary of using confederates in the addressee role unless they are actually doing a task with the subject. Most of our studies of language use and processing have used pairs of truly naı¨ve speakers and addressees (e.g., Bortfeld & Brennan, 1997; Brennan, 1990, 1995, 2004; Brennan & Clark, 1996; Brennan & Ohaeri, 1999; Brennan et al., 2007; Ekeocha & Brennan, 2008; Galati & Brennan, 2010a; Hanna & Brennan, 2007; Kraljic & Brennan, 2005; Lockridge & Brennan, 2002). Some have had copresent confederate speakers who interact mostly spontaneously with naı¨ve addressees, producing only certain critical utterances according to a partial script (e.g., Hwang, Brennan, & Huffman, 2007; Metzing & Brennan, 2003). A few have used prerecorded utterances but without any pretense that a live speaker is present (e.g., Perryman & Brennan, 2009). The point is that one partner’s behavior shapes another’s during dialog or during collaboration more generally (Kuhlen & Brennan, 2010), and this should be acknowledged when confederates are employed (Kuhlen & Brennan, 2008). In this section, we describe three influential views of processing in dialog, each of which makes quite different assumptions about its essential aspects.
3.1. The Message Model The message model of communication (or as Pickering & Garrod call it in their 2004 critique, the autonomous transmission model ) is intuitively plausible and widely assumed among the cognitive sciences (e.g., Akmajian,
Two Minds, One Dialog: Coordinating Speaking and Understanding
309
Demers, & Harnish, 1987). This model is derived from information theory (MacKay, 1983; Shannon & Weaver, 1949; Wiener, 1965), in which information is defined in probabilistic terms; what is less probable is more informative. Communication involves the transmission and reception of information, which flows at a particular rate through a channel. One agent, a sender, encodes a message into a language and transmits it to another, a recipient, who decodes it; the two agents can communicate as long as they both have the same set of encoding and decoding rules (e.g., a language). Feedback (e.g., ‘‘backchannels’’ in conversation; Yngve, 1970) regulates the flow of information. The message model is consistent with the conduit metaphor (see critique by Reddy, 1979), in which words are treated like packages of meaning sent by speakers to listeners. It is difficult to think formally about communication without invoking the conduit metaphor and other information theoretic terms (Eden, 1983). The approach represented by the message model decouples coordination from language per se, and it does not require that one partner recognizes an intention to communicate in the other. It has been used to model interactions between humans, between nonhumans, between mechanical processes, and between humans and machines (Wiener, 1965). But it is difficult to see how the message model could explain the tightly coordinated exchange among Leah, Dale, and Adam, in which their contributions defy relegating them to roles of sender or receiver, and meanings have no simple mapping but are negotiated so fluidly and flexibly. As these three recall the movie together, they coauthor a jointly recalled and articulated product (rather than formulating and sending signals autonomously). They all recognize a common goal. And in the first trial from the ‘‘monk praying’’ example, Partner A was the one who knew the identity of the target objects (and so should be considered to be the sender of the message), and yet it is B (the recipient) who ended up proposing the perspective that they entrain upon. As Figure 1 illustrates, there is no predictable mapping of perspective or label to object. We argue (as do Reddy, 1979; Schober, 1998) that words do not ‘‘contain’’ their meanings; even labels for common objects that are highly conventional can turn out to be negotiable. This means that there is no guaranteed 1:1 mapping of meaning to word, even for basic level terms. As Brennan and Clark (1996) showed in a series of referential communication studies, once speakers have entrained upon a perspective for a common object (e.g., calling a shoe the man’s loafer to distinguish it from other shoes), they often continue to use the over-informative term even when this level of detail is no longer necessary (when the man’s loafer is the only shoe). In fact, native speakers of English may even produce wildly nonidiomatic referring expressions (e.g., the chair in which I shake my body for a rocking chair or the chair with five little tires on the bottom for an office chair) to maintain a perspective that has been mutually achieved with a non-native speaker (Bortfeld & Brennan, 1997). The message model does not account
310
Susan E. Brennan et al.
for such flexibility. Because we are interested in understanding how people coordinate joint actions interpersonally and how they coordinate joint action with language processing intrapersonally, we find that the message model presents an unsatisfying view of communication.
3.2. Two-Stage Models Several accounts of cognitive processing in dialog can be grouped together because they presume that processing is conducted in two distinct stages. According to the ‘‘interactive alignment’’ model (Pickering & Garrod, 2004), language processing in a dialog setting is fundamentally different from language processing in monologue because in dialog, both the speech production and speech comprehension systems are active at once, with the two systems assumed to have parity of representations. The interactive alignment model further assumes that interlocutors routinely come to achieve shared mental representations through a ‘‘direct’’ process of priming. Priming is proposed as the mechanism that explains convergent linguistic behaviors both between and within interlocutors such as lexical entrainment, shared perspectives, and the reuse of syntactic forms. According to this account, interlocutors converge on shared terms (such as in our earlier ‘‘monk praying’’ example) simply because one partner’s utterance primes another’s. Interpersonally, alignment is claimed to be direct and automatic. As the basis for such imitation, Pickering and Garrod (p. 188) invoke the human mirror system (to be discussed in Section 6), as well as the fact that the same brain areas (Brodmann’s Areas 44 and 45; see Iacoboni et al., 1999) are implicated in both language processing and imitation. On Pickering and Garrod’s view, processing in dialog defaults to what is assumed to be automatic and inflexible, driven by priming. The interactive alignment model is compatible with two-stage proposals by Keysar and colleagues (e.g., the ‘‘monitoring and adjustment’’ theory: Horton & Keysar, 1996 and ‘‘perspective adjustment’’ theory: Keysar, Barr, & Horton, 1998) that assume that early processes in dialog are unable to take account of a partner. On these proposals, interlocutors often share the same context, knowledge, or informational needs, so that what appears to be audience design (when one partner seems to take the other’s knowledge or mental state into account) is actually done for the self (Brown & Dell, 1987). As with the interactive alignment model, the first stage of these models is fast, automatic, and encapsulated from all but ‘‘egocentric’’ information, followed by an inferential stage that can accommodate partner-specific information, but more slowly. On these approaches, such mentalizing about a partner (or deploying ‘‘full common ground’’ to plan or process an utterance) is thought to be computationally expensive (e.g., Pickering & Garrod, 2004, p. 180), and therefore either optional or else
Two Minds, One Dialog: Coordinating Speaking and Understanding
311
invoked only when necessary for a repair: ‘‘normal conversation does not routinely require modeling the interlocutor’s mind’’ (Pickering & Garrod, 2004, p. 180). The interactive alignment theory further assumes that, intrapersonally or within the mind of an individual, priming at one level of linguistic processing (e.g., phonological) leads directly to alignment at another level (e.g., lexical representation), and that this automatically results in shared representations between partners at all levels of linguistic processing (Pickering & Garrod, 2004). But for this proposal to work, both interlocutors would have to be exact copies of one another. The problem is that presumably any conceptual networks that undergo priming within an individual’s mind will have been sculpted by their idiosyncratic experiences and memories, and so it seems unlikely that shared meanings can be reached simply by priming (see Schober, 2004 for a related critique). Priming is simply the underlying currency by which language and memory are purchased, with multiple elements being primed at a given moment. As we will argue in Section 5.3, priming is not a satisfying explanation for convergent behaviors such as entrainment because such behaviors have a partner-specific component. Note that not all of the theories that assign a prominent role to priming in order to account for convergent behavior agree that priming results in shared mental representations. In the ‘‘coordinative structures’’ proposal (Shockley et al., 2009), which focuses on convergent behaviors such as gaze patterns, body sway, and postural coordination, the authors argue that at least for these behavioral adjustments, executive control (and presumably mentalizing) does not play a role (p. 315) since these behaviors happen too rapidly, and since postural mimicry and sway are largely unconscious. The question remains, then, whether linguistic and communicative behaviors can also be aligned at multiple levels of linguistic processing without involving executive control and without achieving aligned mental representations.
3.3. The Collaborative View and the Grounding Model Like the interactive alignment model, the grounding model views dialog as fundamentally different from monologue, but for different reasons (see Clark & Brennan, 1991 for discussion; see Cahn & Brennan, 1999; Clark & Schaefer, 1989 for formal models of grounding). According to this view, spoken communication is conducted not only as a kind of joint activity, but as a collaboration (Clark, 1992; Clark & Wilkes-Gibbs, 1986). On this view, words do not ‘‘contain’’ meanings, there are no ‘‘default’’ contexts, and entrainment and understanding are not automatic byproducts of priming. Rather, communicative signals are intended to be recognized as such by communicating partners. Meanings are coordinated
312
Susan E. Brennan et al.
through grounding, the interactive process by which people in dialog seek and provide evidence that they understand one another (Brennan, 1990, 2004). Evidence used for grounding can be explicit, such as a backchannel response (uhuh) or clarification question, or it can be implicit, such as displaying continuing attentiveness via eye contact or continuing with a next relevant utterance. Interlocutors spontaneously provide evidence of what they themselves understand; they also monitor one another for such evidence, and when it is not forthcoming (or else not what they expect), they seek it out. Depending on their purposes and the task at hand, they set higher or lower grounding criteria for the form, strength, and amount of evidence they seek or provide at any particular point (Brennan, 1990, 2004; Clark & Brennan, 1991; Clark & Schaefer, 1989; Clark & Wilkes-Gibbs, 1986; Wilkes-Gibbs, 1986). According to Clark and Schaefer’s (1989) grounding model, Partner A cannot know whether her utterance (‘‘number 4 is the monk’’) constitutes a contribution to the conversation (and to the common ground she is accruing with Partner B) until there is some evidence, verbal or nonverbal, about how (or whether) Partner B has heard and understood it (‘‘ok’’). On this model, each contribution to a conversation has a presentation phase (an utterance) and an acceptance phase (the evidence that comes after it). A speaker evaluates her addressee’s response against the response she expected; she can then refashion her utterance and represent it, or even revise her original intention so that it now converges with the one her addressee seems to be recognizing or proposing. Elsewhere we have conceptualized grounding as a process of joint hypothesis testing (Brennan, 1990, 2004), by which an addressee also forms incremental interpretations or meaning hypotheses as an utterance unfolds (Krauss, 1987) and then tests and revises them as more evidence accrues. From the speaker’s perspective, the unfolding utterance embodies her hypothesis about what she believes might induce her addressee to recognize and take up her intention at a particular moment. Experimental studies of grounding often observe pairs of interlocutors doing a joint task, such as matching duplicate objects (as with the three trials in our previous example in which Partners A and B became increasingly efficient while discussing tangram figures). What began as a provisional, complex, and possibly incoherent proposal for a suitable perspective on an object (Trial 1 in our previous example) was ratified during the grounding process; both partners converged on an efficient and streamlined label for a perspective built on their common ground (Trials 2 and 3). Both took responsibility for making sure communication succeeds, not just Partner A (the one who knew the target configuration): A: it’s almost like a person kind of in a weird way B: yeah like like a monk praying or something
Two Minds, One Dialog: Coordinating Speaking and Understanding
313
According to the assumptions of the message model, which assumes that communication is about one person who has information transmitting it to another who does not have it, this should not happen. According to the collaborative view, this is not unusual. Sometimes it is not clear whether partner-adapted processing is due to cues produced during the grounding process, or from the explicit representation of a partner’s perspective. An early study that documented partneradapted referring during referential communication (Brennan & Clark, 1996) had pairs of naı¨ve speakers establish referential precedents during spontaneous conversation (e.g., using the high heel, to distinguish one shoe from several); after that, speakers either continued to interact with the same partner or else were paired with a new one to match the same objects. When continuing with the same partners, speakers continued to use the same terms they had entrained upon even when this was over-informative (e.g., when there was only one shoe in the set). But they tended to switch to the unadorned basic level term (e.g., shoe) when interacting with a brand new partner who had not matched the objects before. This partner-specific effect may have been shaped by speakers mentalizing about what their partners knew, by cues that partners presented about their knowledge or needs during the dialog, or by both of these factors in combination. These two sources of information may be independent, or they may interact.
4. The Role of Cues in Grounding Experimental work within the grounding framework has focused on coordination by examining the role of nonlinguistic and nonverbal cues, including elements that other traditions have considered mere noise—either a product not worth studying or one too difficult to study systematically. These elements include paralinguistic cues (both verbal and nonverbal) such as acknowledgments or eye contact (Schober & Clark, 1989). Paralinguistic cues may be used in a variety of ways, such as to display an addressee’s continued attention to (or confusion about, or alignment with) an utterance, to signal a speaker’s degree of commitment toward what she is saying, to invite an addressee to participate in completing an utterance, to capture the addressee’s attention, to display a speaker’s awareness of a speech disfluency or other problem in speaking, or to initiate or invite a repair (e.g., Brennan & Williams, 1995; Clark & Fox Tree, 2002; Goodwin, 1981). Additional evidence of a partner’s understanding comes from incremental progress in whatever joint task interlocutors are doing (Brennan, 1990). During the process of grounding, interlocutors produce and monitor paralinguistic cues and monitor one another’s instrumental behavior in order to seek and provide evidence that they understand one another.
314
Susan E. Brennan et al.
We propose that the use of such cues in grounding facilitates the kind of intrapersonal ‘‘mind reading’’ needed for interlocutors to conclude that they are both talking about the same thing. These paralinguistic signals (track 2 or secondary signals; Clark, 1994, 1996) provide information about the ongoing utterance itself (as distinct from track 1 signals, which encode the ‘‘official business’’ of the utterance; Clark, 1994, 1996). The interactive alignment model (Pickering & Garrod, 2004), along with its cousins (Barr & Keysar, 2002; Dell & Brown, 1991; Horton & Keysar, 1996; Keysar, Barr, & Horton, 1998), ignores any early or automatic role that such cues may play in shaping language processing in dialog (largely ruling out the kind of flexible collaboration that such signals could help achieve, and instead focusing on what is achieved by automatic, ‘‘dumb’’ priming). Most versions of the message model allow a role for backchannel cues limited to regulating the rate of information flow rather than modeling how the evidence provided by a partner may collaboratively shape the incremental products of dialog. Of the models we have reviewed here, only the grounding model assigns a major role to such cues. Are such cues really communicative? An essential aspect of communication is the ability of one person to recognize another’s intention to communicate. This, according to Grice (1957), is what differentiates natural information (e.g., smoke is a symptom caused by fire) from non-natural (e.g., a smoke signal may be recognizable as an intentional communicative act). What starts out developmentally as a natural cue, such as a cry of pure distress produced by a baby who is hungry, develops into an intentional display intended to be communicative, as when a child cries to get her parents’ attention. Although savvy parents can tell the difference, sometimes the distinction between natural and non-natural cues is ambiguous (see Harding, 1982 for more on relevant cues in development). A cue may serve both communicative and instrumental purposes; it is not always easy to differentiate communicative from noncommunicative behavior. Consider the production of um and uh, short elements sometimes known as ‘‘fillers.’’ Clark and Fox Tree (2002) have argued that such signals are communicative, that they can facilitate processing, and in fact, that um contrasts with uh in much the same way that lexical items do. However, facilitation may be due to the time that elapses while the filler is produced rather than to its phonetic form (Brennan & Schober, 2001). Moreover, a cue can facilitate processing for an addressee without being communicative. Consider three criteria that must be met for a cue to be ‘‘communicative’’ (proposed by Brennan & Williams, 1995): Criterion 1. The cue must be potentially informative; that is, it must encode information. Criterion 2. The addressee must be able to process the cue and recover the information.
Two Minds, One Dialog: Coordinating Speaking and Understanding
315
Criterion 3. Finally, the cue must be able to be modified by the speaker’s intentions. This does not require that the speaker be consciously aware of planning or modifying the cue per se, but only that the cue be shaped by the speaker’s intentions toward the addressee or what they are doing together. We acknowledge that some paralinguistic cues may be produced communicatively while others may not be; nevertheless, even the cues that do not meet Criterion 3 can still serve a coordinating function, helping partners in conversation seek and provide evidence about what each other intends and understands. Consider the phenomenon of ‘‘Feeling of Knowing’’ (Hart, 1965), the metalinguistic ability to assess one’s own knowledge. Speakers can display their confidence (or lack thereof) when they answer a question, via the latency to their answer, the use of rising intonation, a filler such as uh or um, and self-speech (Smith & Clark, 1993). Speakers who display uncertainty while recalling an answer or certainty when saying ‘‘I don’t know’’ are likely to fail to recognize the answer later on a multiple choice test. This satisfies Criterion 1; the paralinguistic cue displays reliable information about what the speaker really knows. It turns out that these cues are also interpretable by addressees (as a ‘‘Feeling of Another’s Knowing,’’ Brennan & Williams, 1995; Swerts & Krahmer, 2005), satisfying Criterion 2 and potentially aiding coordination. However, such cues may simply emerge from the speakers’ own ease or difficulty in recalling, planning, and articulating an answer; whether they are actually communicative or not depends on whether speakers modify the cues based on their intentions toward their addressees. One way to test for Criterion 3 is to have speakers answer questions that are either sincere (the speaker knows that the partner who asked the question does not know the answer) or rhetorical (the speaker knows that the partner knows the answer, similar to a student answering a question posed by a teacher; Brennan & Kipp, 1996; Brennan, Kuhlen, & Ratra, 2010). So far we have focused our discussion of cues on their potential as interpersonal signals in the process of grounding, as revealed in dialog transcripts. In the next section, we consider evidence for partner-specific impacts as revealed by the time course of eye gaze and other behaviors synchronized with linguistic evidence.
5. Partner-Specific Processing It is clear from the evidence in a dialog’s transcript that speakers tailor their utterances to what they know about addressees, and that addressees tailor their interpretations to what they know about speakers. What is not so clear is how and when they do this. The models of interactive
316
Susan E. Brennan et al.
communication described in Section 3 make very different predictions about partner-adapted processing. Recall that according to the message model, processing language in dialog is not so different from processing in monologue; interlocutors take discrete turns, with one listening while the other is speaking and vice versa. Partner-adapted processing is not an issue because words map simply onto meanings; rules of encoding and decoding guarantee successful communication, as long as the transmission channel is not noisy or otherwise defective. The recognition of communicative intention is beside the point. According to the interactive alignment model, processing in dialog is distinctly different from processing in monologue, with an individual’s production and comprehension systems both active at the same time during dialog, so that processing is assisted by an assumed parity between representations for speaking and representations for interpretation. One interlocutor’s behavior primes another’s, such that convergence of their mental representations is largely automatic. Like the two-stage interactive alignment model, the monitoring and adjustment model predicts that processing, at least initially, is automatic and inflexible; people with different perspectives or knowledge default to processing in a way that is not adapted to a partner, and they take account of ‘‘full common ground’’ only later (if ever), as a kind of slow inference or repair. Grounding, on the other hand, assigns an essential role to recognizing and signaling communicative intent; dialog can be viewed as a highly coordinated hypothesis-testing activity that individuals engage in together, where one partner’s presentation (their hypothesis of what their partner will understand) plays a dual role by providing the other person with evidence of how the previous utterance has been understood. Products such as utterances and perspectives are jointly constructed. This sort of model supposes that partner-specific processing is flexible and ‘‘smart,’’ as well as highly incremental. In Section 5, we consider experimental evidence about the products and timing of partner-adapted processing in dialog. We discuss some of our own and others’ behavioral and eye-tracking data that are relevant to the agenda of uncovering a cognitive architecture that could support such effects.
5.1. Global and Local Adaptations It is useful to categorize partner-specific information into two sources: (1) information from a more or less global model of a partner or their characteristics, mentally represented from prior personal experience, from expectations, or else from a stereotype, and (2) feedback that becomes available locally online, from cues that emerge as the dialog unfolds. The first source of information involves some degree of mentalizing about the partner and their intentions. It is available in some form at the start of
Two Minds, One Dialog: Coordinating Speaking and Understanding
317
the dialog (whether in detailed or else quite rudimentary form), and it may or may not be updated as the dialog unfolds. The second source consists of evidence emerging during the interaction about the context or the partner’s needs, perceived from verbal and nonverbal cues. Whether a particular kind of cue evokes mentalizing, and when such mentalizing might occur, depends on the attributions made to the cue (as we will see presently). Presumably if a cue satisfies all the criteria to be considered as communicative (including being able to be mediated by intention, as outlined in Section 4), mentalizing is involved; if the cue satisfies only the first two (is informative and can be perceived), then it may support interpersonal coordination but not involve mentalizing. Both global and local sources of partner-specific information have the potential to guide production of utterances. In one study (Brennan, 1991), students were led to believe they were interacting via text with either a remotely located student or else a computer that could interpret natural language; the task was to retrieve information to fill in the missing cells of a spreadsheet database about hypothetical students and their characteristics. The answers were provided by a confederate (blind to whether she was assumed to be human or computer), were entirely rule-based, and in a given dialog, took the form of either short elliptical and telegraphic turns, or else complete sentences that reused syntax and word choice from the students’ original questions. Those who believed they were communicating with a natural language interface began the dialogs by typing telegraphic utterances, whereas those who believed they were communicating with a remotely located person began with longer, grammatical sentences. But this global force for audience design was trumped midway through the session by the remote partner’s online feedback; by the end of the sessions, students’ questions converged in form with their partners’ answers (to either short utterances or complete sentences), regardless of whether the partner was believed to be human or computer. Although this pattern of adaptation was true for some kinds of measures (e.g., lexical choice and syntactic form), it was not true for all measures. For instance, students used third-person pronouns relevant to the task equally often in all conditions (e.g., Where does he work?), showing that they expected their (human or computer) partner to model connectedness of utterances within the dialog context, but they rarely used first- or second-person pronouns with computer partners compared to with humans (e.g., Can you tell me whether. . .?), suggesting that they did not expect to have social context with computers. Often, local cues (e.g., feedback about the informational needs of a conversational partner) corroborate the information available through global cues (e.g., about a partner’s identity). This can make it challenging to tease apart effects of these two potentially independent factors, and most studies do not attempt to do so. In a recent study (Kuhlen & Brennan, 2010), we teased apart expectations about a partner from cues. Speakers
318
Susan E. Brennan et al.
learned jokes in the form of brief stories and told them to addressees who also were naı¨ve subjects. The instructions led speakers to expect either attentive addressees (who would have to retell the jokes later), or distracted addressees (working on a secondary task while listening to the jokes). As expected, attentive addressees gave more feedback than distracted addressees. Thus, while (globally) expecting attentive or distracted addressees some speakers encountered behavior contrary to their expectation (based on local cues in form of addressee feedback). We found that the tellings of the jokes were shaped both by speakers’ expectations and by addressees’ cues. Speakers with attentive addressees told the jokes with more vivid detail than those with distracted addressees, but only when they expected attentive addressees. Speakers with distracted addressees put less time into the task than did those with attentive addressees, but only when they had expected the distracted addressees to be attentive (when the initial expectation did not match the unfolding evidence). These results suggest that feedback cues are interpreted against prior expectations or attributions about a partner. A similar pattern of partner-specific adaptations was found in speakers’ speech-accompanying gestures (Kuhlen, Galati, & Brennan, 2010). Independent of adjustments made in speaking, speakers gestured more frequently when their expectations were consistent with addressees’ feedback, supporting the idea that speakers put more effort into narrating when their global expectations of addressees’ needs are matched by local cues provided by addressees in the interaction. Moreover, speakers used more gestures that were produced in the body’s periphery when narrating to attentive addressees whom they had also expected to be attentive, supporting the idea that consistency between local and global cues is associated with more vivid narration. These results suggest that global information established prior to the interaction is updated by local cues provided within the interaction in a highly interactive manner, resulting in a cascade of adjustments in speakers’ narrating style that affects both speech and gesture. A clear example of cues intended by one partner to be recognized by the other as communicative (and recognized by the other partner as such) comes from Brennan’s (1990) study (reported in Brennan, 2004). Pairs of subjects in adjoining cubicles discussed target locations on identical maps displayed on networked computer screens. The task was for the matcher to get his car icon parked in the same target location displayed on the director’s screen. In one condition, the director could visually monitor the progress of the matcher’s car; in the other, she could not. In both conditions, they could talk freely; in both, the matcher saw only his car icon displayed over the map. Over 80 trials with different targets, whether the director could see the matcher’s movements toggled every 10 trials (and the matcher was informed of this switch at the start of each block of 10 trials). So the director had local cues of what the matcher understood, updated moment by moment, while
Two Minds, One Dialog: Coordinating Speaking and Understanding
319
the matcher had only global information (that he needed to keep in memory) about what his partner could see. When they could not visually monitor the matchers’ progress in the task, directors proposed descriptions in installments, and matchers responded verbally to clarify, modify, and eventually, ratify descriptions of the target location. Meaning was established incrementally and opportunistically, with both partners sharing the responsibility for doing so (as with the earlier dialog about the tangram that looked like a monk). The matcher’s icon typically arrived at the correct target location early in the trial; but they still needed additional verbal turns during which they grounded their meaning. It was up to the matcher to propose when he thought he understood well enough for current purposes and go on to the next trial. In contrast, when the director could monitor the matcher’s icon’s movements, she took the responsibility for determining when the matcher indeed understood the target location, and since this was based on direct visual evidence, she took responsibility of proposing when to go on to the next trial, sometimes suspending speaking midword as soon as the matcher reached the target, as here (note: asterisks denote overlapping speech): Director: ok now we’re gonna go over to M-Memorial Church? and park right in Memorright there that’s *good.* Matcher: *that’s* rude to park in the church. Director: hheh heh
Grounding with visual evidence was much more efficient, although partners adjusted their effort so that performance was equally accurate with and without visual evidence. What is particularly striking is that even though matchers’ screens appeared the same to them regardless of what condition they were in (there were no cues to remind them of what directors could see), they easily adapted to what they knew about their unseen partners’ perceptual context by providing or withholding backchannels; when they knew the directors could see their cars, they used their icon moves not only as instrumental acts for doing the task, but also as communicative acts (Brennan, 2004). Each time the visual evidence condition toggled, matchers adapted to this global partner-specific information immediately (almost always without discussion). Directors packaged location descriptions into installments and grounded these with the online local cues provided by matchers’ icon movements. So in this study, directors used
320
Susan E. Brennan et al.
local cues provided moment by moment by their partners; these were verbal when they could not see their partners’ moves, and visual when they could. At the same time, matchers, who were aware when their moves could or could not be seen, used that simple bit of information to guide whether to produce backchannels or not.
5.2. Speakers Adapt Utterances for Their Addressees Interlocutors often share considerable context beyond being speakers of the same language, including that due to previously established common ground or to being copresent in the same perceptual environment. Therefore, what might appear to be a case of a speaker tailoring an utterance to an addressee’s needs or knowledge may occur simply because that is what is easiest for the speaker to do. For example, within a discourse the first articulation of a word (when it represents new and sometimes unpredictable information) tends to be longer in duration and more intelligible than repeated mentions of the same word (or other uses in which it is more predictable) (Bard et al., 2000; Fowler & Housum, 1987; Lieberman, 1963; McAllister, Potts, Mason, & Marchant, 1994; Samuel & Troicki, 1998). Listeners can pick up on attenuation as a marker of information status, such that when they hear an initially ambiguous word that is destressed, they assume that it refers to the given item in an array (that also includes a new distractor with the same phonological onset); but when the word is stressed, they assume that it refers to the new item (Dahan, Tanenhaus, & Chambers, 2002). The question is whether variations due to attenuation are communicative, for the benefit of the addressee (as assumed by Nooteboom, 1991; Samuel & Troicki, 1998), or whether this is a generic sort of variation produced automatically by speakers (Dell & Brown, 1991) that would likely occur without any addressee present. To establish that a variation in speaking is not egocentric but is produced truly as a form of audience design, ‘‘for’’ a partner, the perspectives of the speaker and the addressee must be distinguishable (for discussion, see Keysar, 1997; Lockridge & Brennan, 2002). Moreover, the speaker must be aware of her addressee’s distinct perspective or needs in time to incorporate this information into speaking; if relevant information about the addressee’s distinct perspective is not available in time, then a failure to incorporate it does not constitute a fair test of whether the early stages of speaking are egocentric (Horton & Gerrig, 2005a, 2005b; Kraljic & Brennan, 2005). When telling stories, speakers leave out some details and include others; for example, they are more likely to mention atypical instruments and omit typical ones (which are implicitly associated with a particular verb or situation). A study by Brown and Dell (1987) tested whether this typicality effect is egocentric, or else driven by the needs of particular addressees. Eighty speakers read silently and then recounted aloud to a confederate addressee
Two Minds, One Dialog: Coordinating Speaking and Understanding
321
very short stories in which an instrument (either typical or atypical in association with a main verb) played a key role; the confederate either had or did not have a picture illustrating the main action and instrument (and the speaker subject knew what the addressee could see). Whether the addressee could see the instrument or not had no effect on whether and how speakers mentioned it; Brown and Dell concluded that the typicality effect was not an adjustment to the addressee’s needs, but simply automatic for the speakers. However, their addressees (both of them) heard the same stories over and over, so actually knew them better than the speakers did; it is possible that the cues they provided signaled this. A subsequent study by Lockridge and Brennan (2002) had speakers tell similar stories, but to naı¨ve addressees who had never heard the stories before, and who saw or did not see the pictures. Speakers were more likely to mention atypical instruments, to mention them early (within the same clause as the action verb), and to mark them as indefinite, when speaking to addressees without pictures than to addressees with pictures. This suggests that when addressees have real needs (and presumably signal them somehow), speakers take this into account in the syntactic choices they make early in an utterance (Lockridge & Brennan). In another study, we examined the extent to which speakers attenuated elements of a longer story ‘‘for’’ themselves or ‘‘for’’ their addressees (Galati & Brennan, 2010a). Twenty naı¨ve speakers spontaneously told and retold the same Road Runner cartoon story twice to one naı¨ve addressee and once to another (counterbalanced for order: Addressee1/Addressee1/Addressee2 or Addressee1/Addressee2/Addressee1). This design enabled us to tease apart tellings of the story that were new versus old to speakers from those that were new versus old to addressees. We found that attenuation was mainly due to whether the material was new or old to the addressee rather than to the speaker; stories retold to the same (old) addressee were attenuated compared to those retold to the new addressee. This was true for a variety of linguistic units, including number of words, amount of detail, and number of events realized in the stories. Although lexically identical expressions by a same speaker were no different in length when addressed to a new versus an old addressee, expressions that had been addressed to new partners were more intelligible to a later group of listeners than when they had been addressed to addressees who had heard them before. This study provides strong evidence that attenuation is driven at least in part by the needs of addressees (in fact, it found little if any evidence for speaker-driven attenuation). The findings contrast sharply with that of Bard et al. (2000), who found that attenuation in articulation of repeated expressions depended on speakers’ experience rather than addressees’ (although it should be noted that their study did not tease apart speakers’ from addressees’ perspectives; all addressees where hearing the expressions for the first time). We found a similar pattern of partner-specific attenuation in these speakers’ gestures (Galati & Brennan, 2010b). Speakers produced fewer
322
Susan E. Brennan et al.
representational gestures overall in retellings to old addressees than to new addressees. The gestures produced in stories retold to old addressees were also smaller and less precise than those retold to new addressees (a for-theaddressee effect), although gestures were also attenuated over time (the only comparison from this experimental corpus that showed any for-the-speaker effect). These data support the conclusion that gesture production is guided by both the needs of addressees and automatic processes by which speakers do what is easiest for themselves. Although Bard et al. (2000) (in their measures of duration and intelligibility) found no audience design effect at the grain of pronunciation of repeated words, Bard and Aylett (2000) did find audience design at the grain of referring expressions; their speakers marked expressions as definite when appropriate given the addressee’s knowledge. The authors proposed a ‘‘dual-process model’’ in which automatic processes are modular and cannot take partner-specific context into account while other, more flexible processes can. But given the audience design effects on articulation that we found in Galati and Brennan (2010a), the modularity claim seems hard to defend. It may be that audience design effects on articulation are either produced inconsistently or that they are difficult to detect. On the other hand, a pattern of variable findings would be consistent with a system whose architecture allowed information to be incorporated into planning in a probabilistic (constraint-based) fashion (e.g., Jurafsky, 1996; MacDonald, 1994; Tanenhaus & Trueswell, 1995). A claim of modularity based on a null finding of audience design might be convincing if every stone has been overturned, and if the information in question is available early enough to impact planning (for discussion, see Brennan & Hanna, 2009; Kraljic & Brennan, 2005). Variability in pronunciation is influenced by multiple factors. Hwang et al. (2007) examined the extent to which articulation may be governed by priming as well as by a conversational partner’s communicative needs, using Korean-born speakers of English as a second language (L2). Ambiguities arise when non-native speakers fail to make L2 phonetic contrasts that are absent in their native language (L1). Korean speakers lack the voicing contrast b/p (‘‘mob’’ vs. ‘‘mop’’) and the vowel contrast ae/E (‘‘pat’’ vs. ‘‘pet’’), so that when they speak Korean-accented English, the first words in each of these pairs are likely to be neutralized to sound like the second words. In two referential communication experiments, subjects who were Korean speakers of English spontaneously produced target words (e.g., ‘‘mob’’). A confederate partner either primed the target words with a rhyming word (e.g., asking ‘‘What is below hob?’’) or did not prime them, and the referential contexts required pragmatically distinguishing two contrasting words (‘‘mob’’ adjacent to ‘‘mop’’ in the array), or did not. The Korean speakers produced more English-like phonetic targets in both the priming and pragmatic conditions (vowel duration was used to
Two Minds, One Dialog: Coordinating Speaking and Understanding
323
signal both contrasts). Moreover, Korean speakers were primed to make the disambiguating contrast when interacting with an English speaker but not with another Korean speaker of English. These results show that Korean speakers speaking English (L2) can be led to produce a phonetic contrast that they do not have in L1 both when they are primed to do so and when their addressees need them to do to resolve an ambiguous expression. Sections 5.1 and 5.2 have reviewed some studies of audience design in which interlocutors with distinct perspectives incorporate their partners’ knowledge or needs rather than ignoring them or taking them into account at a late stage of processing. But we have not yet addressed the question of how perspectives (whether of self or other) are suppressed, selected, or updated moment by moment.
5.3. Addressees Adapt Utterance Interpretations to Speakers According to the grounding framework, just as speakers design utterances for their addressees, addressees interpret utterances in the context of what they know about speakers. This means that the same words may be interpreted differently depending on who utters them. In a referential communication experiment that incorporated interaction between confederate speakers and naı¨ve addressees, addressees’ initial looks to familiar target objects (that they had previously grounded during interaction with a speaker) were delayed by a few hundred ms when the same speaker uttered an entirely new expression for the familiar object, but not when a new speaker uttered the same new expression (Metzing & Brennan, 2003). The conclusion was that speakers and addressees ground ‘‘conceptual pacts’’ or shared perspectives that are not only partner-specific but also quite flexible: Addressees were quick to abandon the precedent of a familiar expression when interacting with a new speaker; their first looks to the target were not delayed when the new speaker used the new expression. This finding, that addressees experience interference or slowed processing when a conceptual pact (previously grounded with a particular speaker) is broken, has been replicated with young children, who show the effect when a speaker abandons a precedent for a new term without any apparent reason, but not when a new speaker introduces a new term (Matthews, Lieven, & Tomasello, 2008). These findings and related findings (e.g., Brown-Schmidt, 2009; Nadig & Sedivy, 2002) are incompatible with interactive alignment theory that seeks to explain convergence from priming alone (Pickering & Garrod, 2004), and in which the speaker’s identity should not matter. Addressees do not inflexibly map expressions onto referents; within a pragmatic context (Grice, 1975), the identity of the speaker can be part of what is represented. Finally, global information (specifically about a speaker) can interact with local information (from cues that emerge during dialog or speaking). That is, listeners interpret cues against the attributions that they make about
324
Susan E. Brennan et al.
those cues (Kuhlen & Brennan, 2010). For instance, when listeners hear a speaker’s disfluency just before a referring expression, they interpret it online as evidence that the speaker is in the process of saying something difficult (Arnold, Tanenhaus, Altmann, & Fagnano, 2004)—unless they have a stable attribution for the disfluency (the speaker has agnosia; Arnold, Hudson-Kam, & Tanenhaus, 2007).
5.4. Simple or ‘‘One-Bit’’ Partner Models It may be no coincidence that experiments that show audience design early in processing involve partner-specific information that is not only clear, but also already-computed and quite simple. In such experiments, what a partner needs is often captured by only two alternatives: my partner can see what I’m doing, or not (Brennan, 2004; Nadig & Sedivy, 2002); my partner can reach the object she’s talking about, or not (Hanna & Tanenhaus, 2004); my partner has a picture of what we’re discussing, or not (Lockridge & Brennan, 2002); my partner and I have spoken about this before, or not (Galati & Brennan, 2010a; Matthews et al., 2008; Metzing & Brennan, 2003); my partner is currently gazing at this object, or not (Hanna & Brennan, 2007); my partner needs to distinguish this referent from a competitor, or not (Hwang et al., 2007); my partner is a young child, as opposed to older (Shatz & Gelman, 1973); or my partner is a native speaker of English, or not (Bortfeld & Brennan, 1997). In these situations, an interlocutor may represent information in working memory about a partner’s state as a simple either/or cue that can be flexibly updated as the situation changes. The findings of audience design in these situations demonstrates that a ‘‘partner model’’ need not entail a detailed record of all of the knowledge one partner has about what the other is likely to know (as well as what the other does not know, as pointed out in a critique by Polichak & Gerrig, 1998). In contrast, a simple ‘‘one-bit’’ model that does not require complex inferences or elaborate maintenance or updating could facilitate rapidly partner-adapted processing, even when two partners have distinct perspectives (Brennan & Hanna, 2009; Galati & Brennan, 2010a). In the next section, we consider evidence from brain imaging studies about the neural circuits that may support partner-adapted processing, both by interpreting local cues and by maintaining simple models of interlocutors’ intentions, perspectives, or communicative needs.
6. Neural Bases of Partner-Adapted Processing Our cognitive/behavioral research program has followed the assumption that partner-specific adaptation during communication can be explained by general principles of memory and cognitive processing, rather
Two Minds, One Dialog: Coordinating Speaking and Understanding
325
than by special cognitive modules that either give priority to an egocentric perspective (Horton & Keysar, 1996; Keysar, Barr, Balin, et al., 1998; Keysar, Barr, & Horton, 1998; Keysar et al., 2000; Pickering & Garrod, 2004) or automatically restrict referential interpretation to what is in common ground (a position attributed to Clark & Carlson, 1981 by Barr & Keysar, 2002). Our studies and others that allow for spontaneous interaction between interlocutors (e.g., Brown-Schmidt, 2009; Brown-Schmidt et al., 2008; Hanna & Brennan, 2007; Kraljic & Brennan, 2005) demonstrate that partner-specific effects can emerge early in processing, and show no evidence for modular or two-stage (early egocentric, late partner-specific) processing models. We find that the evidence supports a cognitive architecture for language processing and communication that combines the available information in a parallel, constraint-based, and probabilistic fashion (Brennan & Hanna, 2009; Horton & Gerrig, 2002, 2005b; MacDonald, 1994; Metzing & Brennan, 2003; Tanenhaus & Trueswell, 1995). However, the behavioral evidence does not tell us precisely how such flexible, partner-adapted processing is achieved in the brain. Imaging studies have revealed multiple neural circuits that appear to aid and abet everyday communication. These circuits handle a wide variety of cues and functions. Cues relevant to communication include gesture, eye gaze, nonlinguistic verbal cues, contrastive stress and other prosodic cues, and disfluencies. Relevant functions that may make use of these cues include speaking, linguistic parsing, postural and motor coordination during joint action, monitoring a partner’s orientation or attention, evoking person stereotypes and other world knowledge, and last but certainly not least, mentalizing about their intentions or beliefs (Theory of Mind). Mapping the circuits that underlie these functions and discovering how these functions could work together requires deploying cognitive/behavioral tasks that preserve the essential aspects of communication. In this section, we discuss some recent and intriguing findings about the neural underpinnings of language and communicative processing that are potentially relevant to a more complete account of adaptive processing.
6.1. Mirroring The idea that the production of speech relies on the same motor routines and representations as the interpretation of speech has been around for a long time (Galantucci, Fowler, & Turvey, 2006; Liberman, Cooper, Shankweiler, & Studdert-Kennedy, 1967). Over the past decade and a half, much evidence has accumulated that people perceive and understand the actions of others by relying on their own motor routines, using a common coding for both. Individual mirror neurons, activated both when an action is performed and when it is observed, have been identified in primates (di Pellegrino, Fadiga, Fogassi, Gallese, & Rizzolatti, 1992) and
326
Susan E. Brennan et al.
are presumed to exist in humans (Iacoboni et al., 1999). The human ‘‘mirror system’’ comprises a network that includes areas in the premotor cortex (PMC) and parietal cortex (in particular, the anterior intraparietal sulcus, aIPS), with input from the posterior superior temporal sulcus (pSTS) (Van Overwalle & Baetens, 2009). The existence of a more or less direct perception–action link is proposed to help people detect each other’s goals by helping them simulate another’s state, as ‘‘nature’s way of getting the observer into the same ‘mental shoes’ as the target’’ (Gallese & Goldman, 1998, p. 497). As Sebanz and Knoblich (2008) have pointed out, the mirror system has been misunderstood by some as being an inflexible mechanism that automatically supports mimicry, and hyped by others as being the explanation for all of social cognition. Recent accounts by these authors and others (e.g., Bekkering et al., 2009) argue that the truth is somewhere in between: the mirror system is recruited for rapid processing of a wide variety of cues and provides input to many kinds of processes, including those that support language, communication, and other forms of joint action. The perception–action links that the mirror system provides do not support only mirroring; arguably, most of the actions that people do jointly involve complementary or noncongruent actions rather than imitative or congruent ones, and the mirror system is more active during the preparation of complementary than imitative actions (NewmanNorlund, van Schie, van Zuijlen, & Bekkering, 2007). Currently, the value of the mirror system is presumed to be in facilitating the recognition of a partner’s goal and the monitoring of outcomes of actions, rather than literally reproducing specific action primitives; otherwise it would be of limited use, as much of the time people would not be well served by an imitation reflex and since the mapping of action to goal is not 1:1 (for discussion, see Bekkering et al.; Van Overwalle & Baetens, 2009). In this more flexible role—as a way to understand (but not mimic) the perspective of another by simulation—the mirror system could support partner-adapted processing by monitoring cues about a partner or about the objects in a task and rapidly updating the state of a simple or ‘‘one-bit’’ partner model. This would explain why interlocutors sometimes adapt rapidly to their partners’ needs or knowledge rather than defaulting to behavior that appears to be egocentric.
6.2. Theory of Mind Designing an utterance or action with regard to what a partner knows, or recognizing an utterance or other action as communicative (Grice, 1957, 1975), presumably involves mentalizing, or attributing intention to another.
Two Minds, One Dialog: Coordinating Speaking and Understanding
327
Mentalizing involves neural circuitry that is usually thought to include (1) the medial prefrontal cortex (mPFC),2 (2) the bilateral temporoparietal junction (TPJ),3 and (3) the precuneus (BA 7)4 (e.g., Ciaramidaro et al., 2007; Van Overwalle & Baetens, 2009; Vogeley et al., 2001). These areas are often considered to be core parts of a ToM network, activated during tasks that require taking into account another person’s mental state. The classic ToM task (as tested on children in various stages of development) involves having a child witness an actor learning of the location of a hidden object, witness the object being rehidden in a different location unknown to the actor, and then predict where the actor will look for the object (Wimmer & Perner, 1983). The majority of imaging studies that aim to probe the ToM network in adults have been conducted in noninteractive settings in which subjects in an fMRI scanner read text stories about characters’ true or false beliefs, perspectives, intentions, or motivations, compared to texts about characters’ physical characteristics or objects (that do not require ToM to understand; Saxe & Kanwisher, 2003). Generally speaking, mentalizing has been proposed to be a fast, automatic process rather than a slow, inferential one (Kampe, Frith, & Frith, 2003; Scholl & Leslie, 1999). This is consistent with the behavioral findings reported earlier, that interlocutors adapt processing to their partners from the early moments of processing. Within the ToM mentalizing circuit, the TPJ appears to be implicated whenever there are early and automatic inferences about another’s goal, with the mPFC implicated during inferences about another’s traits that unfold more slowly (Van der Cruyssen, Van Duynslaeger, Cortoos, & Van Overwalle, 2009; Van Duynslaeger, Van Overwalle, & Verstraeten, 2007; see Van Overwalle & Baetens, 2009 for discussion). So ToM as a network may underlie not only immediate partner-adapted processing (in the TPJ region), but also the slower, inferential, adjustments to a partner that may unfold after an initially ‘‘egocentric’’ response (in the mPFC). 6.2.1. Distinguishing Kinds of Intentions: Private, Social, and Communicative Some of the variability in findings about the shape of the network hypothesized to underlie ToM may be due to lack of precision in fMRI imaging, and some may be due to limitations in the kinds of intentions depicted in the stimulus stories. A study by Ciaramidaro et al. (2007) took a nuanced look at the neural bases of ToM, by having individuals in the scanner read short comic strips that distinguished (1) the private intentions of characters from their social intentions toward other characters, and (2) within these social 2 3 4
Some studies label this Theory of Mind area as the anterior paracingulate cortex within the mPFC. Although note that some studies label this as the posterior STS, which extends to the TPJ. Some studies (Gallagher & Frith, 2003; Gallagher et al., 2002; Kampe et al., 2003) implicate the temporal poles (BA 38) in the ToM circuit.
328
Susan E. Brennan et al.
intentions, communicative from noncommunicative intentions. The ToM areas mPFC and TPJ were both found to be crucial, but were activated differentially depending on the kind of intention being recognized. The right TPJ and precuneus were active in the processing of all types of prior intentions, with the anterior paracingulate cortex in the mPFC and the left TPJ active when processing social intention; in fact, in these comparisons the left TPJ was active only when processing communicative intention. The evidence from this study suggests four (rather than three) core parts for the ToM network, with distinct roles for both the left and right TPJ areas. It is possible that previous studies that failed to find a clear role for the left TPJ during mentalizing used stimuli that did not require recognizing communicative intentions; the authors suggest that the left TPJ may be a fourth ToM area activated by the recognition of intentions that are specifically intended as communicative. 6.2.2. Joint Activation During Interpersonal Interaction While mentalizing about the intentions of characters in a story almost certainly overlaps with the mentalizing involved in thinking about an interlocutor’s knowledge or communicative needs, a reading task probably misses some of the essential aspects of interacting with a partner in dialog. For instance, most ToM stimuli texts are written about characters in the third person rather than the first or second person, and most fMRI scanner tasks do not probe contingently unfolding social interaction between partners, with a few notable exceptions. A few studies have used interactive games with real or simulated partners. In one series, neural activation was examined while pairs of partners playing a ‘‘tacit communication game’’ (Noordzij et al., 2009) in which ‘‘senders’’ invented new ways of conveying their communicative intentions to ‘‘receivers’’ using entirely graphical means. Senders had to figure out how to move icons so that receivers could distinguish instrumental moves from moves intended to instruct them about where they should move their own icons. The perspectives of the two partners were known (by both) to be different, with one person’s icon being inherently more ambiguous than the other (a triangle that could be oriented in three ways, a rectangle that could be oriented in two ways, or a circle for which orientation did not matter). In this task, communication was interactive, incremental, and graphical; both communicative and control trials evoked identical motor actions and graphics so that activation related to the planning and interpretation of communicative intent could be distinguished from that related to noncommunicative signals, visual motion, and hand movements. In each session, fMRI data were collected from either the sender or the receiver. Remarkably, during communicative trials senders and receivers both showed activation in one of the same ToM regions: the right pSTS, but not in the left pSTS. This right activation was modulated by the degree of ambiguity in
Two Minds, One Dialog: Coordinating Speaking and Understanding
329
the communicative signal (e.g., a sender’s circle could not easily depict how a receiver should orient their own triangle), but not by visual appearance or sensorimotor complexity. In addition to the right pSTS, mentalizing about communicative intent coactivated the mPFC. That the same ToM circuitry implicated in recognizing a partner’s (the sender’s) intention is also implicated in predicting how best to signal one’s own intention to a partner (the receiver) suggests that there is a kind of functional parity between signaling one’s own and interpreting a partner’s intentionality. One puzzle in comparing this study with the previous one (Ciaramidaro et al., 2007) is that Noordzij et al. (2009) reported no differential activation whatsoever for communicative action in the left pSTS (which extends into the left TPJ, the region where Ciaramidaro et al. did find activation associated with communicative intention). Whether this apparent inconsistency is due to a task difference remains to be settled. Noordzij et al.’s interactive task differentiated first- and second-person communicative intentions from instrumental acts, whereas Ciaramidaro et al.’s reading task differentiated third-person communicative intentions from other (ToM-associated) intentions. The interactive task required participants to generate communicative intentions as well as to recognize them, whereas they needed only to recognize them in the reading task. And the interactive task used graphical communication, whereas the reading task used language. There are so few imaging studies of communicative intention that it is difficult to interpret the implications of these task differences, but one speculative possibility is that the left TPJ might link ToM activation to language processing networks in the left temporal lobe. 6.2.3. Interactions with Human Versus Computer Partners ToM is associated with predicting the behavior of conspecifics (e.g., Ciaramidaro et al., 2007; Van Overwalle & Baetens, 2009). But does it matter whether an interacting partner is human or computer? Several imaging studies have been conducted using tasks in which subjects interacted with computers or human partners (or else ones they believed to be human) in a prisoner’s dilemma or other payoff game. In one such investigation (Gallagher, Jack, Roepstorff, & Frith, 2002), subjects who believed they were playing a (competitive) rock–paper–scissors game with either a computer or another person showed more activation in only one of the ToM areas with human than computer partners, the anterior paracingulate cortex (mPFC). In another investigation (Rilling, Sanfey, Aronson, Nystrom, & Cohen, 2004), subjects playing interactive games and receiving feedback from supposed partners showed activation in two of the main ToM areas, the mPFC and posterior STS; these areas were activated whether subjects believed their partners were human or computer. The cues that subjects received during the sessions were identical (and automatically generated) in both partner conditions, and ToM was activated in both
330
Susan E. Brennan et al.
kinds of sessions, but activation was higher when subjects believed they interacted with humans (Rilling et al., 2004). This difference in activation may reflect partner-adapted processing that distinguishes human from machine partners, or it may emerge simply from different levels of engagement in the task; but either way, it documents the influence of the same sort of global partner-identity variable that has emerged in behavioral studies (e. g., Brennan, 1991). Recent studies by Krach et al. (2008, 2009) have found consistent results, with activation in the mPFC and right TPJ when interactive games were played with (supposed) human or computer partners; however in Krach et al. (2009), the first of these ToM areas was more activated when the partner was believed to be human than computer. When people played with one of four kinds of (simulated) partners (human, anthropomorphic robot, functional robot, or computer process), there was more activation in both of these ToM areas, the more human-like the partner (Krach et al., 2008). So the difference between interacting with a human partner and a computer partner may be quantitative rather than qualitative (at least for part of the ToM network). These studies suggest to us that under some circumstances ToM processing may be flexible enough to be able to model varieties of an intelligent partner’s ‘‘mind’’ that need not even be human, an idea relevant to the field of ‘‘intelligent’’ computer– human interaction (Don, Brennan, Laurel, & Schneiderman, 1992).
6.3. Distinguishing a Partner’s Perspective from One’s Own: The Role of Executive Control Stimulus stories that require recognizing a single character’s intention presumably require less complex mentalizing than referential communication studies that require distinguishing two perspectives (e.g., one’s own from one’s partner’s or one’s private knowledge from common ground shared with a partner), especially when the two perspectives may in fact be inconsistent (Galati & Brennan, 2010a; Hanna et al., 2003; Metzing & Brennan, 2003; Nadig & Sedivy, 2002). Distinguishing privately held information from common ground presumably requires such mentalizing, as well as executive control to select the appropriate perspective and/or to suppress the inappropriate one. In addition, during dynamic communicative interaction, there is the challenge of keeping track of how a partner’s perspective (or else common ground) changes over time. Imaging studies show that the mentalizing network is recruited when people explicitly prevent themselves from imitating another’s behavior (Van Overwalle & Baetens, 2009), perhaps facilitating the differentiation of self from other (Brass, Derrfuss, & von Cramon, 2005; Brass, Zysset, & von Cramon, 2001; for discussion, see Van Overwalle & Baetens). A study by Vogeley et al. (2001) attempted to distinguish egocentric (SELF) processing from ToM by comparing activation associated with stories about the
Two Minds, One Dialog: Coordinating Speaking and Understanding
331
intentions of another person to stories about the reader’s own perspective. Consistent with other studies, Vogeley et al. found ToM to implicate the mPFC.5 But reasoning about one’s own perspective led to additional activation in the right inferior temporoparietal cortex that did not appear to be associated with ToM (Vogeley et al., 2001). These authors conclude that the right TPJ ‘‘is involved in computing an egocentric reference frame’’ (p. 179), and that ToM and SELF interact in the right prefrontal cortex, an area that has been associated with executive control processes. To the extent that taking another’s perspective requires inhibiting one’s own, executive control seems to play a role by inhibiting responses that are either overlearned or imitative (Brass et al., 2005). Concerning imitation, there is some evidence that what has been proposed by some to be a largely automatic tendency to imitate (governed by the mirror system; see, e.g., Pickering & Garrod, 2004) is routinely mediated by executive control, so that people can avoid imitating others when such behavior might be costly or inappropriate. Imitative finger gestures are actually initiated more quickly when working memory load is increased (with a two-back task) than without such load (Van Leeuwen, van Baaren, Martin, Dijksterhuis, & Bekkering, 2009), suggesting that executive control is the rule (for restraining this sort of imitation from the start) rather than the exception (for adjusting this behavior later in planning). More evidence for the importance of executive control in suppressing egocentric behavior is implicated by Brown-Schmidt’s (2009) visual worlds eye-tracking study of communication. To test the role of executive control, individual differences were first measured using a Stroop task. Then subjects interacted with a confederate partner to do a referential communication task that included both shared and privileged information; subjects had to differentiate what they knew from what the partner knew. Interaction was mostly unscripted, with the confederate partner asking the subject for information using expressions that were temporarily ambiguous between an object they could both see and one that only the subject could see. Some of the time immediately after the partner asked for information, their display would disappear so that the task would be interrupted before the subject could respond (thus interrupting the grounding process), and 2 s later, the display would reappear and task would resume again. This innovative manipulation aimed to test whether subjects closely monitored the grounding process in order to keep track of what their partners were actually likely to know. The findings were clear: Subjects who were better at suppressing Stroop interference were better able to restrict themselves to considering shared (rather than private) information in the early moments of responding to their partner’s temporarily ambiguous questions. And they 5
Vogeley et al. (2001) also found ToM activation in the left temporopolar cortex.
332
Susan E. Brennan et al.
were better able to keep track of which expressions had been verbally grounded (and could therefore be assumed to be in common ground) as opposed to which had been uttered but interrupted before being grounded (these were treated as referring to information that was still private). This is a remarkable demonstration of not only the role of executive control in perspective taking, but also the ability of interlocutors to keep detailed track of the mutual knowledge product resulting from the grounding process. If these kinds of interactive tasks could be probed with imaging, the workings of the ToM network might be further clarified. It may be possible to use imaging to delineate a role for the mentalizing system in influencing executive control over other neural circuits (including those associated with the mirror system). Such findings would be consistent with the choice and timing evidence from our and Brown-Schmidt’s experiments and could provide a mechanism by which partner-adapted information that has already been perceived or computed could have an early impact.
6.4. Mentalizing Versus Mirroring Recall that the goal of this review is to better understand how speakers and addressees take one another into account during processing. The behavioral evidence of adaptive processing that we wish to explain emerges from not only cues that unfold during interaction (locally driven) but also simple models of a partner (globally driven; see Section 5.1). This distinction can be mapped onto its neural counterpart, the mirror system (driven by sensorimotor resonance by which one partner simulates another’s perspective) versus the mentalizing system (which involves more conceptual perspective taking). How might the mentalizing and mirroring systems work together to support flexible partner-adapted processing? The answer is not clear. In a comprehensive meta-analysis of over 200 fMRI studies, Van Overwalle & Baetens (2009) considered three possibilities: (1) that mentalizing and mirroring might show anatomical overlap and share a functional core, (2) that they might not overlap but both be active during the same sorts of tasks, or (3) that they might be activated independently. They found the mirroring and mentalizing systems to be ‘‘rarely concurrently active’’ (p. 564), and so concluded that they are complementary, with neither subserving the other. This conclusion does not seem like the end of the story, however. These authors acknowledge ‘‘the lack of clear anatomical definitions for the pSTS and the TPJ’’ and warn that the overlap in their patterns of activation ‘‘cautions against making any strong distinction between them’’ (p. 568). Recall that the TPJ is implicated in rapid mentalizing. However, the seeds of an answer may exist in Noordzij et al.’s (2009) study, which aimed to distinguish mentalizing from mirror networks. Here, the right pSTS was activated not only in recognizing communicative actions, but also in planning actions intended to be recognized as
Two Minds, One Dialog: Coordinating Speaking and Understanding
333
communicative (see Section 6.2.2). The right pSTS, traditionally associated with the mirror system, appeared to participate in a ToM pattern of activation that included mPFC activation, as well as coinciding with the deactivation of the mirror system’s sensorimotor areas (which were most deactivated during planning communicative action). Unfortunately this study is too new to have been covered in the meta-analysis; however, it causes us to question Van Overwalle & Baetens’ conclusion that mirroring and mentalizing are independent for two reasons. First, it may have been premature to conclude that pSTS activation is indicative only of mirroring and not of mentalizing (especially given Van Overwalle & Baetens own caveat), and so the two systems may share a functional core after all. Second, it is probable that few if any studies in the meta-analysis involved interactive communication between partners (the analysis did not include the other studies deploying interactive tasks that we have surveyed here: Krach et al., 2009; Rilling et al., 2004). So it may be that deploying measures that preserve essential aspects of communicative interaction (e.g., Suda et al., 2010) along with tasks that evoke recognition and planning of communicative intentions could show more clearly how these two essential networks might work together.
6.5. Cues Hypothesized to Support Partner-Adapted Processing In this section, we consider several cues relevant to spoken communication. As we have argued from eye-tracking and other behavioral evidence (e.g., Brennan & Hanna, 2009; Metzing & Brennan, 2003), partner-adapted processing can be both rapid and flexible. Thus it makes sense to investigate not only mentalizing as a facilitator of such behaviors, but also the role of cues or local signals about a partner’s needs. Just as affordances in the environment appear to directly support behavior (Gibson, 1977; Norman, 2002), the evidence that unfolds either as feedback from a partner or progress in a joint task could shape an individual’s behavior ‘‘for the partner.’’ Reconsider (from Section 4) the three criteria that for a cue to be ‘‘communicative’’: it must be informative, it must be able to be perceived, and it should be able to be modified by the originator’s intentions (Brennan & Williams, 1995). It can be a challenge to set up behavioral studies of communication that satisfy the last criterion. The ‘‘tacit communication game’’ of Noordzij et al. (2009) accomplished this quite well and found a clear dissociation for moves that signal intention versus (instrumental) moves that do not, when the moves employ otherwise identical perceptual/motor actions. As the neural network(s) associated with processing communicative intentions (whether from local cues or global knowledge) become more well understood, imaging may be able to illuminate communicative processing in ways that are impossible with behavioral studies alone.
334
Susan E. Brennan et al.
6.5.1. Processing Cues That Initiate Social Interaction A dialog begins when one partner recognizes another’s intention to communicate. Calling a partner’s name (an auditory cue) and making eye contact (a visual cue) signal the initiation of social interaction. Both of these cues activate the mPFC (in particular, the right paracingulate cortex) and the left temporal pole of an addressee (Kampe et al., 2003), suggesting that these regions are part of a multimodal circuit that supports recognizing a partner’s intention to communicate. 6.5.2. Voice Cues to Partner Identity Because fMRI studies address anatomical localization but not event-related timing, it is particularly useful to consider electrophysiological evidence from event-related potentials (ERPs) in order to consider the time course with which partner-specific information may have an effect. New evidence from electrophysiological data demonstrates that listeners integrate the content of an utterance with stereotypic information about its speaker from the earliest moments of utterance processing (Van Berkum, van den Brink, Tesink, Kos, & Hagoort, 2008). In Van Berkum et al.’s study, listeners heard utterances (in Dutch) whose content was either congruent or incongruent with stereotypes evoked by the voices in which they were spoken, such as: statements odd for a child speaker but not for an adult (Every evening I drink some wine before I go to sleep), odd for a man but not a woman (I recently had a check-up at the gynecologist in the hospital), or odd for a speaker with a lower-class accent but not an upper-class one (In my free time I enjoy listening to piano music by Chopin). Voice-incongruent utterances evoked reliable N400 waves right from the acoustic onsets of relevant words, at the same early point in time as lexically based semantic anomalies evoke N400s when other semantic information is integrated (Van Berkum et al., 2008). It is remarkable that this incongruity effect of utterance content and speaker stereotype was cued entirely by prerecorded voices (with each presented in a block). It is certainly possible that physical copresence with an interacting speaker in dialog could yield even stronger partner-specific effects, if ERP could be used in this kind of situation. Van Berkum and colleagues next localized this speaker-specific effect. Generally speaking, recognition of a speaker’s identity and characteristics (as evident in the voice) is associated with activation in the right anterior superior temporal sulcus (STS) or temporal pole. Presumably that area provides inputs into language processing in Broca’s Area (BA 44 and 45 in the inferior frontal gyrus, IFG). An fMRI study using the same stimuli as Van Berkum et al. (2008) found more activation in the left IFG (or Brodmann’s Areas 45/47) as well as the right IFG (BA 47) for voice-incongruent sentences than for voice-congruent sentences (although IFG was activated for both kinds of sentences; Tesink et al., 2008). This was interpreted as
Two Minds, One Dialog: Coordinating Speaking and Understanding
335
reflecting effort to unify lexico-semantic information from the utterance with the world knowledge stereotype evoked by the speaker’s voice. Sentences in which voice and message were coherent led to enhanced activation in the bilateral superior temporal cortex (STC, BA 22 extending into BA 41), the right lingual gyrus (BA 18), and the right posterior cingulate cortex (PCC, BA 29). These regions were construed to form a ‘‘unification network’’ for combining linguistic and extralinguistic information, with STC activation proposed to be specific to the congruence between voice and message (as opposed to semantic coherence in general; Tesink et al., 2008). This study did not report any activation in the ToM network. Finally, autism is associated with (and sometimes diagnosed by) ToM deficits. In another study, Tesink and colleagues tested listeners with and without autism spectrum disorder (ASD) using the same voice-incongruent and congruent stimuli. Again, the listeners were able to detect the voiceincongruent messages, showing more activation in the right IFG (BA 47) for speaker-incongruent than congruent messages (Tesink et al., 2008 2009). However, this activation was stronger in listeners with ASD than without; their increased right hemisphere activation in this area over that of non-ASD listeners was interpreted as evidence of compensation, or more effortful processing (perhaps due to difficulty in evoking stereotypes). In addition, non-ASD listeners showed more activity than did ASD listeners in the right ventral mPFC (BA 10) and right ACC (anterior cingulate cortex, BA 24/32) regions (Tesink et al., 2009).
7. Conclusions Psycholinguistic studies of dialog that preserve as many of the natural aspects of spontaneous interpersonal communication as possible (while at the same time achieving sufficient control) have found evidence that speakers and addressees can adapt to each other from the early moments of processing. That is, processing need not be encapsulated from relevant partner-specific information that is straightforward and known in advance. Under some circumstances, speakers can adjust immediately to their addressees’ needs or perspectives, even when these are distinct from their own. The following considerations, we propose, represent useful design considerations for experimental studies that aim to uncover the cognitive and/or neural bases of language processing in communicative contexts, and in particular, partner-specific processing:
To the extent that an experimental task affords behavioral, eye-tracking, or imaging evidence that can be measured independently from evidence in the stimulus events or transcript, this gives the experimenter a window into subjects’ cognitive processing.
336
Susan E. Brennan et al.
The ‘‘language game’’ that subjects are asked to play should be well characterized and staged such that it does not exclude the behavior that it aims to study. To this end, imaging studies with tasks that require subjects to communicate should yield valid data about the kind of processing that underlies language-as-action. Especially useful is evidence that unfolds moment by moment and can be synchronized with events or a transcript, or that can be collected from two interacting partners and synchronized. To experimentally distinguish ‘‘for-the-self’’ from ‘‘for-the-other’’ processing, partners doing a joint task must (at least at some point in the task) have perspectives, needs, or knowledge states that can be operationally distinguished from each other’s. Unless the goal is to study perspective taking under cognitive load, information about one partner’s needs must be available to the other partner in a timely enough fashion to be incorporated into speech planning, articulation, or interpretation—otherwise, one cannot conclude that behavior that seems to be egocentric is actually egocentric. It may be useful for an experimental design to distinguish local (sensorimotor) cues from global cues that are updated less often, or at least to take this distinction into account. It may be useful to characterize cues as to whether they consist of signals intended to be recognized as communicative (in the Gricean sense), or whether they are simply informative. This may determine whether they activate the mentalizing system.
When thinking about how to model partner-adapted processing, it is productive to consider fMRI and electrophysiology data alongside eye-tracking and behavioral studies of communication. We anticipate that timing data from electrophysiology studies and anatomical data from imaging studies have potential to clarify process models that would otherwise be ambiguous. Each approach can shape and inform the kinds of questions that the other can ask, as well as the kinds of cognitive models that it makes sense to propose. Ultimately, plausible cognitive models must be guided by neurological constraints. The distinction between local cues and global partner models that we have developed in our behavioral studies seems to map naturally onto the mirror system and the mentalizing network, respectively. Our findings about how local and global sources of information shape one another to achieve partner-adapted processing lead us to seek out ways in which the mirror and mentalizing systems coexist in the service of language and communicative processing. Executive control appears to play an important role in both kinds of systems: for instance, to inhibit mimicry in the mirror system when necessary, and to select, suppress, or update a global perspective, especially when more than one perspective is implicated in the context (e.g., self vs. other).
Two Minds, One Dialog: Coordinating Speaking and Understanding
337
The mirror system automatically processes social cues that are sensorimotor in nature (e.g., voice, gaze, body motion, backchannels), whereas ToM underlies more conceptual modeling of a partner’s perspectives, needs, and intentions. It remains to be established whether and how these circuits interact. But given the range of processes they support and the likely importance of these processes in interpersonal communication, we expect that they do interact. Previous imaging studies (e.g., as surveyed by Van Overwalle & Baetens, 2009) have failed to clearly establish how they may work together, but this does not mean they are independent, especially since many of the tasks currently in use (especially for ToM) are based on an impoverished notion of what constitutes dialog. Most of the tasks employed so far in ToM studies have not involved interpersonal interaction (or first- or secondperson communicative intent); progress could accelerate with more sophistication in the kinds of language tasks that imagers employ. Another challenge is that sometimes it is difficult to determine exactly which anatomical areas are activated in a particular study. There is much that is unknown about the potential connectivity among regions and about the time course of their activation. And it is extremely difficult to stage an experiment in a scanner that involves speaking; perhaps, new experimental techniques will make it easier to use tasks that preserve the essence of spoken (or even face-to-face) dialog, such as near-infrared spectroscopy (Suda et al., 2010). We also expect that new evidence from imaging studies will help to clarify how ToM and mirroring neural circuits work in concert with those traditionally associated with language, with profound implications for neural models of joint processing both within and between the minds of language users. Understanding how brain networks interact may promote a more nuanced understanding of why communication failures occur, of individual differences in perspective taking, and of the neural basis of communication deficits. In closing, we suggest that to study language use based entirely on individual cognitive processes is to overlook a ubiquitous and astonishing human skill: the coordination of the behavior and mental states of interacting individuals. Interpersonal coordination is so pervasive that it is worthy of scientific investigation in its own right. This skill proceeds in parallel (and is closely integrated) with traditional psycholinguistic processing. For that reason, we advocate studying language processing along with interpersonal coordination in order to understand what it is that minds actually do when communicating.
ACKNOWLEDGMENTS We thank Richard Gerrig, Arthur Aron, and Hoi-Chung Leung for their comments and the Gesture Focus Group for many helpful discussions. This material is based upon work supported by NSF under Grants IIS-0527585 and ITR-0325188. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
338
Susan E. Brennan et al.
REFERENCES Akmajian, A., Demers, R. A., & Harnish, R. M. (1987). Linguistics: An introduction to language and communication (2nd ed.). Cambridge, MA: MIT Press. Altmann, G. T. M., & Kamide, Y. (2007). The real-time mediation of visual attention by language and world knowledge: Linking anticipatory (and other) eye movements to linguistic processing. Journal of Memory and Language, 57, 502–518. Arnold, J. E., Hudson-Kam, C. L., & Tanenhaus, M. K. (2007). If you say thee uh—you’re describing something hard: The on-line attribution of disfluency during reference comprehension. Journal of Experimental Psychology: Learning, Memory, and Cognition, 33, 914–930. Arnold, J. E., Tanenhaus, M. K., Altmann, R., & Fagnano, M. (2004). The old and thee, uh, new. Psychological Science, 15, 578–581. Bard, E. G., Anderson, A. H., Sotillo, C., Aylett, M., Doherty-Sneddon, G., & Newlands, A. (2000). Controlling the intelligibility of referring expressions in dialogue. Journal of Memory and Language, 42, 1–22. Bard, E. G., & Aylett, M. (2000). Accessibility, duration, and modeling the listener in spoken dialogue. In: Proceedings of Gotalog 2000, 4th Workshop on the Semantics and Pragmatics of Dialogue, Gotalog, Sweden: Gothenburg University. Barr, D. J., & Keysar, B. (2002). Anchoring comprehension in linguistic precedents. Journal of Memory and Language, 46, 391–418. Basden, B. H., Basden, D. R., & Henry, S. (2000). Costs and benefits of collaborative remembering. Applied Cognitive Psychology, 14, 497–507. Bekkering, H., de Bruijn, E. R. A., Cuijpers, R. H., Newman-Norlund, R., van Schie, H. T., & Meulenbroek, R. (2009). Joint action: Neurocognitive mechanisms supporting human interaction. Topics in Cognitive Science (Special Issue on Joint Action), 1, 340–352. Bortfeld, H., & Brennan, S. E. (1997). Use and acquisition of idiomatic expressions in referring by native and non-native speakers. Discourse Processes, 23, 119–147. Branigan, H. P., Pickering, M. J., & Cleland, A. A. (2000). Syntactic coordination in dialogue. Cognition, 75, B13–B25. Brass, M., Derrfuss, J., & von Cramon, D. Y. (2005). The inhibition of imitative and overlearned responses: A functional double dissociation. Neuropsychologia, 43, 89–98. Brass, M., Zysset, S., & von Cramon, D. Y. (2001). The inhibition of imitative response tendencies. NeuroImage, 14, 1416–1423. Brennan, S. E. (1990). Speaking and providing evidence for mutual understanding. Unpublished doctoral dissertation, Stanford University, Stanford, CA. Brennan, S. E. (1991). Conversation with and through computers. User Modeling and UserAdapted Interaction, 1, 67–86. Brennan, S. E. (1995). Centering attention in discourse. Language and Cognitive Processes, 10, 137–167. Brennan, S. E. (2004). How conversation is shaped by visual and spoken evidence. In J. Trueswell & M. Tanenhaus (Eds.), Approaches to studying world-situated language use: Bridging the language-as-product and language-as-action traditions (pp. 95–129). Cambridge, MA: MIT Press. Brennan, S. E., Chen, X., Dickinson, C. A., Neider, M. B., & Zelinsky, G. J. (2007). Coordinating cognition: The costs and benefits of shared gaze during collaborative search. Cognition, 106, 1465–1477. Brennan, S. E., & Clark, H. H. (1996). Conceptual pacts and lexical choice in conversation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 6, 1482–1493. Brennan, S. E., & Hanna, J. E. (2009). Partner-specific adaptation in dialogue. Topics in Cognitive Science (Special Issue on Joint Action), 1, 274–291.
Two Minds, One Dialog: Coordinating Speaking and Understanding
339
Brennan, S. E., & Kipp, E. G. (1996). An addressee’s knowledge affects a speaker’s use of fillers in question-answering. In: Abstracts of the Psychonomic Society, 37th Annual Meeting (p. 24), Chicago, IL. Brennan, S. E., Kuhlen, A. K., & Ratra, B. (2010). Audience design in answering rhetorical versus sincere questions (in preparation). Brennan, S. E., & Ohaeri, J. O. (1999). Why do electronic conversations seem less polite? The costs and benefits of hedging. In: Proceedings, International Joint Conference on Work Activities, Coordination, and Collaboration (WACC’99) (pp. 227–235), San Francisco, CA: ACM. Brennan, S. E., & Schober, M. F. (2001). How listeners compensate for disfluencies in spontaneous speech. Journal of Memory and Language, 44, 274–296. Brennan, S. E., & Williams, M. (1995). The feeling of another’s knowing: Prosody and filled pauses as cues to listeners about the metacognitive states of speakers. Journal of Memory and Language, 34, 383–398. Brown, P. M., & Dell, G. S. (1987). Adapting production to comprehension: The explicit mention of instruments. Cognitive Psychology, 19, 441–472. Brown-Schmidt, S. (2009). The role of executive function in perspective taking during online language comprehension. Psychonomic Bulletin & Review, 16, 893–900. Brown-Schmidt, S., Gunlogson, C., & Tanenhaus, M. K. (2008). Addressees distinguish shared from private information when interpreting questions during interactive conversation. Cognition, 107, 1122–1134. Cahn, J. E., & Brennan, S. E. (1999). A psychological model of grounding and repair in dialog. In: Proceedings, AAAI Fall Symposium on Psychological Models of Communication in Collaborative Systems (pp. 25–33), North Falmouth, MA: American Association for Artificial Intelligence. Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press. Chomsky, N. (1980). Rules and representations. Behavioral and Brain Sciences, 3(1), 1–62. Ciaramidaro, A., Adenzato, M., Enrici, I., Erk, S., Pia, L., Bara, B. G., et al. (2007). The intentional network: How the brain reads varieties of intentions. Neuropsychologia, 45, 3105–3113. Clark, H. H. (1992). Arenas of language use. Chicago, IL: University of Chicago Press. Clark, H. H. (1994). Managing problems in speaking. Speech Communication, 15, 243–250. Clark, H. H. (1996). Using language. Cambridge MA: Cambridge University Press. Clark, H. H., & Brennan, S. E. (1991). Grounding in communication. In L. B. Resnick, J. Levine, & S. D. Teasley (Eds.), Perspectives on socially shared cognition (pp. 127–149). Washington, DC: APA Reprinted in R. M. Baecker (Ed.), Groupware and computersupported cooperative work: Assisting human–human collaboration (pp. 222–233). San Mateo, CA: Morgan Kaufman Publishers. Clark, H. H., & Carlson, T. B. (1981). Context for comprehension. In J. Long & A. Baddeley (Eds.), Attention and performance, Vol. IX (pp. 313–330). Hillsdale, NJ: Erlbaum. Clark, H. H., & Fox Tree, J. E. (2002). Using uh and um in spontaneous speaking. Cognition, 84, 73–111. Clark, H. H., & Schaefer, E. F. (1989). Contributing to discourse. Cognitive Science, 13, 259–294. Clark, H. H., & Wilkes-Gibbs, D. (1986). Referring as a collaborative process. Cognition, 22, 1–39. Dahan, D., Tanenhaus, M. K., & Chambers, C. G. (2002). Accent and reference resolution in spoken language comprehension. Journal of Memory and Language, 47, 292–314. Dell, G. S., & Brown, P. M. (1991). Mechanisms for listener-adaptation in language production: Limiting the role of the ‘Model of the Listener’. In D. Napoli & J. Kegl (Eds.), Bridges between psychology and linguistics: A Swarthmore Festschrift for Lila Gleitman (pp. 105–129). Hillsdale, NJ: Erlbaum.
340
Susan E. Brennan et al.
di Pellegrino, G., Fadiga, L., Fogassi, L., Gallese, V., & Rizzolatti, G. (1992). Understanding motor events: A neurophysiological study. Experimental Brain Research, 91, 176–180. Don, A., Brennan, S., Laurel, B., & Schneiderman, B. (1992). Anthropomorphism: From Eliza to Terminator 2. In: Proceedings, CHI’92, Human Factors in Computing Systems, Monterey, CA (pp. 67–70). DuBois, J. (1974). Syntax in mid-sentence. In C. Fillmore, G. Lakoff, & R. Lakoff (Eds.), Berkeley studies in syntax and semantics, Vol. 1 (pp. III.1–III.25). Berkeley, CA: University of California. Eden, M. (1983). Cybernetics. In F. Machlup & U. Mansfield (Eds.), The study of information: Interdisciplinary messages (pp. 409–439). New York, NY: John Wiley & Sons. Ekeocha, J. O., & Brennan, S. E. (2008). Collaborative recall in face-to-face and electronic groups. Memory, 16, 245–261. Fowler, C. A., & Housum, J. (1987). Talkers signaling ‘new’ and ‘old’ words in speech and listeners’ perception and use of the distinction. Journal of Memory and Language, 26, 489–504. Fussell, S. R., & Krauss, R. M. (1989). Understanding friends and strangers: The effects of audience design on message comprehension. European Journal of Social Psychology, 19, 509–525. Fussell, S. R., & Krauss, R. M. (1991). Accuracy and bias in estimates of others’ knowledge. European Journal of Social Psychology, 21, 445–454. Fussell, S. R., & Krauss, R. M. (1992). Coordination of knowledge in communication: Effects of speakers’ assumptions about what others know. Journal of Personality and Social Psychology, 62, 378–391. Galantucci, B., Fowler, C. A., & Turvey, M. T. (2006). The motor theory of speech perception reviewed. Psychonomic Bulletin & Review, 13(3), 361–377. Galati, A., & Brennan, S. E. (2010a). Attenuating information in spoken communication: For the speaker, or for the addressee? Journal of Memory and Language, 62, 35–51. Galati, A., & Brennan, S. E. (2010b). Audience design in the production of gesture (under review). Gallagher, H. L., & Frith, C. D. (2003). Functional imaging of ‘theory of mind’. Trends in Cognitive Sciences, 7, 77–83. Gallagher, H. L., Jack, A. I., Roepstorff, A., & Frith, C. D. (2002). Imaging the intentional stance in a competitive game. NeuroImage, 16, 814–821. Gallese, V., & Goldman, A. (1998). Mirror neurons and the simulation theory of mindreading. Trends in Cognitive Sciences, 2, 493–501. Gibson, J. J. (1977). The theory of affordances. In R. Shaw & J. Bransford (Eds.), Perceiving, acting, and knowing. Hillsdale, NJ: Lawrence Erlbaum Associates. Giles, H., & Powesland, P. F. (1975). Speech styles and social evaluation. London: Academic Press. Glucksberg, S., Krauss, R., & Weisberg, R. (1966). Referential communication in nursery school children: Method and some preliminary findings. Journal of Experimental Child Psychology, 3, 333–342. Goodwin, C. (1981). Conversational organization: Interaction between speakers and hearers. New York, NY: Academic Press. Grice, H. P. (1957). Meaning. Philosophical Review, 66, 377–388. Grice, H. P. (1975). Logic and conversation (from the William James lectures, Harvard University, 1967). In P. Cole & J. Morgan (Eds.), Syntax and semantics 3: Speech acts (pp. 41–58). New York, NY: Academic Press. Hanna, J. E., & Brennan, S. E. (2007). Speakers’ eye gaze disambiguates referring expressions early during face-to-face conversation. Journal of Memory and Language, 57, 596–615. Hanna, J. E., & Tanenhaus, M. K. (2004). Pragmatic effects on reference resolution in a collaborative task: Evidence from eye movements. Cognitive Science, 28, 105–115.
Two Minds, One Dialog: Coordinating Speaking and Understanding
341
Hanna, J. E., Tanenhaus, M. K., & Trueswell, J. C. (2003). The effects of common ground and perspective on domains of referential interpretation. Journal of Memory and Language, 49, 43–61. Harding, C. (1982). Development of the intention to communicate. Human Development, 25, 140–151. Harris, C. B., Paterson, H. M., & Kemp, R. I. (2008). Collaborative recall and collective memory: What happens when we remember together? Memory, 16, 213–230. Hart, J. T. (1965). Memory and the feeling-of-knowing experience. Journal of Educational Psychology, 56, 208–216. Hollingshead, A. B. (1998). Retrieval processes in transactive memory systems. Journal of Personality and Social Psychology, 74, 659–671. Horton, W. S., & Gerrig, R. J. (2002). Speaker’s experiences and audience design: Knowing when and knowing how to adjust utterances to addressees. Journal of Memory and Language, 47, 589–606. Horton, W. S., & Gerrig, R. J. (2005a). Conversational common ground and memory processes in language production. Discourse Processes, 40, 1–35. Horton, W. S., & Gerrig, R. J. (2005b). The impact of memory demands on audience design during language production. Cognition, 96, 127–142. Horton, W. S., & Keysar, B. (1996). When do speakers take into account common ground? Cognition, 59, 91–117. Hwang, J., Brennan, S. E., & Huffman, M. K. (2007). How non-native speakers make phonetic adjustments to partners in dialogue. In: Abstracts of the Psychonomic Society, 48th Annual Meeting (p. 88 ), Long Beach, CA. Iacoboni, M., Woods, R. P., Brass, M., Bekkering, H., Mazziotta, J. C., & Rizzolatti, G. (1999). Cortical mechanisms of human imitation. Science, 286, 2526–2528. Jefferson, G. (1973). A case of precision timing in ordinary conversation: Overlapped tagpositioned address terms in closing sequences. Semiotica, 9, 47–96. Jurafsky, D. (1996). A probabilistic model of lexical and syntactic disambiguation. Cognitive Science, 20, 137–194. Kampe, K. K. W., Frith, C. D., & Frith, U. (2003). ‘‘Hey John’’: Signals conveying communicative intention toward the self activate brain regions associated with ‘‘mentalizing’’, regardless of modality. Journal of Neuroscience, 23, 5258–5263. Keysar, B. (1997). Unconfounding common ground. Discourse Processes, 24, 253–270. Keysar, B., Barr, D. J., Balin, J. A., & Brauner, J. S. (2000). Taking perspective in conversation: The role of mutual knowledge in comprehension. Psychological Science, 11, 32–38. Keysar, B., Barr, D. J., Balin, J. A., & Paek, T. S. (1998). Definite reference and mutual knowledge: Process models of common ground in comprehension. Journal of Memory and Language, 39, 1–20. Keysar, B., Barr, D. J., & Horton, W. S. (1998). The egocentric bias of language use: Insights from a processing approach. Current Directions in Psychological Science, 7, 46–50. Kiesler, S., & Sproull, L. (1992). Group decision making and communication technology. Organizational Behavior and Human Decision Processes, 52, 96–123. Krach, S., Blu¨mel, I., Marjoram, D., Lataster, T., Krabbendam, L., Weber, J., et al. (2009). Are women better mindreaders? Sex differences in neural correlates of mentalizing detected with functional MRI. BMC Neuroscience, 10, 9. Krach, S., Hegel, F., Wrede, B., Sagerer, G., Binkofski, F., & Kircher, T. (2008). Can machines think? Interaction and perspective taking with robots investigated via fMRI. PLoS ONE, 3, e2597. Kraljic, T., & Brennan, S. E. (2005). Using prosody and optional words to disambiguate utterances: For the speaker or for the addressee? Cognitive Psychology, 50, 194–231.
342
Susan E. Brennan et al.
Krauss, R. M. (1987). The role of the listener: Addressee influences on message formulation. Journal of Language and Social Psychology, 6, 81–98. Kraut, R. E., Lewis, S. H., & Swezey, L. W. (1982). Listener responsiveness and the coordination of conversation. Journal of Personality and Social Psychology, 43, 718–731. Kronmu¨ller, E., & Barr, D. J. (2007). Perspective-free pragmatics: Broken precedents and the recovery-from-preemption hypothesis. Journal of Memory and Language, 56, 436–455. Kuhlen, A. K., & Brennan, S. E. (2008). Addressees shape speaking: When confederates may be hazardous to your data. In: Abstracts of the Psychonomic Society, 49th Annual Meeting (p. 6), Chicago, IL. Kuhlen, A. K., & Brennan, S. E. (2010). Anticipating distracted addressees: How speakers’ expectations and addressees’ feedback influence storytelling. Discourse Processes, (in press). Kuhlen, A. K., Galati, A., & Brennan, S. E. (2010). Gesturing integrates top-down and bottom-up information: Effects of speakers’ expectations and addressees’ feedback (under review). Lerner, G. H. (1996). On the ‘‘semi-permeable’’ character of grammatical units in conversation: Conditional entry into the turn space of another speaker. In E. Ochs, E. A. Schegloff, & S. Thompson (Eds.), Interaction and grammar (pp. 238–276). Cambridge, MA: Cambridge University Press. Levelt, W. J. M., & Kelter, S. (1982). Surface form and memory in question answering. Cognitive Psychology, 14, 78–106. Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-Kennedy, M. (1967). Perception of speech code. Psychological Review, 74, 431–461. Lieberman, P. (1963). Some effects of context on the intelligibility of hearing and deaf children’s speech. Language and Speech, 24, 255–264. Lockridge, C. B., & Brennan, S. E. (2002). Addressees’ needs influence speakers’ early syntactic choices. Psychonomic Bulletin & Review, 9, 550–557. MacDonald, M. C. (1994). Probabilistic constraints and syntactic ambiguity resolution. Language and Cognitive Processes, 9, 157–201. MacKay, D. M. (1983). The wider scope of information theory. In F. Machlup & U. Mansfield (Eds.), The study of information: Interdisciplinary messages (pp. 485–492). New York, NY: John Wiley & Sons. Matthews, D. E., Lieven, E. V. M., & Tomasello, M. (2008). What’s in a manner of speaking? Children’s sensitivity to partner-specific referential precedents. In: Proceedings of the LONDIAL Workshop on the Semantics and Pragmatics of Dialog, London, UK. McAllister, J., Potts, A., Mason, K., & Marchant, G. (1994). Word duration in monologue and dialogue speech. Language and Speech, 37, 393–405. Metzing, C., & Brennan, S. E. (2003). When conceptual pacts are broken: Partner-specific effects in the comprehension of referring expressions. Journal of Memory and Language, 49, 201–213. Nadig, A. S., & Sedivy, J. S. (2002). Evidence of perspective-taking constraints in children’s online reference resolution. Psychological Science, 13, 329–336. Neider, M. B., Chen, X., Dickinson, C. A., Brennan, S. E., & Zelinsky, G. J. (2005). Sharing eyegaze is better than speaking in a time-critical consensus task. In: Abstracts of the Psychonomic Society, 46th Annual Meeting (p. 72), Toronto, Canada. Newman-Norlund, R. D., van Schie, H. T., van Zuijlen, A. M. J., & Bekkering, H. (2007). The mirror neuron system is more active during complementary compared with imitative action. Nature Neuroscience, 10, 817–818. Noordzij, M. L., Newman-Norlund, S. E., de Ruiter, J. P., Hagoort, P., Levinson, S. C., & Toni, I. (2009). Brain mechanisms underlying human communication. Frontiers in Human Neuroscience, 3, 1–13. Nooteboom, S. G. (1991). Perceptual goals of speech production. In: Proceedings of the 12th International Congress of Phonetic Sciences, Aix-en-Provence, France, August 19–24 (pp. 107–110), Vol. 1.
Two Minds, One Dialog: Coordinating Speaking and Understanding
343
Norman, D. A. (2002). The design of everyday things. New York, NY: Basic Books. Perryman, G. A., & Brennan, S. E. (2009). Effects of multiple speakers (copresent or not) on dialog context. In: Abstracts of the Psychonomic Society, 50th Annual Meeting, Boston, MA . Pickering, M. J., & Garrod, S. (2004). Toward a mechanistic psychology of dialogue. Behavioral and Brain Sciences, 27, 167–226. Polichak, J. W., & Gerrig, R. J. (1998). Common ground and everyday language use: Comments on Horton and Keysar (1996). Cognition, 66, 183–189. Reddy, M. J. (1979). The conduit metaphor—A case of frame conflict in our language about language. In A. Ortony (Ed.), Metaphor and thought (pp. 284–297). Cambridge, UK: Cambridge University Press. Rilling, J. K., Sanfey, A. G., Aronson, J. A., Nystrom, L. E., & Cohen, J. D. (2004). The neural correlates of theory of mind within interpersonal interactions. NeuroImage, 22, 1694–1703. Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). A simplest systematics for the organization of turn-taking in conversation. Language, 50, 696–735. Samuel, S. G., & Troicki, M. (1998). Articulation quality is inversely related to redundancy when children or adults have verbal control. Journal of Memory and Language, 39, 175–194. Saxe, R., & Kanwisher, N. (2003). People thinking about thinking people: The role of the temporo-parietal junction in ‘‘theory of mind’’ NeuroImage, 19, 1835–1842. Schober, M. F. (1998). Conversational evidence for rethinking meaning. Social Research, 65, 511–534. Schober, M. F. (2004). Just how aligned are interlocutors’ representations? Behavioral and Brain Sciences, 27, 209–210. Schober, M. F., & Clark, H. H. (1989). Understanding by addressees and overhearers. Cognitive Psychology, 21, 211–232. Scholl, B. J., & Leslie, A. M. (1999). Modularity, development and ‘theory of mind’. Mind & Language, 14, 131–153. Sebanz, N., Bekkering, H., & Knoblich, G. (2006). Joint action: Bodies and minds moving together. Trends in Cognitive Science, 10, 70–76. Sebanz, N., & Knoblich, G. (2008). From mirroring to joint action. In I. Wachsmuth, M. Lenzen, & G. Knoblich (Eds.), Embodied communication (pp. 129–150). Oxford: Oxford University Press. Sebanz, N., & Knoblich, G. (2009). Prediction in joint action: What, when, and where. Topics in Cognitive Science, 1, 353–367. Shannon, C., & Weaver, W. (1949). The mathematical theory of communication. Urbana, IL: University of Illinois Press. Shatz, M., & Gelman, R. (1973). The development of communication skills: Modifications in the speech of young children as a function of listener. Monographs of the Society for Research in Child Development, 38, 1–38. Shockley, K., Richardson, D., & Dale, R. (2009). Conversation and coordinative structures. Topics in Cognitive Science, 1, 305–319. Smith, V. L., & Clark, H. H. (1993). On the course of answering questions. Journal of Memory and Language, 32, 25–38. Stellmann, P., & Brennan, S. E. (1993). Flexible perspective-setting in conversation. In: Abstracts of the Psychonomic Society, 34th Annual Meeting (p. 20), Washington, DC. Suda, M., Takei, Y., Aoyama, Y., Narita, K., Sato, T., Fukuda, M., et al. (2010). Frontopolar activation during face-to-face conversation: An in situ study using near-infrared spectroscopy. Neuropsychologia, 48, 441–447. Swerts, M., & Krahmer, E. (2005). Audiovisual prosody and feeling of knowing. Journal of Memory and Language, 53, 81–94. Tanenhaus, M. K., Spivey-Knowlton, M., Eberhard, K., & Sedivy, J. (1995). Integration of visual and linguistic information in spoken language comprehension. Science, 268, 1632–1634.
344
Susan E. Brennan et al.
Tanenhaus, M. K., & Trueswell, J. C. (1995). Sentence comprehension. In J. Miller & P. Eimas (Eds.), Handbook of perception and cognition, Vol. 11: Speech language and communication. San Diego, CA: Academic Press. Tanenhaus, M. K., & Trueswell, J. C. (2004). Eye movements as a tool for bridging the language-as-product and language-as-action traditions. In J. C. Trueswell & M. K. Tanenhaus (Eds.), Approaches to studying world-situated language use: Bridging the languageas-product and language-action traditions (pp. 3–37). Cambridge, MA: MIT Press. Tesink, C. M. J. Y., Buitelaar, J. K., Petersson, K. M., van der Gaag, R. J., Kan, C. C., Tendolkar, I., et al. (2009). Neural correlates of pragmatic language comprehension in autism spectrum disorders. Brain, 132, 1941–1952. Tesink, C. M. J. Y., Petersson, K. M., van Berkum, J. J. A., van den Brink, D., Buitelaar, J. K., & Hagoort, P. (2008). Unification of speaker and meaning in language comprehension: An fMRI study. Journal of Cognitive Neuroscience, 21, 2085–2099. Van Berkum, J. J. A., van den Brink, D., Tesink, C. M. J. Y., Kos, M., & Hagoort, P. (2008). The neural integration of speaker and message. Journal of Cognitive Neuroscience, 20, 580–591. Van der Cruyssen, L., Van Duynslaeger, M., Cortoos, A., & Van Overwalle, F. (2009). ERP time course and brain areas of spontaneous and intentional goal inferences. Social Neuroscience, 4, 165–184. Van Duynslaeger, M., Van Overwalle, F., & Verstraeten, E. (2007). Electrophysiological time course and brain areas of spontaneous and intentional trait inferences. Social Cognitive and Affective Neuroscience, 2, 174–188. Van Leeuwen, M. L., van Baaren, R. B., Martin, D., Dijksterhuis, A., & Bekkering, H. (2009). Executive functioning and imitation: Increasing working memory load facilitates behavioural imitation. Neuropsychologia, 47, 3265–3270. Van Overwalle, F., & Baetens, K. (2009). Understanding others’ actions and goals by mirror and mentalizing systems: A meta-analysis. NeuroImage, 48, 564–584. Vogeley, K., Bussfeld, P., Newen, A., Herrmann, S., Happe, F., Falkai, P., et al. (2001). Mind reading: Neural mechanisms of theory of mind and self perspective. NeuroImage, 14, 170–181. Weldon, M. S., & Bellinger, K. D. (1997). Collective memory: Collaborative and individual processes in remembering. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23, 1160–1175. Wiener, N. (1965). Cybernetics—or control and communication in the animal and the machine (2nd ed.). Cambridge, MA: The MIT Press (originally published in 1948). Wiley, J., & Bailey, J. (2006). Effects of collaboration and argumentation on learning from web pages. In A. M. O’Donnell, C. E. Hmelo-Silver, & G. Erkens (Eds.), Collaborative learning, reasoning, and technology (pp. 297–321). Hillsdale, NJ: Erlbaum. Wiley, J., & Jensen, M. (2006). When three heads are better than two. In: Proceedings, CogSci 2006, 28th Annual Conference of the Cognitive Science Society (p. 3275), Vancouver, CA: Cognitive Science Society. Wilkes-Gibbs, D. (1986). Collaborative processes of language use in conversation. Unpublished doctoral dissertation, Stanford University, Stanford, CA. Wimmer, H., & Perner, J. (1983). Beliefs about beliefs: Representation and constraining function of wrong beliefs in young children’s understanding of deception. Cognition, 13, 103–128. Wright, D. B., & Klumpp, A. (2004). Collaborative inhibition is due to the product, not the process, of recalling in groups. Psychonomic Bulletin & Review, 11, 1080–1083. Yngve, V. H. (1970). On getting a word in edgewise. In: Papers from the 6th Regional Meeting of the Chicago Linguistic Society (pp. 567–578), Chicago, IL: Chicago Linguistic Institute.
C H A P T E R
N I N E
Retrieving Personal Names, Referring Expressions, and Terms of Address Zenzi M. Griffin Contents 345 346 346 348 364 365 369 371 372 376 379 379 380
1. Introduction 2. Psychological Research on Personal Name Production 2.1. How Difficult Are Personal Names? 2.2. Why Are Personal Names So Difficult? 3. Personal Names and Reference Across Cultures 3.1. What Are Names Like Cross-Culturally? 3.2. How Are People Referred to? 4. Direct Address in Spoken Language 4.1. Forms of Direct Address 4.2. Factors Influencing Choice of Address Form 5. Conclusion Acknowledgments References
Abstract Why is it more difficult to recall the names of celebrities and old acquaintances than other words that one seldom uses? Several factors related to the way information about people are structured and how words are produced conspire to make personal names particularly difficult to retrieve. In contrast, expressions such as descriptive nicknames, kinship terms, and titles appear easier to retrieve. A review of how people are named, referred to, and addressed across cultures and situations suggests that there is broad range in the relative difficulty of producing terms and that several social variables must be considered in a full account of name retrieval.
1. Introduction Personal names are among the most difficult words to learn and retrieve. At the same time, they are often extremely important for social interactions. Several properties of personal names contribute to their Psychology of Learning and Motivation, Volume 53 ISSN 0079-7421, DOI: 10.1016/S0079-7421(10)53009-3
#
2010 Elsevier Inc. All rights reserved.
345
346
Zenzi M. Griffin
difficulty. Rather than enter into the seemingly endless debate about the ‘‘meaning’’ of personal names (for novel perspectives, see, e.g., Allerton, 1987; Le´vi-Strauss, 1966; Miller & Johnson-Laird, 1976), the question to be addressed here is how they are retrieved in speaking. That is, what the processes and retrieval cues are that speakers use to generate personal names and how do these cues differ, if at all, from those used to retrieve other words. There is a large existing literature on the use of names as opposed to thirdperson pronouns such as he1 or she for referring to individuals in text and speech (for review, see Arnold, 2008). Much of this literature concentrates on the ability of a comprehender to recognize whom a speaker refers to and the extent to which other factors drive a speaker’s choice of referring expression (e.g., Arnold & Griffin, 2007). In contrast, no work in psycholinguistics has addressed the use of names as forms of direct address (i.e., vocative usage), as when speakers say the name of the person they are talking to, as in How’s it going, Brian? Moreover, people do not always address or refer to people by name. Speakers often use other terms such as titles (e.g., Doctor), kinship terms (Mom), nicknames (Fridge), endearments (Sweetie), and pronouns (you). Another goal of this chapter is to explore when these other forms of direct address are used and develop hypotheses about how they are retrieved. First, I review the literature on the learning and retrieval of personal names in comparison to other words such as object names (see also Valentine, Brennen, & Bre´dart, 1996). As most of the work of name production has been carried out in the United States and United Kingdom, the generalizations and assumptions about the forms of personal names are not universally valid. For the initial portion of this chapter, I nevertheless discuss this work as if it does generalize broadly. Then I briefly review naming and address systems from other cultures, drawing heavily on work from anthropology and sociolinguistics, and hypothesize about the relative ease of production for referring expressions and forms of direct address in these systems. Based on the information currently available, the terms used in other cultures may be easier to retrieve than the types of names typically studied or they may be difficult for different reasons.
2. Psychological Research on Personal Name Production 2.1. How Difficult Are Personal Names? Complaints about the difficulty of learning and remembering personal names are common (e.g., Cohen & Faulkner, 1986). On occasion, speakers fail to retrieve the names of familiar people although they have identified 1
Italics will be used to indicate reference to the form of a word rather than its referent. Quotes denote usage.
Personal Names
347
them correctly and can provide specific information about them. Such tipof-the-tongue (TOT) states occur when someone is confident that he/she knows the name of a person or concept but nonetheless cannot retrieve it. When in a TOT state, speakers often know the first sound or letter of the sought-for name and how many syllables it has (Brown & McNeill, 1966). In one study, 130 adults were asked to record each instance of TOT states over 4 weeks (Burke, MacKay, Worthley, & Wade, 1991). Although TOT states were relatively uncommon (averaging one occurrence per week for college-age adults), the majority of these retrieval failures were for proper names, and typically for acquaintances known for several years but with whom the speaker had not been in recent contact. Other diary studies report similar TOT frequencies for personal names (Cohen & Faulkner; Gollan, Bonanni, & Montoya, 2005; Young, Hay, & Ellis, 1985). In laboratory studies too, speakers are more likely to report TOT states for the names of familiar celebrities than for familiar objects (e.g., Gollan et al., 2005). Unfortunately, it is not possible to equate the familiarity of objects and faces (Semenza, 2006) or labels2 and celebrity names on all of the variables that one would want. Even taking this into account, personal names appear much more vulnerable to forgetting or retrieval blocks. Alas, for example, students forget the names of famous cognitive psychologists faster than they forget facts and concepts from cognitive psychology (Conway, Cohen, & Stanhope, 1991). Patients suffering from cognitive impairments tend to show greater decrements in their ability to retrieve proper names than common nouns (for review, see Yasuda, Nakamura, & Beckman, 2000). Even the cognitive deficits associated with extremely high altitude may impair recall of first names while sparing memory for common nouns (Pelamatti, Pascotto, & Semenza, 2003).3 Many studies have compared learning names for unfamiliar faces with the learning of other information such as occupations, hometowns, or hobbies. Again and again these studies show that personal names take longer to learn and are less likely to be recalled than other information (Cohen & Faulkner, 1986; Crook & West, 1990; McWeeny, Young, Hay, & Ellis, 1987). Personal names truly do seem particularly difficult to learn and retrieve.
2 3
Typically the literature refers to object names, but these will be referred to as labels here in an attempt at clarity. In this study, all of the common nouns came from different categories (e.g., body parts, vegetables, furniture) whereas nothing was done to make the personal names more distinctive. So, differences in recall may be due to greater item similarity within the list of personal names than within the list of common nouns. Note that this was not the case for the original study to use this technique with personal names (Hittmair-Delazer, Denes, Semenza, & Mantovan, 1994).
348
Zenzi M. Griffin
2.2. Why Are Personal Names So Difficult? Speaking involves conceptualizing the content of an utterance, selecting a term to use, retrieving the sounds of the selected word, and planning and executing the motor movements to articulate it (see, e.g., Griffin & Ferreira, 2006; Levelt, 1989; Meyer & Belke, 2007). Difficulties can arise at any of these stages but only the stages preceding articulatory planning may account for TOT states. Theorists and researchers have noted several ways in which personal names differ from other words. Some of these differences appear real and are likely to lead to slower or less successful retrieval of personal names. Other proposed differences have not yet been substantiated; others appear unlikely to hinder word retrieval. The evidence for these proposals is reviewed in the following section. 2.2.1. Individuality, Uniqueness, and Arbitrariness One of the most common assertions about proper nouns as opposed to object names is that the former pick out individuals (or tokens) rather than categorizing them as exemplars of a type (e.g., Hittmair-Delazer et al., 1994; Semenza & Zettin, 1989). This one-to-one relationship is often cited as a reason that personal names are more difficult to produce than object names and has been incorporated into most models of face naming to account for the retrieval difficulty of proper nouns (Burke et al., 1991; Burton, Bruce, & Johnston, 1990; Valentine et al., 1996). One prediction, then, is that other unique information should be equally difficult to retrieve. Harris and Kay (1995) methodically evaluated the extent to which a patient with proper-name anomia was impaired in retrieving information about people. Consistent with her diagnosis, the patient categorized and sorted celebrity photos perfectly and was unable to name many photos of celebrities and even photos of her friends. However, she had no problems generating unique information about celebrities (e.g., Why was Salman Rushdie in the news?) or her friends (such as which seat they habitually occupied at daycare). Her ability to recall identifying information about individuals, but not their names suggests that uniqueness alone is not to blame for the vulnerability of personal names to retrieval problems (see also Hittmair-Delazer et al., 1994). Furthermore, not all proper nouns appear as difficult to retrieve as personal names are. For example, place names (e.g., New York, London, Sweden) and the names of monuments and masterpieces (e.g., the Statue of Liberty, the Mona Lisa) are often less impaired by brain damage than personal names are (Ghika-Schmid & Nater, 2003; Saetti, Marangolo, De Renzi, Rinaldi, & Lattanzi, 1999; Warrington & Clegg, 1993). A review of over 10 cases of proper-name anomia showed that place names were only impaired for the patients with the most severe celebrity face-naming impairments (Hanley & Kay, 1998). As the researchers note, this could be due to place names (at least those that were used in these particular tests) being acquired
Personal Names
349
earlier and used more often than the celebrity names to which they were compared. Other researchers have hypothesized that such differences may be due to famous geographical locations often being more meaningful as reflected in their use as adjectives such as Parisian, and therefore having more connections between conceptual and word representations (Cohen & Burke, 1993).4 Alternatively, the existence of adjectival word representations for place names may support the retrieval of nominal word representations (Semenza, 1997). So, despite the individuality and relative arbitrariness of place names, it seems that they may acquire meaningful associations to their names that support their retrieval. Note that the Mona Lisa is also used as modifier as in a Mona Lisa smile and monuments often have meaningful words in them, such as statue and liberty, which may help their recall. That place names are often less impaired than personal names are further suggests that individuality and uniqueness alone do not make personal names particularly difficult to recall. Interestingly, one ability that seems just as impaired as retrieving personal names in proper-name anomia5 is retrieval of previously well-known telephone numbers and addresses (Harris & Kay, 1995; Saetti et al., 1999). Telephone numbers and addresses are extremely arbitrary, suggesting that arbitrariness and meaningfulness are important contributors to retrieval difficulty.6 2.2.2. Word Forms Learning someone’s occupation is typically a matter of linking a familiar concept and familiar word form (e.g., psychologist) to a new individual. Although one comes across new occupations occasionally (perhaps psycholinguist), new occupations are encountered far less frequently than novel first or last names. One study found that only 2% of first names in Oklahoma were meaningful words such as Ruby, Faith, or Violet (Alford, 1988). Moreover, even novel occupations typically contain familiar components (e.g., psycho-, lingu-, -ist) that make their forms and meanings somewhat familiar. In contrast, Brennen (1993) argues that learning someone’s name often involves learning a new word form (e.g., Zenzi) in addition to an association between the new name and the unfamiliar individual. 4 5
6
Names of famous people can also be used as modifiers (Cohen & Burke, 1993), but this may be less often the case for the celebrities used in face-naming studies. Many patients with proper-name anomia also have extreme difficulty in paired associate learning for unrelated words implying that arbitrariness could be an issue (Hittmair-Delazer et al., 1994), but there are several exceptions (Ghika-Schmid & Nater, 2003; Saetti et al., 1999). As Saetti et al. point out, it is likely that participants vary in their tendency and ability to spontaneously generate meaningful associations between unrelated words. Building names such as the Eiffel Tower are not generally extended to anything else and may have fewer associations than other famous unique things. Two experiments involving people with closed-head injuries and controls found that celebrities and buildings (matched on rated ease of naming) were equally slow and inaccurate in naming (Milders, 2000).
350
Zenzi M. Griffin
Indeed, people are much more successful at learning associations between names and faces when the forms of the names are common (e.g., Mr. Mitchell ) rather than so uncommon as to be unique (e.g., Mr. Marland; James & Fogler, 2007). Also, consistent with Brennen’s argument, when novel word forms such as ryman and crumpler are used for occupations in learning studies, occupations show no advantage in recall over surnames (Bruce, Burton, & Walker, 1994; Cohen, 1990b). In language production, words that share sounds with many other words are easier to retrieve than those that are more unique (e.g., Dell & Gordon, 2003; Harley & Bown, 1998; Vitevitch, 2002). Brennen (1993) posited that personal names vary much more widely in their phonological forms than other words do. Intuitively this assertion seems plausible. Greater variation may be due to many first and last names often coming from other languages that allow different sound sequences (e.g., Knut, Antje), due to the wide range of choices for first names among English speakers (e.g., using city names, combining parts of both parents’ first names into one for a child), or simply the combinations of sounds in even common names perhaps differing from those in other words (e.g., for a monosyllable, there are relatively few words that differ from John by one sound). Further, this possible difference between the forms of personal names and other words entails that the same amount of partial phonological information will be more constraining when trying to retrieve an occupation than a person’s name (Brennen). For example, the first syllable /bei/ is compatible with only a few occupations (baker, bailiff) but many potential and plausible surnames (Bay, Bader, Baker, Bale, Baines, Bates, Beattie, Bateman, etc.). Such a difference could make a TOT state for a surname harder to resolve than one for an occupation or other class of words with more regular forms. When phonological neighbors are defined as words that differ by just one sound, the phonological neighborhoods of multisyllabic words tend to be less dense than the neighborhoods for monosyllables (Harley & Bown, 1998). Even with other definitions of neighborhoods and attempting to control for the important variables that covary with length, longer words appear more vulnerable to speech errors (Goldrick, Folk, & Rapp, 2010). First and last names are often multisyllabic (e.g., Cassandra and Cooper, respectively). An implication of being longer, with sparser phonological neighborhoods would be that personal names receive less support from similar words during retrieval and therefore be more prone to TOTs and speech errors, just as these factors affect common nouns (Goldrick et al.; Harley & Bown). Consistent with this, retrieval failures occur more often for the names of celebrities who are known by three-part names (e.g., Martin Luther King) than equally familiar celebrities that are known by two-part names (e.g., Clint Eastwood; Hanley & Chapman, 2008). In sum, personal names may be longer and more phonotactically diverse than other types of words. These two variables are known to affect word
Personal Names
351
retrieval for common nouns and may be part of what makes personal names particularly difficult to learn and retrieve. 2.2.3. Descriptiveness and Meaning Experimental work on name learning shows that when properties of word forms are controlled by only using name-occupation homonyms, it is still more difficult to learn someone’s name than their occupation. That is, it is easier to recall that a new person in a photo is a potter by profession than that her last name is Potter ( James, 2004; McWeeny et al., 1987). One account for this paradox is that learning someone’s occupation provides a wealth of information (e.g., a potter is likely to be artistic, good at working with her hands, and not squeamish about dirt) whereas learning a name provides less information. Indeed, first names typically provide little information beyond statistical clues about sex (e.g., Cassidy, Kelly, & Sharoni, 1999), age cohort (e.g., Todd & Robert, 2009), and ethnicity (e.g., Rymes, 1996). Surnames provide even less information. Cohen (1990b) tested the extent to which surnames are treated as meaningless by comparing their recall to that of real and novel possessions in a learning study. So, for a particular photographed face, participants would hear, This man is called Mr. Hobbs. He is a pilot. He has a blick/dog (with the order of properties counterbalanced). So, the novel possessions had no meaning connected to them to aid in recall. As usual, participants recalled occupations more accurately than surnames. Names and nonword possessions were recalled equally poorly, whereas real-word possessions were recalled as accurately as occupations were. Cohen’s result is consistent with the idea that people treat surnames as meaningless. Also consistent with this, participants sometimes report spontaneously using mental imagery to learn occupations but do not seem to do so spontaneously for names (Bruce et al., 1994; McWeeny et al., 1987). In another experiment, Cohen found that participants were better able to learn homonymic surnames such as Baker when they were paired with meaningless occupation names like ryman than when paired with real occupations like lawyer.7 Cohen argued that the mismatch in information contained in potentially meaningful surnames and real occupations (or other properties) discourages people from treating surnames as meaningful. In other words, it is difficult to imagine that Potter is a potter while remembering that she is a psychologist. That said, studies of name learning indicate that instructions to try ‘‘to give meaning’’ to names increase recall (Brooks, Friedman, Gibson, & Yesavage, 1993; Milders, Deelman, & Berg, 1998; Morris, Fritz, Jackson, Nichol, & Roberts, 2005).8 This suggests that meaningful associations for a proper 7 8
Ideally, one would like to see control conditions with matched familiar but meaningless surnames like Hobbes. Unfortunately, this strategy is difficult to apply outside of the lab (Morris et al., 2005).
352
Zenzi M. Griffin
noun facilitate its retrieval (Warrington & Clegg, 1993) even though such associations are likely to be nonsensical and possibly inconsistent with the person’s other characteristics. Of course, surnames such as Potter and Baker were originally bestowed on people who had such occupations. The greater ease of recall for real occupations than surnames suggests that such names were easier to retrieve when they were descriptive (i.e., when Potter was a potter) than they are now. Likewise, a case study of an anomic aphasic is consistent with the idea that descriptive names are easier to recall than less descriptive ones. The patient was described as having preserved information about familiar people, good comprehension, and fluent grammatical speech, but impaired ability to retrieve personal names despite good recognition ability for faces (Flude, Ellis, & Kay, 1989). Notably among the few names or parts of names he was able to produce were ‘‘The Queen Mother,’’ ‘‘Princess’’ for Princess Anne and ‘‘Prince of’’ for the Prince of Wales. That is, when names included titles or roles, they were more readily retrieved. Other studies further indicate that descriptive names are retrieved more readily than nondescriptive ones. For example, people find it easier to name pictures of well-known cartoon characters that have descriptive names such as the Pink Panther and Spider-man than matched characters with nondescriptive names like Homer Simpson and Garfield (Bre´dart & Valentine, 1998; Fogler & James, 2007). Along the same lines, the only two famous individuals (out of 22) that an Italian proper-name anomic (HittmairDelazer et al., 1994) was able to name from descriptions had descriptive names: Superman and Batman. At the same time, he was able to provide additional specific information about 20 of the individuals and match a name to each face perfectly from a choice of three names. So, personal names that are descriptive of their bearers are more easily retrieved than nondescriptive names. Unfortunately, many names are not descriptive. However, this does not mean that speakers do not make use of information about a person to retrieve the person’s name. 2.2.4. Features and Representational Structure In general, when a speaker accidentally substitutes one word for another, the substituting word is highly related in meaning to the intended word. A representative substitution is ‘‘It’s at the bottom—I mean—the top of the stack of books’’ (Fromkin, 1971). Such observations are used to argue that the first stage of word production involves selecting word representations based on correspondence to semantic features (for review, see Griffin & Ferreira, 2006). Although personal names provide little information about their bearers, there is evidence that properties of their owners are used to retrieve them. Substitution errors typically result in the name of a person with shared characteristics. For example, errors in celebrity face naming by
Personal Names
353
both college students and anomics tend to be names of people who share nationality and profession, such as calling President Kennedy ‘‘Reagan’’ or ‘‘Eisenhower’’ or calling Elizabeth I ‘‘Mary of Scotland’’ (Bre´dart & Valentine, 1992; Cipolotti, McNeil, & Warrington, 1993; Lucchelli, Muggia, & Spinnler, 1997).9 In a name-learning experiment where occupations and other information were also presented, learners’ naming errors showed a tendency to confuse the names of individuals who had the same occupation (Fraas et al., 2002). Physical similarity also may play a role even when there is no reason to suspect that the substitution is due to mistaken identity. For example, survey respondents’ parents were more likely to mistakenly call them by the name of their siblings when they were of the same sex, close in age, or similar in appearance (Griffin & Wangerman, 2008). When parents accidentally addressed their children by names other than those of siblings, the source of the substitution was nearly always another family member such as the speaker’s own sibling, spouse, or a family pet. In sum, the substitution errors for personal names suggest that the characteristics and roles of a person are used retrieve the person’s name. However, the representations of people and objects are likely to differ in several ways that may affect the ease of word retrieval. Retrieving a name for something requires distinguishing it from other similar things and the features that are unique to it are important for achieving this. Connectionist networks trained to map between concepts and word forms develop stronger connections between word forms and distinctive features than between word forms and shared features (e.g., Cree, McNorgan, & McRae, 2006). This translates into faster retrieval for items with more distinctive features. In the context of retrieving names based on photographs or line drawings, visual distinctiveness is important. For example, rushing people to label objects disproportionately increased substitution errors for objects belonging to visually similar categories such as animals, fruits, and vegetables, rather than artifacts (Lloyd-Jones & Nettlemill, 2007). Substitutions tended to be visually and semantically related to the target, as in asparagus for celery. Indicating that retrieval of personal names is likewise sensitive to visual similarity, faces that are rated as more distinctive are named faster than less distinctive ones (e.g., Jack Nicholson vs. Mel Gibson; Valentine & Moore, 1995). In labeling animals, the speaker just needs to distinguish between categories such as lions from tigers. In naming faces, speakers must make a more fine-grained distinction, within the category of human. As a consequence, faces should be slower to activate appropriate name representations, be more prone to errors in name
9
However, within the category of musicians, US versus UK nationality is not strong enough to produce release from proactive interference unless the categories are explicitly mentioned (Darling & Valentine, 2005).
354
Zenzi M. Griffin
selection, and be more susceptible to impairment, just as the literature on proper naming deficits in brain-damaged patients suggests (Yasuda et al., 2000). If people tend to share many nonvisual features (e.g., , , ), this could also contribute to difficulty retrieving their names. When people are asked to list the features of various object concepts (e.g., dog: , , ), the distribution of features provided for animals and artifacts differ systematically. Animals tend to have fewer distinctive or informative features as well as more highly correlated features (e.g., the features and co-occur for multiple concepts) than artifacts do (e.g., Cree & McRae, 2003; Cree et al., 2006; Tyler, Moss, Durrant-Peatfield, & Levy, 2000). These differences between feature distributions have been hypothesized to account for category-specific deficits in general semantic knowledge and word retrieval (e.g., greater impairment for processing animals than artifacts) with brain damage (McRae, de Sa, & Seidenberg, 1997; Tyler et al.). So, an important question is how the distribution of features for individual people compares to other categories. A learning study by Cohen (1990a) further illustrates the importance of shared versus distinctive features in name learning. Participants studied a list of 18 statements that each linked an attribute to a common male first name with the goal of being able to supply the correct name when given the associated attributes. One condition paired each name with a unique attribute (no-fan), while in another condition, each name was paired with three unique attributes (fan-in). In the crossed-fan condition, each name was paired with two attributes that were each linked to one other name (see Figure 1). Once participants could go through the entire list of attributes and correctly generate the correct names for all of them, they performed a speeded attribute verification test (e.g., George likes cats—true or false?) and then an untimed cued recall test (e.g., __________ likes beer and is a doctor). In both tests, performance was equally fast and highly accurate for the two conditions in which names were paired with unique attributes (fan-in and no-fan), despite the difference in the number of attributes associated with each name. Participants were slowest and least accurate in verifying and retrieving names in the crossed-fan condition with its intermediate number of attributes per name. Thus, sharing attributes with others impaired name retrieval even though the combination of attributes linked to each individual was unique. Another learning study found that uncommon occupations such as jockey were better recalled than common ones like secretary when participants were cued by a photo (Stanhope & Cohen, 1993).10 In real life, people seem likely to share
10
List composition modulates this effect. Uncommon stimuli must be in the minority for the effect to appear (Stanhope & Cohen, 1993).
355
Personal Names
likes beer
is a doctor
likes films
has a bike
David
Tim Crossed-fan
is a waiter
has a car
is thin
Alan
John
Peter
No-fan
Fan-in
Figure 1 Diagram of conditions in Cohen’s (1990b) fan effect experiment. Recall was poorest in the crossed-fan condition.
many characteristics and have few, if any, unique attributes, and this may hinder retrieval of personal names. Certainly work in social psychology shows that people who share important features are perceived as similar and may be confused (e.g., Andersen & Berk, 1998; Fitzsimons & Shah, 2009). However, further work is required to establish the extent to which differences in distinctive features may account for differences between retrieving personal and object names. One potential complication in comparing distinctive properties across categories is that features that appear relatively distinctive for a person are likely to involve their roles in a relational structure. For example, Ragni Lantz has the distinction of being the one and only person who is my mother. However, the property of being a mother is not distinctive at all. Insofar as representing the unique property of being my mother relies on the nonunique relation <mother-of-X>, it might not function as a distinctive feature or at least not as effectively as, say, a feature like <moos> does for identifying cows. Recall that in the name substitution errors summarized above, the characteristics shared by the target person and owner of the substituting name were often extrinsic to the person (i.e., shared social roles such as governing the same country, having the same occupation, or belonging to the same family) rather than intrinsic features such as hair color or personality traits. The mental representations for relational properties differ from those of feature-based ones in many ways (Markman & Stilwell, 2001) that are likely to impact word retrieval. For example, relational concepts provide second order partitions of the world that are more complex and acquired later than categories based on intrinsic feature distinctions (Gentner & Kurtz, 2005). Furthermore, Gentner and Kurtz point out that labeling isolated objects emphasizes intrinsic features rather than relational ones. For example, a hammer is primarily for pounding nails but when it is presented by itself, its physical appearance provides the primary cues for identification and word retrieval. By extension to face
356
Zenzi M. Griffin
naming, if relational information is more critical in person representations and the retrieval of personal names, one should expect relatively poorer retrieval of personal names than of object labels even if other variables could be held equal. Note that this argument does not bear on whether mental representations for people are richer than for those for objects, but rather posits that the information that distinguishes among people is likely to be represented in a more complex fashion than that which distinguishes objects. A priori, the results of priming and interference studies should be helpful for investigating the mental representations for people. Many studies have investigated priming between celebrities in face naming and found weak categorical priming for people but strong associative priming (e.g., Brennen & Bruce, 1991; Carson & Burton, 2001; Darling & Valentine, 2005; Young, Ellis, Flude, McWeeny, & Hay, 1986; Young, Flude, Hellawell, & Ellis, 1994), which differs from what is typically seen for labeling objects (e.g., Lupker, 1979). However, interpreting the results of these celebrity-naming experiments is difficult. One large limitation is that categorical relatedness for people is typically defined as sharing an occupation or nationality in such studies. So, actors such as Tom Hanks and Demi Moore should affect the naming of John Wayne more than they would the naming of a politician (Carson & Burton, 2001). Although occupations or nationalities are clear categories, they may not be terribly important for how people conceptualize celebrities (e.g., witness the porous boundary between performer and politician, and the many successful Canadians on American TV). Association between celebrities is typically due to marriage or other co-occurrence, which intuitively results in stronger associations than found between objects. In addition, as Carson and Burton (2001) pointed out, association often was confounded with categorical relatedness in celebritynaming studies. So, for example, Prince Charles was associated with Princess Di, Stan Laurel was associated with Oliver Hardy, whereas for objects, a mouse was associated with cheese. Without some way of quantifying categorical or associative relatedness, there is no sound basis for comparing the degree of priming between people with the priming between objects. Another source of information about the semantic representations of words comes from their distribution in texts. For example, the words lemon and lime tend to appear in similar sentence contexts with words like pie, tree, squeeze, and wedge, while the word laptop does not. Large-scale analyses of word co-occurrence patterns yield vector representations that reflect these distribution patterns (Landauer & Dumais, 1997; Lund & Burgess, 1996). The more similar the vectors for two words, the more similar their meanings. For example, such measures of similarity have predicted the magnitude of priming from one word to another in a word recognition task (Lund & Burgess) and distinguished synonyms from foils on a vocabulary test (Landauer & Dumais). Such representations can be calculated for proper
Personal Names
357
names that appear in texts as well as for common nouns (Burgess & Conley, 1999). The most similar words for 20 personal names (e.g., Thomas) were compared to the most similar words for 20 common nouns of the same frequency as the names (e.g., dollar). The difference between the representations of personal names and their neighbors was smaller than the difference between common nouns and their neighbors. In other words, fewer features distinguished the use of personal names than common nouns. This result supports the idea that the representations of people that are used for name retrieval are less distinctive than those of other nouns. In summary, another source of the greater susceptibility of personal names to retrieval deficits may lie in how people are mentally represented. One likely source of difficulty is a predominance of shared features and dearth of distinctive ones. In addition, many distinguishing features may be extrinsic to the person, involving their relationships to other concepts or people, which may make their semantic representations more elaborate in a way that hampers retrieval. 2.2.5. Multiple Names Researchers have suggested that one of the reasons personal names are so difficult to retrieve is that there are so few alternative ways to refer to individuals (e.g., Bre´dart, 1993; Cohen & Faulkner, 1986). That is, people often have a given first name, a middle name that only their parents and the government know, and a surname. Unlike objects, they do not have synonyms (e.g., sofa for couch), superordinate terms ( furniture), or subordinate category names (sectional) that may substitute. Even insofar as a speaker knows different parts of a person’s name, conventions seems likely to preclude the use of an alternative such as suddenly addressing an acquaintance by their last name instead of their first name. Of course, some people are known by truncated11 versions of their given names (e.g., Vic for Victor). However, given the phonological overlap between long and truncated name forms, if one version can be retrieved, the other might be easily retrieved as well. In contrast, when someone is known by a nickname or pseudonym, speakers may not know the given name at all and, even if they did, it may not function well as a reference or form of address (e.g., Aceto, 2002; Dorian, 1970). That said, a study of male college students found that the more intimate a student was with someone, more names for he used for him (Brown & Ford, 1964). For example, a close friend of James Scoggin would call him ‘‘Scoggin,’’ ‘‘Jim,’’ ‘‘James,’’ and ‘‘Scoggs.’’ Although intuitively it might seem that more choices are to be preferred over few, the reverse is often the case for ease of production. For example, the time needed to retrieve an object label increases with its number of 11
I reserve the term nickname for names that are not standard truncations of given names such as calling someone named Robert Red but not Bob.
358
Zenzi M. Griffin
context appropriate labels (e.g., Bates et al., 2003). For example, people take longer to label an object that can be called TV or television than to label a matched object with a single, dominant label like tooth. This influence of codability or name agreement is greater than other variables such as word frequency, age of acquisition, phonological neighborhood size, and so on (Bates et al.; Bonin, Chalard, Meot, & Fayol, 2002; Snodgrass & Yuditsky, 1996). Furthermore, relative to monolinguals, bilinguals are more prone to TOT states for common nouns for which they may know or at least recognize labels across two languages. However, in both elicited and naturally occurring speech, monolinguals and bilinguals are equally prone to TOT states for proper names, for which they probably have one representation (Gollan et al., 2005). Moreover, common words do not even need to be relatively synonymous to compete for selection. For example, sharing a superordinate category allows giraffe and zebra to compete (for review, see Griffin & Ferreira, 2006; Vitkovitch, Humphreys, & Lloyd-Jones, 1993). There is surprisingly little and equivocal evidence on whether multiple names have a similar effect in retrieving personal names. To study the effect of having multiple names, some studies have made use of famous actors who play famous characters. A series of celebrity face-naming studies (Bre´dart, 1993) compared name retrieval for actors who were strongly associated with a particular character (Harrison Ford playing Indiana Jones) with equally familiar actors who were not associated with any particular character name (e.g., Woody Allen). A norming study indicated that the names of the actors and the characters were similar in familiarity. If having two names associated with a face were similar to having two labels associated with an object, one would expect slower and less successful naming for Harrison Ford. However, the proportions of correct responses and retrieval failures were similar for retrieving the actors’ names from the two groups.12 It may have been the case that the instruction to name the actor rather than the character successfully eliminated interference from the character’s name. Counter to that hypothesis, when participants were free to produce either actors’ or characters’ names as responses, they were still significantly more successful and suffered fewer TOT states for celebrities with a famous character than those without. Although character names are not synonymous with actors’ names, this last result suggests that having more names to choose among is facilitatory for person naming although it is not for object naming. Moreover, the same benefit of multiple known names occurred in a comparison between the same set of famous actors with famous characters (Harrison Ford/Indiana Jones) and photographs of actors whose names were not known but who played well-known characters (Richard Anderson as MacGyver). Insofar as 12
Another study failed to replicate this result and found a cost for having multiple names. However, the familiarity of the actors’ names was not controlled across conditions although other important factors were (Stevenage & Lewis, 2005).
Personal Names
359
characters’ names can be considered as close to actors’ names as multiple labels for an object are to one another, these studies suggest a discrepancy between the effect of multiple names on object labeling and face naming. However, another series of studies manipulated the availability of alternative names for actors and objects and found evidence of competition among multiple names for both types of words. On target trials in an initial experiment, participants named a photograph of an actor (e.g., John Cleese) after providing one of three responses to the same photograph of the actor on an earlier trial (Valentine, Hollis, & Moore, 1999). Having previously produced both the actor’s name and the name of a famous character that the actor played (e.g., John Cleese and Basil Fawlty) dramatically slowed latencies to later produce the actor’s name alone, relative to conditions in which participants previously either named the actor alone or in addition to the name of the TV program (e.g., John Cleese and Fawlty Towers). The conclusion was that the associated character’s name competed with the actor’s name for selection. Because the names of TV programs do not fall into the category of personal names, they did not compete with the actors’ names despite also being produced in response to the photograph. As the design of that experiment differed from existing studies of object labeling, somewhat analogous experiments were carried out with objects that differed in their labels in British and American English (Valentine & Darling, 2006). For example, the same object is labeled a lorry in British English and a truck in American English. For practice using the names, objects were repeatedly presented with labels written below. Responses were significantly slower and less accurate when participants had been trained to use two different labels during practice rather than one, regardless of whether they were asked to only use British labels at test or were free to use either British or American ones. So, common nouns from different dialects showed a multiple name competition effect just as celebrity names did using a similar paradigm. A clear difference between the studies that showed no cost of having multiple names for a face as opposed to those that did is whether participants were asked to switch responses used for repeated stimuli. Participants in a TOT study (Cross & Burke, 2004) generated names from celebrity photographs and based on descriptions with word stems (e.g., The flower girl from the musical ‘My Fair Lady’ whom Prof. Higgins transforms into a fashionable lady presentable to society, Eli____ Do____). Unlike those in the John Cleese experiments, these participants were more likely to successfully name a celebrity’s photograph (e.g., Audrey Hepburn) after previously generating an associated character name from a fill-in-the-blank question (Eliza Doolittle) than after an unrelated name. Altogether, this suggests that repeating stimuli and asking for different responses may be necessary to get competition between multiple personal names. In contrast, for objects, researchers consistently find a cost for having multiple potential labels regardless of whether the objects required multiple responses within the study or not.
360
Zenzi M. Griffin
It is not obvious which experimental situation is more likely to generalize to the everyday use of personal names. 2.2.6. Name Frequency and Age of Acquisition For objects, the impact of greater word usage is relatively simple. People are faster and more accurate to label objects that have more frequently used labels (Oldfield & Wingfield, 1964). Frequency of use is typically estimated from how often a word appears in a corpus of text or speech. Frequently used words are typically learned at earlier ages, so much work has tried to determine whether effects that were attributed to frequency of use were actually due to differences in age of acquisition (Carroll & White, 1973). Researchers normally use ratings to estimate how early words are learned and these ratings tend to correlate highly with the ages at which children can accurately label objects (Morrison, Chappell, & Ellis, 1997). Although frequency and age of acquisition are highly correlated, evidence suggests that both variables affect retrieval (see, e.g., Brysbaert & Ghyselinck, 2006). People are faster to produce earlier acquired personal names and have fewer TOT states for them (Bonin, Perret, Me´ot, Ferrand, & Mermillod, 2008; Moore & Valentine, 1998). However, establishing the role of usage frequency for personal names is complicated. Early diary studies found that people reported TOT states most often for names of friends and acquaintances (Cohen & Faulkner, 1986). This suggested a paradox in which the commonly used names are the most prone to problems, which would be the opposite of the frequency effect found for object labels. The researchers suggested that the apparent paradox could simply be due to the high frequency of retrieval attempts for the names of friends and acquaintances. That is, one rarely attempts to retrieve names of relatively unfamiliar people and so there are fewer occasions for failure. The results of an experiment that controlled for the number of retrieval attempts supported this explanation (Bre´dart, 1996). Less familiar celebrities elicited more TOT states than more familiar ones did. Furthermore, a speaker may simply entirely forget infrequently used names, in which case they will not result in TOT states. Unfortunately, the initial observation has been referred to as a reverse frequency or reverse familiarity effect on some occasions (Bre´dart; Cohen, 1990a), creating some confusion about how frequency affects the retrieval of personal names. Another difficulty is that only the frequency of surnames has been used as a measure. When dependent measures are based on the production of first and last names, one might not expect to see an effect of surname frequency (Bonin et al., 2008). On the other hand, it is a bit odd to produce bare surnames for celebrities who are known by their first and last names, so one might not expect robust effects in surname production. Then there is the issue of how to estimate name frequency. Researchers tend to use how often a surname appears in a telephone book or census ( James, 2004; James & Fogler, 2007; Moore & Valentine, 1998; Valentine &
Personal Names
361
Moore, 1995). This measure is probably correlated with frequency of exposure because the more Johnsons there are, the more likely one is to hear the name Johnson. However, there are relatively few people referred to by the name Madonna but the name is used relatively frequently (albeit periodically). Frequency measures for object labels reflect how often the words appear in print or speech rather than how many objects bear the label. So, the variable typically referred to as name frequency in the context of personal names is actually more a reflection of name ambiguity (e.g., Cohen, 1990a). Researchers have found that participants are faster and more accurate to name celebrities that they rate as more familiar, that is, ones that they have encountered more frequently than others (Moore & Valentine). However, rated familiarity appears to conflate exposure to the person with exposure to the name, so it is a very rough measure. In summary, age of acquisition and familiarity of personal names affects how quickly and accurately they may be retrieved just as they do for object labels, but researchers have not tested a measure of frequency of use for names that resembles the measure used for object labels. Many personal names may be more difficult to retrieve than common nouns if their forms are acquired by speakers later in life (Brennen, 1993). However, it has not been established whether personal names typically differ from object names and other words in age of acquisition, familiarity, or frequency, so they may or may not be at a disadvantage in this regard. 2.2.7. Name Ambiguity A prominent difference between personal names and common nouns is that a personal name may be shared by many completely unrelated individuals. For example, I know more than one person named Dan. For common nouns, the most analogous case is that of homonyms such as bank meaning financial institution or the shore of a river. Just as meeting one Dan will do nothing to help you recognize another one, learning one meaning of bank will not help recognition of the other. Having multiple unrelated meanings affects the speed and accuracy of producing a word. That is, the processing of homonyms (bank) and homophones (week/weak) differs from the processing of words with only one meaning or related senses (e.g., balcony). The speed and accuracy of producing a homophone is affected by how often the unintended meaning is typically used (Dell, 1990; Jescheniak & Levelt, 1994; Jescheniak, Meyer, & Levelt, 2003).13 Priming studies further indicate that having a shared form has processing consequences. For example, the word dance is related to one meaning of ball but not the bouncy, round meaning.14 13 14
The degree to which this holds is controversial (Caramazza, Costa, Miozzo, & Bi, 2001). Lemma or abstract word representations connect meanings, syntactic information, and phonological forms. Different homonym meanings are assumed to have different abstract word representations that likely feed into shared phonological forms (Dell, 1990).
362
Zenzi M. Griffin
When participants named a picture of a round ball, hearing dance as distractor sped up their responses relative to hearing an unrelated distractor word (Cutting & Ferreira, 1999). Presumably, dance activates the dance meaning of ball, which spreads activation to a shared phonological form / ball/, allowing it to be produced more rapidly. Homophone priming also occurs when personal names and common nouns share a common form. Producing the word pit as in cherry pit in response to a definition increased the probability of correctly retrieving Brad Pitt’s last name and reduced TOT states (Burke, Locantore, Austin, & Chae, 2004). It is not entirely clear whether these effects are due to multiple meaning representations becoming active when one is intended or just having multiple inputs converging on a single phonological form, but either way, there are implications for shared personal names. Retrieval of a person’s name should be affected by having the same form as another person’s name. For example, the ease of producing the name Dan should be sensitive to the use of the name for all Dans even if they do not share higher-level representations (i.e., person or word representations). To the extent that person representations that share a name are processed like the unrelated meanings of homophones, priming the characteristics of one Dan could make the name of another Dan faster and more likely to be retrieved. However, two humans have much more in common than two homophone concepts, so it may be premature to make predictions that assume that they are equally unrelated. Words that are similar in meaning like lion and tiger compete, slowing retrieval. Their level of competition is modulated by their degree of similarity (Vigliocco, Vinson, Damian, & Levelt, 2002). Substitution errors in the retrieval of personal names suggest that the names of similar people may also compete for selection (e.g., Bre´dart, 1993; Griffin & Wangerman, 2008). If so, perhaps paradoxically, the representations for two different individuals sharing the same name may interfere with one another, slowing retrieval and making it less likely to succeed (see Figure 2). Indeed, noun phrase representations for Kate Bush and George Bush are predicted to inhibit one in Node Structure Theory (Valentine & Moore, 1995). To recap, retrieving a name that is shared by many individuals should be different than retrieving a unique name. Currently, it is not clear what the effect of name ambiguity is, independent of other variables that tend to covary like frequency. So, name ambiguity may or may not be a disadvantage for personal names relative to other words. 2.2.8. Vocabulary Size and Age The longer a person lives, the bigger their vocabulary tends to get and particularly their knowledge of uncommon words (Alwin & McCammon, 2001; Verhaeghan, 2003). This vocabulary difference at least partially explains age-related increases in TOT states for common
363
Personal Names
Representation likes travel
has a Ph.D.
rows Feature
cycles
person1
Dan
?
is vegan
person2
person3
Fahad
David
is thin
person4
person5
Person
Andrew
Drew
Word
Form
Figure 2 Simplified diagram of representations for five men. If person representations compete with one another for selection, representations for individuals with the same name (person 1 and 2) may interfere with each other more than representations for equally similar individuals with different names (e.g., person 3 and 4). Note that there could be different word representations for people with the same name (i.e., two Dan word representations) and competition could occur at this level alone or in addition to the person level.
nouns (Bock, 1977; Dahlgren, 1998; Gollan & Brown, 2006). Just as people learn more uncommon words with age, they also learn more personal names. Higher numbers of known personal names may result in proactive interference in learning new names or greater interference between known names during retrieval (Brooks et al., 1993). Another contributor to poorer learning and retrieval of proper names among older adults may be their use of less effective mnemonics for learning names or decreased abilities to carry out mnemonics in real-world interactions (Brooks et al.). So, changes in knowledge and processing that are associated with increasing age tend to make personal names harder to retrieve. 2.2.9. Summary People are highly similar, both visually and in what can be predicated of them. Their distinguishing features often involve complex relationships (i.e., Barack Obama is not the first president in the world, nor is he the first US president or first multiracial/Black president, but he is the first official multiracial/Black US president). In other words, the perceptual and semantic spaces for people may be very dense with only complex
364
Zenzi M. Griffin
constellations of cues to distinguish them. So, the retrieval cues for people’s names may overlap with one another more than the cues for other categories of words do, making word selection more difficult. Even the names of countries and cities seem more meaningful than most personal names as suggested by their seemingly greater use as informative modifiers (e.g., a New York minute or an Austin music festival) and their higher recall in learning studies (Cohen & Faulkner, 1986). Furthermore, many personal names may be at a disadvantage due to later age of acquisition, lower frequency of use, greater length, greater phonological diversity, and higher ambiguity relative to common nouns (Brennen, 1993; Cohen, 1990a). However, studies have not directly compared word types on these dimensions. In addition, having multiple names for the same referent may or may not be more common for people than for objects and the effect of multiple names for people on retrieval is currently unclear. So, there are many possible reasons why personal names should be more difficult to retrieve than other words or information. Surprisingly, the fact that personal names pick out individuals rather than labeling categories does not seem to be a strong contributing factor to their difficulty.
3. Personal Names and Reference Across Cultures The naming system in some societies seems to have evolved primarily in order to differentiate individuals, in which case other means must be found to categorize them, to place them in the social matrix. In other societies, the naming system seems to have evolved primarily to categorize individuals, so that additional means must be found to differentiate them. (Alford, 1988, p. 69)
Speakers seem to take into account their audience’s knowledge when determining how to refer to things (e.g., Olson, 1970). The emphasis in psycholinguistic studies of reference has been on how speakers successfully differentiate between potential referents. First mentions of a person are typically in the form of a description (A guy) or some form of name (Steve), while subsequent references are more likely to use third-person pronouns such as he (see Smith, Noda, Andrews, & Jucker, 2005; Stivers, Enfield, & Levinson, 2007). So, the words we use to refer to a person vary considerably even within a single conversation. Although it is implicit in contrasting forms, less attention has been paid to the category information conveyed by different forms beyond their specificity. However, a referring expression for a person conveys a great deal of information about the referent, as well as the speaker and the relationship between the two (e.g.,Befu & Norbeck, 1958).
Personal Names
365
Generalizations about personal names up until this point have primarily reflected the types of names one finds in British and American mainstream culture. These properties do not generalize across the world or even across communities within the United Kingdom and the United States. Sociolinguists and anthropologists have documented various naming practices and identified many social and pragmatic factors that influence choice of referring expression (for review, see Alford, 1988; Stivers et al., 2007). In this section, I briefly discuss naming practices and speculate about their potential consequences for production based on the properties of the names. To preview the results, possession of multiple names and use of descriptive names or nicknames is much more common than one would expect based on the assumptions made in the existing psychological literature on name retrieval.
3.1. What Are Names Like Cross-Culturally? Alford (1988) studied naming practices across a probability sample of 60 nonindustrialized societies. He found that people had a first or given name in every society sampled, but in small communities, individuals often had only one component to their names. At the other extreme, in 5% of societies, names had four or more components. 3.1.1. Family and Clan Names Consistent with the use of names to categorize their bearers, many names indicated family or clan membership. Family surnames appeared in 33% of Alford’s (1988) sample and names that conveyed clan or lineage in 15%. Among many native American and Australian tribes, people receive names that are associated with their clan’s totem (Le´vi-Strauss, 1966). For example, members of an Osage Black Bear clan were dubbed, ‘‘Flashing-eyes (of the black bear), Tracks-on-the-prairies, Ground-cleared-of-grass, Black-bearwoman, Fat-on-the-skin of the black bear’’ (p. 173). In China, familyrelated information is marked on multiple names of an individual (Yau, 1996). In addition to a shared family name, the first character of a given name is traditionally taken from a family poem and shared by all male members of a generation. In Los Angeles, a complete gang name includes both a nickname and gang affiliation, where the nickname may be descriptive or even refer to a senior gang member who help violently initiate the person into the gang (Rymes, 1996). As Le´vi-Strauss argues, when names identify individuals as members of a class, the type/token distinction between proper names and common nouns is blurred. In using surnames to convey family membership, names in the United States and United Kingdom are similar to those in many other cultures.
366
Zenzi M. Griffin
3.1.2. Descriptive and Meaningful Names In two-thirds of Alford’s (1988) sampled societies, children typically received meaningful names. The most popular source for a meaningful name is the physical or behavioral characteristics of a child, making the name descriptive as well. In some communities, bestowing a name is customarily delayed for years after a child is born in order to allow characteristics to emerge. Animal names are also common, as it is hoped that children will take on their desirable characteristics. Children also receive names based on circumstances or events at the time of their birth. For example, among the Nuer of Borneo (Evans-Pritchard, 1948/1964), a child born during a drought was named Reath [drought]. Another Nuer child was named Met [to deceive], because the child’s father bent the truth while courting the child’s mother. Derogatory protective names were used in 21% of societies to ward off bad luck (Alford). Names that are tailored to individuals in this way may be more likely to be unique. All else being equal, descriptive names and meaningful names should be far easier to learn and less prone to TOT states than nondescriptive and meaningless names. Just as the pinkness of the Pink Panther facilitated recall of his name (Bre´dart & Valentine, 1998; Fogler & James, 2007), the mental representation of a person should provide good retrieval cues for a descriptive personal name. Having a story behind an episode-based name should make it memorable by creating a rich mental representation with many retrieval cues. At the same time though, meaningful names may be prone to semantically related word substitutions in a way that nonmeaningful names are spared. For example, someone named Blizzard may be prone to being called Snowfall by mistake. Alternatively, the meanings of meaningful names may cease to be processed after the person and name become familiar (Brennen, 2000). On the beneficial side, the word or words that comprise meaningful names are part of speakers’ normal vocabulary, so the names have forms that are similar in frequency and age of acquisition to those of common nouns. Furthermore, unique names avoid potential problems associated with name ambiguity. Thus, given names in many societies are likely to be more memorable and accessible than the ones typically studied by psychologists, but may be relatively more prone to semantic errors. 3.1.3. Nonunique and Multiple Names In societies with small sets of first and last names to draw on, many people end up sharing their entire names. Alford (1988) found that some form of alternative or nickname was commonly used in 75% of societies, and particularly in those where personal names did not uniquely specify individuals. For example, in a few small communities in Ireland, people so commonly share legal first and last names like Catherine Mullen and Pa´draig O´ Conghaile that their names are useless for reference (Lele, 2009).
Personal Names
367
Instead, referring expressions (bynames) include the Gaelic version of a person’s first name followed by either an ancestor’s name (e.g., Pa´draig Mha´ire Mho´ir [Pa´draig descendent of Big Mary]) or a prominent property of the referent (e.g., Jockan Rua´ [Red-headed Jockan]). A similar situation and solution arises in some villages in the Scottish Highlands, where for example, there were once 13 William Mackays in one school and three surnames shared by the majority of the inhabitants (Dorian, 1970). Likewise, legal first and last names fail to uniquely identify people in some Caribbean communities, so nicknames and license plate numbers are used for reference instead (Manning, 1974). Although some of the Caribbean nicknames may be nonsense words or arbitrarily connected to their bearers, they are often based on the individual’s personality, physical characteristics, or experiences. The Kamsa´ of Southwestern Colombia also have legal names that are shared by many members of the community (McDowell, 1981). So, they instead refer to one another within the community using ugly names that pick out distinctive characteristics (e.g., height or weight) or behaviors (e.g., a man who called all vegetables ‘‘yuca’’ was referred to by the word for the yuca plant). Descriptive and episode-based nicknames should be functionally the same as descriptive and episode-based names, and easily be learned and retrieved. In communities where ancestors are well known, adding an ancestor to a name may make it more memorable by adding retrieval cues. On the other hand, a name composed of two nondescriptive personal names may be more difficult to retrieve than a single name because either name may suffer retrieval failure. Recall that celebrities with three-part names resulted in more TOT states than those with two-part names (Hanley & Chapman, 2008). Moreover, Gaelic bynames place the ambiguous name first in the sequence, followed by the name that differentiates between potential referents. As a result, the initial name may cue the wrong second name, delaying retrieval or resulting in speech errors (Sevald & Dell, 1994). On the positive side, when legal names are so ambiguous that other names are nearly always used for reference instead, many people may only know a person’s nickname (see Aceto, 2002 for further examples). If so, there will be no processing consequences associated with having multiple names for a person. Other naming practices also result in an individual having multiple names. For example, among the Nuer, children often receive one personal name from their father’s side of the family and then another from their mother’s side (Evans-Pritchard, 1948/1964). An effort is made to make the names semantically related. For example, a child named Mun [earth] by one side was named Tiop [earth mixed with manure and ashes] by the other side of the family. A particular speaker is only likely to produce whichever name is appropriate for their side of the family or the village that they are in. However, the semantic relatedness and knowledge that both names apply to the same referent may cause the names to interfere with one another in production.
368
Zenzi M. Griffin
Belonging to multiple cultures and using more than one language can also result in multiple names for individuals. For example, a given or legal name in many spoken language may not be easily expressed in a sign language of the deaf. As a result, signers dub people with name signs. Within the first weeks of starting school in Greece, the United States, or China, deaf children from hearing families typically receive a name sign from an older schoolmate or an assertive peer, and then often carry the name sign for life (Kourbetis & Hoffmeister, 2002; Supalla, 1992; Yau, 1996). Name signs based on physical characteristics are very common across signed languages. For example, a study of 200 users of Greek sign language (Kourbetis & Hoffmeister) found that 92% of them had descriptive names, of which 55% expressed physical characteristics like having a scar or curly hair and 21% expressed personality traits. Likewise, a survey of users of the sign language of the Netherlands reported that 75% had descriptive name signs, such as the sign for smile for someone who smiled frequently (Schuit, 2009). Name signs may also be nondescriptive, based on a nondescriptive given name. For example, Supalla’s name sign in American sign language was an S handshape for Samuel that moved from one side of the chin to the other to distinguish it from the name sign of his brother Steve, which was an S handshape touching the side of the chin twice. However, in French sign language, Supalla was given a name sign denoting his pointed nose. Celebrities receive name signs if their given names are too unwieldy to fingerspell. For example, the name sign for Mao Zedong in Chinese sign language is composed of the sign for hair, which is the literal meaning of Mao, plus a gesture that alludes to his facial mole (Yau). In many societies, people receive an additional name or undergo a name change when their social relationships or status change (Kendall, 1980). The most familiar version of this in the United States and United Kingdom is the custom of women taking on their husbands’ surnames in place of their original surname or in addition to their surname. Often individuals receive a new name or nickname explicitly as part of entering adulthood or around in adolescence. These may be related to the person’s characteristics or related to objects strongly associated with the person, as in Nuer ox-names (EvansPritchard, 1948/1964). In a third of Alford’s (1988) sample, parents take on the name of a child (teknonymy), being called the equivalent of father-of-X or mother-of-X. Among some Africans, X, the child referred to in a teknonym, is the oldest one that still lives at home. As a result a parent may go through a sequence of names over a few years. The Penan of Borneo even have a default name that can be used in a teknonym until a child receives a name (Needham, 1954). The Penan also take new names when a family member dies. The death name specifies the relationship between the referent and the dead family member(s) as teknonyms do. So, a man may go from being referred to as Tama Jalong [father of Jalong] to Uyung Jalong [first born child Jalong is dead].
Personal Names
369
In sum, many cultures have naming systems that result in an individual having multiple names either sequentially, simultaneously, or both (see also Rymes, 1996). Name changes are highly likely to slow name retrieval and decrease accuracy, as old names interfere with new ones. As reviewed earlier, the effect of simultaneous multiple names is unclear, and it is likely to be modulated by the phonological and semantic relationships between names, their relative frequencies, their descriptiveness, and the strength of contextual cues associated with different uses.
3.2. How Are People Referred to? Names serve different purposes. So, even though a person may possess a name with three or more components, it is unlikely that the entire name will actually be used for more than occasional official documents or ceremonies. One question is how individuals are referred to when people who know them discuss them. In Alford’s (1988) sample, people were primarily referred to by some portion of their given name in 46% of societies, by possessed kinship terms (e.g., my uncle, your aunt) in 46%, and by nickname in the remainder. Use of kinship terms was more common when personal names did not already contain genealogical information in the form of surnames or patronyms (i.e., John’s son). Not surprisingly, frequent users of kinship terms tended to be kin-centered societies (see also Stivers et al., 2007). Because kinship terms depend on the meaningful, systematic relationships between people, they could be relatively easy to retrieve. People readily conceptualize messages relative to themselves (e.g., Ertel, 1977; Keysar, Barr, & Horton, 1998; MacWhinney, 1977), so the relationship between a referent and the speaker should be particularly salient and easy to represent. On the other hand, although a speaker might only refer to her aunt as my aunt, she must be able to recognize the references made to that person using a wide array of other kinship terms (e.g., mother, sister). Not only are all of these terms associated with the individual, but also they are all semantically related by being labels for female family members. So, again, the meaningfulness of the term should facilitate its retrieval but at the same time it introduces a number of semantically related terms that may interfere with it. Moreover, it may be advantageous in conversation to associate an individual to the addressee or someone other than oneself, for example by saying the equivalent of Your sister is causing a scene rather than My aunt (Stivers et al., 2007). However, retrieving a form that associates a relative to someone other than the speaker is likely to take extra time and increase the risk of speech error. In some cases, a further complication is the presence of taboos on name use. Having unique names in a society is associated with having name taboos (Alford, 1988). Unique names are often treated as extremely intimate and
370
Zenzi M. Griffin
sometimes sacred. Because they may be so evocative of the person named, in the wrong hands they may be used for bad magic and hence their use is avoided on most occasions (like US social security numbers). To the extent that thinking of a person activates a taboo name, speakers may have difficulty producing an alternative form and the anxiety about violating a taboo may increase the interference between names. Analyses of conversations suggest that speakers prefer to produce short referring expressions and let the addressee provide feedback if the referent is not recognized rather than produce unnecessarily elaborate descriptions (Sacks & Schegloff, 1979). However, speakers need to take into consideration not only whether an addressee is likely to identify the referent of a particular referring expression, but also whether the expression is an appropriate way to refer to the person given the addressee (e.g., Allerton, 1996; Murphy, 1988; Stivers et al., 2007). When speaking to someone of a lower status, it is common for a higher status speaker to refer to another person using the form of reference or address that would be appropriate for the lower status addressee to use rather than the speaker’s own term of address (Dickey, 1997). So, for example, a professor refers to another professor by title and last name (e.g., Dr. Markman) when speaking to an undergraduate, but by a shortened version of the professor’s name under normal circumstances (Art). Another example is when a parent refers to their child’s other parent as the equivalent of Mommy or Daddy only when speaking to their child or in its presence (Befu & Norbeck, 1958; Dickey). These situations share some characteristics with situations in the experimental work on perspective taking and audience design in object reference (see Brennan & Hanna, 2009). That work suggests that overcoming one’s own perspective may be effortful but vary with cultural practice (e.g., Keysar et al., 1998; Wu & Keysar, 2007). The broader social context of speech is also important. Some expressions should only be used if the referent is not present. Derogatory nicknames are the most obvious example (Crozier, 2002; McDowell, 1981). As a result, when a person’s physical presence is likely to make their nickname most easily retrievable is also when use of the nickname is least appropriate. Even if the referent is not present, social context matters. Dorian (1970) remarks on the difficulties determining whether an addressee was someone who would take offense when a byname was used to refer to another person and the difficulty coming up with another form of reference. On the other hand, speakers may avoid using first names or nicknames as referring expressions when speaking to someone who is less intimate with the referent (Dickey, 1997; Murphy, 1988). Whether a speaker is in the presence of in-group members and can be considered a member of the in-group is very important in selecting a referring expression and avoiding conflict (see Allerton, 1996).
Personal Names
371
In summary, whether or not an addressee can identify whom a speaker is referring to may be less important than making sure that the referring expression used is appropriate. Among other considerations, choice of an appropriate expression depends on the speaker’s relationship to the referent, the speaker’s relationship to the addressee, the relationship between the addressee and the referent, the presence of overhearers including the referent, the social context, and what the speaker wishes to express or emphasize about the referent and their relationship. Although this seems like it might require a great deal of calculation, it is not clear how much explicit reasoning is actually involved in selecting a referring expression. Contextual cues and implicit memory processes may help make appropriate expressions available (Horton & Gerrig, 2005). Further research is needed but support comes from the finding that even common nouns that have previously been used in conversation with a person are more quickly retrieved in the presence of the same person than with a different person, although the speaker is not speaking to them (Horton, 2007). The next section considers the forms used when the referent is the addressee.
4. Direct Address in Spoken Language Vocatives are terms that refer to the addressee of an utterance, that is, the person to whom a speaker speaks (Zwicky, 1974). Speakers often address people by name when trying to get their attention15 (e.g., Hey Jennifer, over here!) and when distinguishing them as the intended addressees of an utterance rather than others within earshot. Not surprisingly, these are usually the first two functions listed for vocatives (Leech, 1999; Zwicky). These functions probably account for vocatives being used more frequently in multiparty conversations than in two-party ones (Leech). Similarly, vocatives occur more often in utterances that introduce a change in topic than in those that continue with the same topic (Wilson & Zeitlyn, 1995). What is more interesting however is the third function of vocatives, which has to do with establishing and maintaining social bonds (Leech, 1999; Zwicky, 1974). When addressing someone for this function, the form of address such as a title, name, endearment, nickname, etc., is critical. In American culture, one risks offense by avoiding use of someone’s name because it often means that the name (and by implication the person) has been forgotten (e.g., Fiske, 1978). Indeed, one feels like a callous jerk when 15
Writers vary considerably in the functions they consider as a vocative use and their willingness to use the term vocative in the absence of vocative case marking. Here I will use direct address and vocative for all reference to a speaker’s addressee.
372
Zenzi M. Griffin
greeted by name and unable to reciprocate. In other societies, addressing someone by their name may be an affront, particularly one’s in-laws (Alford, 1988; Kasanga, 2009). Speech communities vary considerably in their preferred forms of address as well as the conventions that dictate their canonical use. Many variables, including the nature of the relationship between interlocutors, determine the form of address.
4.1. Forms of Direct Address Forms of direct address differ in many ways from referring expressions. Grammatically, they typically occur at the boundaries of an utterance (Leech, 1999; McCormick & Richardson, 2006). Particularly when the speaker already has the addressee’s attention, the form of address does not need to clearly specify the person as a referring expression would. Address forms are discussed below in roughly in their order of specificity. Nonspecific terms may be more easily retrieved than more specific ones, because they are likely to be used more frequently (e.g., you vs. man vs. John) and because they require less information be accurately retrieved about the addressee. However, richer, more meaningful representations of potential referents may support retrieval of more specific terms. 4.1.1. Address Avoidance and Second-Person Pronouns Because forms of address are so laden with social meaning (Zwicky, 1974), it is often tempting to avoid them altogether to prevent a social blunder. Indeed, address avoidance in English is common when people are unsure about the appropriate form of address (Ervin-Tripp, 1972). For example, one study found that advanced graduate students were more likely than starting graduate students to avoid explicitly addressing faculty by name (Little & Gelles, 1975). Presumably, the advanced students were past the point where they could comfortably call many faculty members by title and surname (e.g., Dr. Pen˜a) as the starting graduate students did, but not at a point where they felt comfortable using first names. Even if one avoids addressing someone by name or title, one is likely to require a second-person pronoun eventually in a conversation, even if it is simply to ask Would you like fries with that? Modern English speakers are lucky in this regard. You is a high-frequency word that works for both individuals and groups. One need know nothing about the addressee to use it. A speaker will not reveal much about his or her relationship to the addressee by the word you alone (although the degree of politeness in the remaining utterance is likely to provide clues about social distance). That said, ‘‘Hey you!’’ as an attention getter is considered rude and there are constructions that seem to exist just to help speakers avoid such pronouns (Brown & Levinson, 1987).
Personal Names
373
The situation is far more complex in languages that have multiple second-person pronouns such as French, Russian, and German where the form may vary systematically with the degree of intimacy between speaker and address as well as their relative status (Brown & Gilman, 1960/1970). Standards for second-person pronoun use changed dramatically in Europe over the twentieth century (e.g., Paulston, 1976). Traditionally, the T-pronoun (tu, ty, du) is informal and used with intimate equals in a casual setting, such as close friends. The V-pronoun (vous, vy, Sie) is formal and required for those of higher status due to wealth, occupation, and age, and in more formal settings. However, even when a speaker is close enough to an addressee to use the T-pronoun, the speaker might switch to the V-pronoun to express respect or contempt (see also Braun, 1988; ErvinTripp, 1972). So, while the second-person pronoun in English is insensitive to the subtleties of social interaction, the second-person pronouns of other languages may require speakers to take into account their relationship to the addressee and what they wish to express about that relationship in the utterance. Although this would be habitual for a native speaker, one would expect that the selection of a pronoun would be slowed when cues conflicted. 4.1.2. Insults and Endearments When someone cuts you off in traffic, terms of address may readily come to mind. Indeed, curses may not only come to mind, but even be uttered involuntarily. An early writer on aphasia noted how well preserved the ability to swear often was in cases of severe brain damage (Jackson, 1866/ 1958). All that seems needed to generate many insults is indignation. Tailoring the form of the curse to the properties of the addressee (e.g., gender, race, nationality, language) may be optional. On the other end of the emotional spectrum, endearments may likewise be relatively indifferent to the individual characteristics of the addressee. So, depending on the speaker’s preference, out may come Sweetie, Love, Honey, Snookums, Honeybunny, etc. Alas, psycholinguistic data on insult and endearment production are lacking, but the particularly strong emotions that are involved are likely to facilitate production relative to other forms of address. 4.1.3. Familiarizers and Fictive Kinship Terms Forms of address such as pal, buddy, chum, man, dude, comrade, or mate appear to be used primarily with strangers to reduce social distance and express solidarity (Brown & Levinson, 1987). For example, there is a famous song from the Great Depression, ‘‘Brother, can you spare a dime?’’ including a verse with ‘‘Buddy, can you spare a dime?’’ (Harburg & Gorney, 1931). Although some familiarizers appear gender-specific, many are used to by both genders to address both genders anyway. Indeed, the features of the addressee seem less relevant to their retrieval than the friendly sentiment that
374
Zenzi M. Griffin
the speaker wants to convey. In a survey of mate usage among Australians suggested that young men and women saw it ‘‘as a friendly term or as a term of endearment, used within a relaxed, informal or casual context’’ to address primarily men but also women (Rendle-Short, 2009, p. 253). Men also were likely to use mate when they had forgotten the addressee’s name. Many also just said they used it ‘‘out of habit’’ which further suggests that it is easily retrievable. Familiarizers seem relatively specific to speech communities and age cohorts, so it seems unlikely that a particular speaker would experience interference from buddy when retrieving dude. Furthermore, familiarizers may often be components of idiomatic expressions so that the other parts of the expression support their retrieval. In many societies, fictive kinship terms are frequently used to address people. For example, a Nuer man will typically call an older man outside of his family the equivalent of father and a younger man my son (EvansPritchard, 1948/1964). A study of 74 youth from various countries in Africa found that they used fictive kinship terms such as aunt and uncle about 75% of the time when addressing older adults outside of their families (Kasanga, 2009). In the United States, one seems most often to hear bro, brother, sister, and pop. Such terms should be quite easy to retrieve because they only require knowing the gender and relative age of the addressee, and perhaps whether they belong to one’s cultural group. For example, Navajo traditionally use fictive kinship terms corresponding to my aunt for women older than themselves and corresponding to my grandmother for yet older looking Navajo women (Fiske, 1978). When kinship terms are used so often for address (albeit with different addressees), their forms should be quite easy to retrieve. 4.1.4. Occupational Titles and Other Categorizations Occupational titles such as Doctor and President as well as military rank (e.g., Private, General) are clear reminders of social roles. Use of a title may indicate that the speaker acknowledges the addressee’s role or that the speaker expects the addressee to live up to the expectations associated with the role. Brown and Levinson (1987) noted that aside from greetings and such, Former Assistant Attorney General Henry Petersen only addressed President Nixon as Mr. President when expressing very sensitive topics such as giving bad news, assurances, suggestions, and asking about touchy subjects like indictments. Occupational titles are meaningful words. The features associated with the concept of doctor (e.g., wearing a stethoscope) can be used to identify people as doctors and retrieve the form of address Doctor. On the other hand, occupations are relational concepts so their conceptual representations are more complex than those for simple objects and this may affect word retrieval. In addition to their meaningfulness, the word forms for occupational titles are often acquired during childhood and may be quite
Personal Names
375
high in frequency. When used as forms of address, occupational titles may have fewer potential competitors than when used as category labels or for reference. For example, one can address all types of physicians, veterinarians, and dentists as Doctor whereas addressing them by their specialties is not an option (e.g., *Thank you, Dentist). In Jordanian Arabic, anyone who has made the pilgrimage to Mecca may be addressed as hadze or hadzi [pilgrim] (Braun, 1988). Like occupational titles, address forms that are related to simple categorizations should be relatively easy to retrieve. In sum, these categorization-like terms should be much easier to retrieve than personal names but may be more difficult than more general solidarity terms like mate or dude because they are more specific. 4.1.5. Kinship Terms Among the first 10 words that children learn across cultures typically are forms of Mommy and Daddy (Tardif et al., 2008). Kinship terms are the dominant form of address for 49% of Alford’s (1988) sampled societies. Among English speakers, kinship terms are mostly used to address members of generations older than the speaker (e.g., Dad, Grandma) whereas first names are typically used for members of the same generation or younger, such as one’s siblings and children (Allerton, 1996; Dickey, 1997). In many other cultures, kinship terms extend to a larger group of people, are more specific (i.e., specifying matrilineal or patrilineal descent, or birth order), and are used more frequently than in the United Kingdom and United States. In some communities, speakers occasionally address people using the kinship term that the addressee would normally use to address the speaker (i.e., inverse-kinship terms, bipolar terms). For example, in Kuwaiti Arabic, a father might address his daughter as /yuba/ [my father] when cajoling or placating her (Yassin, 1977). Other inverse-kinship terms in Kuwaiti can be used to express affection, mild rebuke, or condescension. Such address inversions further emphasize the relationship between the speaker and the addressee by using a term that takes the addressee’s perspective. Braun (1988) noted that address inversion clearly indicates that forms of direct address are not about identifying addressees but rather about emphasizing the relationship between speaker and addressee. With kin, speakers often know an addressee’s name even if it is not appropriate to use it in a specific community or context. So, the availability of multiple forms of address may affect retrieval times. Moreover, as mentioned earlier, multiple related kin terms may become available and compete for selection. That said, the forms for addressing kin (e.g., Mom, Mother), referring to kin (my mom, your mother), or categorizing family relationships (a mom, a mother) are often similar or identical within a language, which, in addition to their early acquisition, should make them easy to retrieve.
376
Zenzi M. Griffin
4.1.6. Nicknames Nicknames were the primary form of address in 19% of Alford’s sample. Across cultures, nicknames tend to be based on a person’s physical characteristics, characteristic behaviors, embarrassing episodes, or humorous variations on a person’s given name (e.g., Aceto, 2002; Manning, 1974). In many communities, men receive nicknames more often than women do (Kendall, 1980). Members of the same age cohort are the ones who tend to coin and use nicknames (e.g., Evans-Pritchard, 1948/1964). Drawing attention to personal information is either associated with acceptance and intimacy as when nicknames are used among friends, or aggression and hostility as when they are part of name calling and bullying (see Crozier & Skliopidou, 2002). Nicknames that are intended to be hurtful tend to pick out the most distinctive physical aspect of the person, especially weight (Crozier & Dimmock, 1999). Racial labels and animal names are also common in name calling. Because nicknames are so often descriptive, they should be easier to retrieve than nondescriptive personal names are, but unlike kin terms and occupational titles, they are particular to an individual so they are likely used less frequently. To the extent that they use common words or morphemes (e.g., Shorty, Red ), the forms of nicknames should be easier to retrieve than personal names are. 4.1.7. Names and Teknonyms Names were the primary form of address in 32% of Alford’s (1988) sample. As reviewed earlier, the difficulty of learning and retrieving names depends on variables such as their descriptiveness, meaningfulness, frequency of use, application to multiple people, and perhaps the number of alternative names for the addressee. Teknonyms such as Abu Ali [father of Ali] may be somewhat more difficult than personal names to retrieve, because they require retrieving the name of the relative whom the speaker is not currently addressing in addition to verbalizing the relationship between the addressee and the relative. On the other hand, in addressing someone as father-of-X, there may be freedom to choose the name of whichever child is easiest to retrieve to form the teknonym (Evans-Pritchard, 1948/1964). Thus, if a person had four children, a speaker may have four potential ways of creating a teknonym. Also, a teknonym may be a better match for how an individual is conceptualized than their personal name is. For example, the father of a friend may be primarily thought of in terms of the friend and paternal relationship, so Abu Ali would be easier to retrieve than a title with a last name such as Mr. Smith.
4.2. Factors Influencing Choice of Address Form In a large corpus study of vocatives (excluding you) in British and American everyday speech, full or truncated first names (e.g., James or Jimmy) were used in 64% of instances and a title with a surname (Mr. Spock) in a further
Personal Names
377
2% (Leech, 1999). The remainder of uses were familiarizers such as dude 14%, kinship terms 10%, and endearments 5%. In contrast, an analysis of speech in academic settings (which included you) found that most commonly used vocatives were group terms such as guys, followed closely by second-person pronouns such as you and you guys (McCormick & Richardson, 2006). About 15% of address forms were names, 7% honorifics (sir), 6.5% familiarizers (dude), and 4% endearments (baby). An analysis of an American middle-class family dinner conversation found that children addressed their parents almost equally often with you and kinship terms (Mom, Dad), whereas parents used you consistently with each other as well as the majority of the time in addressing their children (Wilson & Zeitlyn, 1995). The children addressed each other equally often by first name and with you. Naturally, the relative frequency of different forms of address depends heavily on the contexts that speech is drawn from. In a classic paper, Brown and Ford (1964) described some criteria that affected the choice of address for Americans in the mid-twentieth century. Upon introduction, people of similar status started off addressing each other by title and last name (Mr. Braddock) and then they quickly tended to shift to reciprocal use of first names. Nonreciprocal forms of address are still common when there is a difference in status due to age or occupation between speakers such as student–teacher, employee–employer, and server– customer. The movie Mrs. Robinson (1967) included wonderful examples of asymmetrical address. Although they have known them their entire lives, the college-age children address their parents’ friends by title and last name (Mr. or Mrs. Robinson) and in return are addressed by first name. So, the older woman, Mrs. Robinson, addresses her young lover as ‘‘Benjamin,’’ while he persists in calling her ‘‘Mrs. Robinson.’’ Increased intimacy typically results in reciprocal use of first names (Brown & Ford), but the continued mismatch in address forms emphasizes the status difference and emotional distance between the two. Strangers or new acquaintances typically have a single form of address. Across cultures, as a relationship grows more intimate, there is often a shift in the form of address used (Befu & Norbeck, 1958; Brown & Ford, 1964; Kasanga, 2009). When there is a difference in status, the higher status person may need to invite the lower status one to use a more familiar form of address. Germans and Swedes even had an informal ceremony to mark the occasion of shifting from the use of formal second-person pronouns to the more familiar pronoun (Brown & Ford; Paulston, 1976). Use of forms that are associated with intimacy is not taken lightly. Imagine that a stranger addressed you by a pet name that only a significant other used. As Beidelman (1974) points out, ‘‘Unwarranted use of a name would thus represent an invasion of a person’s social space, esteem, dignity, privacy, and therefore an abuse’’ (p. 282). Unfortunately, cultural differences in address terms can be difficult to reconcile and may result in discomfort or offense (Bargiela et al., 2002).
378
Zenzi M. Griffin
As people become closer, the number of forms they use to address one another often increases (Brown & Ford, 1964; Jonz, 1975). Brown and Ford asked 32 male undergraduates at MIT to each list four men of approximately the same age that he had met about one year before and all the different names he used to address them. A greater number of names used to address a man were associated with greater self-disclosure to the addressee. The researchers compared this to other areas of vocabulary where greater importance and interest in a domain leads to finer lexical distinctions within it (e.g., skiers have more terms for snow than nonskiers). In part, this increase in address forms is likely due to interacting with the person under a broader range of circumstances. Having multiple names for an individual provides a rich means of expressing variations in how the person is considered (Brown, 1959/1970). To the extent that contextual or emotional cues selectively activate particular address forms, having more forms available may not be a problem. Further complicating matters, people might use formal terms of address for someone in overt speech but more intimate terms when thinking about the person, particularly as an object of secret affection (Friedrich, 1972). In such cases, it seems particularly likely that covert forms would interfere with retrieving overt ones. Shifts in vocative forms convey the speaker’s current attitude about the relationship, the addressee, or the message to be conveyed (e.g., RendleShort, 2009). A clear example of this is the use of endearments like Sweetie or Love, particularly when providing emotional support. In the United States, parents typically address their children by first name or a truncated version of it. However, they may address their children by their full names (first, middle, and last) when displeased by their behavior (Brown & Ford, 1964). Although the use of full names is canonically associated with formal occasions and politeness, it connotes distance when used by chiding parents. Likewise, college students report that they are more likely to use formal terms like Mother and Father rather than Mom and Dad when in conflict with their parents than in casual situations (Lewis, 1965). Several researchers note that speakers shift to more intimate or kinship terms for family members when requesting something from an addressee (Ervin-Tripp, 1972; Kendall, 1980). An anthropologist’s informant expressed the motivation succinctly, ‘‘If you call someone maya [parallel cousin], they have to treat you right’’ (Kendall, p. 266). In contrast, speakers may switch to more polite forms such as title and last name when requesting something of nonkin (Brown & Levinson, 1979; Jonz, 1975). Indeed, use of address forms is an important part of softening threats to the ‘‘face’’ of one’s addressee (Brown & Levinson, 1987). In emergency situations, there is no time for considering the merits of alternative forms of address. Jonz reported that bare titles were particularly likely to be used in the military when under fire. Although the use of first names and diminutives is standardly associated with intimacy and the use of terms like buddy or mate often implies solidarity
Personal Names
379
and equality, these terms may also be used to connote condescension or hostility (Brown & Ford, 1964; Ervin-Tripp, 1972; Rendle-Short, 2009). So, the form that a vocative takes not only reflects the respective social roles of the speaker and the addressee, but also communicates the speaker’s current attitude about the relationship and the addressee. At present far too little is known about the production processes behind such ironic use of words to speculate about the processing involved (but see Hancock, 2004). Interestingly, name signs are not used for direct address among signers (Schuit, 2009). Second-person pronouns are in the lexicons of signed languages, but I have as yet found no data about variations in politeness for these or other variations in address forms. This summary only scratches the surface of work that has been done on personal address (see, e.g., Braun, 1988; Philipsen & Huspek, 1985). Very little is known about how these considerations come into play in online word production. Clearly, understanding the processes that underlie the production of address terms will require the field to consider speakers’ intentions, emotions, attitudes, and social interactions at a deeper level than it previously has.
5. Conclusion The study of how speakers retrieve personal names has thus far primarily addressed a small subset of name forms (mostly nondescriptive surnames) in relatively unrepresentative domains of name usage (labeling faces and referring in narration). On the other hand, abundant information is available from anthropology and sociolinguistics about naming systems and the variables that affect the choice of terms for personal reference and address. Given the recent shift to studying language processes in communicative settings and the increase in experimental methods for doing so (for review, see Griffin & Crew, 2010), the field is ripe for beginning to address the production of socially relevant language in communicative contexts. Social psychology can provide information about person perception, but challenges will lie in developing production models that can capture the importance and subtleties of social relationships, communicative intentions, and discourse factors.
ACKNOWLEDGMENTS Thanks to Tamar Gollan, Brian Ross, and Cognition and Communication Lab members: Callan Cooper, Cassandra Jacobs, and Madeline Clark for comments on drafts of this chapter and to the many people who have discussed names with me. I am especially grateful to the people who stimulated my interest in this area by emailing me to ask about the cause of substitution errors for personal names.
380
Zenzi M. Griffin
REFERENCES Aceto, M. (2002). Ethnic personal names and multiple identities in Anglophone Caribbean speech communities in Latin America. Language in Society, 31(4), 577–608. Alford, R. D. (1988). Naming and identity: A cross-cultural study of personal naming practice. New Haven, CN: HRAF Press. Allerton, D. J. (1987). The linguistic and sociolinguistic status of proper names What are they, and who do they belong to? Journal of Pragmatics, 11(1), 61–92. Allerton, D. J. (1996). Proper names and definite descriptions with the same reference: A pragmatic choice for language users. Journal of Pragmatics, 25(5), 621–633. Alwin, D. F., & McCammon, R. J. (2001). Aging, cohorts, and verbal ability. Journals of Gerontology Series B: Psychological Sciences and Social Sciences, 56(3), S151–S161. Andersen, S. M., & Berk, M. S. (1998). The social-cognitive model of transference: Experiencing past relationships in the present. Current Directions in Psychological Science, 7(4), 109–115. Arnold, J. E. (2008). Reference production: Production-internal and addressee-oriented processes. Language and Cognitive Processes, 23(4), 495–527. Arnold, J. E., & Griffin, Z. M. (2007). The effect of additional characters on choice of referring expression: Everyone counts. Journal of Memory and Language, 56, 521–536. Bargiela, F., Boz, C., Gokzadze, L., Hamza, A., Mills, S., & Rukhadze, N. (2002). Ethnocentrism, politeness and naming strategies. In: Working Papers on the Web, Vol. 3: Linguistic Politeness and Context. Retrieved December 1, 2009, from http://extra.shu.ac. uk/wpw/politeness/bargiela.htm. Bates, E., D’Amico, S., Jacobsen, T., Szkely, A., Andonova, E., Devescovi, A., et al. (2003). Timed picture naming in seven languages. Psychonomic Bulletin & Review, 10(2), 344–380. Befu, H., & Norbeck, E. (1958). Japanese usages of terms of relationship. Southwestern Journal of Anthropology, 14(1), 66–86. Beidelman, T. O. (1974). Kaguru names and naming. Journal of Anthropological Research, 30(4), 281–293. Bock, J. K. (1977). The effect of a pragmatic presupposition on syntactic structure in question answering. Journal of Verbal Learning and Verbal Behavior, 16, 723–734. Bonin, P., Chalard, M., Meot, A., & Fayol, M. (2002). The determinants of spoken and written picture naming latencies. British Journal of Psychology, 93(1), 89–114. Bonin, P., Perret, C., Me´ot, A., Ferrand, L., & Mermillod, M. (2008). Psycholinguistic norms and face naming times for photographs of celebrities in French. Behavior Research Methods, 40(1), 137–146. Braun, F. (1988). Terms of address: Problems of patterns and usage in various languages and cultures. Berlin: Mouton de Gruyter. Bre´dart, S. (1993). Retrieval failures in face naming. In G. Cohen & D. M. Burke (Eds.), Memory for proper names (pp. 351–366). Hillsdale, NJ: Lawrence Erlbaum Associates. Bre´dart, S. (1996). Person familiarity and name-retrieval failures: How are they related? Cahiers de Psychologie Cognitive/Current Psychology of Cognition, 15(1), 113–120. Bre´dart, S., & Valentine, T. (1992). From Monroe to Moreau: An analysis of face naming errors. Cognition, 45(3), 187–223. Bre´dart, S., & Valentine, T. (1998). Descriptiveness and proper name retrieval. Memory, 6(2), 199–206. Brennan, S. E., & Hanna, J. E. (2009). Partner-specific adaptation in dialog. Topics in Cognitive Science, 1(2), 274–291. Brennen, T. (1993). The difficulty with recalling people’s names: The plausible phonology hypothesis. In G. Cohen & D. M. Burke (Eds.), Memory for proper names (pp. 409–431). Hillsdale, NJ: Lawrence Erlbaum Associates.
Personal Names
381
Brennen, T. (2000). On the meaning of personal names: A view from cognitive psychology. Names, 48(2), 139–146. Brennen, T., & Bruce, V. (1991). Context effects in the processing of familiar faces. Psychological Research, 53, 296–304. Brooks, J. O., III, Friedman, L., Gibson, J. M., & Yesavage, J. A. (1993). Spontaneous mnemonic strategies used by older and younger adults to remember proper names. In G. Cohen & D. M. Burke (Eds.), Memory for proper names (pp. 393–407). Hillsdale, NJ: Lawrence Erlbaum Associates. Brown, P., & Levinson, S. C. (1979). Social structure, groups and interaction. In H. Giles & K. R. Scherer (Eds.), Social markers in speech (pp. 291–341). Cambridge, UK: Cambridge University Press. Brown, P., & Levinson, S. C. (1987). Politeness: Some universals in language usage. Cambridge, UK: Cambridge University Press. Brown, R. (1959/1970). A review of Nabokov’s Lolita. In R. Brown (Ed.), Psycholinguistics: Selected papers by Roger Brown (pp. 370–376). New York, NY: Free Press. Brown, R., & Ford, M. (1964). Address in American English. In D. Hymes (Ed.), Language in culture and society: A reader in linguistics and anthropology (pp. 234–244). New York, NY: Harper & Row. Brown, R., & Gilman, A. (1960/1970). Pronouns of power and solidarity. In R. Brown (Ed.), Psycholinguistics: Selected papers by Roger Brown. New York, NY: Free Press. Brown, R., & McNeill, D. (1966). The ‘‘tip-of-the-tongue’’ phenomenon. Journal of Verbal Learning and Verbal Behavior, 5, 325–337. Bruce, V., Burton, A. M., & Walker, S. (1994). Testing the models? New data and commentary on Stanhope & Cohen (1993). British Journal of Psychology, 85(3), 335–349. Brysbaert, M., & Ghyselinck, M. (2006). The effect of age of acquisition: Partly frequency related, partly frequency independent. Visual Cognition, 13(7), 992–1011. Burgess, C., & Conley, P. (1999). Representing proper names and objects in a common semantic space: A computational model. Brain and Cognition, 40(1), 67–70. Burke, D. M., Locantore, J. K., Austin, A. A., & Chae, B. (2004). Cherry pit primes Brad Pitt: Homophone priming effects on young and older adults’ production of proper names. Psychological Science, 15(3), 164–170. Burke, D. M., MacKay, D. G., Worthley, J. S., & Wade, E. (1991). On the tip of the tongue: What causes word finding failures in young and older adults? Journal of Memory and Language, 30(5), 542–579. Burton, A. M., Bruce, V., & Johnston, R. A. (1990). Understanding face recognition with an interactive activation model. British Journal of Psychology, 81, 361–380. Caramazza, A., Costa, A., Miozzo, M., & Bi, Y. C. (2001). The specific-word frequency effect: Implications for the representation of homophones in speech production. Journal of Experimental Psychology: Learning Memory and Cognition, 27(6), 1430–1450. Carroll, J. B., & White, M. N. (1973). Word frequency and age of acquisition as determinants of picture-naming latency. Quarterly Journal of Experimental Psychology, 25, 85–95. Carson, D. R., & Burton, A. M. (2001). Semantic priming of person recognition: Categorial priming may be a weaker form of the associative priming effect. Quarterly Journal of Experimental Psychology A: Human Experimental Psychology, 54(4), 1155–1179. Cassidy, K. W., Kelly, M. H., & Sharoni, L. J. (1999). Inferring gender from name phonology. Journal of Experimental Psychology: General, 128(3), 362–381. Cipolotti, L., McNeil, J. E., & Warrington, E. K. (1993). Spared written naming of proper nouns: A case report. In G. Cohen & D. M. Burke (Eds.), Memory for proper names (pp. 289–311). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Cohen, G. (1990a). Recognition and retrieval of proper names: Age differences in the fan effect. European Journal of Cognitive Psychology, 2(3), 193–204.
382
Zenzi M. Griffin
Cohen, G. (1990b). Why is it difficult to put names to faces? British Journal of Psychology, 81(3), 287–297. Cohen, G., & Burke, D. M. (1993). Memory for proper names: A review. In G. Cohen & D. M. Burke (Eds.), Memory for proper names (pp. 249–263). Hillsdale, NJ: Lawrence Erlbaum Associates. Cohen, G., & Faulkner, D. (1986). Memory for proper names: Age differences in retrieval. British Journal of Developmental Psychology, 4(2), 187–197. Conway, M. A., Cohen, G., & Stanhope, N. (1991). On the very long-term retention of knowledge acquired through formal education: Twelve years of cognitive psychology. Journal of Experimental Psychology: General, 120(4), 395–409. Cree, G. S., McNorgan, C., & McRae, K. (2006). Distinctive features hold a privileged status in the computation of word meaning: Implications for theories of semantic memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 32(4), 643–658. Cree, G. S., & McRae, K. (2003). Analyzing the factors underlying the structure and computation of the meaning of chipmunk, cherry, chisel, cheese, and cello (and many other such concrete nouns). Journal of Experimental Psychology: General, 132(2), 163–201. Crook, T. H., & West, R. L. (1990). Name recall performance across the adult life-span. British Journal of Psychology, 81(3), 335–349. Cross, E. S., & Burke, D. M. (2004). Do alternative names block young and older adults’ retrieval of proper names? Brain and Language, 89(1), 174–181. Crozier, W. R., & Dimmock, P. S. (1999). Name-calling and nicknames in a sample of primary school children. British Journal of Educational Psychology, 69, 505–516. Crozier, W. R., & Skliopidou, E. (2002). Adult recollections of name-calling at school. Educational Psychology, 22(1), 113–124. Cutting, J. C., & Ferreira, V. S. (1999). Semantic and phonological information flow in the production lexicon. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25, 318–344. Dahlgren, D. J. (1998). Impact of knowledge and age on tip-of-the-tongue rates. Experimental Aging Research, 24(2), 139–153. Darling, S., & Valentine, T. (2005). The categorical structure of semantic memory for famous people: A new approach using release from proactive interference. Cognition, 96(1), 35–65. Dell, G. S. (1990). Effects of frequency and vocabulary type on phonological speech errors. Language and Cognitive Processes, 5, 313–349. Dell, G. S., & Gordon, J. K. (2003). Neighbors in the lexicon: Friends or foes? In N. O. Schiller & A. S. Meyer (Eds.), Phonetics and phonology in language comprehension and production (pp. 9– 38). Berlin: Mouton de Gruyter. Dickey, E. (1997). Forms of address and terms of reference. Journal of Linguistics, 33(2), 255–274. Dorian, N. C. (1970). A substitute name system in the Scottish Highlands. American Anthropologist, 72(2), 303–319. Ertel, S. (1977). Where do the subjects of sentences come from? In S. Rosenberg (Ed.), Sentence production: Developments in research and theory (pp. 141–167). Hillsdale, NJ: Lawrence Erlbaum Associates. Ervin-Tripp, S. M. (1972). Alternation and co-occurrence. In J. J. Gumperz & D. Hymes (Eds.), Directions in sociolinguistics: The ethnography of communication (pp. 218–250). New York, NY: Holt, Rinehart and Winston. Evans-Pritchard, E. E. (1948/1964). Nuer modes of address. In D. Hymes (Ed.), Language in culture and society (pp. 221–227). New York, NY: Harper & Row. Fiske, S. (1978). Rules of address: Navajo women in Los Angeles. Journal of Anthropological Research, 34(1), 72–91.
Personal Names
383
Fitzsimons, G. M., & Shah, J. Y. (2009). Confusing one instrumental other for another: Goal effects on social categorization. Psychological Science, 20(12), 1468–1472. Flude, B. M., Ellis, A. W., & Kay, J. (1989). Face processing and name retrieval in an anomic aphasic: Names are stored separately from semantic information about familiar people. Brain and Cognition, 11(1), 60–72. Fraas, M., Lockwood, J., Neils-Strunjas, J., Shidler, M., Krikorian, R., & Weiler, E. (2002). ‘What’s his name?’ A comparison of elderly participants’ and undergraduate students’ misnamings. Archives of Gerontology and Geriatrics, 34(2), 155–165. Fogler, K. A., & James, L. E. (2007). Charlie Brown versus Snow White: The effects of descriptiveness on young and older adults’ retrieval of proper names. Journals of Gerontology: Series B: Psychological Sciences and Social Sciences, 62(4), 201–207. Friedrich, P. (1972). Social context and semantic feature: The Russian pronominal usage. In J. Gumperz & D. Hymes (Eds.), Directions in sociolinguistics (pp. 273–300). New York, NY: Holt, Rinehart and Winston. Fromkin, V. A. (1971). The non-anomalous nature of anomalous utterances. Language, 47, 27–52. Gentner, D., & Kurtz, K. (2005). Relational categories. In W. K. Ahn, R. L. Goldstone, B. C. Love, A. B. Markman, & P. W. Wolff (Eds.), Categorization inside and outside the lab (pp. 151–175). Washington, DC: American Psychological Association. Ghika-Schmid, F., & Nater, B. (2003). Anemia for people’s names, a restricted form of transient epileptic amnesia. European Journal of Neurology, 10(6), 651–654. Goldrick, M., Folk, J. R., & Rapp, B. (2010). Mrs. Malaprop’s neighborhood: Using word errors to reveal neighborhood structure. Journal of Memory and Language, 62, 113–134. Gollan, T. H., Bonanni, M. P., & Montoya, R. (2005). Proper names get stuck on bilingual and monolingual speakers’ tip of the tongue equally often. Neuropsychologia, 19(3), 278–287. Gollan, T. H., & Brown, A. S. (2006). From tip-of-the-tongue (TOT) data to theoretical implications in two steps: When more TOTs means better retrieval. Journal of Experimental Psychology: General, 135(3), 462–483. Griffin, Z. M., & Crew, C. (2010). Research in language production. In M. Spivey, M. Joanisse, & K. McRae (Eds.), Cambridge handbook of psycholinguistics. Cambridge, UK: Cambridge University Press (in press). Griffin, Z. M., & Ferreira, V. S. (2006). Properties of spoken language production. In M. J. Traxler & M. A. Gernsbacher (Eds.), Handbook of psycholinguistics (2nd ed.). (pp. 21–59). London: Elsevier. Griffin, Z. M., & Wangerman, T. (2008). ‘‘Lisa, Patty, Selma, Snowball . . . Maggie!’’ Names that parents call their children by mistake. Poster presented at the 5th International Workshop on Language Production, Annapolis, MD. Hancock, J. T. (2004). Verbal irony use in face-to-face and computer-mediated conversations. Journal of Language and Social Psychology, 23(4), 447–463. Hanley, J. R., & Chapman, E. (2008). Partial knowledge in a tip-of-the-tongue state about two- and three-word proper names. Psychonomic Bulletin & Review, 15(1), 156–160. Hanley, J. R., & Kay, J. (1998). Proper name anomia and anomia for the names of people: Functionally dissociable impairments? Cortex, 34(1), 155–158. Harburg, E. Y., & Gorney, J. (1931). Brother, can you spare a dime. (lyrics by Yip Harburg, music by Jay Gorney). Harley, T. A., & Bown, H. E. (1998). What causes a tip-of-the-tongue state? Evidence for lexical neighborhood effects in speech production. British Journal of Psychology, 89, 151–174. Harris, D. M., & Kay, J. (1995). I recognize your face but I can’t remember your name: Is it because names are unique? British Journal of Psychology, 86(3), 345–358.
384
Zenzi M. Griffin
Hittmair-Delazer, M., Denes, G., Semenza, C., & Mantovan, M. C. (1994). Anomia for people’s names. Neuropsychologia, 32(4), 465–476. Horton, W. S. (2007). The influence of partner-specific memory associations on language production: Evidence from picture naming. Language and Cognitive Processes, 22(7), 1114–1139. Horton, W. S., & Gerrig, R. J. (2005). Conversational common ground and memory processes in language production. Discourse Processes, 40(1), 1–35. Jackson, H. J. (1866/1958). Notes on the physiology and pathology of language. In J. Taylor (Ed.), Selected writings of John Hughlings Jackson. London: Staples Press (originally published in 1866, Vol. 2, pp. 121–128). James, L. E. (2004). Meeting Mr. Farmer versus Meeting a Farmer: Specific effects of aging on learning proper names. Psychology and Aging, 19(3), 515–522. James, L. E., & Fogler, K. A. (2007). Meeting Mr. Davis vs Mr. Davin: Effects of name frequency on learning proper names in young and older adults. Memory, 15(4), 366–374. Jescheniak, J.-D., & Levelt, W. J. M. (1994). Word frequency effects in speech production: Retrieval of syntactic information and of phonological form. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 824–843. Jescheniak, J. D., Meyer, A. S., & Levelt, W. J. M. (2003). Specific-word frequency is not all that counts in speech production: Comments on Caramazza, Costa, et al. (2001) and new experimental data. Journal of Experimental Psychology: Learning, Memory, & Cognition, 29(3), 432–438. Jonz, J. G. (1975). Situated address in the United States Marine Corps. Anthropological Linguistics, 17(2), 68–77. Kasanga, L. A. (2009). Language socialization: The naming of non-kin adults by African children and preadolescents in intercultural encounters. Intercultural Pragmatics, 6(1), 85–114. Kendall, M. B. (1980). Exegesis and translation: Northern Yuman names as texts. Journal of Anthropological Research, 36(3), 261–273. Keysar, B., Barr, D. J., & Horton, W. S. (1998). The egocentric basis of language use: Insights from a processing approach. Current Directions in Psychological Science, 7(2), 46–50. Kourbetis, V., & Hoffmeister, R. J. (2002). Name signs in Greek sign language. American Annals of the Deaf, 147(3), 35–43. Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2), 211–240. Leech, G. (1999). The distribution and function of vocatives in American and British English conversation. In H. Hasselga˚rd & S. Oksefjell (Eds.), Out of corpora: Studies in honour of Stig Johansson (pp. 107–118). Amsterdam: Rodopi. Lele, V. (2009). ‘‘It’s not really a nickname, it’s a method’’: Local names, state intimates, and kinship register in the Irish Gaeltacht. Journal of Linguistic Anthropology, 19(1), 101–116. Levelt, W. J. M. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT Press. Le´vi-Strauss, C. (1966). The savage mind. Chicago, IL: University of Chicago Press. Lewis, L. S. (1965). Terms of address for parents and some clues about social relationships in the American family. The Family Life Coordinator, 14(2), 43–46. Little, C. B., & Gelles, R. J. (1975). The social psychological implications of form of address. Sociometry, 38(4), 573–586. Lloyd-Jones, T. J., & Nettlemill, M. (2007). Sources of error in picture naming under time pressure. Memory & Cognition, 35(4), 816–836. Lucchelli, F., Muggia, S., & Spinnler, H. (1997). Selective proper name anomia: A case involving only contemporary celebrities. Cognitive Neuropsychology, 14(6), 881–900. Lund, K., & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers, 28(2), 203–208.
Personal Names
385
Lupker, S. J. (1979). The semantic nature of response competition in the picture–word interference task. Memory & Cognition, 7, 485–495. MacWhinney, B. (1977). Starting points. Language, 53, 152–168. Manning, F. E. (1974). Nicknames and number plates in the British West Indies. Journal of American Folklore, 87(344), 123–132. Markman, A. B., & Stilwell, C. H. (2001). Role-governed categories. Journal of Experimental & Theoretical Artificial Intelligence, 13, 329–358. McCormick, J., & Richardson, S. (2006). Vocatives in MICASE [Electronic Version]. MICASE Kibbitzers, 12, Retrieved November 18, 2008, from http://micase.elicorpora. info/micase-kibbitzers/12-vocatives-in-micase. McDowell, J. H. (1981). Toward a semiotics of nicknaming the Kamsa´ example. Journal of American Folklore, 94(371), 1–18. McRae, K., de Sa, V. R., & Seidenberg, M. S. (1997). On the nature and scope of featural representations of word meaning. Journal of Experimental Psychology: General, 126(2), 99–130. McWeeny, K. H., Young, A. W., Hay, D. C., & Ellis, A. W. (1987). Putting names to faces. British Journal of Psychology, 78(2), 143–149. Meyer, A. S., & Belke, E. (2007). Word form retrieval in language production. In M. G. Gaskell (Ed.), Oxford handbook of psycholinguistics (pp. 471–487). Oxford: Oxford University Press. Milders, M. (2000). Naming famous faces and buildings. Cortex, 36(1), 139–145. Milders, M., Deelman, B., & Berg, I. (1998). Rehabilitation of memory for people’s names. Memory, 6(1), 21–36. Miller, G. A., & Johnson-Laird, P. N. (1976). Language and Perception. Cambridge MA: Harvard University Press. Moore, V., & Valentine, T. (1998). The effect of age of acquisition on speed and accuracy of naming famous faces. Quarterly Journal of Experimental Psychology A: Human Experimental Psychology, 51(3), 485–513. Morris, P. E., Fritz, C. O., Jackson, L., Nichol, E., & Roberts, E. (2005). Strategies for learning proper names: Expanding retrieval practice, meaning and imagery. Applied Cognitive Psychology, 19(6), 779–798. Morrison, C. M., Chappell, T. D., & Ellis, A. W. (1997). Age of acquisition norms for a large set of object names and their relation to adult estimates and other variables. Quarterly Journal of Experimental Psychology A, 50, 528–559. Murphy, G. L. (1988). Personal reference in English. Language in Society, 17(3), 317–349. Needham, R. (1954). The system of teknonyms and death-names of the Penan. Southwestern Journal of Anthropology, 10(4), 416–431. Oldfield, R. C., & Wingfield, A. (1964). The time it takes to name an object. Nature, 202, 1031–1032. Olson, D. R. (1970). Language and thought: Aspects of a cognitive theory of semantics. Psychological Review, 77, 257–273. Paulston, C. B. (1976). Pronouns of address in Swedish: Social class semantics and a changing system. Language in Society, 5(3), 359–386. Pelamatti, G., Pascotto, M., & Semenza, C. (2003). Verbal free recall in high altitude: Proper names vs common names. Cortex, 39(1), 97–103. Philipsen, G., & Huspek, M. (1985). A bibliography of sociolinguistic studies of personal address. Anthropological Linguistics, 27(1), 94–101. Rendle-Short, J. (2009). The address term mate in Australian English: Is it still a masculine term? Australian Journal of Linguistics, 29(2), 245–268. Rymes, B. (1996). Naming as social practice: The case of Little Creeper from Diamond Street. Language in Society, 25(2), 237–260.
386
Zenzi M. Griffin
Sacks, H., & Schegloff, E. A. (1979). Two preferences in the organization of reference to persons in conversation and their interaction. In G. Psathas (Ed.), Everyday language: Studies in ethnomethodology (pp. 15–21). New York, NY: Halsted (Irvington). Saetti, M. C., Marangolo, P., De Renzi, E., Rinaldi, M. C., & Lattanzi, E. (1999). The nature of the disorder underlying the inability to retrieve proper names. Cortex, 35(5), 675–685. Schuit, J. (2009). What’s in a name sign? Name signs in sign language of the Netherlands (NGT). In A. Ender, M. Matter & F. Tissot (Eds.), Proceedings der 39 Studentischen Tagung Sprachwissenschaft (StuTS), (pp. 21-34). Bern: Universitaet Bern Arbeitspapiere. Semenza, C. (1997). Proper-name-specific aphasias. In H. Goodglass & A. Wingfield (Eds.), Anomia: Neuroanatomical and cognitive correlates (pp. 115–134). San Diego, CA: Academic Press. Semenza, C. (2006). Retrieval pathways for common and proper names. Cortex, 42(6), 884–891. Semenza, C., & Zettin, M. (1989). Evidence from aphasia for the role of proper names as pure referring expressions. Nature, 342(6250), 678–679. Sevald, C. A., & Dell, G. S. (1994). The sequential cuing effect in speech production. Cognition, 53(2), 91–127. Smith, S. W., Noda, H. P., Andrews, S., & Jucker, A. H. (2005). Setting the stage: How speakers prepare listeners for the introduction of referents in dialogues and monologues. Journal of Pragmatics, 37(11), 1865–1895. Snodgrass, J. G., & Yuditsky, T. (1996). Naming times for the Snodgrass and Vanderwart pictures. Behavior Research Methods, Instruments, & Computers, 28, 516–536. Stanhope, N., & Cohen, G. (1993). Retrieval of proper names: Testing the models. British Journal of Psychology, 84(1), 51–65. Stevenage, S. V., & Lewis, H. G. (2005). By which name should I call thee? The consequences of having multiple names. Quarterly Journal of Experimental Psychology A: Human Experimental Psychology, 58(8), 1447–1461. Stivers, T., Enfield, N. J., & Levinson, S. C. (2007). Person reference in interaction. In N. J. Enfield & T. Stivers (Eds.), Person reference in interaction: Linguistic, cultural, and social perspectives (pp. 1–20). Cambridge, UK: Cambridge University Press. Supalla, S. J. (1992). The book of name signs: Naming in American Sign Language. San Diego, CA: Dawn Sign Press. Tardif, T., Fletcher, P., Liang, W. L., Zhang, Z. X., Kaciroti, N., & Marchman, V. A. (2008). Baby’s first 10 words. Developmental Psychology, 44(4), 929–938. Todd, M. G., & Robert, L. G. (2009). How you named your child: Understanding the relationship between individual decision making and collective outcomes. Topics in Cognitive Science, 1(4), 651–674. Tyler, L. K., Moss, H. E., Durrant-Peatfield, M. R., & Levy, J. P. (2000). Conceptual structure and the structure of concepts: A distributed account of category-specific deficits. Brain and Language, 75(2), 195–231. Valentine, T., Brennen, T., & Bre´dart, S. (1996). The cognitive psychology of proper names: On the importance of being Ernest. London: Routledge. Valentine, T., & Darling, S. (2006). Competitor effects in naming objects and famous faces. European Journal of Cognitive Psychology, 18(5), 686–707. Valentine, T., Hollis, J., & Moore, V. (1999). The nominal competitor effect: When one name is better than two. In M. Hahn & S. C. Stoness (Eds.), Proceedings of the 21st Annual Meeting of the Cognitive Science Society, (pp. 749–754). Mahwah, NJ: Lawrence Earlbaum Associates. Valentine, T., & Moore, V. (1995). Naming faces: The effects of facial distinctiveness and surname frequency. Quarterly Journal of Experimental Psychology A: Human Experimental Psychology, 48(4), 849–878.
Personal Names
387
Verhaeghan, P. (2003). Aging and vocabulary scores: A meta-analysis. Psychology and Aging, 18, 332–339. Vigliocco, G., Vinson, D. P., Damian, M. F., & Levelt, W. J. M. (2002). Semantic distance effects on object and action naming. Cognition, 85, B61–B69. Vitevitch, M. S. (2002). The influence of phonological similarity neighborhoods on speech production. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28(4), 735–747. Vitkovitch, M., Humphreys, G. W., & Lloyd-Jones, T. J. (1993). On naming a giraffe a zebra: Picture naming errors across different object categories. Journal of Experimental Psychology: Learning, Memory, & Cognition, 19(2), 243–259. Warrington, E. K., & Clegg, F. (1993). Selective preservation of place names in an aphasic patient: A short report. In G. Cohen & D. M. Burke (Eds.), Memory for proper names (pp. 281–288). Hillsdale, NJ: Lawrence Erlbaum Associates. Wilson, A. J., & Zeitlyn, D. (1995). The distribution of person-referring expressions in natural conversation. Research on Language and Social Interaction, 28(1), 61–92. Wu, S., & Keysar, B. (2007). The effect of culture on perspective taking. Psychological Science, 18(7), 600–606. Yassin, M. A. F. (1977). Bi-polar terms of address in Kuwaiti Arabic. Bulletin of the School of Oriental and African Studies, University of London, 40(2), 297–301. Yasuda, K., Nakamura, T., & Beckman, B. (2000). Review. Aphasiology, 14(11), 1067–1089. Yau, S.-C. (1996). The weight of tradition in the formation of the name signs of the deaf in China. Diogenes, 44(3), 55–65. Young, A. W., Ellis, A. W., Flude, B. M., McWeeny, K. H., & Hay, D. C. (1986). Face–name interference. Journal of Experimental Psychology: Human Perception and Performance, 12, 466–475. Young, A. W., Flude, B. M., Hellawell, D. J., & Ellis, A. W. (1994). The nature of semantic priming effects in the recognition of familiar people. British Journal of Psychology, 85, 393–411. Young, A. W., Hay, D. C., & Ellis, A. W. (1985). The faces that launched a thousand slips: Everyday difficulties and errors in recognizing people. British Journal of Psychology, 76(4), 495–523. Zwicky, A. (1974). Hey, whatsyourname!. Chicago Linguistic Society, 10, 787–801.
Subject Index
A Action disorganization syndrome. See Prefrontal cortex (PFC) Adaptive memory domain-specific mnemonic process, potential candidates, 3–4 memory theory and nature’s criterion encoding–retrieval match, 13–14 episodic future thought, 16–18 levels of processing, 14–16 rational analysis, memory, 18–20 stone-age brain, remembering, 20–21 ancestral priorities, survival processing, 23–24 cognitive adaptations, 21–23 domain-specific knowledge systems, 25 mnemonic adaptations, 24 multiple module, 26 shallow perceptual dimensions, 26 survival processing paradigm emotional processing, 8–9 proportion correct recall, words, 6–7 scenarios, 5 special adaptation, 11–12 thematic processing, 9–11 taboo words, 4 temporal context, 2 Age-invariance, 84–85 Aging episodic memory and situation model, 277–279 event segmentation, 279–282 midbrain neuromodulatory systems, 277 prefrontal cortex, 276–277 Alzheimer’s disease (AD) brain changes and cognitive deficits, 283–285 event segmentation, 286–287 symptoms, 282–283 Anterior cingulate cortex (ACC), 258 Argument evaluation, 185, 191–193, 203–204 B Blindness, 54–55 C Category-based induction inference. See also Inductive inference generation cognitive functions, 221
forced-choice/argument-evaluation, 219 induction process, 184 inductive reasoning, 221–222 premise categories, 223 taxonomic relations, 220 Causal inferences, 215–216 Chronic de´ja` vu, 55–56 Creativity implications, 173–174 retrieving analogies brainstorming, 155–156 incubation/preparedness effects, 157 social factors, 156 Cued-recall method, 233 D Deficient-processing effects homographic repetition, 77 presentation rate, 78 same-sense repetition, 77 testing effect, 117–119 word puzzles, 77 De´ja` vu research aging, 56–57 dreams, 57 implicit memory explanation episodic experience, 43–46 gestalt familiarity explanation, 49–51 hypnosis, 51–52 single-element familiarity explanation, 46–49 Jamais vu, 58–59 physiological explanation neural transmission asynchrony, 52–53 surgical elicitation, 53–54 surgical elimination, 53 reincarnation and extra sensory perception, 35 reports, anomalous individuals blindness, 54–55 chronic de´ja` vu, 55–56 single vs. multiple causes, 57–58 split perception Jacoby and Whitehouse’s design, 37–38 modern cognitive science, 36 peripheral priming possibility, 41 pre-experiment source rating, 41 superficial glance, shallow processing, 42–43 symbols, 38–41
389
390
Subject Index
Dialog processing confederation, 308 different perspectives, 306–307 grounding process, 306 process model collaborative view, 311–313 grounding, 311–313 message, 308–310 two-stage, 310–311 transcripts, 304–307 Direct address, spoken language address avoidance and second-person pronouns, 372–373 address form choice, factors, 376–379 familiarizers and fictive kinship terms, 373–374 insults and endearments, 373 kinship terms, 375 names and teknonyms, 376 nicknames, 376 occupational titles, 374–375 social bonds, 371 vocatives, 371 Domain knowledge acquisition implications, 175–176 retrieving analogies complex declarative learning, 158 progressive alignment, 159 salient surface property, 159 social guidance, 159 tradeoff, 158 Dual-process model, 322 E Episodic memory, 277–279 Epistemic uncertainty and approximation demand characteristics, 247 distinction, 246–247 expertise function, 238 vs. novices, 238–239 submarine domain, 238 uncertainty detection and resolution strategies, 240 gestures coding cross-validation, 236 taxonomies, 235 uncertainty speech code, 237 visual–spatial content, 234–235 linguistic pragmatics, 229–231 patterns, 247 psychological uncertainty vs. approximation, 229 qualitative reasoning, 248 spatial reasoning mental simulations, engineering design, 244–246
spatial gestures, 242–244 verbally coded spatial transformations, 241–242 speech coding conversation and interview coding, science data analysis, 232–234 conversation coding, engineering design team, 231–232 types, 229 Error-related negativity (ERN), 266 Event perception aging episodic memory and situation model, 277–279 midbrain neuromodulatory systems, 277 possibilities, 279–282 prefrontal cortex, 276–277 Alzheimer’s disease (AD) brain changes and cognitive deficits, 283–285 symptoms, 282–283 event segmentation theory (EST) behavior and brain function, 259–260 components, 256 event models, 255 event schemata, 257 predictions, 255 temporal dynamics, 258–259 working memory (WM) representation, 261 obsessive-compulsive disorder (OCD) cognitive disturbances, 265–267 neurochemical mechanism, 265 possibilities, 267–269 serotonin and dopamine, 265 Parkinson’s disease (PD) coginitive deficits, 269–270 symptoms, 269 prefrontal cortex (PFC) cognitive deficits, 272–274 prefrontal lesions, 274–275 Schizophrenia cognitive deficits, 263 cognitive dysfunction, 264 neurotransmitter dopamine, 262 Event Segmentation Theory (EST) aging, 279–282 Alzheimer’s Disease (AD), 286–287 behavior and brain function, 259–260 components, 256 event models, 255 event schemata, 257 Obsessive-Compulsive Disorder (OCD), 267–269 overview of, 288 Parkinson’s Disease, 271–272 predictions, 255 prefrontal cortex (PFC), 274–275
391
Subject Index
Schizophrenia, 263–264 temporal dynamics, 258–259 WM representation, 261 Extrinsic inferences, 215 F Fitness-relevant processing domain-specific mnemonic process, potential candidates, 3–4 survival processing paradigm emotional processing, 8–9 proportion correct recall, words, 6–7 scenarios, 5 special adaptation, 11–12 thematic processing, 9–11 taboo words, 4 G Glenberg surface, 81, 86–87, 90, 103, 107 Grounding model, 311–313 Guide inductive reasoning, 204 H High-familiarity symbols, 38–39 I Implicit memory interpretation episodic experience, 43–46 gestalt familiarity explanation, 49–51 hypnosis, 51–52 single-element familiarity explanation, 46–49 Inductive inference generation categorical induction process, 184 category-based induction cognitive functions, 221 forced-choice/argument-evaluation, 219 inductive reasoning, 221–222 premise categories, 223 taxonomic relations, 220 causal relations, 189–190, 202 contextual relations, 202 extrinsic similarity, 188–189 induction and relations, 187 novel properties, 191 open-ended method, 186–187 premise relations effects argument evaluation, 191–193 coding, 196–198 multiple regressions analyses, 200 privileged taxonomic inferences, 201 relative frequency inferences, 198–199 relative salience of conceptual relations, 193–194 research design and procedure, 195–196 property effects argument evaluation, 203–204
causal inferences, 215–216 coding, 206–207 extrinsic inferences, 215 gene, 210–211 premise pair, 212 relative frequency inferences, 207–208 research design and procedure, 206 salience shared habitat, 213 substance and disease, 208–210 taxonomic inferences, 214–215 salient relations, 191 salient spatiotemporal or causal relations, 195 salient taxonomic relations, 195 taxonomic similarity, 187–188 Intention invariance incidental learning effect, 82–84 intentional-learning, 83 rehearsal borrowing, 81 J Jamais vu, 58–59 L Lag effect, 65, 69–70, 78, 86–87, 101–103, 107 Language processing dialog collaborative view, 311–313 confederation, 308 different perspectives, 306–307 grounding process, 306 message model, 308–310 transcripts, 304–307 two-stage models, 310–311 partner-adapted processing human vs. computer partner interactions, 329–330 joint activation, 328–329 mentalizing vs. mirroring system, 332–333 mirroring system, 325–326 private, social, and communicative intentions, 327–328 processing cues, 334 role of executive control, 330–332 voice cues, 334–335 partner-specific processing addressees adapt utterance, 323–324 global and local adaptations, 316–320 ‘‘one-bit’’partner models, 324 speakers adapt utterances, 320–323 role of cue, 313–315 Linguistic pragmatics, 229–231 List-strength effects, 68, 107–108 incidental learning and mixed lists, 79–80 SAM/REM model, 104–106, 111 Low-familiarity symbols, 38–41
392
Subject Index M
Message model, 308–310 Mirroring systems human ‘‘mirror system’’, 326 vs. mentalizing, 332–333 N Naı¨ve theories, 216 Neural transmission asynchrony, 52–53 Novel open-ended induction task, 185, 190, 216 Novel symbols, 38–41 O Obsessive-compulsive disorder (OCD) cognitive disturbances, 265–267 event segmentation, 267–269 neurochemical mechanism, 265 serotonin and dopamine, 265 One-bit partner models, 324 P Parkinson’s disease (PD) coginitive deficits, 269–270 event segmentation, 270–271 symptoms, 269 thalamocortical loops, 269 Partner-adapted processing cues hypothesize processing cues, 334 voice cues, 334–335 mentalizing vs. mirroring system, 332–333 mirroring system, 325–326 role of executive control, 330–332 theory of mind (ToM) human vs. computer partner interactions, 329–330 joint activation, 328–329 private, social, and communicative intentions, 327–328 Partner-specific processing addressees adapt utterance, 323–324 global and local adaptations, 316–320 ‘‘one-bit’’ partner models, 324 speakers adapt utterances, 320–323 Personal names contextual cues and implicit memory process, 371 derogatory nicknames, 370 descriptive and meaningful names, 366 family and clan names, 365 kinship terms, 369 nonunique and multiple names descriptive and episode-based nicknames, 367 Gaelic version, first name, 367 name signs, 368
nicknames and license plate numbers, reference, 367 teknonymy, 368 tip-of-the-tongue (TOT) states, 367 psychological research cognitive impairment, patients, 347 cognitive psychology, 347 descriptiveness and meaning, 351–352 features and representational structure, 352–357 individuality, uniqueness, and arbitrariness, 348–349 multiple names, 357–360 name ambiguity, 361–362 name frequency and acquisition age, 360–361 tip-of-the-tongue (TOT) state, 347–348 vocabulary size and age, 362–363 word forms, 349–351 taboo, 369–370 unique names, 369–370 Predator–prey relations, 204 Prefrontal cortex (PFC) aging, 276–277 cognitive deficits, 272–274 event segmentation, 274–275 Premise relations effects argument evaluation, 191–193 coding, 196–198 multiple regressions analyses, 200 premise pair, 212 privileged taxonomic inferences, 201 relative frequency inferences, 198–199 relative salience of conceptual relations, 193–194 research design and procedure, 195–196 salience shared habitat, 213 Problem solving implications, 173–174 retrieving analogies base rates, 154 domain experts, 154 higher quality retrievals, 155 learning phase, 152 mathematics test scores, 154 potential generalization, 153 self-explanation, 153 transfer effects, 153 Process model grounding, 311–313 message, 308–310 two-stage, 310–311 R Real spacing effect, 65–66, 73, 80–81. See also Spacing effect Recency effects, 67–68, 73, 108
393
Subject Index
Rehearsal-borrowing effect ‘‘deep’’ mnemonics, 71 encoding strategy, 71–72 later memory test, 69–70 ‘‘modal model’’ paper, 69 rehearse-aloud protocols, 69 rote rehearsal and borrowing hypothesis, 73–74 story mnemonic, 71–72 Retrieving analogies ambiguity and contextual variability, 161–162 context specificity, 161–162 creativity brainstorming, 155–156 incubation/preparedness effects, 157 social factors, 156 domain knowledge acquisition complex declarative learning, 158 progressive alignment, 159 salient surface property, 159 social guidance, 159 tradeoff, 158 encoding specificity, 160, 163 exemplars, encode, 160 generic encodings, 165 LISA model, 164 problem solving base rates, 154 domain experts, 154 higher quality retrievals, 155 learning phase, 152 mathematics test scores, 154 potential generalization, 153 self-explanation, 153 transfer effects, 153 retrieval time autobiographical memory, 168–169 controlled memory set studies, 170–171 MAC/FAC simulation modeling, 171–172 role bindings, 164 Reverse spacing effect, 78, 98. See also Spacing effect S Salient spatiotemporal/causal relations, 195 Salient taxonomic relations, 195 Schizophrenia cognitive deficits, 263 event segmentation, 263–264 neurotransmitter dopamine, 262 Situation model, 277–279 Spacing effect age-invariance, 84–85 contextual variability, 87–91 Glenberg surface, 86–87 hybrid accounts, 101–103 intention invariance
incidental learning effect, 82–84 intentional-learning, 83 rehearsal borrowing, 81 pedagogical ecology approach, 137 recognition, spacing benefits final free recall test, 91 incidental-learning condition, 93 semantic and perceptual priming accounts, cued-memory tasks, 95–101 species invariance, 85–86 spotting impostors deficient-processing effects, 77–79 impostor effects and confounds, 80 list-strength effects, 79–80 primacy and recency buffers (see Zero-sum effect) recency effects, 67–68 rehearsal effects and strategy-switching, 68–74 study-phase retrieval account context strength, 107 cued-recall spacing effects, 106 FRAN, model, 108 incidental background stimuli, 108 list-strength effect, 104–106, 109–111 SAM/REM model, 106, 111 U-shaped curve, 108 verbal theory, 106–107 and testing, educational contexts advocacy, 127 distant transfer, 129–130 educational outcome improvement, 127 in-class discussions., 131 individual differences, 134–135 learner improment, 131–134 remembering and learning, 128 rote memory, 127–128 theories and key phenomena, 103–104 Spatial reasoning mental simulations, engineering design, 244–246 spatial gestures mental transformations, 243 speech segment, 243–244 spatial transformations cued-recall phase, 242 fMRI data analysis, 241 Species invariance, 81, 85–86 Split perception, de´ja` vu Jacoby and Whitehouse’s design, 37–38 modern cognitive science, 36 peripheral priming possibility, 41 pre-experiment source rating, 41 superficial glance, shallow processing, 42–43 symbols, 38–41 Study-phase retrieval account context strength, 107 cued-recall spacing effects, 106
394
Subject Index
Study-phase retrieval account (cont.) FRAN, model, 108 incidental background stimuli, 108 list-strength effect, 104–106, 109–111 SAM/REM model, 107, 112 U-shaped curve, 108 verbal theory, 106–107 Survival processing paradigm emotional processing, 8–9 proportion correct recall, words, 6–7 scenarios, 5 special adaptation, 11–12 thematic processing, 9–11 T Taxonomic inferences, 214–215 Testing effect deficient-processing accounts, 117–119 encoding variability, 122–123 forgetting rate, 115 integrated stimuli, 124–125 learning aid, educational practice, 113 pedagogical ecology approach, 137 recitation method, 113 restudy condition, 122 retention interval, 115–117 retrieval effort and desirable-difficulties framework, 121–122 and spacing, educational contexts advocacy, 127 distant transfer, 129–130 educational outcome improvement, 127 in-class discussions., 131
individual differences, 134–135 learner improment, 131–134 remembering and learning, 128 rote memory, 127–128 transfer-appropriate processing accounts ACT-R theory, 120 free recall test, 119 long retention interval, 119 Theory of mind (ToM) human vs. computer partner interactions, 329–330 joint activation, 328–329 private, social, and communicative intentions, 327–328 Transfer-appropriate processing accounts ACT-R theory, 120 free recall test, 119 long retention interval, 119 U Uncertainty/approximation distinction, 247 V Visual–spatial processing, 243 W Wisconsin card sort test (WCST), 266 Working memory (WM) theory, 263 Z Zero-sum effect, 68, 74–77
CONTENTS OF RECENT VOLUMES
Volume 40 Different Organization of Concepts and Meaning Systems in the Two Cerebral Hemispheres Dahlia W. Zaidel The Causal Status Effect in Categorization: An Overview Woo-kyoung Ahn and Nancy S. Kim Remembering as a Social Process Mary Susan Weldon Neurocognitive Foundations of Human Memory Ken A. Paller Structural Influences on Implicit and Explicit Sequence Learning Tim Curran, Michael D. Smith, Joseph M. DiFranco, and Aaron T. Daggy Recall Processes in Recognition Memory Caren M. Rotello Reward Learning: Reinforcement, Incentives, and Expectations Kent C. Berridge Spatial Diagrams: Key Instruments in the Toolbox for Thought Laura R. Novick Reinforcement and Punishment in the Prisoner’s Dilemma Game Howard Rachlin, Jay Brown, and Forest Baker Index
Volume 41 Categorization and Reasoning in Relation to Culture and Expertise Douglas L. Medin, Norbert Ross, Scott Atran, Russell C. Burnett, and Sergey V. Blok On the Computational basis of Learning and Cognition: Arguments from LSA Thomas K. Landauer Multimedia Learning Richard E. Mayer Memory Systems and Perceptual Categorization Thomas J. Palmeri and Marci A. Flanery
Conscious Intentions in the Control of Skilled Mental Activity Richard A. Carlson Brain Imaging Autobiographical Memory Martin A. Conway, Christopher W. Pleydell-Pearce, Sharon Whitecross, and Helen Sharpe The Continued Influence of Misinformation in Memory: What Makes Corrections Effective? Colleen M. Seifert Making Sense and Nonsense of Experience: Attributions in Memory and Judgment Colleen M. Kelley and Matthew G. Rhodes Real-World Estimation: Estimation Modes and Seeding Effects Norman R. Brown Index
Volume 42 Memory and Learning in Figure–Ground Perception Mary A. Peterson and Emily Skow-Grant Spatial and Visual Working Memory: A Mental Workspace Robert H. Logie Scene Perception and Memory Marvin M. Chun Spatial Representations and Spatial Updating Ranxiano Frances Wang Selective Visual Attention and Visual Search: Behavioral and Neural Mechanisms Joy J. Geng and Marlene Behrmann Categorizing and Perceiving Objects: Exploring a Continuum of Information Use Philippe G. Schyns From Vision to Action and Action to Vision: A Convergent Route Approach to Vision, Action, and Attention Glyn W. Humphreys and M. Jane Riddoch Eye Movements and Visual Cognitive Suppression David E. Irwin
395
396
What Makes Change Blindness Interesting? Daniel J. Simons and Daniel T. Levin Index
Volume 43 Ecological Validity and the Study of Concepts Gregory L. Murphy Social Embodiment Lawrence W. Barsalou, Paula M. Niedinthal, Aron K. Barbey, and Jennifer A. Ruppert The Body’s Contribution to Language Arthur M. Glenberg and Michael P. Kaschak Using Spatial Language Laura A. Carlson In Opposition to Inhibition Colin M. MacLeod, Michael D. Dodd, Erin D. Sheard, Daryl E. Wilson, and Uri Bibi Evolution of Human Cognitive Architecture John Sweller Cognitive Plasticity and Aging Arthur F. Kramer and Sherry L. Willis Index
Volume 44 Goal-Based Accessibility of Entities within Situation Models Mike Rinck and Gordon H. Bower The Immersed Experiencer: Toward an Embodied Theory of Language Comprehension Rolf A. Zwaan Speech Errors and Language Production: Neuropsychological and Connectionist Perspectives Gary S. Dell and Jason M. Sullivan Psycholinguistically Speaking: Some Matters of Meaning, Marking, and Morphing Kathryn Bock Executive Attention, Working Memory Capacity, and a Two-Factor Theory of Cognitive Control Randall W. Engle and Michael J. Kane Relational Perception and Cognition: Implications for Cognitive Architecture and the Perceptual-Cognitive Interface Collin Green and John E. Hummel An Exemplar Model for Perceptual Categorization of Events Koen Lamberts
Contents of Recent Volumes
On the Perception of Consistency Yaakov Kareev Causal Invariance in Reasoning and Learning Steven Sloman and David A. Lagnado Index
Volume 45 Exemplar Models in the Study of Natural Language Concepts Gert Storms Semantic Memory: Some Insights From Feature-Based Connectionist Attractor Networks Ken McRae On the Continuity of Mind: Toward a Dynamical Account of Cognition Michael J. Spivey and Rick Dale Action and Memory Peter Dixon and Scott Glover Self-Generation and Memory Neil W. Mulligan and Jeffrey P. Lozito Aging, Metacognition, and Cognitive Control Christopher Hertzog and John Dunlosky The Psychopharmacology of Memory and Cognition: Promises, Pitfalls, and a Methodological Framework Elliot Hirshman Index
Volume 46 The Role of the Basal Ganglia in Category Learning F. Gregory Ashby and John M. Ennis Knowledge, Development, and Category Learning Brett K. Hayes Concepts as Prototypes James A. Hampton An Analysis of Prospective Memory Richard L. Marsh, Gabriel I. Cook, and Jason L. Hicks Accessing Recent Events Brian McElree SIMPLE: Further Applications of a Local Distinctiveness Model of Memory Ian Neath and Gordon D. A. Brown What is Musical Prosody? Caroline Palmer and Sean Hutchins Index
397
Contents of Recent Volumes
Volume 47 Relations and Categories Viviana A. Zelizer and Charles Tilly Learning Linguistic Patterns Adele E. Goldberg Understanding the Art of Design: Tools for the Next Edisonian Innovators Kristin L. Wood and Julie S. Linsey Categorizing the Social World: Affect, Motivation, and Self-Regulation Galen V. Bodenhausen, Andrew R. Todd, and Andrew P. Becker Reconsidering the Role of Structure in Vision Elan Barenholtz and Michael J. Tarr Conversation as a Site of Category Learning and Category Use Dale J. Barr and Edmundo Kronmu¨ller Using Classification to Understand the Motivation-Learning Interface W. Todd Maddox, Arthur B. Markman, and Grant C. Baldwin Index
Volume 48 The Strategic Regulation of Memory Accuracy and Informativeness Morris Goldsmith and Asher Koriat Response Bias in Recognition Memory Caren M. Rotello and Neil A. Macmillan What Constitutes a Model of Item-Based Memory Decisions? Ian G. Dobbins and Sanghoon Han Prospective Memory and Metamemory: The Skilled Use of Basic Attentional and Memory Processes Gilles O. Einstein and Mark A. McDaniel Memory is More Than Just Remembering: Strategic Control of Encoding, Accessing Memory, and Making Decisions Aaron S. Benjamin The Adaptive and Strategic Use of Memory by Older Adults: Evaluative Processing and ValueDirected Remembering Alan D. Castel Experience is a Double-Edged Sword: A Computational Model of the Encoding/ Retrieval Trade-Off With Familiarity
Lynne M. Reder, Christopher Paynter, Rachel A. Diana, Jiquan Ngiam, and Daniel Dickison Toward an Understanding of Individual Differences In Episodic Memory: Modeling The Dynamics of Recognition Memory Kenneth J. Malmberg Memory as a Fully Integrated Aspect of Skilled and Expert Performance K. Anders Ericsson and Roy W. Roring Index
Volume 49 Short-term Memory: New Data and a Model Stephan Lewandowsky and Simon Farrell Theory and Measurement of Working Memory Capacity Limits Nelson Cowan, Candice C. Morey, Zhijian Chen, Amanda L. Gilchrist, and J. Scott Saults What Goes with What? Development of Perceptual Grouping in Infancy Paul C. Quinn, Ramesh S. Bhatt, and Angela Hayden Co-Constructing Conceptual Domains Through Family Conversations and Activities Maureen Callanan and Araceli Valle The Concrete Substrates of Abstract Rule Use Bradley C. Love, Marc Tomlinson, and Todd M. Gureckis Ambiguity, Accessibility, and a Division of Labor for Communicative Success Victor S. Ferreira Lexical Expertise and Reading Skill Sally Andrews Index
Volume 50 Causal Models: The Representational Infrastructure for Moral Judgment Steven A. Sloman, Philip M. Fernbach, and Scott Ewing Moral Grammar and Intuitive Jurisprudence: A Formal Model of Unconscious Moral and Legal Knowledge John Mikhail Law, Psychology, and Morality Kenworthey Bilz and Janice Nadler
398
Protected Values and Omission Bias as Deontological Judgments Jonathan Baron and Ilana Ritov Attending to Moral Values Rumen Iliev, Sonya Sachdeva, Daniel M. Bartels, Craig Joseph, Satoru Suzuki, and Douglas L. Medin Noninstrumental Reasoning over Sacred Values: An Indonesian Case Study Jeremy Ginges and Scott Atran Development and Dual Processes in Moral Reasoning: A Fuzzy-trace Theory Approach Valerie F. Reyna and Wanda Casillas Moral Identity, Moral Functioning, and the Development of Moral Character Darcia Narvaez and Daniel K. Lapsley ‘‘Fools Rush In’’: A JDM Perspective on the Role of Emotions in Decisions, Moral and Otherwise Terry Connolly and David Hardman Motivated Moral Reasoning Peter H. Ditto, David A. Pizarro, and David Tannenbaum In the Mind of the Perceiver: Psychological Implications of Moral Conviction Christopher W. Bauman and Linda J. Skitka Index
Volume 51 Time for Meaning: Electrophysiology Provides Insights into the Dynamics of Representation and Processing in Semantic Memory Kara D. Federmeier and Sarah Laszlo Design for a Working Memory Klaus Oberauer When Emotion Intensifies Memory Interference Mara Mather Mathematical Cognition and the Problem Size Effect Mark H. Ashcraft and Michelle M. Guillaume Highlighting: A Canonical Experiment John K. Kruschke
Contents of Recent Volumes
The Emergence of Intention Attribution in Infancy Amanda L. Woodward, Jessica A. Sommerville, Sarah Gerson, Annette M. E. Henderson, and Jennifer Buresh Reader Participation in the Experience of Narrative Richard J. Gerrig and Matthew E. Jacovina Aging, Self-Regulation, and Learning from Text Elizabeth A. L. Stine-Morrow and Lisa M. S. Miller Toward a Comprehensive Model of Comprehension Danielle S. McNamara and Joe Magliano Index
Volume 52 Naming Artifacts: Patterns and Processes Barbara C. Malt Causal-Based Categorization: A Review Bob Rehder The Influence of Verbal and Nonverbal Processing on Category Learning John Paul Minda and Sarah J. Miles The Many Roads to Prominence: Understanding Emphasis in Conversation Duane G. Watson Defining and Investigating Automaticity in Reading Comprehension Katherine A. Rawson Rethinking Scene Perception: A Multisource Model Helene Intraub Components of Spatial Intelligence Mary Hegarty Toward an Integrative Theory of Hypothesis Generation, Probability Judgment, and Hypothesis Testing Michael Dougherty, Rick Thomas, and Nicholas Lange The Self-Organization of Cognitive Structure James A. Dixon, Damian G. Stephen, Rebecca Boncoddo, and Jason Anastas Index