THE PSYCHOLOGY OF LEARNING AND MOTIVATION Advances in Research and Theory
VOLUME 48
This page intentionally left blank
Skill and Strategy in Memory Use A Volume In THE PSYCHOLOGY OF LEARNING AND MOTIVATION Advances in Research and Theory
Edited by AARON S. BENJAMIN BECKMAN INSTITUTE AND DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA, ILLINOIS
BRIAN H. ROSS BECKMAN INSTITUTE AND DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA, ILLINOIS
Volume 48
AMSTERDAM • BOSTON • HEIDELBERG • LONDON NEW YORK • OXFORD • PARIS • SAN DIEGO SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO Academic Press is an imprint of Elsevier
Academic Press is an imprint of Elsevier 84 Theobald’s Road, London WC1X 8RR, UK Radarweg 29, PO Box 211, 1000 AE Amsterdam, The Netherlands Linacre House, Jordan Hill, Oxford OX2 8DP, UK 30 Corporate Drive, Suite 400, Burlington, MA 01803, USA 525 B Street, Suite 1900, San Diego, CA 92101-4495, USA
First edition 2007 Copyright # 2007 Elsevier Inc. All rights reserved No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise without the prior written permission of the publisher Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford, UK: phone (+44) (0) 1865 843830; fax (+44) (0) 1865 853333; email:
[email protected]. Alternatively you can submit your request online by visiting the Elsevier web site at http://elsevier.com/locate/permissions, and selecting Obtaining permission to use Elsevier material Notice No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. Because of rapid advances in the medical sciences, in particular, independent verification of diagnoses and drug dosages should be made ISBN: 978-0-12-373607-9 ISSN: 0079-7421
For information on all Academic Press publications visit our website at books.elsevier.com
Printed and bound in USA 07 08 09 10 11 10 9 8
7 6 5
4 3 2 1
CONTENTS
Contributors ............................................................................................................................ Preface......................................................................................................................................
ix xi
THE STRATEGIC REGULATION OF MEMORY ACCURACY AND INFORMATIVENESS
Morris Goldsmith and Asher Koriat I. Introduction................................................................................................................ II. The Strategic Control of Memory Reporting: A Metacognitive Framework................................................................................ III. Applications of the Framework........................................................................... IV. Expanding the Framework: Control of Memory Grain Size .................... V. Toward an Integrated Model of Grain Size and Report Option............................................................................................................ VI. Conclusion................................................................................................................... References....................................................................................................................
1 6 29 39 48 53 53
RESPONSE BIAS IN RECOGNITION MEMORY
Caren M. Rotello and Neil A. Macmillan I. Introduction................................................................................................................ II. Measuring Response Bias ...................................................................................... III. Explaining Response Bias......................................................................................
v
61 63 67
Contents
vi
IV. V. VI. VII. VIII. IX. X.
Between-Group Criterion DiVerences ............................................................... Between-List Criterion DiVerences..................................................................... Within-Test Criterion Shifts ................................................................................. An Interim Summary .............................................................................................. Distribution Shifts Masquerading as Criterion Shifts ................................. Designs with Multiple Responses ....................................................................... Conclusions and Recommendations .................................................................. References ...................................................................................................................
72 72 74 78 79 81 88 90
WHAT CONSTITUTES A MODEL OF ITEM-BASED MEMORY DECISIONS?
Ian G. Dobbins and Sanghoon Han I. Introduction ............................................................................................................... II. The Characteristics and Neural Substrates of Item-Based Memory Decisions ............................................................................ III. Conclusion—What Constitutes a Model of Item-Based Memory Decisions? .......................................................................... References ...................................................................................................................
95 100 138 140
PROSPECTIVE MEMORY AND METAMEMORY: THE SKILLED USE OF BASIC ATTENTIONAL AND MEMORY PROCESSES
Gilles O. Einstein and Mark A. McDaniel I. II. III. IV.
Introduction ............................................................................................................... What Is Different About Prospective Memory?............................................ Is There a Specialized Prospective Memory System?................................... Using Basic Memory and Attentional Processes in the Service of Prospective Memory ........................................................................... V. The Multiprocess Theory: Contextual Factors Determining the Utility of Each Process.......................................................... VI. Metamemory and Prospective Memory............................................................ VII. Summary and Future Directions ........................................................................ References ...................................................................................................................
145 146 147 148 154 155 165 169
Contents
vii
MEMORY IS MORE THAN JUST REMEMBERING: STRATEGIC CONTROL OF ENCODING, ACCESSING MEMORY, AND MAKING DECISIONS
Aaron S. Benjamin I. II. III. IV. V. VI.
Introduction................................................................................................................ Interacting with Memory ....................................................................................... Strategic Decisions About Encoding.................................................................. Strategic Decisions About Memory Access ..................................................... Postaccess Decision Processes .............................................................................. Conclusions................................................................................................................. References....................................................................................................................
175 177 178 189 202 209 212
THE ADAPTIVE AND STRATEGIC USE OF MEMORY BY OLDER ADULTS: EVALUATIVE PROCESSING AND VALUE-DIRECTED REMEMBERING
Alan D. Castel I. Overview ...................................................................................................................... II. A Selective Review of the Research on Memory and Lifespan Development ............................................................................................ III. Strategic Control and Value as Memory Modifiers for Older Adults.... IV. Model, Review and New View of Value, Memory, and Aging ................ V. Implications of Value on Memory and Aging................................................ VI. Summary and Conclusions .................................................................................... References....................................................................................................................
225 226 231 242 254 260 263
EXPERIENCE IS A DOUBLE-EDGED SWORD: A COMPUTATIONAL MODEL OF THE ENCODING/RETRIEVAL TRADE-OFF WITH FAMILIARITY
Lynne M. Reder, Christopher Paynter, Rachel A. Diana, Jiquan Ngiam, and Daniel Dickison I. Introduction................................................................................................................ II. When and Why Experience Adversely Affects Memory Retrieval .......... III. When and Why Experience Facilitates Memory Encoding........................
271 273 291
Contents
viii
IV. General Discussion .................................................................................................. V. Summary and Conclusions.................................................................................... References ...................................................................................................................
303 305 306
TOWARD AN UNDERSTANDING OF INDIVIDUAL DIFFERENCES IN EPISODIC MEMORY: MODELING THE DYNAMICS OF RECOGNITION MEMORY
Kenneth J. Malmberg I. A Possible Relationship Between the Speed-Accuracy Trade-Off and Individual Differences in Associative Recognition ....................................... II. Traditional Testing Procedures............................................................................ III. Classical Models of Associative Recognition ................................................. IV. Modeling the Accuracy and Latency of Associative Recognition........... V. Conclusions ................................................................................................................ References ...................................................................................................................
313 318 319 323 345 346
MEMORY AS A FULLY INTEGRATED ASPECT OF SKILLED AND EXPERT PERFORMANCE
K. Anders Ericsson and Roy W. Roring I. II. III. IV.
Introduction ............................................................................................................... Outline of the Chapter ........................................................................................... Summary ..................................................................................................................... Conclusion .................................................................................................................. References ...................................................................................................................
351 353 372 373 375
Index ......................................................................................................................................... Contents of Recent Volumes ..............................................................................................
381 391
CONTRIBUTORS
Numbers in parentheses indicate the pages on which the authors’ contributions begin.
Aaron S. Benjamin (175), Department of Psychology, University of Illinois, Champaign, Illinois 61820 Alan D. Castel (225), Department of Psychology, University of California, Los Angeles, California 90095 Rachel A. Diana (271), Department of Psychology, Carnegie Mellon University, Pittsburgh, Pennsylvania, 15213 Daniel Dickison (271), Department of Psychology, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 Ian G. Dobbins (95), Department of Psychology and Neuroscience, Duke University, Durham, North Carolina 27708 Gilles O. Einstein (145), Department of Psychology, Furman University, Greenville, South Carolina 29613 K. Anders Ericsson (351), Department of Psychology, Florida State University, Tallahassee, Florida 32306 Morris Goldsmith (1), Department of Psychology, University of Haifa, Haifa 31905, Israel Sanghoon Han (95), Department of Psychology and Neuroscience, Duke University, Durham, North Carolina 27708 Asher Koriat (1), Department of Psychology, University of Haifa, Haifa 31905, Israel ix
x
Contributors
Neil A. Macmillan (61), Department of Psychology, University of Massachusetts, Amherst, Massachusetts 01003 Kenneth J. Malmberg (313), Department of Psychology, University of South Florida, Tampa, Florida 33620 Mark A. McDaniel (145), Department of Psychology, Washington University, St. Louis, Missouri 63130 Jiquan Ngiam (271), Department of Psychology, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 Christopher Paynter (271), Department of Psychology, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 Lynne M. Reder (271), Department of Psychology, Carnegie Mellon University, Pittsburgh, Pennsylvania15213 Roy W. Roring (351), Department of Psychology, Florida State University, Tallahassee, Florida 32306 Caren M. Rotello (61), Department of Psychology, University of Massachusetts, Amherst, Massachusetts 01003
PREFACE
To use memory effectively, we must do much more than shuttle information into and out of storage. Much of our use of memory is actually the action of higher-level decision making on the inputs to and the outputs from memory stores. The central premise of this volume is that the many capabilities of memory reflect not the action and interaction of multiple memory systems but rather the myriad of ways in which memory queries can be strategically devised for the task at hand and the degree to which the products of memory can be flexibly acted upon. The chapters here review research that demonstrates how we select strategies for querying memory effectively, how we successfully remember to perform intended actions, how the skillful use of encoding and retrieval strategies can moderate memory deficits and support expertise, and how we accommodate our responses and monitor our output in order to satisfy situational demands while making optimal use of the information that our memory provides us with. This perspective, which emphasizes the control processes that govern memory use, contrasts with many current and traditional views of memory, which appeal to an everincreasing set of distinct memory systems or separable memory processes. This view of memory use as skilled performance is uniquely able to accommodate within a single framework the recent growth in a number of sociologically distinct but conceptually unified areas of study, including (but not limited to) metamemory, recognition memory, prospective memory, and individual differences in memory. Much traditional research seeks to explicitly reduce or eliminate the influence of higher-order strategic processes on memory tasks, but the research presented here embraces the interactive nature of memory and higher-level cognition. It is at the union between memory and these other cognitive domains that fascinating and previously unexplored questions arise. How do students decide whether they have mastered materials for an upcoming test, and what actions do xi
xii
Preface
they engage in to efficiently reach that goal? When asked where I live, how do I decide whether to respond with a street address or a city name? What factors influence the likelihood of a selection of a particular party from a criminal lineup in addition to actual memory for the perpetrator? How do I make sure to remember to pick up dinner on the way home? Why do my friends who are expert golfers have better memory for courses they have played, and how they played them, than I do? These and similar questions about memory use are ones that can be addressed only by considering the contexts in which memory is used rather than by trying to disembody the study of memory from strategic control processes that govern it. Traditional memory research has offered little more than a promissory note toward the resolution of such questions, and this volume represents our attempt to bring together the work of a number of prominent researchers who have made headway on these problems by broadening the scope of what memory researchers are willing to address and consider. One theme that can be detected throughout is an explicit consideration of sources of variability in performance that the typical memory task is designed to eliminate, avoid, or control for. Understanding how factors such as study time and response grain size vary with experimental manipulations or individual differences takes a big step forward in understanding memory in vivo, and it is here that the primary action of this volume takes place. The first chapter in this volume considers the strategies people use in responding to memory queries. Such strategies include choices about whether and how to access memory, and how to translate the retrieved products into overt responses. Goldsmith and Koriat show how people modulate their answers to questions by withholding (choosing not to respond) and by trading off informativeness with accuracy (choosing an appropriate grain size for their response). The next few chapters address a similar topic in tasks that involve memory judgment, rather than the production of a response. In such tasks, subjects typically have to endorse or reject a memory probe as having been experienced in a particular context, and the burden on the subjects is to decide how much evidence is enough Rotello and Macmillan review the literature on how, and how well, people place and adjust such criteria, and also consider the successes and failures of the different approaches researchers can take in trying to separate the influences of memory from the influence of response strategies. Dobbins and Han similarly address what variables influence criteria and also how simple unidimensional criterion-based models of memory judgments fail in important and informative ways.
Preface
xiii
Einstein and McDaniel extend the consideration of strategic processes to the cases of prospective memory—the ability to remember to perform actions in the future. They show that people leverage basic attentional, mnemonic, and metamnemonic processes in order to improve their chances of successfully completing prospective demands that they think are particularly difficult or important. The chapter by Benjamin takes as its central conceit the idea that memory itself is a somewhat fixed and impermeable entity, and provides an overview of the range of memory behavior—from encoding through response—that differs across people and situations because of applications of ‘‘memory skill.’’ Castel extends this reasoning to understanding memory use in elderly people. His chapter reviews ways in which elderly people employ strategies that offset the consequences of their declining memory fidelity, and how doing so decreases the magnitude of the deficits they face. The final section of this volume reviews the role of memory strategies in expertise and individual differences. Reder, Paynter, Diana, Ngiam, and Dickison argue that expertise in a domain can simultaneously facilitate and obstruct the efficient use of memory, and show how relevant effects can be summarized and understood with the application of a symbolic model. Within the context of a different set of models, Malmberg considers individual differences in strategy use on associative recognition, and reviews how lessons from that particular task might be more generally applied to understanding individual differences in strategy use in recognition. In the final chapter, Ericsson and Roring discuss how expertise develops with practice, and how particular memory skills develop in parallel with such expertise. They also outline a particular methodological approach to studying these experts fruitfully. At the heart of each of these chapters is an acknowledgment of the very important role that strategies and skill play in memory performance, regardless of whether that performance is on a controlled laboratory test or in a dynamic, real-world context. In each case, people use extramnemonic processes to aid them in reaching mnemonic and intellectual goals. Those extramnemonic processes should not be the exclusive province of those who worry about the ecological validity of their work; the chapters in this volume are a testament to the experimental rigor that can be applied to tasks that nonetheless allow the subject additional degrees of control. Those ‘‘nuisance’’ variables are important aspects of memory performance that have been thoughtfully but quietly spirited away from experimental consideration in much research. This volume shows that a careful consideration of what exactly is lost by controlling for self-controlled
xiv
Preface
behavior in learning and remembering can provide insight into new problems and new solutions, ones that are outside the scope of traditional memory research. Aaron S. Benjamin and Brian H. Ross
THE STRATEGIC REGULATION OF MEMORY ACCURACY AND INFORMATIVENESS Morris Goldsmith and Asher Koriat
I.
Introduction
When things happen to us, we talk about them. Events do not just happen in words, but that is our primary means of conveying them. When we talk, we do not just recount events one by one in serial order as in a memory experiment. . .. We tell things diVerently to diVerent audiences and for diVerent ends. (Tversky & Marsh, 2000, pp. 1–2)
An important development in experimental memory research over the past two decades has been the extension of that research to include phenomena and processes that are characteristic of the richness and complexity of memory in real‐life settings. Regardless of the controversies that have accompanied this development, the everyday‐naturalistic approach has greatly enriched the study of memory, yielding new experimental paradigms, novel theoretical approaches, and valuable insights. The central thesis of the present chapter is that particularly in real‐life situations, but also to some extent in the laboratory, rememberers strategically regulate the quality and amount of information that they report from memory in accordance with two generally competing goals: accuracy and informativeness. They do so by deciding which items of information to report and which to withhold, and by controlling the level of precision or graininess of the information that they report. These decisions can have a substantial eVect on memory performance. THE PSYCHOLOGY OF LEARNING AND MOTIVATION VOL. 48 DOI: 10.1016/S0079-7421(07)48001-X
1
Copyright 2007, Elsevier Inc. All rights reserved. 0079-7421/08 $35.00
Goldsmith and Koriat
2
In this chapter, we present a current snapshot of the metacognitive framework that we developed for investigating this regulation, reviewing related work in which some of the essential aspects of the strategic regulation of memory reporting in real‐life contexts have been brought into the laboratory for controlled experimental study. A.
EVERYDAY VERSUS LABORATORY APPROACHES TO MEMORY
Our interest in the strategic regulation of memory reporting stemmed initially from an attempt to clarify some apparent inconsistencies that emerged when comparing laboratory‐based findings regarding memory performance with results obtained in naturalistic contexts (Koriat & Goldsmith, 1994). As is well known, there has been a long and sometimes heated debate between proponents of the traditional, laboratory‐based study of memory and those who favor the ecological study of memory in naturalistic settings (see, e.g., January 1991 issue of American Psychologist). Our analysis of the discussions surrounding this debate (Koriat & Goldsmith, 1996a) revealed three dimensions along which the controversy generally revolved: what memory phenomena should be studied (real‐life phenomena vs list‐learning phenomena), how they should be studied (ecological validity vs experimental control), and where (real world vs laboratory). In addition, however, we argued that there seems to be a more fundamental breach underlying these issues that can account for some of the apparent correlation between the ‘‘what,’’ ‘‘where,’’ and ‘‘how’’ aspects: Underlying the everyday memory approach is a diVerent way of thinking about memory, a diVerent memory metaphor, than that underlying the traditional study of memory. We labeled these metaphors, the correspondence and storehouse metaphors, respectively. The contrast between the two metaphors provides the metatheoretical foundation for distinguishing two essentially diVerent treatments of memory. As detailed below, in comparison with the traditional, storehouse approach, the correspondence‐oriented, everyday approach has engendered (1) an increased focus on the reliability or unreliability of memory in capturing past events, (2) a greater recognition of the active role of the rememberer in controlling memory performance, and (3) a stronger emphasis on the role of subjective‐phenomenological experience in remembering. 1.
Focus on Accuracy
The traditional laboratory approach to the study of memory has followed Ebbinghaus (1895) in adopting a quantity‐oriented conception. In this conception, memory is seen as a storehouse into which discrete items of information are initially deposited and then later retrieved (Roediger, 1980). Memory is then evaluated in terms of the number of items that can be
The Strategic Regulation of Memory Accuracy and Informativeness
3
recovered after some retention interval. This approach to memory underlies the traditional list‐learning paradigm that continues to produce much of the data that appear in scientific journals. In contrast, the recent upsurge of interest in everyday memory phenomena implies a diVerent conception of memory. In this conception (following Bartlett, 1932), memory is viewed as a representation or reconstruction of past experience, and hence is evaluated in terms of its faithfulness to past events rather than in terms of the mere number of input items that can be recovered. Embodied in this conception is a correspondence rather than a storehouse metaphor of memory (Koriat & Goldsmith, 1996a,b). The correspondence metaphor, with its emphasis on memory accuracy, is apparent in such varied topics as eyewitness testimony, autobiographical memory, spatial memory, memory distortions and fabrications, false memory, memory and metamemory illusions, and schema‐based errors. As reviewed in Koriat, Goldsmith, and Pansky (2000), the growing body of work on memory accuracy and distortion has produced a plethora of new paradigms and findings, as well as some specific accuracy‐oriented theories that attempt to explain them. 2.
Active Role of the Rememberer
The interest in everyday memory has led also to a greater emphasis on the functions of memory in real‐life contexts and on the active role of the rememberer in putting memory to use in the service of personal goals. Most prominently, Neisser (1996, p. 204) has proposed that remembering should be viewed as a form of purposive action. In his words: Remembering is a kind of doing. Like other kinds of doing, it is purposive, personal, and particular: (1) It is purposive because it is done with a specific goal in mind; often that goal is to tell the truth about some past event, but on other occasions it may be to entertain, to impress, or to reassure. (2) It is personal because it is done by a specific individual and bears the stamp of that individual’s characteristic way of doing and telling. (3) It is particular because it is done on a specific occasion, in a way that reflects the particular opportunities and demands that the occasion may aVord.
Neisser’s proposal (see also Winograd, 1994, 1996), together with the idea that memory constructions are ‘‘skillfully built from available parts to serve specific purposes’’ (Neisser, 1996, p. 204), not only promotes a functional perspective in the study of memory but also implies a greater emphasis on self‐controlled, regulatory processes in remembering. This emphasis can be seen in an expanded notion of retrieval and remembering (Norman & Schacter, 1996; Winograd, 1996; Koriat, Goldsmith, & Halamish, in press) and in work emphasizing the metacognitive processes of monitoring and control that mediate memory performance (Goldsmith & Koriat, 1999;
4
Goldsmith and Koriat
Koriat & Goldsmith, 1996b). Complex evaluative and decisional processes used to avoid memory errors or to escape illusions of familiarity have been emphasized by many authors (Burgess & Shallice, 1996; Goldsmith & Koriat, 1999; Kelley & Jacoby, 1996; Schacter, Norman, & Koutstaal, 1998). The operation of these processes is particularly crucial in real‐life situations (e.g., eyewitness testimony) in which a premium is generally placed on accurate reporting. Personal control has not figured prominently in traditional laboratory memory research, perhaps because of its incompatibility with the desire for strict experimental control (Banaji & Crowder, 1989; Nelson & Narens, 1994). Thus, the common approach has been to limit personal control over memory reporting as much as possible (e.g., by using forced‐report techniques; Erdelyi & Becker, 1974), or else to attempt to ‘‘correct’’ for it by using techniques such as those provided by the signal‐detection methodology (Lockhart & Murdock, 1970) or standard correction‐for‐guessing formulas (Cronbach, 1984). This approach essentially treats personal control as a methodological nuisance that must be eliminated. However, once we acknowledge that personal control over memory reporting is an intrinsic aspect of real‐life remembering (see below), then participants must be allowed such control, but at the same time the underlying dynamics and performance consequences of this control should be systematically investigated. 3.
Emphasis on Subjective Experience
The focus on memory accuracy and correspondence in real‐life remembering has been accompanied by increased interest in the phenomenal qualities of recollective experience. Such qualities have attracted little interest in traditional quantity‐oriented memory research. Accuracy‐oriented research, in contrast, often involves the assumption that the phenomenal qualities of remembering provide diagnostic clues that are used by rememberers (as well as by observers) for discriminating between genuine and false memories (Conway, Collins, Gathercole, & Anderson, 1996; Koriat, 1995; Ross, 1997). For example, this assumption is central to the source‐monitoring framework (Mitchell & Johnson, 2000). In this framework, such properties as perceptual vividness and amount of contextual detail are assumed to help rememberers in specifying the origin of mental experiences. Subjective experience has been examined in connection with autobiographical memories (Brewer, 1992; Conway et al., 1996), false recall (Payne, Jacoby, & Lambert, 2004; Roediger & McDermott, 1995; Schacter, Verfaellie, & Pradere, 1996), post‐event misinformation (Zaragoza & Mitchell, 1996), flashbulb memories (Conway, 1995), eyewitness testimony (Fruzzetti, Toland, Teller, & Loftus, 1992), and fluency attributions and misattributions (Kelley & Jacoby, 1998).
The Strategic Regulation of Memory Accuracy and Informativeness
5
In metacognition research, various types of metacognitive feelings, such as the sense of familiarity, the feeling of knowing, and subjective confidence, have been assumed to guide the regulation of search and retrieval processes (Benjamin & Bjork, 1996; Koriat & Levy‐Sadot, 1999; Koriat, Ma’ayan, & Nussinson, 2006; Son & Schwartz, 2002). Thus no longer mere epiphenomena, subjective experience is treated as an integral component of the process of remembering (Johnson, 1997; Kelley & Jacoby, 2000; Koriat et al., 2000; Schacter et al., 1998).
B.
COMPETING GOALS OF MEMORY REPORTING: ACCURACY VERSUS INFORMATIVENESS
The traditional storehouse metaphor of memory implies a clear goal for the rememberer: to reproduce as much of the originally stored information as possible. This is the essence of the instructions provided to participants in typical list‐learning experiments. In contrast, as just discussed, the goals of remembering in everyday life are complex and varied and, in addition, these may be partially or wholly conflicting. Hence, a great deal of skill and sophistication may be required of the rememberer in negotiating between the diVerent goals and in finding an expedient compromise. In this chapter, we focus on two prominent memory goals that are tied to the storehouse and correspondence metaphors, respectively: quantity, or more generally, informativeness, and accuracy. In real‐life situations, these will often be pursued in the service of other, higher‐order goals. Importantly, the two goals are generally conflicting. Consider, for example, a courtroom witness who has sworn to ‘‘tell the whole truth and nothing but the truth.’’ Even if the witness is sincere in trying to uphold this oath, given the fallibility of memory, it is generally not possible to satisfy both of the implied commitments simultaneously: To avoid false testimony, the witness may choose to refrain from providing information that she feels unsure about. This, however, will tend to reduce the amount of information that she provides the court. Alternatively, she may choose to phrase her answers at a level of generality at which they are unlikely to be wrong. Once again, however, the increased accuracy will come at the expense of informativeness. In what follows, we present work that examines how rememberers control their memory reporting in the wake of generally competing demands for accuracy and informativeness, and the consequences of this control for their memory performance. Two types of control are considered: The first, control of report option (Koriat & Goldsmith, 1994, 1996b), involves the decision to volunteer or to withhold particular items of information. The second, control of grain size (Goldsmith, Koriat, & Pansky, 2005;
Goldsmith and Koriat
6
Goldsmith, Koriat, & Weinberg‐Eliezer, 2002), involves choosing the level of precision or coarseness of an answer, when it is provided. II.
The Strategic Control of Memory Reporting: A Metacognitive Framework
In order to bring the essential aspects of the strategic regulation of memory reporting into the laboratory, we adopted an item‐based approach that allows the examination of memory quantity and memory accuracy performance within a common framework. In this framework, the two memory properties are distinguished in terms of input‐bound and output‐bound measures, respectively (Koriat & Goldsmith, 1994, 1996b). Traditionally, measures of memory performance have been calculated conditional on the input by expressing the number of items recalled or recognized as the proportion or percentage of the total number of items presented. Such measures reflect the amount of presented or studied information that has been retained and is currently accessible. This type of assessment follows naturally from the storehouse metaphor. Memory performance, however, can also be assessed using output‐bound measures in which the number of correct items recalled is expressed as a proportion or percentage of the total number of items reported. Such measures reflect the accuracy of the memory report, in terms of the probability that a reported item is correct. Consider, for example, a participant (witness) who is presented with 25 words (items of information), and in a recall test reports 12 words (provides answers to 12 questions), 10 of which are correct and 2 are commission errors (wrong). Input‐bound memory quantity performance in that case is .40 (10/25), that is, 40% of the input‐study items have been successfully recalled. In contrast, output‐bound memory accuracy is .83 (10/12). That is, 83% of the output‐recalled items (answers) are, in fact, correct. This latter measure uniquely reflects the dependability of the information that is reported—the degree to which each reported item can be trusted to be correct. Essentially, then, whereas the input‐bound quantity measure holds the rememberer responsible for what he or she fails to report, the output‐bound accuracy measure holds the person accountable only for what he or she does report. Importantly, output‐bound accuracy and input‐bound quantity measures can be distinguished operationally only when rememberers are given the option of free report. On forced‐report tests, such as forced‐choice recognition or (less commonly) forced recall, in which participants are required to provide a substantive response to each and every test item, the input‐bound quantity and output‐bound accuracy percentages are necessarily equivalent.
The Strategic Regulation of Memory Accuracy and Informativeness
7
This is because the number of output items is the same as the number of input items (Koriat & Goldsmith, 1994, 1996a). For example, if a participant gets 10 out of 25 choices correct on a forced‐choice recognition test, we may conclude either that the probability of correctly recognizing an input item is .40 (input‐bound quantity) or that the probability that a reported item is correct is .40 (output‐bound accuracy). The diVerence between the two measures is entirely a matter of interpretation—whether one intends to measure quantity or accuracy. In contrast, on free‐report tests, such as cued or free recall, participants are allowed to omit items from the memory report or, equivalently, to respond ‘‘don’t know’’ if they feel they do not remember an item. In this case, the number of output items may be far fewer than the number of input items. The option of free report is essential when the focus is on output‐bound memory accuracy. Just as an eyewitness cannot be expected to uphold the oath to tell ‘‘nothing but the truth’’ under forced‐report conditions, neither does it make sense to hold participants accountable for the errors that they make under such conditions. Indeed, only under free‐report conditions, when rememberers have the option to respond ‘‘don’t know,’’ can we assume that they are actually committed to the accuracy of their memory output. Clearly, in real‐life (and most laboratory) settings, rememberers do not simply spew out all of the items of information that come to mind. In fact, as will be seen below, the option to screen out incorrect answers is an important means by which rememberers regulate the quality and quantity of their memory output in real‐life settings. How can the strategic regulation of memory performance in free‐report situations be conceptualized and investigated? In searching for a viable research approach, we first turned to signal‐detection theory (SDT; Green & Swets, 1966; Swets, Tanner, & Birdsall, 1961). Of course, SDT has been very influential in bringing to the fore the role of subject‐controlled processes in memory responding (Lockhart & Murdock, 1970; Norman & Wickelgren, 1969). That framework and its associated Type‐1 analyses have been used extensively to investigate the decision processes underlying forced‐report recognition memory: Participants in the standard old/new recognition paradigm are assumed to set a response criterion on a continuum of memory strength in order to decide whether to respond ‘‘old’’ (studied) or ‘‘new’’ (foil) to any given test item. Depending on various further assumptions, two indexes are typically derived: a measure of retention, d 0 , and a measure of criterion level, . Unfortunately, however, the traditional signal‐detection approach (Type‐1 analysis) is not very helpful in dealing with the decision process underlying free‐report memory performance, that is, with the decision whether to report an answer or to abstain. Therefore, our approach to the problem was to
Goldsmith and Koriat
8
extend the basic logic underlying SDT to free‐report situations (as others have done; see Klatzky & Erdelyi, 1985; and see Higham, 2002, for an application of Type‐2 SDT analyses, discussed in Section II.D), but also to augment that logic with concepts and methods borrowed from the study of metacognition. A.
THE BASIC MODEL: CONTROL OF REPORT OPTION
Figure 1 presents a simple model of how metamemory processes are used to regulate memory accuracy and quantity performance under free‐report conditions (Koriat & Goldsmith, 1996b). The model is deliberately schematic, focusing on the manner in which metacognitive processes at the reporting stage aVect the ultimate memory performance (cf. the distinction between ‘‘ecphory’’ and ‘‘conversion’’ in Tulving, 1983). Thus, in addition to an unspecified retrieval (or ecphory, reconstruction, and so forth) mechanism,
Input query
Retrieve
Best-candidate answer
LTM
Situational demands/payoffs
Monitor
Assessed probability (Pa)
Report option Accuracy
Set report criterion probability (Prc)
Pa ≥ Prc?
Yes ACC QTY
No Correct Volunteer
Incorrect
Correct Withhold Incorrect
Retrieval
Monitoring
Control
Performance
Fig. 1. A schematic model of the strategic regulation of memory accuracy and memory quantity performance, utilizing the option of free report. The upward and downward pointing arrows on the right of the figure signify positive and negative performance outcomes. (Adapted from Koriat & Goldsmith, 1996b.)
The Strategic Regulation of Memory Accuracy and Informativeness
9
we posit a monitoring mechanism that is used to subjectively assess the correctness of potential memory responses, and a control mechanism that determines whether to volunteer the best available candidate answer (for similar models, see Barnes, Nelson, Dunlosky, Mazzoni, & Narens, 1999; Higham, 2002). The control mechanism operates by setting a report criterion on the monitoring output: The answer is volunteered if its assessed probability of being correct passes the criterion, but is withheld otherwise. The criterion is set on the basis of implicit or explicit payoVs, that is, the perceived gain for providing correct information relative to the cost of providing incorrect information. Although the model is simple, its implications for memory performance are not. In fact, as will now be explained, within this metacognitive framework, free‐report memory performance depends on four contributing factors: 1. Overall retention: The amount of correct information (i.e., the number of correct candidate answers) that can be retrieved. 2. Monitoring eVectiveness: The extent to which the assessed probabilities (subjective confidence judgments) successfully diVerentiate correct from incorrect candidate answers. 3. Control sensitivity: The extent to which the volunteering or withholding of answers is in fact based on the monitoring output. 4. Report criterion setting: The report‐criterion probability (Prc) above which answers are volunteered, below which they are withheld. The general assumption is that although people cannot increase the quantity of correct information that they retrieve (e.g., Nilsson, 1987), they can enhance the accuracy of the information that they report by withholding answers that are likely to be incorrect. Hence, the most basic prediction is for a quantity‐accuracy trade‐oV: In general, raising the report criterion should result in fewer volunteered answers, a higher percentage of which are correct (increased output‐bound accuracy), but a lower number of which are correct (decreased input‐bound quantity). Because raising the report criterion is assumed to increase accuracy at the expense of quantity, the strategic control of memory performance requires the rememberer to weigh the relative payoVs for accuracy and quantity in reaching an appropriate criterion setting. Of course, this assumes that the participant does in fact volunteer and withhold information on the basis of subjective confidence. This is an assumption that is shared with the SDT framework, but our framework allows for variations in the strength of the relationship between subjective
Goldsmith and Koriat
10
experience and behavior, and treats this as a free parameter in explaining free‐report memory performance. The prediction of a quantity‐accuracy trade‐oV also assumes that the participant’s probability assessments are reasonably, but not perfectly, diagnostic of the correctness of the candidate answers. The importance of this assumption has largely gone unnoticed. Indeed, although monitoring eVectiveness has attracted much attention among students of metacognition (Koriat, 2007; Metcalfe & Shimamura, 1994; Schwartz, 1994), its performance consequences have only recently begun to be investigated (Barnes et al., 1999; Bjork, 1994; Thiede, Anderson, & Therriault, 2003). The critical contribution of monitoring eVectiveness to both memory accuracy and memory quantity performance emerged in several simulation analyses based on the model (Koriat & Goldsmith, 1996b). Let us assume a testing situation in which 50% of a participant’s candidate answers are correct (varying this percentage does not change the basic pattern of results), and manipulate both monitoring eVectiveness and report criterion. Figure 2 depicts the accuracy and quantity performance that should ensue under the 1.0 A - Accuracy
0.9 0.8
B - Accuracy
Memory performance
0.7 0.6
C - Accuracy
0.5 A - Quantity
0.4
B - Quantity
0.3 Monitoring: A Perfect discrimination B Prototypical C No discrimination
0.2 0.1
C - Quantity
0.0 0.0
0.1
0.2
0.3
0.4
0.5 0.6 Criterion level
0.7
0.8
0.9
1.0
Fig. 2. Simulated memory quantity and memory accuracy performance (proportion correct) plotted as a function of response criterion level, assuming three diVerent levels of monitoring eVectiveness (see text for explanation). (Adapted from Koriat & Goldsmith, 1996b.)
The Strategic Regulation of Memory Accuracy and Informativeness
11
model from the use of various report criteria, assuming three diVerent levels of monitoring eVectiveness. Consider first the ‘‘prototypical’’ monitoring condition (Plot B). In this condition the participant’s confidence judgments are assumed to be uniformly distributed across 11 levels, ranging from 0 (certainly wrong) to 1.0 (certainly right). In addition, these judgments are assumed to be perfectly calibrated, that is, 20% of the answers with confidence (assessed probability) of .20 are correct, 30% of the answers with confidence of .30 are correct, and so forth. Under these conditions, raising the report criterion from 0 (forced report) to 1.0 yields the prototypical quantity‐accuracy trade‐oV: Accuracy increases but quantity decreases as the criterion becomes more strict. Now, however, consider the plot for the ‘‘no discrimination’’ monitoring condition (Plot C ) in which the participant’s confidence judgments bear no relationship to the actual correctness of the answers. The participant may believe that his or her judgments are diagnostic, but in fact the probability that an answer is correct is .50 regardless of the confidence attached to it. In this extreme case, the participant is unable to enhance his or her memory performance at all by exercising the option of free report: Raising the report criterion does not increase accuracy performance, but simply decreases quantity performance. Finally, consider the ‘‘perfect discrimination’’ condition (Plot A) in which the participant discriminates perfectly between correct and incorrect candidate answers.1 Here, all correct answers are assigned a subjective probability of 1.0, and all incorrect answers are assigned a probability of 0. In that case, the ideal situation is reached in which the option of free report allows the participant to achieve 100% accuracy with no cost in quantity: For any criterion level greater than 0 (forced report), the participant will volunteer only correct answers and withhold only incorrect answers.
1
It is important to distinguish between two diVerent indices of monitoring eVectiveness, calibration and resolution (or discrimination accuracy; see, e.g., Lichtenstein et al., 1982; Nelson, 1996; Yaniv, Yates, & Smith, 1991). Calibration captures the absolute correspondence between subjective probabilities and the actual proportions correct. Perfect calibration, however, does not entail perfect monitoring eVectiveness at the level of the individual answers. For instance, although a subject may be well calibrated in that, for example, among all items assigned a probability of .60, exactly .60 are correct, this in fact means that the subjective monitoring is not eVective enough to diVerentiate the 60% correct responses from the 40% incorrect responses included in this category. Thus, it is discrimination accuracy (relative correspondence) that is more critical for the eVective operation of the control mechanism: When assessed probabilities are polarized between the 0 (certainly wrong) and 1.0 (certainly right) categories, perfect calibration entails perfect discrimination accuracy at the level of individual items. Note, however, that the same discrimination accuracy would be obtained even when the probability values assigned to the two categories were, say, .40 and .41, in which case calibration would be very poor.
Goldsmith and Koriat
12
These simulations help illustrate the role of two critical factors within the proposed framework: monitoring eVectiveness and accuracy motivation. With regard to monitoring eVectiveness, clearly some ability to distinguish between correct and incorrect candidate answers is necessary for the control of memory reporting to yield any benefits at all. Moreover, as this ability improves, greater increases in accuracy can be achieved at lower costs in quantity, so that at the extreme, when monitoring eVectiveness is perfect, there is no quantity‐accuracy trade‐oV at all. As far as accuracy motivation is concerned, one can generally increase the accuracy of a memory report by employing a more conservative report criterion. However, under most monitoring conditions, enhancing one’s accuracy becomes relatively costly in terms of quantity performance as the criterion level is raised (note the accelerated drop in quantity on the prototypical plot in Fig. 2). Thus, simply giving a person the option of free report may allow a fairly large accuracy improvement to be achieved without much loss of quantity, but placing a larger premium on accuracy should lead to a more serious quantity reduction. More generally, when considering free‐report memory performance, it is both necessary and useful to distinguish between the independent contributions of retention, monitoring, and control. Overall retention (50%, as indexed by forced‐report performance at criterion ¼ 0) was the same for all three conditions in Fig. 2. Yet the observed levels of free‐report performance could vary dramatically, depending on both the participant’s control policy (criterion level) and degree of monitoring eVectiveness. We will return to these points again with regard to the empirical results, considered next. B.
EMPIRICAL EVIDENCE
Do the monitoring and control processes in fact operate in the postulated manner? To test the basic assumptions of the model, we developed a special two‐phase procedure, referred to as the quantity‐accuracy profile (QAP) methodology (see Section II.C below). In the first experiment (Koriat & Goldsmith, 1996b, Experiment 1), a general knowledge test was administered to participants in either a recall or a recognition format. The participants initially took the test under forced‐report instructions (Phase 1) and provided confidence judgments regarding the correctness of each answer. Immediately afterward, they took the same test again under free‐report instructions (Phase 2), with either a moderate accuracy incentive (receiving a monetary bonus for correct answers but paying an equal penalty for wrong answers) or a strong accuracy incentive (in which the penalty was 10 times greater than the bonus). This procedure enabled us to trace the links postulated by the model (see Fig. 1) between retrieval, monitoring, control, and memory performance:
The Strategic Regulation of Memory Accuracy and Informativeness
13
Retrieval (recall or recognition) was tapped by treating the forced‐report answers provided in Phase 1 as representing the participant’s best candidate response for each item. Monitoring was tapped by eliciting each confidence judgment as a subjective probability assessment (Pa) associated with each best‐candidate answer. This allowed monitoring eVectiveness to be evaluated. Control was tapped by examining which answers were volunteered or withheld on Phase 2. This allowed us to determine the sensitivity of the control policy to the monitoring output, and to derive a best‐fit estimate of the Prc set by each participant.2 In addition, a comparison of the estimated report criteria for the two incentive conditions allowed an examination of the predicted eVects of accuracy incentive on the participants’ control policy. Finally, the design allowed us to evaluate the contribution of monitoring and control processes to the ultimate free‐report memory accuracy and memory quantity performance. The results accorded well with the model. First, participants were found to be fairly eVective in monitoring the correctness of their answers. Within‐subject Kruskal–Goodman gamma correlations between confidence and correctness (see Nelson, 1984) averaged .87 for recall and .68 for recognition. Second, the tendency to report an answer was strongly tied to confidence in the answer. In fact, the gamma correlations between confidence and volunteering averaged .97 for recall and .93 for recognition! Third, participants who were given the strong accuracy incentive were more selective in their reporting, adopting a stricter criterion than those given the more moderate incentive: They volunteered fewer answers on the average (45%) than did the moderate‐incentive participants (52%), and mean confidence for those answers (.93) was higher than those volunteered by the moderate‐incentive participants (.84). In addition, the report criterion estimates averaged .84 for the strong‐incentive participants versus .61 for the moderate‐incentive participants. Finally, by employing these monitoring and control processes, participants in both incentive conditions were able to enhance their free‐report accuracy performance relative to forced report. However, a quantity‐accuracy trade‐oV was observed both in comparing free‐ and forced‐report performance, and in 2
The procedure for estimating the report criterion (Prc) set by each participant is as follows: For each participant, each assessed‐probability‐correct (confidence) level from 0 to 1.0 is evaluated as a possible Prc. The model predicts that all items with assessed probability correct greater than or equal to the candidate Prc will be volunteered, and that all other answers will be withheld. The proportion of the participant’s actual volunteering and withholding decisions that correspond to the predicted decisions for each candidate Prc are calculated, and the candidate Prc that yields the highest proportion of correctly predicted report decisions (with fit rates generally averaging over 90%) is chosen as the Prc estimate. If a range of values yields an equally good fit, the average of these estimates may be chosen (though in the study referred to here, we used the upper bound instead).
Goldsmith and Koriat
14
Free report Forced report
Moderate accuracy incentive
Strong accuracy incentive
87.1 Accuracy Recall
77.5 46.3
Quantity
40.5 36.3
90.3 Accuracy Recognition Quantity
85.3 61.5 49.1 41.9
Fig. 3. Results from Koriat and Goldsmith (1996b, Experiment 1). Free‐report quantity and accuracy performance (percent correct) as a function of test format (recall vs recognition) and accuracy incentive (strong vs moderate). The means are adjusted for initial diVerences between the incentive groups in forced‐report performance, which is also presented for each test format.
comparing performance under the two incentive conditions (see Fig. 3).3 Consistent with the simulation analyses, the quantity cost of the improved accuracy increased in relative terms when a higher criterion was employed: Whereas under a moderate accuracy incentive, the option of free report enabled participants to enhance their accuracy substantially at a relatively low cost in quantity performance (a 64% accuracy improvement achieved at a 19% quantity cost for recall; a 33% accuracy improvement achieved at a 26% quantity cost for recognition), the introduction of a stronger accuracy incentive resulted in a further increase in accuracy, but now at a relatively high quantity cost (a further 12% accuracy improvement achieved at a 10% quantity cost for recall; a 6% accuracy improvement achieved at a 15% quantity cost for recognition; based on adjusted means). A second experiment evaluated the role of monitoring eVectiveness (Koriat & Goldsmith, 1996b, Experiment 2). That experiment used the 3 Due to sampling error, subjects in the high‐incentive condition yielded a higher forced‐ report quantity score (57.3%) than did the moderate‐incentive subjects (52.5%). This diVerence was partialed out in an analysis of covariance to determine the eVects of the incentive manipulation of free‐report accuracy and quantity performance.
The Strategic Regulation of Memory Accuracy and Informativeness
15
same procedure as in the first experiment (recall and moderate incentive only), but in addition, monitoring eVectiveness was manipulated within participant by using two diVerent sets of general knowledge items: One set (the ‘‘poor’’ monitoring condition) consisted of items for which the participants’ confidence judgments were expected to be generally uncorrelated with the correctness of their answers (FischhoV, Slovic, & Lichtenstein, 1977; Gigerenzer, HoVrage, & Kleinbo¨lting, 1991; Koriat, 1995), whereas the other set (the ‘‘good’’ monitoring condition) consisted of more typical items, for which the participants’ monitoring was expected to be more eVective. The success of the manipulation can be verified by examining Fig. 4. Participants based their volunteering decisions heavily on their monitoring output in both monitoring conditions, presumably because they lacked any better predictor. Thus, the gamma correlations between confidence and volunteering averaged .95 and .88 for the good‐ and poor‐monitoring conditions, respectively. More importantly, even when the two sets were matched on retention (by adding some very diYcult items to the good‐ monitoring set) so that forced‐report performance was equivalent, the good‐monitoring condition allowed participants to attain a far superior joint level of free‐report accuracy and quantity performance: Much better accuracy performance was achieved while maintaining equivalent quantity performance, compared to the poor‐monitoring condition (see Fig. 5). These results, then, reinforce the earlier simulation results in highlighting the criticality of monitoring eVectiveness for free‐report memory performance. When participants’ monitoring eVectiveness is good, the option of free report can allow them to achieve high levels of accuracy. In other situations, however, participants’ monitoring may be undiagnostic (or perhaps even counterdiagnostic, see Benjamin, Bjork, & Schwartz, 1998) to the point of being useless. Participants still control their memory reporting according to their monitoring output, but the attained level of free‐report accuracy may be little better than when participants are denied the option of deciding which answers to volunteer (for similar results using an associative interference manipulation, see Kelley & Sahakyan, 2003; Rhodes & Kelley, 2005). Of particular importance is the demonstration that monitoring eVectiveness can aVect memory performance independent of memory ‘‘retention’’ (cf. Fig. 2). Even when retention, as indexed by forced‐report quantity performance, was equated across the good‐ and poor‐monitoring subtests in Experiment 2, the joint levels of free‐report accuracy and quantity performance were far superior for the good‐monitoring subtest than for the poor‐monitoring subtest. Clearly, then, free‐report memory performance depends on the eVective operation of metacognitive processes that are simply not tapped by forced‐report performance. Results from several other studies also suggest a dissociation between monitoring and retention. For example, Kelley and Lindsay (1993) observed
Goldsmith and Koriat
16
A
Good-monitoring condition 1.0 0.9
17 146
Mean gamma = 0.90
0.8 Proportion correct
24 0.7 14
0.6 0.5
24
0.4
43
0.3
17
24 0.2 0.1
40 94
21
426
0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Assessed probability Poor-monitoring condition
B 1.0 0.9
Mean gamma = 0.26
Proportion correct
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
122
49
13 28
83 26
104
24 35
33 345 0.0 28 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Assessed probability Fig. 4. Calibration plots for the (A) good‐monitoring and (B) poor‐monitoring conditions in Koriat and Goldsmith (1996b, Experiment 2). The frequency of judgments in each category appears beside each data point, and the mean within‐subject gamma correlations between assessed probability correct and actual correctness is also presented.
The Strategic Regulation of Memory Accuracy and Informativeness
17
Matched retention Forced
Accuracy GOOD Monitoring
Free
75.0 27.9 22.3
Accuracy
21.0
Quantity
Free
63.0 11.2
Quantity
POOR Monitoring
Forced
8.6
11.8 7.6
Fig. 5. Results from Koriat and Goldsmith (1996b, Experiment 2). Mean quantity and accuracy performance (percent correct) for the good‐monitoring condition, the poor‐monitoring condition, and the good‐monitoring condition after matching it to the poor‐monitoring condition on retention (by including a subset of diYcult items).
that advance priming of potential answers to general information questions increased the ease of access to these answers, raising subjective confidence regardless of whether those answers were right or wrong. Similarly, research investigating the cue‐familiarity account of the feeling of knowing indicates that feeling‐of‐knowing judgments can be enhanced by advance priming of the cue, again even when such priming has no eVect on actual memory quantity performance (Reder & Ritter, 1992; Schwartz & Metcalfe, 1992). Finally, Chandler (1994) found that exposing participants to an additional set of pictures similar to the studied set increased their confidence ratings on a subsequent forced‐choice recognition test, while in fact their actual performance was impaired. Such dissociations serve to emphasize a basic diVerence between our proposed framework for conceptualizing the strategic regulation of memory reporting and the well‐known (Type‐1) SDT approach to memory. Type‐1 SDT does not address the separate contributions of memory retention (or memory strength) and monitoring eVectiveness to memory performance. In that approach, subjective confidence and memory strength are generally treated as synonymous (Chandler, 1994), and in fact, confidence is often used to index memory strength (Lockhart & Murdock, 1970; Parks, 1966; cf. Van Zandt, 2000). Thus, in the forced‐report old/new paradigm to which
Goldsmith and Koriat
18
signal‐detection methods are typically applied, ‘‘control’’ is isolated in terms of the parameter , yet ‘‘retention’’ (overall memory strength) and ‘‘monitoring eVectiveness’’ (the extent to which the participant’s confidence distinguishes ‘‘old’’ from ‘‘new’’ items) cannot be operationally or conceptually separated: Both are equally valid interpretations of d 0 (Lockhart & Murdock, 1970). By contrast, in our proposed framework for conceptualizing free‐report performance, these latter two aspects (as well as control) are given a separate standing: A person may have eVective monitoring, yet very poor retention, or vice versa. Furthermore, poor free‐report memory performance, for instance, could derive from poor retention, poor monitoring, an inappropriate control policy, or all three. The conceptual separation of these components of free‐report performance has important implications. At the theoretical level, it calls for more serious eVorts to incorporate monitoring and control processes—as well as encoding, storage, and retrieval processes—into our theories and models of memory. At the same time, however, acknowledgment of the potential eVects of metamemory processes on memory performance raises an important assessment issue: How should such eVects be handled when assessing memory performance? Our approach has been illustrated by the experimental procedure and analyses utilized in the experiments just reported. In the following section, we explicate and expand on this assessment methodology. C.
QAP METHODOLOGY
How can one sensibly evaluate a person’s memory if memory performance, particularly memory accuracy, is under the person’s control? The approach that we developed (Koriat & Goldsmith, 1996b) incorporates metacognitive processes into the assessment of memory performance, while isolating and evaluating their independent contributions to free‐report memory quantity and accuracy performance. Thus, rather than deriving a single ‘‘point‐ estimate’’ index of memory performance, our QAP methodology, provides a profile of measures that capture various aspects of each participant’s cognitive and metacognitive performance in a particular memory task. In addition, the methodology also allows one to examine (by way of simulation) the potential free‐report accuracy and quantity performance that participants might achieve, given their particular levels of memory retention and monitoring eVectiveness. The core of the procedure is the two‐phase, forced‐free paradigm, combined with the elicitation of confidence judgments in the forced‐report phase, which was just described in connection with our empirical studies. The role of the forced‐report phase is to provide information about memory retention or retrieval which is, as much as possible, unaVected by metacognitive report processes. The role of the free‐report phase, beyond that of indicating the actual levels of accuracy and quantity performance that are achieved under free‐report
The Strategic Regulation of Memory Accuracy and Informativeness
19
conditions, is to provide information about control: the extent to which the report decision is coupled to one’s monitoring (control sensitivity), and the strictness or liberality of the report criterion used by the participant (Prc). This is done in conjunction with the confidence judgments that are collected in the forced‐report phase. The additional role of the confidence judgments, of course, is to provide information about monitoring per se: its absolute levels, its calibration (e.g., over/underconfidence), and the extent to which it discriminates between correct and incorrect candidate answers (monitoring resolution). Overall, although the specifics may vary according to one’s research goals, our metacognitive free‐forced paradigm and associated QAP methodology allow the derivation of up to 10 diVerent measures for each participant (see Table I). In our own work, we have used several variations of the general procedure. Initially, in tasks involving general knowledge questions, we chose to collect the forced‐ and free‐report data in two separate phases, having the participants answer the same set of questions twice: first under forced‐report instructions and then again under free‐report instructions (or in reverse order). In other experiments, particularly those involving episodic memory tasks, the answers from the initial forced‐report phase were carried over to the subsequent free‐ report phase in which participants simply marked which items they would like to volunteer for points under the specified payoV schedule. Alternatively, however, the free‐ and forced‐report data can be collected on an item‐by‐item basis, by first forcing the participant to provide an answer, then eliciting a confidence judgment, and finally, having the participant decide whether to volunteer the answer or not (Kelley & Sahakyan, 2003). Each variation of this paradigm has advantages and disadvantages, but the pattern of results obtained across diVerent variations appears to be quite consistent. An additional component of the methodology and its use within the overall assessment approach still require explication. Similar to the manner in which the plotting of ROC curves in the SDT approach yields additional information beyond what is evident in the parameter values d 0 and alone (cf. Higham, 2007), so too with our approach, one can gain additional information by using the answers and associated confidence judgments collected in the forced‐report phase to plot the joint levels of free‐report quantity and accuracy performance that would ensue from the application of various report criteria to the participants’ candidate answers. Like ROC curves, these QAP curves (Koriat & Goldsmith, 1996b; see also Goldsmith & Koriat, 1999)4 allow one to generalize
4
In our previous presentations of the QAP methodology, it was not entirely clear whether the label ‘‘QAP’’ pertained to our entire methodology or only to the plotted QAP curves. We hope now to correct this problem by adding ‘‘curve’’ or ‘‘plot’’ as a qualifier when referring to this particular component of the overall methodology.
TABLE I SUMMARY OF MAIN QUANTITY‐ACCURACY PROFILE (QAP) MEASURES Measure Retention (or retrieval or ecphory) Monitoring resolution (or discrimination accuracy or relative monitoring) Monitoring calibration (or absolute monitoring) over/ underconfidence Monitoring calibration (or absolute monitoring) squared‐ or absolute‐value deviations Control sensitivity
Type
Description
Phase
Memory
Proportion or percentage of forced‐report answers that are correct
Forced
Monitoring
Within‐individual gamma correlation between confidence (assessed probability correct) in each answer and the correctness of each answer (Nelson, 1984, 1996), or alternative measures such as ANDI (Yaniv et al., 1991) DiVerence between mean assessed probability correct and proportion correct (positive values reflect overconfidence; Lichtenstein et al., 1982).
Forced
Monitoring
Mean squared‐ or absolute‐value diVerence between the mean assessed probability correct and proportion correct of each confidence category used in plotting a calibration curve (e.g., Fig. 4; see Lichtenstein et al., 1982)
Forced
Control
Within‐individual gamma correlation between confidence (assessed probability correct) in each answer and whether or not it was volunteered (see also Prc fit rate)
Forced þ Free
Monitoring
Forced
Report criterion (Prc) estimate
Control
Prc fit rate (or fit ratio)
Control
Control eVectiveness
Control
Free‐report quantity (input‐bound) Free‐report accuracy (output‐bound)
Performance Performance
Estimate of each participant’s report criterion (assessed probability level) that yields the maximum fit (fit rate) with his or her actual report decisions (see Footnote 2 for details) The proportion of each participant’s actual volunteering decisions that are compatible with the derived Prc estimate, and which is maximized by this estimate (see Footnote 2 for more details). This can also be used as an index of control sensitivity Absolute value of the diVerence between the estimated Prc for each participant and the optimal Prc, identified as the Prc level that would maximize the participants’ payoV (see Section II.C for details) Proportion of correct reported answers out of the total number of questions (or studied items) Proportion of correct reported answers out of the number of answers that were volunteered
Forced þ Free Forced þ Free
Forced þ Free
Free Free
The table includes typical and alternate names of the measures (in parentheses), the type of cognitive or metacognitive component that they address, a description of how they are calculated, and the source of the experimental data (forced‐report or free‐report phase) from which they are derived.
Goldsmith and Koriat
22
GM-MOD PM-MOD PM-STRONG GM-STRONG
100 – GM-ACC – PM-ACC
90 80
Performance (%)
70 60 50 40 30
– GM-QTY – PM-QTY
20 10 0 0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Report criterion (Prc) Fig. 6. Illustrative quantity‐accuracy profile (QAP) curves for two groups of participants exhibiting diVerent levels of monitoring eVectiveness. Potential free‐report memory quantity and memory accuracy performance (mean percent correct) is plotted as a function of criterion level for each group. The mean free‐report quantity and accuracy scores actually achieved by the participants, subdivided according to the operative level of accuracy incentive (high vs low), are also plotted as open squares or triangles at the point on the x‐axis corresponding to the criterion estimate for that subgroup. GM, good‐monitoring group; PM, poor‐monitoring group; ACC, accuracy performance; QTY, quantity performance; STRONG, strong accuracy incentive; MOD, moderate accuracy incentive.
beyond the participants’ actual free‐report quantity and accuracy performance, to other potential levels of performance, including ‘‘optimal’’ levels (if explicit payoVs for quantity and accuracy have been specified). To illustrate the method, and how it can be used in conjunction with the other QAP components, in Fig. 6 we present two new QAP curves derived using data from the recall condition of our earlier study (Koriat & Goldsmith, 1996b, Experiment 1). The plots compare the potential quantity and accuracy performance of eight ‘‘good‐monitoring’’ participants, those falling in the top quartile of monitoring eVectiveness (mean gamma correlation between confidence and correctness of individual items ¼ .95; range: .93–1.0) with
The Strategic Regulation of Memory Accuracy and Informativeness
23
eight ‘‘poor‐monitoring’’ participants, comprising the bottom quartile of monitoring eVectiveness (mean gamma correlation ¼ .75; range: .65–.83).5 These QAP curves were derived as follows: For each participant, confidence data from the initial forced‐report phase were used to calculate the input‐ bound quantity scores and the output‐bound accuracy scores (plotted on the y‐axis) that would result from the application of 11 diVerent potential report‐ criterion settings (Prc; plotted on the x‐axis), ranging from 0 (equivalent to forced report) to 1.0. That is, we assumed that all items with assessed probability correct greater than or equal to each Prc would be volunteered, and calculated the quantity and accuracy scores for that Prc accordingly. The means of these scores at each Prc are plotted separately for each of the two groups of participants. In addition, the actual quantity and accuracy scores achieved by the participants in the free‐report phase appear as bullets above the mean estimated criterion level for those participants (based on their actual volunteering behavior; see Footnote 2), subdivided further into those operating under the moderate (1:1 bonus‐penalty ratio) versus strong (1:10 bonus‐penalty ratio) accuracy incentives, described earlier. What type of information can be gleaned from these QAP curves? In terms of forced‐report performance (Prc ¼ 0), the two groups of participants are virtually indistinguishable. Thus, the memory performance ability of the participants in the two groups would be evaluated as ‘‘equivalent’’ under the traditional assessment approach, which often uses forced reporting in attempting to eliminate the contribution of participant‐controlled processes. Yet, one can immediately see that as soon as the participants are given the freedom to control their own memory reporting, the higher level of monitoring eVectiveness of the good‐monitoring participants allows them to achieve substantially better performance than the poor‐monitoring participants. In fact, the joint levels of accuracy and quantity performance that can be achieved by the good‐monitoring participants are superior to those attainable by the poor‐monitoring participants across the range of potential free‐ report criterion settings (Prc > 0). Also, consistent with the results of the earlier simulation analyses, although the poor‐monitoring participants can utilize the option of free report to achieve fairly high levels of output‐bound accuracy performance, in doing so, they must pay a higher price in quantity performance than do the good‐monitoring participants. That is, the good‐ monitoring participants exhibit a shallower quantity‐accuracy trade‐oV pattern than do their poor‐monitoring counterparts. Hence, the memory
5 Note that the level of monitoring eVectiveness exhibited by these ‘‘poor‐monitoring’’ participants is still rather good. By comparison, mean gamma was .26 in the poor‐monitoring condition of Experiment 2 in that same study (see Fig. 3).
24
Goldsmith and Koriat
abilities of the participants in the two groups are clearly not equivalent under conditions that allow them to regulate their own memory reporting. While the QAP curves provide important information about the potential levels of memory accuracy and quantity performance that can be achieved by participants given their specific levels of retention and monitoring eVectiveness, they can also be supplemented by information about the actual volunteering policy of the participants under free‐report conditions and the actual performance levels that ensue from that policy. This information is derived from the free‐report phase of the test procedure. As can be seen, there is a rather good correspondence between the actual free‐report quantity and accuracy scores, and the simulated scores based on the forced‐report data. The deviations that occur reflect the fact that the participants did not volunteer and withhold their answers entirely in line with the model (the Prc fit rate averaging 93% for this sample) and from the fact that in this illustration, the QAP curves are based on the forced‐report data of the participants from both incentive conditions, whereas the actual performance results reflect a subset of those participants, operating under a specific incentive condition. Examination of the actual free‐report performance of the participants brings to the fore the contribution of accuracy motivation: In both monitoring groups, participants strategically regulated their memory reporting according to the operative level of accuracy incentive, with a stricter criterion being adopted when reporting under the strong accuracy incentive than under the moderate accuracy incentive. Interestingly, the good‐monitoring participants were much more sensitive to the incentive manipulation than were the poor‐monitoring participants, showing a much larger diVerence in report criteria between the two incentive conditions. Which group achieved the best actual memory performance, and to what extent can this be attributed to more eVective report regulation? Consider the results from the participants in the moderate‐incentive condition. Whereas free‐ report accuracy was higher for the good‐monitoring than for the poor‐ monitoring participants, quantity performance was slightly higher for the poor‐monitoring participants. In such a case, one cannot determine, without further assumptions, which level of joint quantity and accuracy performance is better. The answer depends on how much weight is given to accuracy relative to quantity. For example, when testifying on the witness stand in a high‐stakes trial, we would tend to give a relatively high weight to the accuracy of the information that is reported. In the early stages of a criminal investigation, however, in order to generate as many leads as possible, one might be more interested in the amount of (correct) information that can be elicited than in the amount of false information (false leads) that is produced. One way of resolving this problem in memory research is to weight the participants’ quantity and
The Strategic Regulation of Memory Accuracy and Informativeness
25
accuracy performance according to the accuracy incentive payoV schedule under which the participant was operating. By considering the operative payoV schedule, not only do we gain a principled way of combining accuracy and quantity performance into a single performance score, we can also examine the eVectiveness of the participants’ control of reporting, in particular, the eYciency of the participants in choosing a report criterion that would maximize the ‘‘utility’’ of their memory performance. Figure 7 presents a set of payoV curves corresponding to the QAP plots in Fig. 6, but now describing the potential monetary bonus that could be achieved by the participants in the two monitoring groups under each of the two manipulated payoV schedules (incentive conditions), with the calculations based on the joint levels of quantity and accuracy performance
GM-MOD PM-MOD PM-STRONG GM-STRONG
30 20
– GM-MOD – PM-MOD – GM-STRONG
10
Payoff (points)
0
– PM-STRONG
−10 −20 −30 −40 −50 −60 −70 0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Report criterion (Prc) Fig. 7. Illustrative QAP payoV curves corresponding to the QAP performance curves in Fig. 6. Simulated mean free‐report point earnings are plotted as a function of criterion level for the participants in each monitoring level accuracy incentive subgroup. The mean point‐payoV actually achieved by the participants, according to the applicable incentive payoV scheme, is also plotted as open squares or triangles at the point on the x‐axis corresponding to the criterion estimate for that subgroup. The y‐axis is truncated to improve readability. GM, good‐ monitoring group; PM, poor‐monitoring group; ACC, accuracy performance; QTY, quantity performance; STRONG, strong accuracy incentive; MOD, moderate accuracy incentive.
26
Goldsmith and Koriat
yielded at each potential report criterion level. We see a pattern very similar to the one observed in the preceding analysis: Under each payoV scheme (incentive condition), the potential performance payoV is higher for the good‐monitoring group than for the poor‐monitoring group across the entire range of possible report criterion settings. We also see, however, that the actual performance payoV depends on the rememberer’s choice of report criterion setting, particularly in the high‐ incentive condition. In fact, by finding the maximum payoV that could be achieved by each participant and the corresponding criterion setting (based on each participant’s individual QAP data), we can identify the optimal criterion setting for each participant and the mean optimal criterion for each group or condition. The rememberer’s control eVectiveness can then be evaluated in terms of the diVerence between the actual payoV and the optimal payoV, and, correspondingly, between the actual (estimated) criterion setting and the optimal criterion setting. Doing this for the specific participants in our illustrative sample, we find that for the good‐monitoring participants, the mean optimal criterion settings were .48 and .63 in the moderate‐ and high‐incentive conditions, respectively, compared to the actual (estimated) criterion settings of .23 and .77, respectively. (This cannot be seen in Fig. 7, which combines the data from participants in both incentive conditions.) Thus, these participants’ control policy was overly liberal in the moderate‐incentive condition and overly conservative in the high‐ incentive condition, causing them to earn 3 points less than the optimal moderate‐incentive payoV and 5 points less than the optimal high‐incentive payoV. By comparison, the mean optimal criterion settings for the poor‐ monitoring participants were .68 and 1.0 in the moderate‐ and high‐incentive conditions, respectively, compared to the actual (estimated) criterion settings of .39 and .57, respectively. Thus, these participants’ control policy was overly liberal in both incentive conditions, causing them to earn 6 points less than the optimal moderate‐incentive payoV and 29 points less than the optimal high‐incentive payoV. Note that the good‐monitoring participants were not only more eVective in their monitoring, they were also more eVective in maximizing their performance by setting an appropriate report criterion than were the poor‐monitoring participants, both in terms of the absolute deviation between the actual and optimal criterion settings, and in terms of the diVerence between actual and optimal payoVs. 6 The preceding example was designed to illustrate the type of information that can be gained using the QAP assessment approach, and the potential 6
Note that one can also evaluate the participants’ performance with respect to ‘‘normative’’ report criteria by which all (and only) answers with a nonnegative subjective expected value are volunteered. Assuming perfect calibration, these are .50 and .91 in the moderate‐ and high‐incentive conditions, respectively.
The Strategic Regulation of Memory Accuracy and Informativeness
27
utility of evaluating the strategic regulation of memory performance within a decision‐theoretic framework. In general, QAP analyses can be used to separate and examine the eVects of diVerent variables on memory retention, monitoring, and control in a manner similar to the way Type‐1 SDT methods allow one to distinguish diVerential eVects on d 0 and . Individual diVerences and the eVects of various factors on the retention and accessibility of information can be examined with respect to forced‐report performance. DiVerences and eVects on monitoring eVectiveness can be examined in terms of calibration and resolution indexes. DiVerences in control sensitivity, report criterion, and control eVectiveness can also be examined. Finally, the contribution of each of these factors to both actual and potential free‐report accuracy and quantity performance can be isolated and compared between conditions or individuals (for further examples of QAP comparisons, see Goldsmith & Koriat, 1999; Koriat & Goldsmith, 1996a,b). D.
QAP OR TYPE‐2 SDT?
We have already discussed the similarities and diVerences between our general approach and the Type‐1 SDT approach. However, a variant of our QAP methodology involving Type‐2 SDT measures has been put forward by Higham (2002) and used in several subsequent studies (Higham, 2007; Higham & Gerrard, 2005; Higham & Tam, 2005, 2006).7 It is worthwhile, therefore, to briefly discuss some of the similarities and diVerences between his methodology and ours (for further discussion, see Higham, 2002). The psychological model underlying Higham’s adaptation of the Type‐2 SDT framework to analyze free‐report performance is essentially the same as ours. However, some of the performance measures and methods of analysis are diVerent. Like us, Higham assumes the existence of an initial retrieval stage, in which candidate answers are generated, followed by a monitoring and control stage, in which the candidate answers are evaluated and then either volunteered or withheld. As in the QAP methodology, a two‐phase, 7 There has been a great deal of confusion over the years concerning the diVerence between Type‐1 and Type‐2 SDT tasks and analyses (for helpful clarifications, see Galvin, Podd, Drga, & Whitmore, 2003; Healy & Jones, 1973). In a Type‐1 SDT task, an observer decides which of two events, defined independently of the observer, has occurred (e.g., whether a stimulus display contains a target or just noise; whether a presented recognition test item appeared earlier in the study list or not). In a Type‐2 SDT task, an observer decides which of her Type‐1 decisions are correct and which are incorrect. In other words, whereas the Type‐1 task taps cognitive performance, the Type‐2 task taps metacognitive performance. Hence, the Type‐2 SDT parameters d 0 (or A0 ) and (or B00D) lose their usual Type‐1 interpretation: In particular, when the participants’ decisions concern their own memory responses, d 0 (or A0 ) no longer indexes memory, but rather metamemory (monitoring eVectiveness). Because Type‐2 SDT no longer oVers an index of retention cleaned of response bias, an additional (non‐SDT) measure must be introduced for this purpose (e.g., forced‐report performance; see following discussion).
28
Goldsmith and Koriat
forced‐free procedure is used to gain information about both stages, and as with QAP, the initial retrieval‐generation component is indexed in terms of forced‐report performance. The monitoring and control components, however, are measured diVerently. Monitoring eVectiveness is measured in terms of the relationship between the free‐report decision (volunteer/withhold) and the correctness of the items, calculated using the (Type‐2) nonparametric SDT measure, A0 (Grier, 1971). Control, or report bias, is measured using the complementary SDT measure, B00D (Donaldson, 1992), which reflects the tendency to volunteer rather than withhold one’s answers, corrected for diVerences in their overall accuracy. In the Type‐2 approach, then, report control is used to tap monitoring by treating the ‘‘volunteer’’ and ‘‘withhold’’ decisions as reflecting high and low confidence, respectively. In contrast, the QAP methodology taps monitoring directly through confidence judgments. This diVerence is not arbitrary: Unlike SDT, our framework allows that rememberers may diVer in the extent to which they control their memory reporting on the basis of subjective confidence, and that these diVerences may be interesting sources of variance in free‐ report memory performance. Therefore, the QAP methodology provides an independent measure of this relationship (control sensitivity). Although the very strong correlations between confidence and reporting obtained with college students, described earlier, might seem to make this dissociation superfluous, in later sections (Sections III.C and III.D) we describe evidence suggesting that control sensitivity may in fact be an important factor to consider in explaining population diVerences in memory performance. A second diVerence concerns the control policy. The B00D measure used in Type‐2 analyses is a measure of report bias: For example, if two participants have exactly the same number of correct candidate answers available for reporting (i.e., equivalent forced‐report performance), and one of them has a higher volunteering rate than the other, this diVerence will be reflected in a lower B00D measure. Note, however, that in terms of our framework, the B00D measure does not distinguish between the setting of a lower (more liberal) report criterion (Prc), and the alternative possibility, that the increased volunteering rate stems from overconfidence in the correctness of one’s answers (i.e., a confidence ‘‘distribution shift’’; see, e.g., Higham & Tam, 2005, Experiment 2; cf. Wixted & Stretch, 2000). We, in keeping with the decision‐making and metacognition literatures (Lichtenstein, FischhoV, & Phillips, 1982; Nelson, 1996), consider over/underconfidence to be an aspect of monitoring rather than of control. This is why in our studies we generally use at least two measures of monitoring eVectiveness (one for calibration, and one for resolution; Lichtenstein et al., 1982; Nelson, 1996; see Table I). We reserve the theoretical notions of control policy and report criterion setting for the idea that independent of how calibrated one is in monitoring
The Strategic Regulation of Memory Accuracy and Informativeness
29
the correctness of one’s candidate answers, one might be risk‐averse, and withhold the answers, or risk‐seeking, and volunteer them. In sum, whereas our theoretical framework and accompanying methodology makes a distinction between five components (and subcomponents) that contribute to free‐report performance—retrieval, monitoring resolution (relative monitoring), calibration (absolute monitoring), control sensitivity, and control policy (report criterion or Prc)—Higham’s Type‐2 approach distinguishes only three—retrieval, monitoring resolution, and report bias (over/underconfidence þ control policy). In addition to these measurement diVerences, however, there also seems to be a fundamental diVerence between the two approaches in the way in which output‐bound memory accuracy is treated. In our approach, output‐bound accuracy is on an equal footing with input‐bound quantity as a property of interest in its own right. In contrast, Higham’s use of the Type‐2 SDT approach resembles the traditional use of Type‐1 SDT methods, which were generally used to ‘‘purify’’ the measure of memory quantity performance from potential ‘‘contaminants’’ such as response bias. Accuracy (i.e., the false alarm rate) was of interest primarily in order to correct the hit rate for response bias. The same appears to hold for Higham’s use of the Type‐2 methodology in studying free‐report performance. For example, in the studies in which his Type‐2 methodology has been applied, output‐bound accuracy was not even reported. Despite these diVerences, we emphasize that Higham’s Type‐2 SDT approach has proven to be very valuable in shedding light on the contribution of monitoring and control processes to free‐report memory quantity performance (see Section III), and appears to constitute a viable complement to our preferred QAP method, depending on the research context and one’s research goals. In the following section, we review work that demonstrates how the general framework that is common to both of our approaches, as well as the more specific assessment methods, can be applied to a variety of research questions and domains. III.
Applications of the Framework
The quantity‐oriented research tradition has identified many important variables that strongly aVect memory quantity performance. These include study time, massed versus spaced practice, test format (recall vs recognition), depth of encoding, list organization, encoding‐retrieval interactions, retention interval, and so forth. The accuracy‐oriented approach has focused on other factors such as those involved in producing misinformation eVects, reconstructive errors, memory misattributions, and source confusions. Both
30
Goldsmith and Koriat
approaches have also examined individual and population diVerences. One advantage of our theoretical framework is that it facilitates the merging of issues and findings from the two research traditions, allowing one to examine the eVects of various factors on memory performance from both a quantity‐ oriented and an accuracy‐oriented perspective. A second advantage is that armed with the QAP (or Type‐2 SDT) methodology, one can not only determine whether a particular factor aVects memory accuracy or quantity performance, but also to shed light on the underlying processes that might mediate such eVects. The application of this framework to examine how rememberers use metacognitive monitoring and control processes to regulate the accuracy and quantity of what they report from memory has yielded new insights with regard to several important memory topics and phenomena, such as (1) the eVectiveness of diVerent questioning and testing procedures in eliciting accurate memory reports, (2) the credibility of children’s witness testimony, (3) memory decline in old age, (4) cognitive and metacognitive impairments related to schizophrenia and psychoactive medication, (5) encoding–retrieval interactions and the encoding specificity principle, and (6) psychometric and scholastic testing. Each of these topics will be considered briefly in turn. A.
THE RECALL‐RECOGNITION PARADOX
A prominent variable in memory research is test format, recall versus recognition. This variable has been studied in both traditional, quantity‐oriented research and in more naturalistic, accuracy‐oriented research, with opposing implications: Whereas the general finding from decades of laboratory research (Brown, 1976) is that recognition testing is superior to recall testing in eliciting a greater quantity of correct information from memory, the established wisdom in eyewitness research is that recognition is inferior to recall in eliciting accurate information from rememberers (Hilgard & Loftus, 1979; Neisser, 1988). The latter position stems in part from the belief that directed questioning or recognition testing can have contaminating eVects on memory (Brown, DeVenbacher, & Sturgill, 1977; Gorenstein & Ellsworth, 1980; Lipton, 1977). Thus, the general recommendation is to elicit information initially in a free‐narrative format before moving on to directed questioning, and, even then, to place greater faith in the former (see Fisher, Geiselman, & Raymond, 1987; Hilgard & Loftus, 1979). Koriat and Goldsmith (1994), however, showed that this recall‐recognition paradox actually stems from the common confounding in research practice between test format (recall vs recognition) and report option (free vs forced): Typically, in recognition testing, participants are forced either to choose between several alternatives or to make a yes–no decision regarding each
The Strategic Regulation of Memory Accuracy and Informativeness
31
and every item (i.e., forced report), whereas in recall testing participants have the freedom to withhold information that they are unsure about (free report). Comparing performance on a free‐recognition test (in which participants had the option to respond ‘‘don’t know’’ to individual items) to a free‐recall test, Koriat and Goldsmith (1994) found that recognition quantity performance was still superior to recall, but now recognition accuracy was as high or even higher than recall accuracy. Thus, although the superior memory quantity performance of forced‐recognition over free‐recall testing does appear to stem from the test‐format diVerence (selection superior to production), the generally superior accuracy of free recall over forced recognition appears to stem entirely from report option (free superior to forced). These initial results were obtained using general knowledge questions, but the same pattern was also observed using a standard list‐learning paradigm (Koriat & Goldsmith, 1994, Experiment 2), and in a developmental study, using more naturalistic episodic stimuli (Koriat, Goldsmith, Schneider, & Nakash‐Dura, 2001). A subsequent examination of the underlying memory and metamemory components of recall and recognition performance using the QAP procedure (Koriat & Goldsmith, 1996b) indicated that although monitoring eVectiveness was in fact somewhat lower for recognition than for recall testing, this disadvantage was more than compensated for by superior memory access and the adoption of a more conservative report criterion under recognition testing. Based on their results, Koriat and Goldsmith (1994, 1996b) concluded that free recognition may actually be a generally superior testing procedure compared to free recall because it elicits better quantity performance with no reduction in accuracy. This, however, assumes that the option to withhold answers is emphasized by the questioner and clearly understood by the rememberer (Memon & Stevenage, 1996; Pansky, Koriat, & Goldsmith, 2005). In many cases, there may be implicit pressures to respond to directed or recognition queries that cause witnesses to lower their report criterion, even though ostensibly they are given the option to respond ‘‘don’t know’’ (see Section III.B). B.
CHILDREN’S EYEWITNESS TESTIMONY
The credibility of children’s memory, particularly with regard to legal testimony, has been studied intensively in recent years (Bruck & Ceci, 1999; Goodman, 2006). Because of the greater involvement of child witnesses in legal settings, it is important to know whether their recollections of an event can be trusted. Can children be counted on to give a complete and reliable account of past events (to tell the whole truth and nothing but the truth)? This question can be addressed in part in terms of strategic regulatory processes: Are children able to exploit the option of free report to enhance
32
Goldsmith and Koriat
the accuracy of what they report? Can we trust an 8‐year‐old child, for instance, to eVectively censor what she reports, providing only those pieces of information that are likely to be correct? Will her performance be sensitive to specific incentives for accurate reporting? What will be the price in terms of memory quantity? Might diVerences in the ability and tendency to exert strategic control over memory reporting account for some of the inconsistency in developmental findings, noted earlier? Results from several studies suggest that children are particularly reluctant to say ‘‘don’t know’’ in response to memory questions (Cassel, Roebers, & Bjorklund, 1996; Mulder & Vrij, 1996; Roebers & Fernandez, 2002). Thus, children may be less able or less willing than adults to control their memory reporting on the basis of their subjective monitoring. One approach to correcting this problem is to instruct children in the ‘‘rules’’ of memory reporting. Mulder and Vrij (1996), for example, found that explicitly instructing children aged 4–10 that ‘‘I don’t know’’ is an acceptable answer significantly reduced the number of incorrect responses to misleading questions (i.e., questions about events that did not in fact occur). Moston (1987) also found that such instructions induced children aged 6–10 to make more ‘‘don’t know’’ responses, but in that study this had no eVect on the overall proportion of correct responses. On the other hand, several studies (Cassel et al., 1996; Koriat et al., 2001; Roebers & Fernandez, 2002) have found that children exhibit a greater tendency than adults to provide wrong information from memory even when they are reminded that they have the option to say ‘‘don’t know.’’ In our own study (Koriat et al., 2001), 8‐ to 12‐year olds were presented with a narrated slide show depicting an incident on the way to a family picnic, and their memory was later tested under free‐ or forced‐report conditions using a recall or a multiple‐choice recognition test format. The results yielded a pattern that was remarkably similar to that described earlier for adult participants. The children’s memory accuracy performance was better under free‐ than forced‐report instructions, and the reverse was true for memory quantity performance. For example, in Experiment 1 of that study, memory accuracy increased from 68% under forced‐report testing to 81% under free‐report testing. In parallel, memory quantity decreased by 5 percentage points. This pattern of quantity‐accuracy trade‐oV, similar to that found with adults (Koriat & Goldsmith, 1994, 1996b), was observed with both younger children (8‐ and 9‐year olds) and older children (11‐ and 12‐year olds). Also, for both age groups, memory accuracy was better under a strong accuracy incentive (88%) than under a moderate accuracy incentive (81%), but here too the improved accuracy was achieved at the expense of quantity performance. Thus, children, even 8 year olds, are capable of exercising the option of free report eYciently to increase the accuracy of their
The Strategic Regulation of Memory Accuracy and Informativeness
33
report, and they do so in accordance with the operative level of accuracy incentive. The absolute levels of achieved accuracy, however, diVered for the two age groups, with the older children producing more accurate memory reports than the younger children under both the moderate and the strong accuracy incentives, and under both recall and recognition testing. The main implication of this work is that strategic regulation of memory reporting is a critical factor that can, under the right conditions, allow children to enhance their memory accuracy considerably. Recent work has been directed toward clarifying precisely what those conditions are (Roebers & Schneider, 2005). C.
MEMORY IMPAIRMENT IN OLD AGE
Memory decline in old age is both ubiquitous and multifaceted. Here too, the distinction between memory quantity and memory accuracy is crucial. Most research has focused on the decline in the amount of information recalled in old age. Other research, however, indicates an impairment in memory accuracy, with older adults exhibiting greater vulnerability to memory errors and distortions that can have potentially serious consequences such as taking the same medicine twice (Koriat, Ben‐zur, & SheVer, 1988) or being susceptible to scams and con artists (Jacoby & Rhodes, 2006). There are numerous references in the literature to possible links between the memory impairments associated with aging and those associated with specific neuropsychological deficits, particularly those characteristic of patients suVering frontal lobe lesions (Moscovitch & Winocur, 1995). Neuropsychological evidence suggests that frontal lobes are at least partially involved in metamemory judgments (Janowsky, Shimamura, & Squire, 1989). If old age is associated with impairments in frontal lobe functioning, then we may expect age‐related declines in metamemory processes. However, the pertinent results have been mixed and inconclusive (Hertzog & Dunlosky, 2004). Several studies have focused specifically on old‐age‐related aspects of metacognitive and neurocognitive functioning that contribute to the strategic regulation of memory performance. Most prominently, Kelley and colleagues (Kelley & Sahakyan, 2003; Rhodes & Kelley, 2005) have utilized our framework and the QAP methodology in conjunction with a clever associative interference paradigm taken from Kato (1985) to compare the strategic regulatory processes of younger and older adults. Kelley and Sahakyan (2003, Experiment 1) found that for control word pairs (not expected to elicit associative interference), although forced‐report performance (quantity or accuracy) was superior for younger than for older participants, the older participants utilized the option to withhold answers to narrow the gap between their level of free‐report accuracy performance and that of the
34
Goldsmith and Koriat
younger participants. In contrast, for ‘‘deceptive’’ word pairs (in which the retrieval cues evoke a highly accessible associate that competes with the target, thereby presenting a tough challenge to memory monitoring), the age diVerence in accuracy performance became, if anything, somewhat larger under free report than under forced report. This interactive pattern was accounted for in terms of monitoring eVectiveness: First, although both older and younger participants were highly overconfident in the correctness of their responses to the deceptive word pairs, the degree of overconfidence was more pronounced for the older participants. Second, the older participants exhibited lower levels of monitoring resolution for both deceptive and control word pairs. Additional experiments suggested that the impaired monitoring of the older participants derived from impoverished encoding: When the encoding of the younger participants was disrupted by having them study the word list under divided attention, they exhibited a pattern of performance that was very similar to that of the older participants in terms of both memory accuracy and memory monitoring. Thus, Kelley and Sahakyan suggested that older adults’ poorer memory monitoring may derive primarily from their increased reliance on familiarity of candidate responses rather than on recollection of details of the study experience (Jacoby, 1999; Jacoby, Debner, & Hay, 2001), which in turn may derive, at least in part, from poor encoding. A similar conclusion was reached by Rhodes and Kelley (2005), who used the same approach to investigate age diVerences in memory performance, but now tying these to neuropsychological measures of executive functioning (see also Butler, McDaniel, Dornburg, Price, & Roediger, 2004). In their study, path analyses supported a model in which aging impairs executive functioning, which in turn impairs retention (forced‐report performance—a product of both encoding quality and retrieval), which in turn impairs free‐report memory accuracy, both directly and by way of impaired monitoring. Research conducted in our own laboratory (Pansky, Koriat, Goldsmith, & Pearlman, 2002), examining age diVerences in memory for a short narrated slide show, also indicated an old‐age decline in monitoring eVectiveness and free‐report memory accuracy, and this pattern too was mimicked by a separate group of young adults who watched the slide show under divided attention. In addition, however, we found an interesting age diVerence in control sensitivity that could not be explained in terms of impaired encoding: Compared to the younger adults in both encoding conditions, the older participants relied less heavily on their confidence judgments in deciding which answers to volunteer and which to withhold under free‐report conditions. Moreover, across both age groups, control sensitivity was highly correlated with two measures of executive functioning (with the age factor partialed out): perseverative errors on the Wisconsin Card Sorting Task (r ¼ –.67) and the
The Strategic Regulation of Memory Accuracy and Informativeness
35
FAS word fluency test (r ¼ .46). These results may perhaps be related to findings implying a breakdown in the relationship between monitoring and control in certain clinical contexts (see Section III.D). This work, together with that of Kelley and colleagues just discussed, points to the need for further investigation of the role of metacognitive monitoring and control processes, and executive functioning in mediating age diVerences in memory accuracy performance. D.
CLINICAL MEMORY IMPAIRMENT
Metacognitive processes underlying the strategic regulation of performance are also gaining increased attention in research on schizophrenia and on the eVects of psychoactive medications. In a series of studies, Koren and colleagues (Koren et al., 2004, 2005; Koren, Seidman, Goldsmith, & Harvey, 2006) adapted our metacognitive framework and QAP methodology to examine the performance of first‐episode schizophrenic patients on a metacognitive free‐report version of the Wisconsin Card Sorting Task. In that adaptation, patients rated their confidence in each sort and decided whether they wanted that sort to count toward their performance payoV. They found that several of the metacognitive measures from the adapted task correlated more strongly with clinical measures relevant to real‐world functioning (‘‘insight into illness’’ and ‘‘competence to consent to treatment’’) than did traditional neuropsychological measures. Most prominent was control sensitivity, which was more highly correlated with the clinical measures of insight than any of the standard neuropsychological measures that were examined (Koren et al., 2004). With regard to the strategic regulation of memory reporting, several other results converge in suggesting impaired metacognitive processes in schizophrenia. First, Moritz and colleagues (Moritz & Woodward, 2006; Moritz, Woodward, & Chen, 2006) have observed that even when memory performance is equated, schizophrenic patients exhibit inferior monitoring resolution (‘‘knowledge corruption’’), characterized by high confidence in commission errors, compared to healthy controls and to other clinical populations. Second, Danion, Gokalsing, Robert, Massin‐Krauss, and Bacon (2001) have used our QAP procedure to compare schizophrenic patients with healthy controls on semantic (general knowledge) memory tasks. In addition to a general deficit in monitoring resolution and calibration (overconfidence), the schizophrenic patients exhibited relatively low control sensitivity (gamma averaging .83 for the clinical patients vs .94 for the healthy controls). Interestingly, similar eVects have been found for lorazepam in studies using healthy participants (Massin‐Krauss, Bacon, & Danion, 2002). Taken together, these various lines of research indicate two general
36
Goldsmith and Koriat
themes that deserve further attention in future research: (1) a monitoring deficit in which schizophrenic patients are highly confident in wrong responses, and (2) a control deficit, suggestive of an impaired relationship between subjective experience and behavior.
E.
ENCODING SPECIFICITY AND MEMORY CUEING
One of the most basic themes of memory research concerns the critical role of retrieval cues and retrieval‐encoding interactions in aVecting remembering (Koriat et al., in press). A case in point is the encoding‐specificity principle (Tulving & Thomson, 1973), which states that a cue presented during testing will be eVective in aiding retrieval to the extent that it has been encoded together with the solicited memory target at study. A large amount of research has provided evidence for this principle (Tulving, 1983) and for the more general idea that retrieval eYciency depends on the extent to which the testing conditions reinstate the overall conditions of study (Schacter, 1990). Almost all previous research has evaluated the encoding‐specificity principle using input‐bound memory quantity performance as the criterion (Fisher & Craik, 1977; Morris, Bransford, & Franks, 1977). In contrast, little attention has been paid to the potential eVects of encoding‐context reinstatement on output‐bound memory accuracy performance. Recent work, however, points to the role of metacognitive processes in mediating the eVects of context reinstatement on both memory quantity and accuracy performance. In attempting to clarify the source of context‐reinstatement eVects, Higham (2002) applied his Type‐2 SDT variant of the QAP procedure (see Section II.D) to examine performance in Thomson and Tulving’s classic paradigm (Thomson and Tulving, 1970). He replicated the classic finding of superior free‐report quantity performance for weak reinstated retrieval cues compared to strong extra‐list (unreinstated) cues. In examining the underlying source of this diVerence, however, he found (somewhat surprisingly) that it was mediated entirely by monitoring eVectiveness. Indeed, although retrieval, as indexed by forced‐report performance, was equivalent in the two conditions, monitoring eVectiveness was much poorer for the strong extra‐list cues. For these cues the participants often failed to recognize the targets that they produced, causing them to withhold these items on the free‐report phase. A subsequent study, however, in which the cueing strength of the reinstated and extra‐list cues was balanced (Higham & Tam, 2006, Experiment 3; see also Zeelenberg, 2005), indicated that the eVects of context reinstatement on free‐report quantity performance are mediated by both memory retrieval and memory monitoring.
The Strategic Regulation of Memory Accuracy and Informativeness
37
In his studies, Higham did not examine the eVects of context reinstatement on memory accuracy. Some insight into these eVects and how they are mediated can be gained from an unpublished study by Rosenbluth‐Mor (2001). In an adaptation of Tulving and Osler’s classic study (Tulving & Osler, 1968), she found that reinstating a weak‐associate studied cue at retrieval increased free‐report memory quantity performance compared to a baseline (no‐retrieval‐cue) condition, but had no eVect on output‐bound memory accuracy. In contrast, providing an extra‐list retrieval cue with the same (weak) associative strength to target as the study cue impaired both memory quantity and memory accuracy performance compared to the baseline (no‐retrieval‐cue) condition. Although preliminary, this pattern suggests that in comparing the reinstated and extra‐list cueing conditions, it is not the match between retrieval and study cues that enhances output‐bound memory accuracy but rather the mismatch between these cues that impairs accuracy. With regard to the underlying QAP components, the pattern for forced‐ report performance mirrored the pattern for free‐report quantity performance, suggesting that the cueing eVects (both positive and negative) on report quantity were mediated by memory retrieval. At the same time, the pattern for monitoring eVectiveness mirrored the pattern for free‐report accuracy performance (no monitoring advantage from reinstated cues, but inferior monitoring from extra‐list cues), suggesting that the negative eVect on report accuracy was mediated by memory monitoring. More specifically, extra‐list retrieval cues induced participants to generate a relatively large number of wrong candidate answers with intermediate levels of confidence, compared to the no‐retrieval‐cue condition, in which the distribution of confidence judgments was more polarized (either one produced the target or else nothing plausible came to mind; cf. Fig. 2). The preceding work demonstrates how the examination of metacognitive monitoring and control processes can shed new light on the factors aVecting both memory quantity and memory accuracy performance in standard, laboratory tasks. Cue reinstatement (encoding specificity) is just one of many classic manipulations and phenomena that might be examined in the context of our metacognitive framework. F.
PSYCHOMETRIC TESTING
Many of the standard psychometric tests of intelligence and scholastic aptitude [e.g., the Scholastic Aptitude Test (SAT) and the Graduate Record Examination (GRE) subject tests] use a multiple‐choice format in conjunction with formula‐scoring procedures (Thurstone, 1919) that are designed to discourage guessing and also to correct for it by levying a penalty for incorrect answers, but
38
Goldsmith and Koriat
not for omissions. In fact, the goal of formula scoring is to achieve an estimate of the test‐taker’s actual knowledge or ability that is ‘‘cleansed’’ from the contribution of guessing (Cronbach, 1984; cf. our earlier discussion of SDT methods). Yet, the penalty for incorrect answers, combined with the option to refrain from answering, eVectively puts the test‐taker in the position of having to strategically regulate his or her reporting in light of a quantity‐accuracy trade‐oV. Indeed, it is not always clear to test administrators that performance on such tests also taps metacognitive ability, that is, the ability to make eVective decisions about whether to risk providing an answer to a question or instead to omit (Budescu & Bar‐Hillel, 1993; Koriat & Goldsmith, 1998). Thus, for instance, one test‐taker may tend to guess on the basis of even a small amount of partial knowledge, while another may prefer not to provide any answer about which she is unsure (Abu‐Sayf, 1979; Gafni, 1990). One test‐taker may be eVective in distinguishing between answers that are more likely or less likely to be correct, whereas another test‐taker may be less eVective in discriminating between what she ‘‘knows’’ and what she does not know (AngoV, 1989; Budescu & Bar‐Hillel, 1993). Clearly, then, formula scoring is not achieving its intended goal (Albanese, 1988; AngoV & Schrader, 1984; Budescu & Bar‐ Hillel, 1993; Cross & Frary, 1977; Frary, 1980; Higham, 2007; Slakter, 1968). Of course, as in the other domains just considered, the fundamental question is not how to get rid of metacognitive contributions to test performance but rather whether we can gain some useful information about the person’s abilities from the systematic measurement and analysis of these contributions. Certainly, metacognitive incompetence can have serious consequences. Would we want to certify (or hire the services of) a doctor, lawyer, accountant, psychologist, or engineer who is deficient in discriminating between what she knows and what she does not know (Dunning, Heath, & Suls, 2004), or who, for example, prescribes treatments regardless of whether she is confident of her diagnosis? Would it not be appropriate, then, to include the ability to monitor one’s own knowledge and control one’s behavior accordingly among those aspects of the examinee’s aptitude or achievement that the test is intended to evaluate? Here too, the QAP and Type‐2 SDT approaches may allow one to incorporate these components into the psychometric assessment procedure and measure them. Higham (2007), in fact, has applied his Type‐2 SDT approach to the analysis and measurement of the strategic regulation of performance in SAT test taking under formula scoring, with interesting results. In parallel, Notea‐Koren (2006) has applied our QAP procedure in the same general context (multiple‐choice aptitude test taking) with similar goals and findings. Both studies indicate that the scores of test‐takers under formula scoring are aVected by the control policy that they adopt and by their level of monitoring eVectiveness. In both studies, the test‐takers’ actual control policies were
The Strategic Regulation of Memory Accuracy and Informativeness
39
measured (estimated) and found to diVer from an optimal control policy that would maximize their score given their specific level of cognitive performance and monitoring eVectiveness (cf. Fig. 7). In addition, results from the Notea‐ Koren (2006) study show that a component measure of metacognitive ability, monitoring resolution, can contribute unique variance in predicting first‐year university grades, beyond the predictive power of the free‐report formula score (or the forced‐report performance score) alone. These studies, together with those in the preceding sections, illustrate just a few of the potential domains to which our metacognitive framework for the control of report option, and the QAP assessment methodology, can be extended and applied. In Section IV, we present a further important direction in which the theoretical framework itself has been extended. IV.
Expanding the Framework: Control of Memory Grain Size
The theoretical and empirical work considered so far has focused on how people regulate their memory performance when given the option to withhold individual items of information or entire answers about which they are unsure. Control of report option, however, is just one means by which people can regulate their memory reporting. Indeed, in most real‐life memory situations, people do not just have the choice of either volunteering a substantive answer or else responding ‘‘I don’t know.’’ They also have the option of controlling the ‘‘graininess’’ or level of precision or coarseness of the information that they provide (e.g., describing the assailant’s height as ‘‘around 6 feet’’ or ‘‘fairly tall’’ rather than ‘‘5 feet 11 inches’’). To illustrate, consider a study reported by Neisser (1988), who tested students’ memory for events related to a seminar that he taught, using either an open‐ended recall format or a forced‐choice recognition format. He found the recall format to yield more accurate remembering than the recognition format and noted that this might come as a surprise to memory researchers who are accustomed to the general superiority of recognition testing over recall testing. As discussed earlier (Section III.A), such a finding can perhaps be explained by the eVects of report option. Neisser, however, also pointed out a further consideration: Whereas in the recognition format, participants had to make relatively fine discriminations between correct and incorrect response alternatives, in the recall format they seemed to choose ‘‘a level of generality at which they were not mistaken’’ (1988; p. 553). Along similar lines, Fisher (1996), in assessing participants’ freely reported recollections of a filmed robbery, was surprised to find that both quantity performance (number of correct statements) and accuracy performance (output‐bound proportion of correct statements) remained constant between two
40
Goldsmith and Koriat
retention intervals across a 40‐day span. The anomaly was resolved by considering the grain size of the reported information: Statements made after 40 days contained information that was substantially more coarse (as rated by two independent judges) than the information contained in the earlier statements. Clearly, then, when rememberers control their own memory reporting, diVerences in the grain size of the reported information can pose a troubling methodological problem. Here, too, the traditional remedy has been to take control away from the participant, for instance, by using recognition testing or by using stimulus materials, such as word lists, that limit the scope of the problem. Like report option, however, control over grain size is more than just a mere methodological nuisance that needs to be circumvented or corrected for. In most real‐life memory situations, it too constitutes an important means by which rememberers regulate the accuracy of their memory reporting and, as such, is an integral aspect of the process of remembering. The challenge is to find a way to systematically investigate this type of control as well. The approach we chose is similar to the one we used for report option and, in fact, assumes a close relationship between these two types of control. A.
ACCURACY‐INFORMATIVENESS TRADE‐OFF
Consider a situation in which a witness is asked to answer a set of questions that have to do with quantitative values such as the time of an accident, the speed of a car, the height of an assailant, and so forth.8 If the witness is forced to answer each question at a specified grain size (to the nearest minute, mile per hour, inch, and so forth), then the accuracy of those answers may be quite poor. However, even though the witness may not remember, say, that the accident occurred precisely at 6:13 pm, she may be able to report that it occurred between 6:00 and 6:30 pm, or perhaps, in the early evening. What, then, will happen if the witness herself is allowed to choose the grain size for her answers? Will she be able to exploit this option in an eVective manner, increasing the (output‐bound) accuracy of her memory report? On what basis will she choose an appropriate grain size for her answers? The considerations and mechanisms underlying the choice of grain size in memory reporting appear to be similar to, though somewhat more complex than, those underlying the exercise of report option. Let us 8 It is methodologically convenient to operationalize grain size in terms of the range or interval width used in reporting quantitative information (Yaniv & Foster, 1995, 1997). We assume that other forms of control over grain size (e.g., vague linguistic qualifiers, ‘‘reddish’’ vs ‘‘red’’) should operate according to similar principles, and in fact recent evidence indicates that they do (Weber & Brewer, in press).
The Strategic Regulation of Memory Accuracy and Informativeness
41
return to the earlier example of a witness who wants to fulfill her vow to ‘‘tell the whole truth and nothing but the truth.’’ How should she proceed? On the one hand, a very coarsely grained response (e.g., ‘‘between noon and midnight’’) will always be the wiser choice if accuracy (i.e., the probability of including the true value—telling nothing but the truth) is the sole consideration. However, such a response may not be very informative, falling short of the goal to tell the whole truth. On the other hand, whereas a very fine‐grained answer (e.g., 5:23 pm) would be much more informative, it is also much more likely to be wrong. A similar conflict is often faced by students taking open‐ended essay exams: Should one attempt to provide a very precise‐informative answer, but risk being wrong, or try to ‘‘hedge one’s bet’’ by providing a coarser, less informative answer, and risk being penalized for vagueness? In both of these examples, control over grain size can be seen to involve an accuracy‐informativeness trade‐oV similar to the accuracy‐quantity trade‐oV observed with regard to the control of report option. This idea of an accuracy‐informativeness trade‐oV was brought out nicely by Yaniv and Foster (1995, 1997) in the context of judgment and decision making. They showed that when people are asked to give quantitative estimates for the purpose of decision making, they tend to consider the recipient’s desire to obtain a useful response (cf. Grice, 1975), and often sacrifice accuracy for informativeness. Recipients of information generally require estimates that are both suYciently informative for their current needs and appropriately accurate. For example, information that the inflation rate will be ‘‘between 0% and 80%’’ in the coming year will not be appreciated by the recipient, although it is likely to be correct. In fact, Yaniv and Foster (1995) found that to some extent, recipients of information actually prefer a somewhat inaccurate but precise‐informative estimate (e.g., that the inflation rate will be ‘‘5–6%’’ when it turns out to be 7%) to an overly coarse, uninformative estimate that is ‘‘technically’’ correct. B.
SATISFICING VERSUS UTILITY‐MAXIMIZING MODELS OF GRAIN CONTROL
How does one find an appropriate compromise between accuracy and informativeness in choosing a grain size for his or her answers? One simple strategy that we considered is to provide the most finely grained (precise) answer that passes some preset report criterion (in terms of assessed probability correct). Thus, for example, our earlier witness might try to answer the question to the nearest minute, to the nearest 5 minutes, 10 minutes, 15 minutes, and so forth, until she is, say, at least 90% sure that the specified answer is correct. Goldsmith et al. (2002) called this the satisficing model (cf. Simon, 1956)
Goldsmith and Koriat
42
of the control of grain size: The rememberer strives to provide as much information as possible, as long as its assessed probability of being correct satisfies some reasonable minimum level. Note that this model is similar to the one presented earlier with regard to report option: As with report option, the assessed probability correct of each answer that is volunteered must pass a report criterion, and the setting of the criterion level should depend on the relative incentives for accuracy and informativeness in each particular situation. A more complex, alternative model was also examined. According to the relative expected‐utility maximizing model, rememberers monitor in parallel the likely correctness (assessed probability correct) of candidate answers at various grain sizes, and evaluate the informativeness (subjective value or utility) of the answer at each grain size. Combining the outputs of these two operations, they then calculate the subjective expected value or utility of the answer at each grain size (e.g., assessed probability correct subjective value or utility), compare these values, and choose the answer that maximizes the subjective expected value or utility. Such a relative comparison process, while aiming for a more optimal grain‐choice solution than the satisficing model, would seemingly place a much heavier cognitive and metacognitive burden on the rememberer than does the satisficing model. Before turning to the empirical evidence with respect to these models for the control of grain size, note that although they diVer in their specifics, they share a common conception of the choice of grain size as being based on two metacognitive processes: (a) a monitoring process that assesses the probability that answers at diVerent grain sizes are correct, and (b) a control process that uses the monitoring output, together with other information (e.g., the perceived informativeness of the answers, and/or the relative incentives for accuracy and informativeness) in order to decide on the appropriate grain size for a particular answer.
C.
EMPIRICAL EVIDENCE
Goldsmith et al. (2002) conducted a systematic study of the control of grain size in reporting from semantic memory. The main goal of that study was to determine whether the general metacognitive framework of monitoring and control that had been developed earlier to address the control of report option, would be useful in studying the control of grain size as well. In that study, participants answered a set of general knowledge questions, all of which related to quantitative‐numeric information: time, date, age, distance, speed, and so forth. The questions were presented in two phases: In the first phase, participants gave their best answer to each item using two diVerent
The Strategic Regulation of Memory Accuracy and Informativeness
43
bounded intervals (grain sizes), the widths of which were specified by the experimenter. For example, ‘‘When did Boris Becker last win the Wimbledon men’s tennis finals? (A) Provide a 3‐year interval; (B) Provide a 10‐year interval.’’ The two grain sizes were tailored for each item such that the coarse‐ grained answer specified a relatively wide interval, that would yield a mean proportion correct of about 75%, whereas the fine‐grained answer specified a more narrow interval (or in some cases a specific value, e.g., year), that would yield a mean proportion correct of about 30%. In the critical second phase, the participants went over their answers, and for each item, indicated which of the two answers (i.e., which of the two grain sizes) they would prefer to provide, assuming that they were ‘‘an expert witness testifying before a government committee.’’ In Experiment 1 of the study, participants chose to provide the fine‐grained answer in about 40% of the cases, implying that the choice of grain level was not guided solely by the desire to be correct (in which case they would have always chosen the coarse‐grained answer), nor solely by the desire to be informative (in which case they would have always chosen the more precise, fine‐grained answer). Instead, the participants tended to choose the coarse‐grained answer when the more precise answer was deemed too unreliable: Answers that the participants chose to provide at the fine‐grained level had a relatively high (about 50%) chance of being correct, whereas the fine‐grained answers that they would have provided, had they not chosen to provide the coarse‐grained answer instead, had a relatively low (about 20%) chance of being correct. Moreover, by sacrificing informativeness in this strategic manner, the participants improved their overall accuracy substantially (to about 60%) compared to what they would have achieved by providing the fine‐grained answers throughout (about 30%). The maximum accuracy that could have been achieved by providing only coarse‐grained answers was somewhat higher (about 75%). In Experiment 2 of the study, the collection of confidence judgments (assessed probability correct) for the answers at each grain size in the first phase of the design helped shed light on the monitoring and control processes underlying the choice of grain size. First, with regard to monitoring, the participants were fairly successful in discriminating between correct and incorrect answers at each grain size, with moderately high within‐participant gamma correlations between confidence and correctness of each answer (averaging about .50) for both the fine‐grained and the coarse‐grained answers. Second, with regard to control, there was a strong relationship between confidence in the fine‐grained answer and choice of grain size, with within‐ participant gamma correlations between confidence in the fine‐grained answer and the choice to provide that answer (rather than the coarse‐grained answer) averaging about .80.
44
Goldsmith and Koriat
In order to gain more insight into the process underlying the regulation of grain size, we conducted some further analyses. According to the satisficing model, participants should provide the fine‐grained answer if its assessed probability passes the report criterion, otherwise they should give the coarse‐ grained answer. Therefore, confidence in the fine‐grained answer should be the primary predictor of grain choice, whereas confidence in the coarse‐ grained answer should be irrelevant. In contrast, according to the expected‐ utility maximizing model, both of these should contribute to the grain choice. In particular, because the expected subjective utility of providing the coarse‐ grained answer increases as confidence (assessed probability correct) in that answer increases, all else equal, there should be a positive relationship between confidence in the coarse‐grained answer and the tendency to provide that answer (rather than provide the fine‐grained answer). The results of several multiple (logistic) regression analyses clearly favored the satisficing model: When both confidence in the fine‐grained answer and confidence in the coarse‐grained answer were included as joint predictors of the choice of grain size, confidence in the fine‐grained answer was the primary predictor, with a standardized regression coeYcient over three times as large as the coeYcient for coarse‐grained answer confidence. Moreover, the sign of the coeYcient for coarse‐answer confidence was in the opposite direction from what would be predicted by the expected‐utility maximizing model: Holding confidence in the fine‐grained answer constant, confidence in the coarse‐grained answer showed a weak but significant negative relationship to choice of the coarse‐grained answer. Finally, in a third experiment, the setting of the report criterion was shown to be strategic. In that experiment, the relative weight assigned to informativeness versus accuracy was manipulated by introducing explicit monetary incentives for correct answers at the two grain sizes in the second phase: A higher bonus was paid for correct fine‐grained answers than for correct coarse‐grained answers, and this ratio was 2:1 for half of the items (weak informativeness incentive) and 5:1 for the other half (strong informativeness incentive), counterbalanced across participants. As predicted, more fine‐grained answers were provided in the strong informativeness‐incentive condition than in the weak incentive condition, decreasing the accuracy of those answers, as well as the average confidence in those answers. The results involving confidence were again consistent with the satisficing model and inconsistent with the expected‐utility maximizing model. Using a procedure similar to the one described earlier as part of the QAP methodology, the report criterion set by each participant for providing fine‐grained answers was estimated, with the mean estimated criterion significantly more
The Strategic Regulation of Memory Accuracy and Informativeness
45
liberal in the strong informativeness‐incentive condition (.58) than in the weak‐incentive condition (.74). D.
CONTROL OF GRAIN SIZE IN EPISODIC MEMORY REPORTING OVER TIME
As in the case of report option, a consideration of the control of grain size in memory reporting has begun to shed light on other memory phenomena and issues. One example is the potential role of control over grain size in modulating the changes that occur in memory over time. Goldsmith et al. (2005) examined the regulation of report grain size over diVerent retention intervals. Starting from the well‐known finding that people often remember the gist of an event though they have forgotten its details (e.g., Kintsch, Welsch, Schmalhofer, & Zimny, 1990; Koriat, Levy‐Sadot, Edry, & de Marcas, 2003), they asked whether rememberers might exploit the diVerential forgetting rates of coarse and precise information in regulating the accuracy of the information that they report over time. Consider Neisser’s and Fisher’s anecdotal observations regarding the control of grain size, mentioned earlier (Fisher, 1996; Neisser, 1988). The general hypothesis implied by these observations is that in recalling episodic information from memory, rememberers may choose to provide more coarsely grained answers as the retention interval increases, thereby maintaining a reasonably high and stable level of report accuracy over time, but at the expense of providing less precise‐detailed information. This hypothesis is consistent with findings indicating that detailed information suVers a faster forgetting rate than coarse information (Kintsch et al., 1990; Koriat et al., 2003; Reyna & Kiernan, 1994), and findings from recognition‐memory research, that memory responses may be strategically based on more coarse levels of representation when the detailed information becomes harder to access (Anderson, Budiu, & Reder, 2001; Brainerd, Wright, Reyna, & Payne, 2002; Koutstaal, 2006). In order to test this hypothesis, Goldsmith et al. (2005, Experiment 1) had participants read a short transcript describing fictitious events surrounding a bar‐room argument, and tested their memory for these events either immediately, or after a one‐day or one‐week retention interval. The experimental paradigm was similar to that used in Goldsmith et al. (2002) except for the episodic nature of the memory material. Again, the test questions pertained to various items of quantitative information (heights, weights, ages of the characters, times of day, distances, etc.), and were initially answered at two fixed grain sizes, one precise (a specific value) and the other coarse (a bounded interval). In the second, grain‐choice phase, the participants
Goldsmith and Koriat
46
were instructed to choose for each item, the answer (precise or coarse) that would ‘‘help the investigator [who lost the original transcript] reproduce the facts of the case.’’ The main results of this experiment are reproduced in Fig. 8. With regard to Phase 1 performance (solid lines), the accuracy of the participants’ answers declined significantly over the one‐week retention interval, particularly in the first 24 h (between immediate and one‐day testing). Consistent with previous findings in the (recognition) literature, the rate of decline was somewhat more shallow for the coarse answers than for the fine‐grained (precise) answers, with coarse‐grain accuracy substantially higher than precise‐grain accuracy across all three retention intervals. More interesting are the results from the second phase (dotted line), in which participants could choose which grain size to provide for each item. As predicted, the tendency to prefer the coarse‐ grained answer increased with retention interval (from 43% at immediate 90 80 70
∆ Coarse answers (phase 1)
Accuracy (%)
60 ∆
50 40
Mon. = .78 Prc = .80 Coarse = 43%
Chosen-grain answers (phase 2)
∆
30 20
Mon. = .65 Prc = .76 Coarse = 61%
10
Precise answers (phase 1) Mon. = .57 Prc = .83 Coarse = 75%
0 Immed
Day
Week Retention interval
Fig. 8. Forgetting curves showing actual memory accuracy performance (mean percent correct) as a function of retention interval for the participants in Goldsmith et al. (2005, Experiment 1) plotted separately for precise‐grain answers (Test phase 1), coarse‐grain answers (Test phase 1), and ‘‘chosen‐grain’’ answers (Test phase 2) for which grain size was under the control of the participant. Three further indexes are presented for each retention interval: Mon. ¼ monitoring eVectiveness, in terms of mean gamma correlation between confidence in the precise answer and its actual correctness; Prc ¼ mean grain‐report criterion, estimated by examining the relationship between confidence in the precise answer and grain choice; Coarse ¼ percentage of answers chosen at the coarse grain level in Phase 2 (as an index of reduction in informativeness). The error bars represent 95% confidence intervals.
The Strategic Regulation of Memory Accuracy and Informativeness
47
testing to 75% after one week). This shift allowed rememberers to maintain a higher and more stable level of report accuracy than what they would have achieved had they been required to provide only precise answers. Of course, the increased accuracy was again ‘‘purchased’’ at the cost of reduced informativeness of the answers that were provided at the longer retention intervals. Insight into the metacognitive mechanisms was gained by examining the confidence data. As in the earlier study, there was a strong relationship between confidence in the fine‐grained answer and the grain‐control decision (mean gamma ¼ .85, across the retention intervals). Moreover, a simple satisficing model was again supported by the data: Multiple regression analyses that included both confidence in the fine‐grained answer and confidence in the coarse‐grained answer as predictors of the grain decision, yielded no added contribution of confidence in the coarse‐grained answer, beyond what could be accounted for by confidence in the fine‐grained answer alone. Interestingly, the estimated report criterion set by the participants was equivalent at the three retention intervals, averaging around .80. This suggests that the participants were aiming to achieve approximately the same level of accuracy (80% or higher), regardless of the retention interval. Yet, the level of accuracy actually achieved was much lower than 80% (particularly at delayed testing), and accuracy did not remain stable between immediate and one‐day testing. This discrepancy appears to stem from the participants’ imperfect level of monitoring eVectiveness, and from a decline in the level of that eVectiveness over time which mirrors the pattern of decline in report accuracy: Overconfidence, in terms of the diVerence between mean assessed probability correct of the fine‐grained answers and actual proportion correct, averaged .16 at immediate testing and .32 at one‐day and one‐week testing. A similar pattern of decrement over time was found for monitoring resolution (see Fig. 8). At the same time, the stability of accuracy between one‐day and one‐week testing appears to derive from (a) the adoption of a constant grain‐control policy (report criterion) across this interval, and (b) the stability of monitoring eVectiveness (in terms of both calibration and resolution) across this interval. These results further demonstrate the critical contribution of metacognitive monitoring and control processes to memory performance. The shift in the preferred grain size with retention interval suggests an additional means by which rememberers can compensate for their failing memory: By regulating the coarseness of their answers, rememberers can maintain relatively high levels of memory accuracy despite increased forgetting. In a similar manner, perhaps rememberers can also regulate grain size to compensate for diVerences in other factors that aVect memory, such as viewing conditions or being questioned about central versus peripheral details. Another important factor
Goldsmith and Koriat
48
to be examined is the regulation of grain size in old age: In light of the general finding of increased reliance on gist memory in old age (e.g., Earles, Kersten, Turner, & McMullen, 1999; Koutstaal & Schacter, 1997), control over grain size in memory reporting could be a potent tool used by older rememberers to maintain their report accuracy in the face of declining memory for details. The implications of control over grain size are, of course, especially pertinent to free‐narrative and other types of open‐ended memory testing procedures commonly used in naturalistic memory research. In fact, with regard to the eVects of retention interval, it is remarkable that some of the studies that used such procedures (but not all of them) observed very high and stable levels of accuracy over retention intervals of up to several years (e.g., Ebbesen & Rienick, 1998; Flin, Boon, Knox, & Bull, 1992; Hudson & Fivush, 1991; Poole & White, 1991, 1993)! These levels of accuracy may have been achieved because the free‐narrative format of memory report allowed participants both control over report option—withholding information that they are not sure about—as well as control over the grain size—choosing the level of precision or coarseness of the information that they reported.
V.
Toward an Integrated Model of Grain Size and Report Option
Indeed, in most real‐life memory situations, rememberers have the freedom both to withhold particular items of information and to choose an appropriate grain size for the information that they do report. In the research described so far, we addressed each of these two types of control separately. Now, however, returning to our hypothetical courtroom witness, we ask, how would she manage the utilization of both types of control simultaneously? When would she choose to provide a coarse‐grained answer and when would she choose to respond ‘‘don’t know’’? In this section, we present work in progress that points toward some preliminary answers to these questions. A.
AN INTEGRATED SATISFICING MODEL
We begin with the observation that report option and grain size may be viewed as a continuum: Withholding an answer is informationally equivalent to providing an extremely coarse‐grained response that encompasses the entire range of possible values, so that it conveys no information at all about the solicited value. A simple model that builds on this idea is sketched in Fig. 9 (solid lines only). In this model, which is essentially a generalization of the satisficing model of control over grain size discussed earlier, the rememberer generates candidate answers to each question at various grain sizes, providing the most precise (informative) answer that passes a preset
The Strategic Regulation of Memory Accuracy and Informativeness
49
Informativeness incentive
Input query
Strong: Prc = 0.71*
Set Prc
Weak: Prc = 0.83*
Retrieve and monitor information Yes
Sufficiently confident in PRECISE answer? [P a ≥ P rc]
Provide PRECISE answer
No Yes
Sufficiently confident in COARSE answer? [P a ≥ P rc]
Provide COARSE answer Yes
No
WITHHOLD the answer “don’t know”
No
Sufficiently INFORMATIVE? [INF ≥ INFc] Pragmatics, norms [Grice, 1975]
Fig. 9. Schematic outline of an integrated model of joint control over report option and grain size. See text for explanation. The Prc (report criterion) values designated by an asterisk are mean empirical Prc values obtained in unpublished new data. The dashed arrows designate a modification to our original model, which adds a minimum informativeness criterion (INFc) that must be passed in order that an answer is volunteered. INF, subjectively evaluated informativeness of the answer at a particular grain size.
confidence criterion. If even the most coarsely grained candidate fails to pass the criterion, the answer is withheld entirely. In an experiment designed to test this model, we used the same basic procedure as in our earlier semantic‐memory grain‐size study (Goldsmith et al., 2002), with the change that now in the second phase, the participants were allowed either to provide an answer at one of the two grain sizes, or to withhold the answer entirely. In analyzing the relationship between confidence in the answers and the choice of response (fine answer, coarse answer, or ‘‘don’t know’’), we found that participants used exactly the same report criterion for the control of grain size as they did for the report‐option decision: If confidence in the fine‐grained answer was less than .83 (mean estimated report criterion for control over grain size), participants preferred to provide the coarse‐grained answer. If, however, confidence in the coarse‐grained answer was also less than .83 (mean estimated report criterion for report option), the answer would be
Goldsmith and Koriat
50
withheld entirely. Interestingly, this pattern repeated itself in another condition in which the incentive for informativeness versus accuracy was increased by increasing the monetary bonus for correct answers at the fine grain size (as in Goldsmith et al., 2002, Experiment 3). A more liberal report criterion was adopted (.71), which again was identical for the decision whether to provide the fine‐grained or coarse‐grained answer, and for the decision whether to provide the coarse‐grained answer or instead to withhold the answer entirely.
B.
TO COARSEN OR WITHHOLD?
Although these data provide very nice support for the integrated satisficing model depicted in Fig. 9 [but see Weber & Brewer (in press) who obtained somewhat diVerent estimates for the grain‐choice and report‐option criteria], admittedly, both the model and the experimental procedure are oversimplified: Clearly, in real‐life control of grain size, rememberers are not confined to just two possible grain sizes. Instead, in principle, they have unlimited control over the grain size of their answers,9 and hence can choose to provide as coarse an answer as is needed to reach the desired level of confidence. Why, then, would rememberers ever choose to utilize the ‘‘don’t know’’ option under such conditions? For example, if participants could be 90% sure that Neil Armstrong walked on the moon sometime between the year 1950 and 2007, would they not prefer to provide that answer, rather than respond ‘‘don’t know’’? To examine this question, we ran a further experiment using the episodic memory materials and questions from our earlier study of grain control over time (Goldsmith et al., 2005), but this time participants could write down an interval of any size as their preferred answer, or they could respond ‘‘don’t know.’’ Thus, they were allowed complete control over the grain size of their answers (see also Goldsmith et al., 2005, Experiment 2), as well as the option of withholding the answer entirely. Under such conditions, the participants chose to utilize the ‘‘don’t know’’ option for an average of 18% of their responses (13% on immediate testing, 17% after one day, and 24% after one week). That is, in a substantial number of cases, the participants chose to refrain from providing any information at all, even though conceivably, they 9
We acknowledge that in general, the set of candidate grain sizes that are actually considered may conform to natural linguistic units (e.g., year, decade, century) and grain sizes that cross natural linguistic boundaries may not be considered seriously by the rememberer (e.g., ‘‘between 1961 and 1971’’ would generally be a less natural response compared to ‘‘between 1960 and 1970’’ or ‘‘sometime in the 1960s’’). This is an additional complication that deserves examination in future research.
The Strategic Regulation of Memory Accuracy and Informativeness
51
could have provided at least some information in a very coarse answer that was likely to be correct (e.g., ‘‘Benny drank between 0 and 15 beers). This interpretation could be wrong, however. Perhaps the participants chose to withhold answers only in those cases in which they could not provide any ‘‘information’’ even by choosing a very coarse answer. This might happen, for example, if in order to reach a reasonably high level of confidence, they would have to coarsen their answer so much that it would no longer yield any reduction in uncertainty about the solicited value beyond what could be inferred on the basis of general knowledge of the world (e.g., script knowledge) and common sense alone. To examine this possibility, we had the participants go back to each of their ‘‘don’t know’’ answers and now provide an interval answer that they were 100% sure was correct. When these 100% confidence intervals (CIs) were compared to those given by a group of control participants who read the stimulus story with all of the target quantitative information blacked out (preventing any episodic memory of the actual quantities), the CIs of the experimental participants were about half as wide10 as those provided by the unexposed control participants, even though the experimental participants had responded ‘‘don’t know’’ to these questions initially. These results indicate that the experimental participants chose the ‘‘don’t know’’ option even though they could have provided some useful information to an outsider (e.g., a police investigator) who had knowledge only of the general episode and not of the actual quantities. Moreover, the answer provided by the participants in the second phase was not only subjectively informative, it was also objectively informative: The 100% CIs obtained for the ‘‘don’t know’’ items of the experimental participants were significantly more likely to contain the correct target value than were the 100% CIs obtained from a second group of control participants, who answered the same questions at the same interval widths used by the experimental participants, on the basis of common sense and script knowledge alone. C.
THE NEED FOR AN INFORMATIVENESS CRITERION
How do the preceding results bear on the idea of an integrated model of grain size and report option? It appears that what was lacking in the original satisficing model of grain control, and would be needed in a new integrated model, is a minimum informativeness criterion that must be satisfied by any reported answer, in addition to the minimum confidence (accuracy) criterion 10 In order to approximately equate the contribution of diVerent items to the overall interval‐ width diVerences, the interval widths were normalized by the midpoint of the answers (normalized interval width ¼ actual interval width/interval midpoint).
52
Goldsmith and Koriat
that we have been focusing on exclusively in our work so far. This addition to the basic model is depicted schematically by the dashed lines in Fig. 9. According to social and pragmatic norms of communication, people are expected not only to be reasonably accurate in what they report, but also to be reasonably informative (Grice, 1975). We assume, then, that very coarse‐ grained answers, such as ‘‘Benny drank between 0 and 15 beers,’’ are generally avoided because they violate these norms of communication. These norms, together with more specific contextual considerations, presumably aVect the setting of the minimum informativeness criterion. If one’s knowledge or memory is so poor that one can only reach a high level of confidence by providing such an answer, the normatively acceptable option (and one that does not violate the minimum informativeness criterion) is to refrain from providing an answer, responding instead, ‘‘don’t know.’’ Note that by this analysis, rememberers should make use of the ‘‘don’t know’’ option specifically when they feel that they are unable to provide an answer that is both suYciently accurate (likely to be correct) and suYciently informative. One can imagine real‐life situations, however, in which there are implicit or explicit demands to provide a substantive answer (i.e., ‘‘don’t know’’ is not permitted). What should rememberers do in such situations when they find themselves unable to simultaneously satisfy the confidence and informativeness criteria? As part of a doctoral research project, currently underway, Ackerman and Goldsmith (2006) put forward a distinction between two knowledge states: satisficing and unsatisficing knowledge. Whereas in the former state one has suYcient knowledge (or memory) to support an answer that simultaneously satisfies the confidence and informativeness criteria, in the latter state one does not. They proposed that in a state of unsatisficing knowledge, withholding of answers (‘‘don’t know’’ response) is the preferred way of resolving the criterion conflict, but when this option is denied, rememberers have no choice but to violate one or both of the two criteria. In that case, they should tend to sacrifice the confidence criterion rather than the informativeness criterion. This is because the penalty (e.g., ridicule) for providing an overly uninformative answer is often immediate, whereas the accuracy or inaccuracy of one’s answer is generally only evident at a later time, if at all. The results of several experiments are consistent with these ideas, yielding the following general conclusions: (1) Participants strive to satisfy a minimum informativeness criterion as well as a minimum confidence‐accuracy criterion. (2) Criterion conflicts are more likely to occur when knowledge (memory) is low. (3) When criterion conflicts occur, participants will often violate the confidence criterion in order to meet the informativeness criterion. (4) Given joint control over grain size and report option, participants tend to circumvent the criterion conflict when it occurs, by withholding the answers entirely.
The Strategic Regulation of Memory Accuracy and Informativeness
VI.
53
Conclusion
The work described in this chapter is predicated on the view that the strategic regulation of memory performance is an intrinsic aspect of everyday remembering. Therefore, to achieve a more complete understanding of remembering in real‐life contexts, it is important to identify the various types of control that people exert over their memory reporting, and examine their underlying mechanisms and performance consequences. The desire to capture the full richness of real‐world memory phenomena, however, is often at odds with the desire to bring the phenomena into the laboratory for controlled experimental investigation. In our work, we have tried to reach an expedient compromise that oVers the benefits of experimental tools and rigor while still tapping some of the fundamental features of the strategic regulation of memory performance in real‐world settings. ACKNOWLEDGMENTS The preparation of this chapter was supported by the German Federal Ministry of Education and Research (BMBF) within the framework of German‐Israeli Project Cooperation (DIP). We thank the BMBF for their support.
REFERENCES Abu‐Sayf, F. K. (1979). The scoring of multiple‐choice tests: A closer look. Educational Technology, 19, 5–15. Ackerman, R., & Goldsmith, M. (2006). Control over grain size in question answering with unsatisficing knowledge. Poster presented at the aVect, motivation, and decision‐making International Conference, Ein Boqeq, The Dead Sea, Israel. Albanese, M. A. (1988). The projected impact of the correction for guessing on individual scores. Journal of Educational Measurement, 25, 149–157. American Psychologist. (1991). 46 (1). Anderson, J. R., Budiu, R., & Reder, L. M. (2001). A theory of sentence memory as part of a general theory of memory. Journal of Memory and Language, 45, 337–367. AngoV, W. H. (1989). Does guessing really help? Journal of Educational Measurement, 26, 323–336. AngoV, W. H., & Schrader, B. W. (1984). A study of hypotheses basic to the use of rights and formula scores. Journal of Educational Measurement, 21, 1–17. Banaji, M. R., & Crowder, R. G. (1989). The bankruptcy of everyday memory. American Psychologist, 44, 1185–1193. Barnes, A. E., Nelson, T. O., Dunlosky, J., Mazzoni, G., & Narens, L. (1999). An integrative system of metamemory components involved in retrieval. In D. Gopher and A. Koriat (Eds.) Cognitive regulation of performance: Interaction of theory and application. Attention and Performance XVII (pp. 287–313). Cambridge, MA: MIT Press. Bartlett, F. C. (1932). Remembering: A study in experimental and social psychology. New York: Cambridge University Press.
54
Goldsmith and Koriat
Benjamin, A. S., & Bjork, R. A. (1996). Retrieval fluency as a metacognitive index. In L. M. Reder (Ed.) Implicit memory and metacognition (pp. 309–338). Hillsdale, NJ: Erlbaum. Benjamin, A. S., Bjork, R. A., & Schwartz, B. L. (1998). The mismeasure of memory: When retrieval fluency is misleading as a metamnemonic index. Journal of Experimental Psychology: General, 127, 55–68. Bjork, R. A. (1994). Memory and metamemory considerations in the training of human beings. In J. Metcalfe and A. P. Shimamura (Eds.) Metacognition: Knowing about knowing (pp. 185–205). Cambridge, MA: MIT Press. Brainerd, C. J., Wright, R., Reyna, V. F., & Payne, D. G. (2002). Dual‐retrieval processes in free and associative recall. Journal of Memory and Language, 46, 120–152. Brewer, W. F. (1992). Phenomenal experience in laboratory and autobiographical memory tasks. In M. A. Conway, D. C. Rubin, H. Spinnler, and W. A. Wagenaar (Eds.) Theoretical perspectives on autobiographical memory (pp. 31–51). Dordrecht: Kluwer. Brown, E. L., DeVenbacher, K. A., & Sturgill, W. (1977). Memory for faces and the circumstances of encounter. Journal of Applied Psychology, 62, 311–318. Brown, J. (Ed.). (1992). Recall and recognition. London: Wiley. Bruck, M., & Ceci, S. J. (1999). The suggestibility of children’s memory. Annual Review of Psychology, 50, 419–439. Budescu, D., & Bar‐Hillel, M. (1993). To guess or not to guess: A decision‐theoretic view of formula scoring. Journal of Educational Measurement, 38, 277–291. Burgess, P. W., & Shallice, T. (1996). Confabulation and the control of recollection. Memory, 4, 359–411. Butler, K. M., McDaniel, M. A., Dornburg, C. C., Price, A. A., & Roediger, H. L., 3rd (2004). Age diVerences in veridical and false recall are not inevitable: The role of frontal lobe function. Psychonomic Bulletin & Review, 11, 921–925. Cassel, W. S., Roebers, C. M., & Bjorklund, D. F. (1996). Developmental patterns of eyewitness responses to repeated and increasingly suggestive questions. Journal of Experimental Child Psychology, 61, 116–133. Chandler, C. C. (1994). Studying related pictures can reduce accuracy, but increase confidence, in a modified recognition text. Memory & Cognition, 22, 273–280. Conway, M. A. (1995). Flashbulb memories. Hove, UK: Erlbaum. Conway, M. A., Collins, A. F., Gathercole, S. E., & Anderson, S. J. (1996). Recollections of true and false autobiographical memories. Journal of Experimental Psychology: General, 125, 69–95. Cronbach, L. J. (1984). Essentials of psychological testing. New York: Harper & Row. Cross, L. H., & Frary, R. B. (1977). An empirical test of Lord’s theoretical results regarding formula scoring of multiple‐choice tests. Journal of Educational Measurement, 14, 313–321. Danion, J. M., Gokalsing, E., Robert, P., Massin‐Krauss, M., & Bacon, E. (2001). Defective relationship between subjective experience and behavior in schizophrenia. American Journal of Psychiatry, 158, 2064–2066. Donaldson, W. (1992). Measuring recognition memory. Journal of General Psychology, 121, 275–277. Dunning, D., Heath, C., & Suls, J. M. (2004). Flawed self‐assessment. Psychological Science, 5, 69–106. Earles, J. L., Kersten, A. W., Turner, J. M., & McMullen, J. (1999). Influences of age, performance, and item relatedness on verbatim and gist recall of verb‐noun pairs. Journal of General Psychology, 126, 97–110. Ebbesen, E. B., & Rienick, C. B. (1998). Retention interval and eyewitness memory for events and personal identifying attributes. Journal of Applied Psychology, 83, 745–762.
The Strategic Regulation of Memory Accuracy and Informativeness
55
Ebbinghaus, H. E. (1895). Memory: A contribution to experimental psychology. New York: Dover. (Republished 1964.) Erdelyi, M. H., & Becker, J. (1974). Hypermnesia for pictures: Incremental memory for pictures but not words in multiple recall trials. Cognitive Psychology, 6, 159–171. FischhoV, B., Slovic, P., & Lichtenstein, S. (1977). Knowing with certainty: The appropriateness of extreme confidence. Journal of Experimental Psychology: Human Perception and Performance, 3, 552–564. Fisher, R. P. (1996). Implications of output‐bound measures for laboratory and field research in memory. Behavioral and Brain Sciences, 19, 197. Fisher, R. P., & Craik, F. I. M. (1977). Interaction between encoding and retrieval operations in cued recall. Journal of Experimental Psychology: Human Learning and Memory, 3, 701–711. Fisher, R. P., Geiselman, R. E., & Raymond, D. S. (1987). Critical analysis of police interview techniques. Journal of Police Science and Administration, 15, 177–185. Flin, R., Boon, J., Knox, A., & Bull, R. (1992). The eVect of a five‐month delay on children’s and adults’ eyewitness memory. British Journal of Psychology, 83, 323–336. Frary, R. B. (1980). The eVect of misinformation, partial information, and guessing on expected multiple‐choice test item scores. Applied Psychological Measurement, 4, 79–90. Fruzzetti, A. E., Toland, K., Teller, S. A., & Loftus, E. F. (1992). Memory and eyewitness testimony. In M. M. Gruneberg and P. E. Morris (Eds.) Aspects of memory (Vol. 1, pp. 18–50). London, UK: Routledge. Gafni, N. (1990). DiVerential tendencies to guess as a function of gender and lingual‐cultural reference group (Report No. 115). Jerusalem, Israel: National Institute for Testing and Evaluation. Galvin, S. J., Podd, J. V., Drga, V., & Whitmore, J. (2003). Type 2 tasks in the theory of signal detectability: Discrimination between correct and incorrect decisions. Psychonomic Bulletin & Review, 10, 843–876. Gigerenzer, G., HoVrage, U., & Kleinbo¨lting, H. (1991). Probabilistic mental models: A Brunswikian theory of confidence. Psychological Review, 98, 506–528. Goldsmith, M., & Koriat, A. (1999). The strategic regulation of memory reporting: Mechanisms and performance consequences. In D. Gopher and A. Koriat (Eds.) Cognitive regulation of performance: Interaction of theory and application. Attention and Performance XVII (pp. 373–400). Cambridge, MA: MIT Press. Goldsmith, M., Koriat, A., & Pansky, A. (2005). Strategic regulation of grain size in memory reporting over time. Journal of Memory and Language, 52, 505–525. Goldsmith, M., Koriat, A., & Weinberg, Eliezer, A. (2002). Strategic regulation of grain size memory reporting. Journal of Experimental Psychology: General, 131, 73–95. Goodman, G. A. (2006). Children’s eyewitness memory: A modern history and contemporary commentary. Journal of Social Issues, 62, 811–832. Gorenstein, G. W., & Ellsworth, P. C. (1980). EVect of choosing an incorrect photograph on a later identification by an eyewitness. Journal of Applied Psychology, 65, 616–622. Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley. Grice, H. P. (1975). Logic and conversation. In P. Cole and J. L. Morgan (Eds.) Syntax and semantics: Vol. 3. Speech acts (pp. 41–58). New York: Academic Press. Grier, J. B. (1971). Nonparametric indexes for sensitivity and bias: Computing formulas. Psychological Bulletin, 75, 424–429. Healy, A. F., & Jones, C. (1973). Criterion shifts in recall. Psychological Bulletin, 79, 335–340. Hertzog, C., & Dunlosky, J. (2004). Aging, metacognition, and cognitive control. In B. H. Ross (Ed.) The psychology of learning and motivation: Advances in research and theory (pp. 215–247). San Diego, CA: Elsevier.
56
Goldsmith and Koriat
Higham, P. A. (2002). Strong cues are not necessarily weak: Thomson and Tulving (1970) and the encoding specificity principle revisited. Memory & Cognition, 30, 67–80. Higham, P. A. (2007). No special K! A signal detection framework for the strategic regulation of memory accuracy. Journal of Experimental Psychology: General, 136, 1–22. Higham, P. A., & Gerrard, C. (2005). Not all error are created equal: Metacogntion and changing answers on multiple‐choice tests. Canadian Journal of Experimental Psychology, 59, 28–34. Higham, P. A., & Tam, H. (2005). Generation failure: Estimating metacognition in cued recall. Journal of Memory and Language, 52, 595–617. Higham, P. A., & Tam, H. (2006). Release from generation failure: The role of study‐list structure. Memory & Cognition, 34, 148–157. Hilgard, E. R., & Loftus, E. F. (1979). EVective interrogation of the eyewitness. International Journal of Clinical and Experimental Hypnosis, 27, 342–357. Hudson, J. A., & Fivush, R. (1991). As time goes by: Sixth graders remember a kindergarten experience. Applied Cognitive Psychology, 5, 347–360. Jacoby, L. L. (1999). Deceiving the elderly: EVects of accessibility bias in cued‐recall performance. Cognitive Neuropsychology, 16, 417–436. Jacoby, L. L., & Rhodes, M. G. (2006). False remembering in the aged. Current Directions in Psychological Science, 15, 49–53. Jacoby, L. L., Debner, J. A., & Hay, J. F. (2001). Proactive interference, accessibility bias, and process dissociations: Valid subject reports of memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 686–700. Janowsky, J. S., Shimamura, A. P., & Squire, L. R. (1989). Memory and metamemory: Comparisons between frontal lobe lesions and amnesic patients. Psychobiology, 17, 3–11. Johnson, M. K. (1997). Identifying the origin of mental experience. In M. S. Myslobodsky (Ed.) The Mythomanias: The nature of deception and self deception (pp. 133–180). Mahwah, NJ: Erlbaum. Kato, T. (1985). Semantic‐memory sources of episodic retrieval failure. Memory & Cognition, 13, 442–452. Kelley, C. M., & Jacoby, L. L. (1996). Adult egocentrism: Subjective experience versus analytic bases for judgment. Journal of Memory and Language, 35, 157–175. Kelley, C. M., & Jacoby, L. L. (1998). Subjective reports and process dissociation: Fluency, knowing, and feeling. Acta Psychologica, 98, 127–140. Kelley, C. M., & Jacoby, L. L. (2000). Recollection and familiarity. In E. Tulving and F. I. M. Craik (Eds.) The Oxford handbook of memory (pp. 215–228). London: Oxford University Press. Kelley, C. M., & Lindsay, D. S. (1993). Remembering mistaken for knowing: Ease of retrieval as a basis for confidence in answers to general knowledge questions. Journal of Memory and Language, 32, 1–24. Kelley, C. M., & Sahakyan, L. (2003). Memory, monitoring, and control in the attainment of memory accuracy. Journal of Memory and Language, 48, 704–721. Kintsch, W., Welsch, D., Schmalhofer, F., & Zimny, S. (1990). Sentence memory: A theoretical analysis. Journal of Memory and Language, 29, 133–159. Klatzky, R. L., & Erdelyi, M. H. (1985). The response criterion problem in tests of hypnosis and memory. International Journal of Clinical and Experimental Hypnosis, 33, 246–257. Koren, D., Poyurovsky, M., Seidman, L. J., Goldsmith, M., Wenger, S., & Klein, E. (2005). The neuropsychological basis of competence to consent in first‐episode schizophrenia: A pilot metacognitive study. Biological Psychiatry, 57, 609–616. Koren, D., Seidman, L. J., Goldsmith, M., & Harvey, P. D. (2006). Real‐world cognitive—and metacognitive—dysfunction in schizophrenia: A new approach for measuring (and remediating) more ‘‘right stuV.’’ Schizophrenia Bulletin, 32, 310–326.
The Strategic Regulation of Memory Accuracy and Informativeness
57
Koren, D., Seidman, L. J., Poyurovsky, M., Goldsmith, M., Viksman, P., Zichel, S., et al. (2004). The neuropsychological basis of insight in first‐episode schizophrenia: A pilot metacognitive study. Schizophrenia Research, 70, 195–202. Koriat, A. (1995). Dissociating knowing and the feeling of knowing: Further evidence for the accessibility model. Journal of Experimental Psychology: General, 124, 311–333. Koriat, A. (2007). Metacognition and consciousness. In P. D. Zelazo, M. Moscovitch, and E. Thompson (Eds.) Cambridge handbook of consciousness (pp. 289–325). Cambridge, UK: Cambridge University Press. Koriat, A., & Goldsmith, M. (1997). Memory in naturalistic and laboratory contexts: Distinguishing the accuracy‐oriented and quantity‐oriented approaches to memory assessment. Journal of Experimental Psychology: General, 123, 297–316. Koriat, A., & Goldsmith, M. (1996a). Memory as something that can be counted versus memory as something that can be counted on. In D. Herrmann, C. McEvoy, C. Hertzog, P. Hertel, and M. Johnson (Eds.) Basic and applied memory research: Practical applications (Vol. 2, pp. 3–18). Hillsdale, NJ: Erlbaum. Koriat, A., & Goldsmith, M. (1996b). Memory metaphors and the real‐life/laboratory controversy: Correspondence versus storehouse conceptions of memory. Behavioral and Brain Sciences, 19, 167–188. Koriat, A., & Goldsmith, M. (1998). The role of metacognitive processes in the regulation of memory performance. In G. Mazzoni and T. O. Nelson (Eds.) Metacognition and cognitive neuropsychology: Monitoring and control processes (pp. 97–118). Hillsdale, NJ: Erlbaum. Koriat, A., Goldsmith, M., Schneider, W., & Nakash‐Dura, M. (2001). The credibility of children’s testimony: Can children control the accuracy of their memory reports? Journal of Experimental Child Psychology, 79, 405–437. Koriat, A., & Levy‐Sadot, R. (1999). Processes underlying metacognitive judgments: Information‐ based and experience‐based monitoring of one’s own knowledge. In S. Chaiken and Y. Trope (Eds.) Dual process theories in social psychology (pp. 483–502). New York: Guilford Press. Koriat, A., Ben‐zur, H., & SheVer, D. (1988). Telling the same story twice: Output monitoring and age. Journal of Memory and Language, 27, 23–39. Koriat, A., Goldsmith, M., & Halamish, V. (in press). Control processes in voluntary remembering. In H. L. Roediger, III (Ed.), Cognitive psychology of memory. Vol. 2 of Learning and memory: A comprehensive reference, 4 vols. (J. Byrne, Editor). Oxford, UK: Elsevier. Koriat, A., Goldsmith, M., & Pansky, A. (2000). Toward a psychology of memory accuracy. Annual Review of Psychology, 51, 481–537. Koriat, A., Levy‐Sadot, R., Edry, E., & de Marcas, S. (2003). What do we know about what we cannot remember? Accessing the semantic attributes of words that cannot be recalled Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 1095–1105. Koriat, A., Ma’ayan, H., & Nussinson, R. (2006). The intricate relationships between monitoring and control in metacognition: Lessons for the cause‐and‐eVect relation between subjective experience and behavior. Journal of Experimental Psychology: General, 135, 36–69. Koutstaal, W. (2006). Flexible remembering. Psychonomic Bulletin & Review, 13, 84–91. Koutstaal, W., & Schacter, D. L. (1997). Gist‐based false recognition of pictures in older and younger adults. Journal of Memory and Language, 37, 555–583. Lichtenstein, S., FischhoV, B., & Phillips, L. D. (1982). Calibration of probabilities: The state of the art to 1980. In D. Kahneman, P. Slovic, and A. Tversky (Eds.) Judgment under uncertainty: Heuristics and biases (pp. 306–334). New York: Cambridge University Press. Lipton, J. P. (1977). On the psychology of eyewitness testimony. Journal of Applied Psychology, 62, 90–95. Lockhart, R. S., & Murdock, B. B. (1970). Memory and the theory of signal detection. Psychological Bulletin, 74, 100–109.
58
Goldsmith and Koriat
Massin‐Krauss, M., Bacon, E., & Danion, J.‐M. (2002). EVects of the benzodiazepine lorazepam on monitoring and control processes in semantic memory. Consciousness and Cognition, 11, 123–137. Memon, A., & Stevenage, V. S. (1996). Interviewing witnesses: What works and what doesn’t? Psycholoquy, 7, witness‐memory.1.memon. Metcalfe, J., & Shimamura, A. P. (1994). Metacognition: Knowing about knowing. Cambridge: MIT Press. Mitchell, K. J., & Johnson, M. K. (2000). Source monitoring: Attributing mental experiences. In E. Tulving and F. I. M. Craik (Eds.) The Oxford handbook of memory (pp. 179–195). London: Oxford University Press. Moritz, S., & Woodward, T. S. (2006). The contribution of metamemory deficits to schizophenia. Journal of Abnormal Psychology, 15, 15–25. Moritz, S., Woodward, T. S., & Chen, E. (2006). Investigation of metamemory dysfunctions in first‐episode schizophrenia. Schizophrenia Research, 81, 247–252. Morris, C. D., Bransford, J. D., & Franks, J. J. (1977). Levels of processing versus transfer appropriate processing. Journal of Verbal Learning and Verbal Behavior, 16, 519–533. Moscovitch, M., & Winocur, G. (1995). Frontal lobe, memory, and aging. In J. Grafman, J. K. Holyoak, and F. Boller (Eds.) Structure and functions of the human prefrontal cortex (Vol. 769, pp. 119–150). New York, NY: New York Academy of Science. Moston, S. (1987). The suggestibility of children in interview studies. First Language, 7, 67–78. Mulder, M. R., & Vrij, A. (1996). Explaining conversation rules to children: An intervention study to facilitate children’s accurate responses. Child Abuse and Neglect, 20, 623–631. Neisser, U. (1988). Time present and time past. In M. M. Gruneberg, P. Morris, and R. Sykes (Eds.) Practical aspects of memory: Current research and issues (Vol. 2, pp. 545–560). Chichester, England: Wiley. Neisser, U. (1996). Remembering as doing. Behavioral and Brain Sciences, 19, 203–204. Nelson, T. O. (1984). A comparison of current measures of the accuracy of feeling‐of‐knowing predictions. Psychological Bulletin, 95, 109–133. Nelson, T. O. (1996). Gamma is a measure of the accuracy of predicting performance on one item relative to another item, not of the absolute performance on an individual item. Applied Cognitive Psychology, 10, 257–260. Nelson, T. O., & Narens, L. (1994). Why investigate metacognition. In J. Metcalfe and A. P. Shimamura (Eds.) Metacognition: Knowing about knowing (pp. 1–25). Cambridge, MA: The MIT Press. Nilsson, L.‐G. (1987). Motivated memory: Dissociation between performance data and subjective reports. Psychological Research, 49, 183–188. Norman, D. A., & Wickelgren, W. A. (1969). Strength theory of decision rules and latency in retrieval from short‐term memory. Journal of Mathematical Psychology, 6, 192–208. Norman, K. A., & Schacter, D. L. (1996). Implicit memory, explicit memory, and false recollection: A neuroscience perspective. In L. M. Reder (Ed.) Implicit memory and metacognition (pp. 229–257). Hillsdale, NJ: Erlbaum. Notea‐Koren, E. (2006). Performance accuracy and quantity in psychometric testing: An examination and assessment of cognitive and metacognitive components. Unpublished doctoral dissertation, University of Haifa, Israel. Pansky, A., Koriat, A., & Goldsmith, M. (2005). Eyewitness recall and testimony. In N. Brewer and K. D. Williams (Eds.) Psychology and law: An empirical perspective (pp. 93–150). New York: Guilford Press. Pansky, A., Koriat, A., Goldsmith, M., & Pearlman, S. (2002). Memory accuracy and distortion in old age: Cognitive, metacognitive, and neurocognitive determinants. Poster presented at the
The Strategic Regulation of Memory Accuracy and Informativeness
59
30th Anniversary Conference of the National Institute for Psychobiology in Israel, Hebrew University, Jerusalem, Israel. Parks, T. E. (1966). Signal detectability theory of recognition‐memory performance. Psychological Review, 73, 44–58. Payne, B.K, Jacoby, L. L., & Lambert, A. J. (2004). Memory monitoring and the control of stereotype distortion. Journal of Experimental Social Psychology, 40, 52–64. Poole, D. A., & White, L. T. (1991). EVects of question repetition on the eyewitness testimony of children and adults. Developmental Psychology, 27, 975–986. Poole, D. A., & White, L. T. (1993). Two years later: EVect of question repetition and retention interval on the eyewitness testimony of children and adults. Developmental Psychology, 29, 844–853. Reder, L. M., & Ritter, F. E. (1992). What determines initial feeling of knowing? Familiarity with question terms, not with the answer. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 435–451. Reyna, V. F., & Kiernan, B. (1994). Development of gist versus verbatim memory in sentence recognition: EVects of lexical familiarity, semantic content, encoding instructions, and retention interval. Developmental Psychology, 30, 178–191. Rhodes, M. G., & Kelley, C. M. (2005). Executive processes, memory accuracy, and memory monitoring: An aging and individual diVerence analysis. Journal of Memory and Language, 52, 578–594. Roebers, C. M., & Fernandez, O. (2002). The eVects of accuracy motivation and children’s and adults’ event recall, suggestibility, and their answers to unanswerable questions. Journal of Cognition and Development, 3, 415–443. Roebers, C. M., & Schneider, W. (2005). The strategic regulation of children’s memory performance and suggestibility. Journal of Experimental Child Psychology, 91, 24–44. Roediger, H. L. (1980). Memory metaphors in cognitive psychology. Memory & Cognition, 8, 231–246. Roediger, H. L., & McDermott, K. B. (1995). Creating false memories: Remembering words not presented in lists. Journal of Experimental Psychology: Learning, Memory & Cognition, 21, 803–814. Rosenbluth‐Mor, M. (2001). Accuracy and quantity in memory reports: The eVects of context reinstatement. Unpublished master’s thesis, University of Haifa. Israel. Ross, M. (1997). Validating memories. In N. L. Stein, B. Ornstein, B. Tversky, and C. Brainerd (Eds.) Memory for everyday and emotional events (pp. 49–81). Mahwah, NJ: Erlbaum. Schacter, D. (1990). Memory. In M. I. Posner (Ed.) Foundation of cognitive science (pp. 687–725). Cambridge, MA: MIT Press. Schacter, D. L., Norman, D. A., & Koutstaal, W. (1998). The cognitive neuroscience of constructive memory. Annual Review of Psychology, 49, 289–318. Schacter, D. L., Verfaellie, M., & Pradere, D. (1996). The neuropsychology of memory illusions: False recall and recognition in amnesic patients. Journal of Memory and Language, 35, 319–334. Schwartz, B. L. (1994). Sources of information in metamemory: Judgments of learning and feeling of knowing. Psychonomic Bulletin & Review, 1, 357–375. Schwartz, B. L., & Metcalfe, J. (1992). Cue familiarity but not target retrievability enhances feeling‐of‐knowing judgments. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 1074–1083. Simon, H. A. (1956). Rational choice and the structure of environments. Psychological Review, 63, 129–138. Slakter, M. J. (1968). The penalty for not guessing. Journal of Educational Measurement, 5, 141–144.
60
Goldsmith and Koriat
Son, L. K., & Schwartz, B. L. (2002). The relation between metacognitive monitoring and control. In T. J. Perfect and B. S. Schwartz (Eds.) Applied metacognition (pp. 15–38). Cambridge, UK: Cambridge University Press. Swets, J. A., Tanner, W. P., Jr., & Birdsall, T. G. (1961). Decision processes in perception. Psychological Review, 68, 301–340. Thiede, K. W., Anderson, C. M., & Therriault, D. (2003). Accuracy of metacognitive monitoring aVects learning of texts. Journal of Educational Psychology, 95, 66–73. Thomson, D. M., & Tulving, E. (1970). Associative encoding and retrieval: Weak and strong cues. Journal of Experimental Psychology, 86, 255–262. Thurstone, L. L. (1919). A method for scoring tests. Psychological Bulletin, 16, 235–240. Tulving, E. (1983). Elements of episodic memory. Oxford: The Clarendon Press. Tulving, E., & Osler, S. (1968). EVectiveness of retrieval cues in memory for words. Journal of Experimental Psychology, 77, 593–601. Tulving, E., & Thomson, D. M. (1973). Encoding specificity and retrieval processes in episodic memory. Psychological Review, 80, 352–373. Tversky, B., & Marsh, E. J. (2000). Biased retellings of events yield biased memories. Cognitive Psychology, 40, 1–38. Van Zandt, T. (2000). ROC curves and confidence judgments in recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 582–600. Weber, N., & Brewer, N. (in press). Eyewitness recall: Regulation of grain size and the role of confidence. Journal of Experimental Psychology: Applied. Winograd, E. (1994). Comments on the authenticity and utility of memories. In U. Neisser and R. Fivush (Eds.) The remembering self: Construction and accuracy in the self‐narrative (pp. 243–251). New York: Cambridge University Press. Winograd, E. (1996). Contexts and functions of retrieval. Behavioral and Brain Sciences, 19, 209–210. Wixted, J. T., & Stretch, V. (2000). The case against a criterion‐shift account of false memory. Psychological Review, 107, 369–376. Yaniv, I., & Foster, D. P. (1995). Graininess of judgment under uncertainty: An accuracy‐ informativeness trade‐oV. Journal of Experimental Psychology: General, 124, 424–432. Yaniv, I., & Foster, D. P. (1997). Precision and accuracy of judgmental estimation. Journal of Behavioral Decision Making, 10, 21–32. Yaniv, I., Yates, J. F., & Smith, J. E. K. (1991). Measures of discrimination skill in probabilistic judgment. Psychological Bulletin, 110, 611–617. Zaragoza, M. S., & Mitchell, K. J. (1996). Repeated exposure to suggestion and the creation of false memories. Psychological Science, 7, 294–300. Zeelenberg, R. (2005). Encoding specificity manipulations do aVect retrieval from memory. Acta Psychologica, 119, 107–121.
RESPONSE BIAS IN RECOGNITION MEMORY Caren M. Rotello and Neil A. Macmillan
I.
Introduction
In this chapter we explore the ways in which subjects’ response bias can influence conclusions drawn by memory researchers from their data. We show that subjects can, in principle, minimize their memory errors and maximize their accuracy by choosing an unbiased criterion location, but that a host of factors influence their actual bias in generally sensible ways. Moreover, we demonstrate that researchers’ interpretation of subjects’ biases across conditions is dependent on their choice of bias measure, and that a failure to take account of response bias can lead to erroneous conclusions about memory performance. Most recognition studies are intended to learn something about memory sensitivity and wish to avoid the possibly confounding eVects of changes in bias toward one or another of the available responses. When this goal is paramount, the point of studying response bias is to eVectively eliminate its influence, either within the experiment or in data analysis. A full account of recognition performance, however, must explain the presumed decision process that leads to a memory judgment, and we view an understanding of bias eVects as a step in that direction. Whether designed to study accuracy or bias, most recognition memory experiments adopt as dependent variables the parameters of a model, and our discussion is largely couched in terms of signal detection theory. Figure 1A illustrates the simplest model of this type in which memory THE PSYCHOLOGY OF LEARNING AND MOTIVATION VOL. 48 DOI: 10.1016/S0079-7421(07)48002-1
61
Copyright 2007, Elsevier Inc. All rights reserved. 0079-7421/08 $35.00
Rotello and Macmillan
62
A
Targets Lures
Criterion B
Lures Targets
Fig. 1. Distributions of memory strengths for targets and lures: (A) equal‐variance distributions, (B) unequal‐variance distributions.
sensitivity is measured by d 0 , the mean separation of Gaussian underlying distributions corresponding to lures and targets. Many investigators report data in terms of d 0 and acknowledge the assumptions of signal‐detection theory (SDT), but other dependent variables are equally theory‐laden. For example, proportion correct [p(c)], another very common summary statistic, is equivalent to the mean diVerence between two distributions that are rectangular in shape (Macmillan & Creelman, 1990, 2005). Receiver‐ operating characteristic (ROC) data almost always support SDT (Gaussian) over threshold (rectangular) models, and it is for that reason that we focus on SDT here. The SDT construct that captures the response bias concept is the criterion: ‘‘old’’ decisions are made for observations above this level, ‘‘new’’ decisions for those below. We ask what determines the level at which the criterion is set and how that level changes in response to diVerent experimental conditions or variations within a condition. In tasks requiring a choice among more than two responses (we consider rating tasks and the remember–know design), an
Response Bias in Recognition Memory
63
SDT analysis assumes that there are multiple criteria, each of which may be fixed or variable, and we extend our survey to these paradigms. Among the variables that are known to aVect the estimated value of the criterion is the nature of the instructions; apparently, therefore, the criterion can be consciously manipulated. Nothing in SDT requires, however, that all criterion settings are available to introspection. The separation between response bias and accuracy—one of the primary accomplishments of SDT—does not map onto the distinction between conscious and unconscious processing. II.
Measuring Response Bias
Defining ‘‘response bias’’ to mean ‘‘value of the criterion,’’ as we do here, does not uniquely specify the variable to be examined. The criterion can be interpreted as a location or as a likelihood ratio, and location must be defined relative to some origin. In the simplest cases, all of these measures are equally useful, but in more complex situations each comes with theoretical strings attached. A.
MEASURING RESPONSE BIAS IN A SINGLE CONDITION
The simplest empirical question about response bias is whether it exists in a single experimental condition. The criterion location, relative to the crossover point between the (equal variance) lure and target distributions, is c ¼ 12½zðHÞ þ zðF Þ
ð1Þ
where H is the hit rate and F is the false‐alarm rate. The dependence of c on both the hit and false‐alarm rates is one of its important characteristics. The value of c is 0 for unbiased responses, in which the miss and false‐alarm rates are equal and therefore the overall error rate is minimized. Negative values of c reflect liberal bias settings, whereas positive values indicate conservative locations. Two variants of c that have been popular in memory research locate the criterion relative to the mean of one of the underlying distributions. Relative to the lure mean the criterion is cL ¼ zðF Þ
ð2Þ
and relative to the target mean it is cT ¼ zðHÞ
ð3Þ
64
Rotello and Macmillan
Each of these indices depends on only one of the two response proportions: cL sets the lure mean to 0 and is positive whenever F < 0.5, whereas cT sets the target distribution mean to 0 and is negative whenever H > 0.5. For example, if d 0 ¼ 2 and the criterion is unbiased, then cL ¼ –z(0.16) ¼ 1 and cT ¼ –z(0.84) ¼ –1: The criterion is one unit above the lure mean and one unit below the target mean. Notice that c ¼ 12 ½cL þ cT ; in this example, c ¼ 12 ½1 þ ð1Þ ¼ 0, that is, the criterion is unbiased. The final measure of bias is the likelihood ratio , the ratio of the target and lure distribution ordinates at the criterion. For the equal‐variance model of Fig. 1, approaches 0 for very low strengths, equals 1 at the point where the distributions cross, and approaches 1 for very high levels of strength. For many purposes, it is more useful to examine ln(), which bears a simple relation to the criterion location: lnðbÞ ¼ cd 0
ð4Þ
Many researchers would prefer a measure of response bias that makes no reference to a model, and the yes rate YR ¼ 12 ðH þ F Þ, the average of the hit rate H and the false‐alarm rate F, appears at first glance to fill this bill. In fact, however, the yes rate is mathematically equivalent to (monotonic with) the criterion location in the rectangular model (Macmillan & Creelman, 1990) and commits the user to threshold assumptions. Similarly, the purportedly nonparametric bias measure B00 is known to be equivalent to likelihood ratio if the underlying distributions are logistic (Macmillan & Creelman, 1996). We believe that all statistics purporting to measure response bias are model bound.1 As shorthand for summarizing the amount of bias in a single condition, any of these measures is satisfactory. It is only when multiple conditions must be compared—that is, in all interesting applications—that complications arise.
1 We also believe this to be true of sensitivity measures, with one exception: the area under the full ROC is nonparametric, because it equals proportion correct by an unbiased observer in two‐ alternative forced‐choice (Green, 1964). An area‐based bias measure could hold promise for similar status and such an index, denoted K, has recently been proposed by Kornbrot (2006). Like Green’s sensitivity measure, K requires a complete ROC, as can be obtained with a ratings design (discussed later in this chapter). For the nonparametric claim to be plausible, the ROC must be based on a large number of points.
Response Bias in Recognition Memory
B.
65
COMPARING BIAS AMONG CONDITIONS WITH EQUAL SENSITIVITY
A common question in memory research is whether an experimental variable aVects sensitivity, bias, or both; to answer this question clearly requires an experiment with multiple conditions. If sensitivity is known to be constant, all the SDT‐based measures described above are equivalent for the model of Fig. 1A. To determine whether sensitivity really is constant requires a model with a sensitivity parameter; for example, the model in Fig. 1A requires that d 0 be constant. If sensitivity is equal in two conditions, then values of a response‐bias statistic can be directly compared. All the SDT statistics described so far are monotonically related in this situation and any of them can be used. Even within the constraints of detection theory, however, the choice of a sensitivity statistic requires care. In particular, the equal‐variance model of Fig. 1A can often be rejected in favor of an unequal‐variance model like that shown in Fig. 1B (Glanzer, Kim, Hilford, & Adams, 1999; RatcliV, Sheu, & Gronlund, 1992), and d 0 does not measure sensitivity in that model. To see why, consider the formula for d 0 : d 0 ¼ zðHÞ zðF Þ
ð5Þ
If z(F) ¼ 0, then d 0 depends only on z(H) and, therefore, only on the standard deviation of the target distribution; if z(H) ¼ 0, then d 0 depends only on the characteristics of the lure distribution. When the variances of the lures and targets are equal, all is well. If the variances are unequal, however, the computed values of d 0 when z(H) ¼ 0 and when z(F) ¼ 0 diVer; d 0 depends on bias in that case. The sensitivity measure should depend on both standard deviations, and an excellent option is to use the root mean square standard deviation. Fixing the standard deviation of the lures at 1 (without any loss of generality) and that of the targets at s, we obtain a new measure of sensitivity da ¼
2 1 þ s2
1=2
½zðHÞ szðF Þ
ð6Þ
when s ¼ 1, the variances are equal and Eq. (6) reduces to Eq. (5). Taking account of unequal variance can dramatically change the interpretation of data. For example, consider recent work on the revelation eVect, the tendency for subjects to say ‘‘old’’ more often in a recognition test after the test item (whether target or lure) is revealed in fragmentary or distorted form immediately prior to a recognition judgment. Hicks and Marsh (1998) summarized the literature using d 0 and concluded that the revelation decreased memory sensitivity. However, Verde and Rotello (2003) collected ROC data, which can be used to estimate the ratio of target and lure standard deviations,
Rotello and Macmillan
66
and showed that da was equal across all conditions; the eVect of revelation was entirely one of response bias. Using similar methods, Dougal and Rotello (in press) showed that emotion‐arousing words aVected response bias rather than memory accuracy. We discuss these results in more detail later. C.
COMPARING BIAS AMONG CONDITIONS WITH UNEQUAL SENSITIVITY
If ‘‘sensitivity’’ has been defined and two conditions are found to diVer in it, then whether response bias has changed necessarily depends on the bias parameter chosen. Figure 2 provides a possible representation for an A
c c L−c T B
Same cL Same b
Same cT Same c
Fig. 2. Equivalence of various possible criterion locations as memory strength for targets is increased. (A) The solid line shows the criterion location; dashed lines indicate the mean of the lure strengths (distance cL from criterion), the mean of the target strengths (distance –cT from criterion), and the strength at which the false‐alarm and miss rates are equal (distance c from criterion). (B) The target distribution has a higher mean. Four possible criterion locations are noted, each of which keeps the criterion constant under one possible definition of its location (, c, cT, or cL).
Response Bias in Recognition Memory
67
experiment in which the same set of lures is compared with targets of two diVerent strengths. The criterion location for the weaker targets is shown, together with the predicted locations for the stronger targets if c, cL, cT, or , remains the same. The example illustrates an asymmetry between the assessment of sensitivity and bias. There is no ambiguity in asserting that condition 1 produces a higher sensitivity than condition 2, but ‘‘higher criterion’’ is ambiguous unless the origin against which it is calculated is specified. Likelihood ratio measures of bias, such as , are ratio scaled and thus unambiguous. Of the three strength‐based criterion measures, c, cL, and cT, which is to be preferred? An alleged advantage of c—that it depends on both H and F—is only compelling if subjects are able to accurately estimate their response rates for each stimulus class. Whether this is possible turns out to be a substantive question.
III.
Explaining Response Bias
A definitive choice among the various measures could justify a claim that bias has or has not been changed by an experimental manipulation. Our satisfaction with such a conclusion is likely to be short‐lived, though, because it immediately raises the questions of experimental strategy (what principles guided the subject’s decisions?) and of invariance (what decision statistic is the subject attempting to keep fixed?). A.
SEEKING INVARIANTS WITHIN THE DATA
As an example of an invariant, consider the well‐known mirror eVect. Recognition memory is better (has a higher d0 ) for low‐frequency (LF) words than for those of high frequency (HF) (analogous results have been obtained for other stimulus characteristics), and this improved sensitivity arises from the combination of a higher hit rate and a lower false‐alarm rate for the HF words. Has ‘‘response bias’’ changed? Figure 3 shows a (somewhat idealized) representation of a typical outcome in which it does not: The criterion falls at the midpoint of the HF lure and target distributions (panel A) and also at the midpoint of the LF distributions (panel B). However, this interpretation is inconsistent with the common assumption that a single strength dimension supports all decisions and that studying an item produces, on average, the same strength increment for all items. One solution to this paradox is to assume that subjects first determine whether the test item is of LF or HF, then evaluate its strength compared only to items of similar frequency. In this example, what is invariant is the criterion location as measured by c or .
Rotello and Macmillan
68
A HF lures
HF targets
B LF targets LF lures
Fig. 3. The mirror eVect. High‐frequency (HF) words (panel A) yield a lower hit rate and a higher false‐alarm rate than low‐frequency (LF) words (panel B). A constant criterion, measured by c or , is shown.
B.
DIVINING SUBJECT STRATEGY
Can the criterion location selected by a subject be predicted? Treisman and Williams (1984) suggested three general principles. The reference principle determines an initial criterion based on several aspects of the task, such as the expected rates of targets and lures on the test, their strengths, and task demands. The other two principles work to optimize the location of the criterion on a trial‐to‐trial basis and thus create criterion variability. In the world (but not usually the lab), target events tend to extend over a period of time, so an ‘‘old’’ judgment on trial n should tend to make a target more likely to appear on trial n þ 1; therefore, the tracking process shifts the criterion slightly (and temporarily) to create sequential response dependencies. To counteract the tracking process, a stabilization process evaluates the median strength of a set of recently tested items and shifts the criterion toward that value. Recognition
Response Bias in Recognition Memory
69
memory judgments are not usually evaluated in terms of sequences of responses, so the potential roles of the tracking and stabilization processes are unknown. We shall see later, however, that allowing criterion location to vary can be important theoretically. Treisman and Williams’ (1984) reference principle is supported by an extensive series of experiments and analyses by Maddox and Bohil (2004), Maddox (2002), and Bohil and Maddox (2001, 2003). Their perceptual classification experiments simultaneously manipulate payoVs, base rates, and discriminability, and their primary conclusion is captured by the acronym COBRA—COmpetition Between Reward and Accuracy. When there is a conflict between payoVs (reward) and accuracy (proportion correct), subjects appear to compromise. The experiments that support these conclusions diVer in many ways from recognition memory studies. The stimuli are perceptual, for example, rectangular bars of various heights. Two artificial, overlapping normal distributions A and B are created with diVerent mean heights. Subjects are aware of the base rates of A and B, and are given explicit payoVs for correct answers. The most important limitation in extending Maddox and Bohil’s conclusions to memory designs may be the absence of feedback in the vast majority of memory experiments. When feedback is provided, subjects can easily estimate their hit and false‐alarm rates and adjust their yes rate (which, in signal‐ detection models, determines their criterion) so as to balance the proportions of these outcomes and therefore minimize errors. Without feedback, they must rely on subjective memorability to meet their goals. One goal is presumably to follow instructions, but instructions in recognition memory studies are typically vague, merely urging accuracy (and speed without sacrificing accuracy). How might a subject set the criterion so as to follow such instructions? First, a likely tactic is probability matching. Few experiments oVer payoVs (but see Van Zandt, 2000, Experiment 1), so there is no ‘‘competition’’ that includes rewards. Instead, subjects may be guided by what Maddox and Bohil (2004) call COBRM—COmpetition Between Reward and probability Matching. In their perceptual experiments, COBRM describes behavior in early blocks and a shift to COBRA occurs with learning, but without rewards this transition may not occur. To probability match is to set the rate of responding ‘‘A’’ equal to the base rate of stimuli from distribution A. In most recognition experiments the base rates are equal, so by this principle subjects should try to use the two responses about equally often. When base rates have been manipulated (or subjects have been misinformed about them), the response rates change in the same direction (Rotello, Macmillan, Hicks, & Hautus, 2006; Strack & Forster, 1995; Van Zandt, 2000, Experiment 2).
70
Rotello and Macmillan
An important class of experiments that we discuss below involves a change in either the target or lure distribution. The COBRA and COBRM rules imply that when discriminability changes, the criterion will remain on the same side of the neutral point but move proportional to the change in sensitivity. If the diVerence between conditions is not announced to the subjects, information suYcient to infer it is available if feedback is provided. In most memory experiments, there is no such information and the subjects must rely on what can be observed, namely the strengths of the test items. Hirshman (1995) proposed that criterion setting was largely determined by the item strengths observed during study. His list‐strength experiments, like most recognition memory designs, did not vary the parameters considered by Maddox and Bohil. Hirshman’s conclusion that subjects base their criterion setting on the range of strengths that they can recall immediately prior to the recognition test, rather than estimating the mean target–lure diVerence or attempting to probability match, apply specifically to the list‐strength paradigm. In addition, Hirshman counterbalanced order of test condition and did not report any block‐to‐block changes; thus his data and analyses do not speak to changes that occur between conditions. We incorporate the range of strengths as a possibly useful statistic, but do not rule out the other decision models considered by Hirshman. This summary leads us to some heuristic expectations of criterion setting when feedback is not provided. Subjects attempt to probability match, setting the ‘‘old’’ rate equal either to the base rates contained in the instructions or, by default, to 0.5. If the base rate is in fact 0.5, this strategy also maximizes accuracy by setting c ¼ 0. By this principle, a shift to a new test should change the criterion, but we speculate that this tendency is tempered by two other tactics. First, subjects are motivated to keep the false‐alarm rate from increasing. This intuition is consistent with COBRA if the psychological ‘‘reward’’ of a false‐alarm is imagined to be strongly negative. Second, subjects display inertia, changing the criterion only if necessary to control the false‐alarm rate. These ideas are informal and we oVer them only as a framework within which to understand experimental results. To illustrate their application, however, consider the eVect of shifting between test lists of diVerent discriminability. In one case, the task is made harder by using weaker targets (Fig. 4A). If the criterion (cL) is held fixed, the yes rate will decrease. To keep the yes rate constant, however, would require an increase in the false‐ alarm rate, so we expect no change in criterion. In the other case, the task is made easier by using stronger targets (Fig. 4B). Now a more conservative criterion would lower the yes rate, move toward probability matching, and lower the false‐alarm rate, all desirable outcomes. Working against such a
Response Bias in Recognition Memory
A
71
B
Stronger targets Weaker targets
Fixed cL Constant yes rate
Fixed cL Constant yes rate
Fig. 4. Distributions of target and lure memory strengths. (A) Subjects encounter weaker targets at some point in the test list. Maintaining a constant yes rate would require using a more liberal response criterion that would increase the false‐alarm rate. (B) Subjects encounter stronger targets at some point in the test list. Maintaining a constant yes rate would require setting a more conservative criterion that would lower both the hit and false‐alarm rate. Here, though, accuracy improves even if the subject does not change the criterion.
change, however, is the inertial principle, and no strong prediction can be made. We now consider the factors influencing criterion setting, and distinguish several kinds of results. Criterion location has been shown to depend on group membership, on the nature of study or test lists, and on the nature of items within test lists. In the first two cases, we refer to criterion diVerences, reserving the phrase criterion shifts for the within‐list case. Most of our heuristic principles apply equally to all these cases, but the inertial principle has the most force in mixed test lists, where adjusting the yes rate or false‐ alarm rate would have to be accomplished on a trial‐by‐trial basis. As a working hypothesis, we predict that such changes will not be made unless the classes of items being tested require very diVerent processing. Even this vague expectation serves to tie together a large class of results.
Rotello and Macmillan
72
IV.
Between‐Group Criterion DiVerences
It is common to ask is whether diVerent types of subjects have better or poorer memory sensitivity; an equally reasonable question is whether such groups employ diVerent decision criteria when responding to the same stimuli. The answer to the latter is clearly aYrmative. For example, older adults have been shown to set more liberal response criteria than younger adults (Benjamin, 2001, Experiment 1) and amnesics set more liberal criteria than control subjects (Dorfman, Kihlstrom, Cork, & Misiaszek, 1995). Subjects tend to set more conservative criteria for easier tasks (Hirshman, 1995), which could easily explain these between‐group diVerences: Amnesics and older adults have poorer memory, which would make the task more diYcult for them and a liberal criterion more appropriate because it increases their hit rate. For other groups, the memory sensitivity eVects are less obvious but criterion diVerences exist nonetheless: Patients with panic disorder use a more liberal criterion than controls (Windmann & Kru¨ger, 1998), as do dementia patients (Woodard, Axelrod, Mordecai, & Shannon, 2004). Women who claim to have recovered previously forgotten memories of sexual abuse have a more liberal response criterion in the Deese–Roediger–McDermott (DRM) paradigm task than controls, whereas women who have always remembered such abuse have a more conservative criterion (Clancy, McNally, Schacter, & Pitman, 2000). Similarly, individuals who either reported memories of having been abducted by aliens or who thought they had been abducted despite the absence of any explicit memories of the experience had more liberal response criteria than controls (Clancy, McNally, Schacter, Lenzenweger, & Pitman, 2002). V.
Between‐List Criterion DiVerences
Considering the more typical subject population of healthy college students, it is easy to find examples of criterion shifts that occur between diVerent test lists. For example, subjects are sensitive to changes in task demands in roughly the manner expected from Maddox and Bohil’s modeling. They set a more liberal criterion when the ratio of studied items to lures on the test is greater, whether they are (Van Zandt, 2000) or are not informed of the ratio changes (Heit, BrockdorV, & Lamberts, 2003). The consequence of this liberal shift is a greater yes rate. Subjects also establish a more liberal criterion when they are misled that the proportion of studied items on the test is increased (Rotello et al., 2006; Strack & Forster, 1995). Few memory experiments use payoV schemes, but Van Zandt (2000, Experiment 2) showed that subjects are responsive to manipulations of the reward structure and set
Response Bias in Recognition Memory
73
their criteria accordingly. Subjects also set their decision criteria in response to experimental instructions: A number of studies have demonstrated that subjects are more willing to say ‘‘old’’ if they are instructed to ‘‘guess old’’ when uncertain, for example (Postma, 1999; Rotello et al., 2006). Criterion location is also aVected by between‐list manipulations of study strength or task diYculty. Benjamin (2005) asked some subjects to make recognition judgments on studied items and lures from the same semantic classes as the targets, whereas other subjects identified studied items amid lures from diVerent (unstudied) semantic classes. Because the constant element across conditions was the studied items, the criterion location was measured by the hit rate (cT). Subjects who were given the easier task set a more conservative criterion, producing a lower hit rate, than subjects who took the harder test. Using the same materials in a within‐subject design, Benjamin and Bawa (2004) showed that subjects shift their criterion between‐ lists when their first test list is easy and the second is hard, but not if the first test is hard and the second is easy. In a list strength design in which subjects are tested after studying words that were all weak (studied once each or at a fast presentation rate), all strong (studied repeatedly or at a slow presentation rate), or a mixture of strong and weak items, subjects consistently set cL more conservatively on tests of strong rather than weak items (Balakrishnan & RatcliV, 1996; Hirshman, 1995; Stretch & Wixted, 1998a). This observation is consistent with the subject’s presumed goal of maximizing accuracy by balancing false‐alarms to lures against missed targets. In addition, the false‐alarm rate is greater for the weak items in the pure (weak‐only) list than for weak items in the mixed lists, and is lower for the strong items on a pure‐strong list than for strong items on a mixed list (Hirshman, 1995). All of these results point to criterion settings that are responsive to the average strength of test lists, being more conservatively placed on easier tests (yielding a lower false‐alarm rate) and more liberally placed on harder tests (yielding a higher false‐alarm rate). Hirshman (1995, Experiment 4) also showed that the study conditions are suYcient to allow a criterion to be set: Subjects studied either a mixed list or a pure weak list, and were tested on the weak items only. Despite being tested on identical items, subjects who had studied the mixed list set a higher criterion (cL) that resulted in fewer false alarms. Hirshman’s explanation was that the subjects estimated the range of the underlying memory strengths of studied items by covertly recalling a few items prior to the recognition test. The presence of stronger items on the study list would lead the subject to expect a greater range of strengths at test (i.e., a higher target distribution), and thus a higher criterion would be needed to maximize accuracy. In contrast, Verde and Rotello (2007) demonstrated that, when the study conditions are held constant across conditions, the criterion is set by the strength
Rotello and Macmillan
74
of the first few items on the test list: Being tested initially on strong items resulted in a higher criterion (lower false‐alarm rate) than being tested initially on weak items. Their result suggests that the subjects’ predictions about the strength of the test items, which help to locate the criterion, can be revised over the very short term.
VI. A.
Within‐Test Criterion Shifts
EVIDENCE AGAINST SHIFTS: STRENGTH MANIPULATIONS
The results we have described so far are, in some ways, fairly obvious. After all, a cognitive system that could not set the decision criterion in response to encoding or testing conditions would be quite ineYcient. A more interesting question is how adaptive that response criterion is to condition changes over the shorter term: Can we adjust our criterion on the fly, to respond optimally to diVerent kinds of test items presented in a single list? Several striking failures to observe strength‐based criterion shifts have been reported. Stretch and Wixted (1998b) repeated one set of study words five times each; these were colored red. A second set of study words, randomly mixed with the red words, were presented once each in blue. At test, accuracy could be maximized by setting a higher criterion for the stronger items than for the weaker items. Across five experiments that made the strength manipulation increasingly obvious to the subjects, no shifts in cL were observed: The false‐alarm rate for lures presented in the color associated with the stronger items (red) equaled that for lures tested in the color (blue) associated with the weaker study items. Similarly, Benjamin (2001) gave subjects DRM lists to study; some of those lists were repeated three times. The false‐alarm rate to the noncritical lures did not diVer with list strength, consistent with a constant decision criterion being used throughout the test list. Verde and Rotello (2004b) used an associative recognition task in which some pairs were strengthened via repetition; both strong and weak pairs were tested in a random order, along with rearranged pairs (lures made from studied words). The false‐alarm rate to weak and strong pairs was the same, revealing no evidence of a criterion shift in response to stimulus strength. In the most dramatic demonstration of this basic result, Morrell, Gaitan, and Wixted (2002) made the strong and weak classes obvious: Strong and weak items diVered in both content and format (names of professions vs pictures of birds). Still, subjects showed no evidence that they shifted their criterion to be more conservative for the stronger class of stimuli, thus missing an opportunity to improve their accuracy: The false‐alarm rate to lures from the weaker class was the same as to lures from the stronger class.
Response Bias in Recognition Memory
75
One explanation for this reluctance to shift decision criterion on a trial‐ by‐trial basis is, as suggested earlier, inertial: Cognitive eVort is required to constantly adjust criteria (Morrell et al., 2002; Stretch & Wixted, 1998b). An alternative view is that changing the criterion takes some amount of time, so that subjects’ criterion shifts continually lag behind the actual changes in task diYculty. Support for the lag hypothesis comes from Brown and Steyvers (2005), who asked subjects to perform a speeded lexical decision task in which the nonwords were either relatively easy or relatively diYcult to distinguish from actual words. These conditions are analogous to the strong and weak conditions, respectively, studied by Wixted and colleagues (without the memory component, of course), except that the easy and hard conditions appeared on the test list in alternating blocks (that were not identified for the subjects). Modeling work revealed that subjects did shift their criteria in response to the changing test conditions, but that the criterion shift lagged behind the task shift by an average of 14 test trials. A similar study using a recognition task was recently reported by Verde and Rotello (2007). They manipulated the strength of diVerent classes of study items, like Stretch and Wixted (1998b), but controlled the test order so that all the strong items were presented before the weak items (Experiments 1–3 and 5), or all the weak items were tested before the strong ones (Experiment 4). The boundary between the easy and hard blocks was not distinguished for the subjects, and the blocks were fairly long (80 items each) so that the criterion would have suYcient time to catch up. The false‐alarm rates were equal in the strong and weak blocks in four of the five experiments. Thus, it does not appear that subjects attempted to shift their criterion cL in accordance with task diYculty: If they had, even at some lag after a shift between blocks, Verde and Rotello would have detected a change in the false‐ alarm rate. The only experiment in which a criterion shift was detected was one in which subjects were given feedback on the accuracy of the recognition decisions (Experiment 5), presumably because feedback gave the subjects important information about the trade‐oV between their false‐alarm and miss rates. In summary, in the absence of feedback to the subject, no evidence has been found for criterion shifts that occur in response to a selective strengthening of one class of studied items in memory relative to another. The false‐alarm rate to the weaker class of stimuli equals that to the stronger class, regardless of whether those classes are randomly intermixed (Morrell et al., 2002; Stretch & Wixted, 1998b) or blocked at test (Verde & Rotello, 2007). One might conclude that criterion shifts do not occur within lists. After all, subjects in these experiments would have performed better (had higher accuracy levels) if they had shifted their criterion, and Wixted and his colleagues took pains to insure that their subjects understood the memory
76
Rotello and Macmillan
improvements that come with greater strength. As we describe next, however, there is actually a great deal of evidence for criterion shifts that occur within a single test list. B.
EVIDENCE FOR SHIFTS: PROCESSING AND STIMULUS MANIPULATIONS
The observation that subjects do not shift their recognition criterion within list in response to changes in the strength of the test items has often been over‐interpreted to imply that subjects do not make such shifts in response to any experimental or stimulus factor. For example, Diana, Reder, Arndt, and Park (2006) asserted that ‘‘if participants do not shift their criterion within list even under these [Stretch & Wixted’s] very encouraging circumstances, it is unlikely that they will do so under the types of conditions used by [other researchers]’’ (p. 2). Contrary to this plausible inference, there is considerable evidence for within‐list criterion shifts. 1.
Two Processing EVects: Processing Time and Revelation
In the response‐signal paradigm, subjects are asked to make their recognition decisions within a narrow time window after a signal‐to‐respond; the timing of the signal is unpredictable, occurring at a variety of shorter (e.g., 100 and 300 ms) and longer (1000 and 2000 ms) lags after the onset of the test probe. Thus, there is a variable amount of time during which a stimulus may be processed before a decision is required. Virtually every response‐signal experiment has shown that the false‐alarm rate declines with processing time. Because information is continually extracted from the memory probes during processing, these false‐alarm rate changes may reflect changes in the form of the distribution with time rather than changes in the location of the decision criterion (Lamberts, BrockdorV, & Heit, 2003). However, a clever experiment by Heit et al. (2003) clearly showed that subjects do shift their criterion across signal lags. They used a response‐signal paradigm in which the ratio of lures to studied items either increased, decreased, or was constant with lag. Although memory sensitivity was equal across these three conditions, subjects’ response criterion (measured with c and compared across conditions for a particular response‐signal lag) shifted in correspondence with the proportion of studied items tested at that lag. As the ratio of new to old fell, the criterion became more liberal; as the ratio rose, so too did the criterion. Contrary to the claims of Wixted and colleagues (Morrell et al., 2002; Stretch & Wixted, 1998b), subjects in the Heit et al.’s experiment were anything but reluctant to shift their criterion; they did so on a trial‐by‐trial basis while under severe time pressure.
Response Bias in Recognition Memory
77
Another example of a processing manipulation that results in a criterion shift is the revelation eVect, which we discussed earlier. In revelation experiments, the trials on which revelation does and does not occur are randomly intermixed, suggesting that subjects might be changing their response criterion on a trial‐by‐trial basis. Indeed, Niewiadomski and Hockley (2001) and Verde and Rotello (2003, 2004a) demonstrated that the revelation eVect is often due to a liberal shift in the response criterion that occurs within list. 2.
Two Stimulus EVects: Emotional Valence and Subjective Memorability
Intuitively, emotional events in our lives are remembered better than neutral, nonemotional events. This intuition is an assumption in the emotion literature, and the data often support that view (Phelps, LaBar, & Spencer, 1997). Emotionally valenced stimuli, particularly negative stimuli, are often recalled more easily than neutral stimuli, but intrusion data are typically not reported. To evaluate memory sensitivity eVects, we must understand the role of response bias. A possible interpretation of those data is simply that subjects are more willing to report negative events, both true and false, and data from recognition tasks support that view. In recognition, memory sensitivity is sometimes enhanced for negatively valenced stimuli, but that eVect is not always observed (Maratos, Allen, & Rugg, 2000; Windmann & Kru¨ger, 1998; Windmann & Kutas, 2001). In contrast, all recognition studies show clear eVects of emotion on response bias: Subjects have a more liberal response criterion for negatively charged stimuli than for neutral or positively valenced items (Dougal & Rotello, in press; Windmann & Kru¨ger, 1998; Windmann & Kutas, 2001; Windmann, Urbach, & Kutas, 2002). These experiments presented the diVerent classes of stimuli (emotional or not) in a random order within a single test, again demonstrating that subjects can and do change their response criterion on a trial‐by‐trial basis. Wixted (1992) asked subjects to study HF, LF, and rare words (those that occurred less than once in 7 million words). Compared with the HF words, the rare words elicited both a higher hit rate and a higher false‐alarm rate; these data are consistent with a liberal criterion shift that is made on a trial‐ by‐trial basis. Such a criterion shift is an inappropriate response to those rare words (it increases the false‐alarm rate), but it might occur if subjects thought that rare words would be easier to remember than more common words. [Stretch and Wixted (1998b) also make this point.] Relatedly, subjects set a more liberal criterion for saying ‘‘old’’ to common rather than bizarre test pictures (Worthen & Eller, 2002), perhaps reasoning that the bizarre stimuli would be easier to recognize and thus justifying a stricter criterion for those items. A more extreme example of the role of subjective memorability comes
Rotello and Macmillan
78
from research on stimuli that subjects are ‘‘sure’’ they would remember had they been studied: Those items rarely elicit false alarms, as if the criterion were shifted very high (Brown, Lewis, & Monk, 1977; Strack & Bless, 1994; but see Rotello, 1999). VII.
An Interim Summary
General evolutionary principles can be invoked to account for some of the eVects summarized so far. For some limited types of stimuli, automatic criterion shifts might be expected, even on a trial‐by‐trial basis. For example, a low criterion that allows false alarms to negatively valenced or fear‐ producing stimuli, but few misses, may have survival benefits (LeDoux, 1986). In contrast, there is little survival advantage to either missing or false‐alarming to positive or neutral stimuli. This line of theorizing is limited: It stretches credulity to invoke an adaptive rationale for criterion shifts to revelation‐style tasks, or to changes in the nature of the lures. Instead, we have found support for some heuristic principles, proposed earlier, for establishing and adjusting response bias. (1) Subjects place their criteria so as to match their yes rate to the known or implied base rate of targets at test. (2) Subjects take into account the range of strengths they observe, making more ‘‘old’’ responses when the average familiarity of probes is high than when it is low. Together, these heuristics tend to maximize accuracy by balancing false‐alarms and misses. (3) When the experimental situation changes, criteria may shift in accordance with these same principles, provided that subjects are aware that a change has occurred and are suYciently motivated to overcome the inertia principle. Awareness of changed conditions could come about through feedback from the experimenter, or when there is an obvious change in the nature of the task: Lures that come from a diVerent semantic class (Benjamin & Bawa, 2004), altered properties of the test stimuli (such as emotionality), or new aspects of the processing task (revelation tasks, processing time). Although bias shifts have been observed in all these situations, they seem to be tempered by resistance to increasing the false‐alarm rate and by a more general inertia. Another factor that seems like it should matter is memory strength. Our survey reveals, however, that although subjective memorability influences criterion, real memorability (item strength) does not (Verde & Rotello, 2007). Perhaps the strength eVects in the literature are too weak to be noticed, but d 0 diVerences alone are not good predictors of within‐list criterion changes: Singer and Wixted (2006) pointed out that they and others (Morrell et al., 2002) used large diVerences in memory strength yet observed no shift in criteria.
Response Bias in Recognition Memory
VIII.
79
Distribution Shifts Masquerading as Criterion Shifts
To this point, we have interpreted positively correlated changes in hit and false‐alarm rates as criterion shifts, assuming that the distributions remain fixed (Fig. 5A). The reverse interpretation, however, is often equally possible: The strength of one or both of the underlying memory distributions may shift while the criterion remains fixed (Fig. 5B). If only the lure distribution shifts, then the false‐alarm rate will change in the absence of a hit rate change, and vice versa if only the target distribution shifts. Although these alternatives are substantively very diVerent, it is not possible to distinguish between them in the absence of a set of test items that is truly constant across conditions (e.g., a set of completely unrelated lures). Instead, arguments for one or the other interpretation are typically based on parsimony. We have already described one empirical phenomenon that has been interpreted either as a criterion shift or as a distribution change: the mirror eVect. Next, we describe two eVects that have been interpreted as response‐bias diVerences but could also be taken as evidence for distribution shifts. A
B
Distribution shift Criterion shift
Fig. 5. The equivalence of criterion‐shift and distribution‐shift interpretations of data. (A) Two conditions diVer only in criterion location. (B) Both target and lure distributions shift upward by the same amount; the criterion does not move. Performance in panels (A) and (B) is identical.
Rotello and Macmillan
80
A.
STUDY‐TEST DELAY
Study‐test delays have been found to influence the false‐alarm rate, even when the delays are manipulated within a single test list, and these changes in the false‐alarm rate have been interpreted as criterion shifts (changes in cL). For example, Singer, Gagnon, and Richards (2002) asked their subjects to read two diVerent sets of stories, separated by a delay. After reading the second set of stories, a recognition test was given on brief sentences from each story, paraphrases of studied sentences, and new but related sentences. The results were clear: Subjects had a higher false‐alarm rate when responding to sentences and lures from the stories studied first (i.e., those tested after a delay). Singer and Wixted (2006) adapted this methodology to a more standard memory paradigm in which two lists of words were studied, again separated by a delay. The words on each list were exemplars of particular semantic categories (e.g., birds, diseases). Items from both study lists were mixed up in a random order on the same test list, along with new exemplars from studied categories. At shorter inter‐study delays (40 min), no evidence for diVerences in the false‐alarm rates was found. With a two‐hour interstudy delay, however, the false‐alarm rate to the items from the delayed‐test condition was greater than in the immediate‐test condition. In both papers, Singer and his colleagues interpreted the false‐alarm rate diVerence as a within‐list criterion shift, but it is equally plausible that the familiarity of the targets and their corresponding lures declined with delay. Because the lures were directly tied to particular study stories or, by virtue of their semantic category, to the study list, the familiarity of the targets and lures would be expected to decline together. B.
CONTEXT EFFECTS
A similar explanation can account for changes in the ‘‘old’’ response rate in experiments that manipulate the match between study and test contexts. A number of recognition experiments have posed the question of whether testing items in the same context (background color, font, screen location, voice of speaker, etc.) as that in which they were studied increases memory sensitivity. Although some studies have found sensitivity benefits, many have found what appear to be bias eVects instead: Subjects are more willing to say ‘‘old’’ to studied items in a studied context and to lures tested in a studied context (Dougal & Rotello, 1999; Feenan & Snodgrass, 1990; Goh, 2005; Murnane & Phelps, 1994, 1995; Murnane, Phelps, & Malmberg, 1999). In these experiments, the same‐context probes are mixed randomly with the diVerent‐context probes at test. Thus, one might interpret the greater ‘‘old’’ rate to same‐context probes as a trial‐by‐trial shift in the response criterion. However, the presence of matching context features increases the familiarity
Response Bias in Recognition Memory
81
of both targets and lures; in the absence of criterion shifts, this familiarity enhancement increases both hits and false alarms. Singer and Wixted (2006) also interpreted their study‐test delay results as a kind of context eVect, noting greater encoding‐test match in the immediate testing than the delay condition. C.
IDENTIFYING THE CHANGE
As we suggested earlier, a relatively simple method for disentangling distribution shifts from criterion shifts is to create an ‘‘anchor’’ distribution that is common across all conditions. Criterion shifts may then be evaluated against this ‘‘fixed’’ distribution; this is the method adopted by Benjamin and Bawa (2004) and Verde and Rotello (2007), for example. Although this method appears straightforward, it is not always possible. Consider the eVects of study‐test delay: It is not possible to have a common target distribution because, by definition, those targets were studied at diVerent points in time. It is also not possible to have a common lure distribution, unless there is a single (unrelated) set of lures. Of course, with only one set of lures and a single test list, only a single false‐alarm rate is observable. An alternative solution, based on event‐related potential (ERP) signals of criterion shifts, is more speculative. The prefrontal cortex has been found to play an important role in decision making (Bechara, Damasio, & Damasio, 2000; Bechara, Damasio, Damasio, & Lee, 1999). Windmann and Kutas (2001) found that the ERP signal for hits and false alarms diVered according to the emotional valence of the memory probe. The eVects were seen very early in processing (300 ms) and over prefrontal sites. They argued that the frontal cortex may be responsible for relaxing the response criterion for negative stimuli. Windmann et al. (2002) also asked subjects to recognize words of diVerent emotional valences while ERP signals were recorded; they then divided subjects into liberal and conservative groups based on their ‘‘old’’ response rates. The ERP signals supported Windmann and Kutas’s assertion that the prefrontal cortex plays a role in setting the decision criterion, and that it does so early (300–500 ms poststimulus) for these emotional stimuli: The largest diVerences in the ERP signals of conservative and liberal subjects were observed there. If these results generalize, then it might be possible to distinguish criterion shifts from distributional shifts based on the early frontal ERP signals.
IX.
Designs with Multiple Responses
It is a truism of signal‐detection theory that binary responses provide less information to the experimenter than the subject has available. Most of the experiments described so far require a simple choice between ‘‘old’’ and
Rotello and Macmillan
82
‘‘new,’’ and the corresponding models contain a single criterion. In this section and the next, we expand our discussion to multiresponse paradigms.
A.
THE REMEMBER–KNOW PARADIGM
Our modest first stop on this journey is the three‐response remember–know design (Tulving, 1985), in which ‘‘old’’ responses are divided by the subject into those that are explicitly ‘‘remembered’’ and those that are merely familiar, or ‘‘known.’’ As with any other design, evaluation of response bias in the remember–know paradigm requires a model. We employ the one‐dimensional SDT model sketched in Fig. 6 (Donaldson, 1996; Wixted & Stretch, 2004). Targets and lures generate distributions in SDT fashion; a high criterion separates remember from know responses and a lower criterion separates know from new responses. In this model, then, remembers are viewed simply as high‐confidence old judgments. The first thing to notice about this representation is its claim that subjects can adopt two criteria—two levels of response bias—simultaneously. Indeed, part of the appeal of the remember–know paradigm is that, unlike other putative methods of separating explicit and implicit memory processes, it does not require multiple testing conditions. The ability to make comparisons between two simultaneously maintained biases avoids the possible confounds that arise when measuring criterion shifts across trials or conditions. Evidence that subjects can indeed hold multiple criteria in place has been provided by rating experiments, which we discuss in the next section.
“Remember” “New” “Know”
k
r
Fig. 6. The equal‐variance one‐dimensional model of remember–know judgments. Targets and lures diVer in average strength. To decide whether a memory probe is remembered, subjects use a high criterion (r); the lower criterion (k) separates known items from those judged to be new.
Response Bias in Recognition Memory
83
The one‐dimensional model allows us to ask some interesting questions about remembering and knowing. How do the criteria shift in response to changes in the task conditions? If subjects are asked to shift one of the criteria—say the old–new bound—how do the other criteria react, if at all? Subjects do shift their old–new bound under biasing instructions, and this change is accompanied by a shift in the remember–know bound (Rotello et al., 2006). Similarly, subjects set a more conservative criterion for saying they ‘‘remember’’ a particular stimulus when a narrow, restrictive definition of remembering is provided. This shift in the remember–know bound produces a corresponding numerical (but statistically insignificant) shift in the old–new criterion (Rotello, Macmillan, Reeder, & Wong, 2005). The most popular use of the remember–know design is to uncover interactions involving the remember and know response rates (Gardiner & Richardson‐Klavehn, 2000). For example, Gregg and Gardiner (1994) found that auditory and visual presentations led to about the same remember rate (0.10 for auditory, 0.11 for visual) but very diVerent know rates (0.27, 0.52). Their two‐process conclusion was that presentation modality aVected familiarity but not recollection. No mention is made of response bias in this interpretation, and it was implicitly understood that these are sensitivity eVects. However, Dunn (2004) applied the one‐dimensional, equal‐variance model and found that (1) overall sensitivity was somewhat greater for visual presentations, and (2) the remember criterion measured as cT remained fixed (1.26 vs 1.22), yielding a constant remember rate; but the old–new criterion decreased (0.33 vs 0.33), producing the increase in knows. Figure 7A illustrates Dunn’s modeling, and supports the conclusion that Gregg and Gardiner’s results are primarily due to response bias. In a second data set examined by Dunn (2004), Gardiner and Java (1990) found that words were remembered less often than nonwords (0.28 vs 0.19) but known more often (0.16 vs 0.30); they concluded that words tended to be recognized more on the basis of recollection, nonwords more on the basis of familiarity. The one‐dimensional model (Fig. 7B) leads to a very diVerent conclusion: Both sensitivity (0.95 vs 1.01) and the old–new criterion (measured as cL ¼ 1.07 vs 1.04) were about the same for the two stimulus categories, but the remember criterion (cL ¼ 1.55 vs 1.89) was much lower for words. In this case, the eVect is almost entirely bias driven: Subjects are less willing to use the ‘‘remember’’ response for strange nonwords than for words they have encountered outside the laboratory. Dunn’s analyses bear only weakly on whether the one‐dimensional model is correct (although the fits of the model are excellent), but make clear the importance of using models to interpret remember–know data. In fact, other models that have been applied to these data—including a model that attempts to capture the conventional, direct interpretation
Rotello and Macmillan
84
B
A
Auditory
−3
Words
−1
1 X
3
5
Visual
−3
−1
−3
−1
1 X
3
5
1 X
3
5
Nonwords
1 X
3
5
−3
−1
Fig. 7. One‐dimensional model‐based interpretations of remember–know dissociation experiments. The estimated locations of target distributions, old–new, and remember–know decision criteria are shown for two experiments. (A) Auditory versus visual stimulus presentation. (B) Word versus nonword stimuli. For both experiments, the model accounts for diVerences between conditions in terms of response bias.
(Murdock, 2006)—also reveal a mixture of sensitivity and bias eVects (Macmillan & Rotello, 2006). The existence of response bias in remember– know experiments can be neither demonstrated nor rejected without an explicit model. B.
CONFIDENCE RATINGS
Many recognition memory experiments use confidence ratings instead of, or as a supplement to, the old–new judgment. The response options may be described verbally (e.g., they may range from ‘‘sure new’’ to ‘‘sure old’’) or be numerical values (often from 1 to 6). The SDT interpretation is that the subject establishes n – 1 criteria to divide the decision axis into n regions, assigning observations above the highest criterion to the response indicating the most confidence that the test item is old, observations between the top two criteria the next response, and so forth. If this interpretation is correct,
Response Bias in Recognition Memory
85
then ratings are an extremely eYcient way to observe multiple criterion settings. Rating data can be used to construct ROC curves, which provide several advantages in interpreting response bias. First, as noted earlier, the promise of SDT to separate sensitivity and bias cannot be fulfilled with binary data, because the relative variance of the target and lure distributions cannot be determined. The rating design allows this parameter to be estimated. Second, the form of the ROC limits the possible representations, so that Gaussian models (with which most empirical ROCs are consistent) can be distinguished from threshold and hybrid models (which are sometimes supported). Third, when bias varies across conditions in a rating experiment, sensitivity and bias can be distinguished in a model‐free manner. As an example, consider the data of Dougal and Rotello (in press), shown in Fig. 8. Notice that the curves for the negative, neutral, and positive stimuli all fall at the same height in the space, reflecting equal memory sensitivity. However, the operating points for the negative items each fall to the upper‐right of the others, reflecting a more liberal response bias (higher H and F). Throughout this chapter we have referred to criteria as points on the strength axis, but as noted earlier they may also be viewed as having diVerent values of likelihood. Likelihood ratio is the optimal decision variable (Green & Swets, 1966), but in applications where sensitivity is fixed it is monotonic with, and thus indistinguishable from, strength [see Eq. (3)]. Varying sensitivity provides a tool for deciding between these measures, but the simple old–new design is not rich enough for a convincing decision. 1.0 Negative
Hit rate
0.8 0.6 0.4
Neutral Positive
0.2 0.0 0.0
0.2 0.4 0.6 0.8 False alarm rate
1.0
Fig. 8. Receiver operating characteristic (ROC) data from Dougal & Rotello (in press, Experiment 1b). The filled circles denote the observed responses to neutral stimuli, the minus signs denote the responses to emotionally negative stimuli, and the plus signs indicate the responses to emotionally positive words. Reprinted with permission of the Psychonomic Society.
86
Rotello and Macmillan
Rating experiments can in principle resolve this ambiguity. In a list‐strength rating design, for example, three models have been proposed for how criteria shift between lists diVering in strength. According to the lockstep model (also called the ‘‘distance from criterion’’ view), the spacing between criteria is maintained: Shifting the location of one criterion shifts all of them by the same amount. The likelihood ratio rule assumes that the criteria are placed in such a way as to maintain their relative locations in likelihood terms. That is, if ‘‘sure old’’ was the label used when the relative heights of the distributions was at least 6:1, then that criterion location would remain at 6:1 under diVerent strength conditions. This model requires that the criteria spread out on the strength axis as discriminability decreases. Finally, the range model assumes that subjects try to place their criteria so as to span the same range of the distributions, yielding the same proportions of high‐confidence misses and false alarms as strength changes: As discrimination decreases, the criteria must be placed more closely together. Hirshman (1995) endorsed this range model, but he did so based only on binary yes–no response rates. Although these models make strikingly diVerent predictions about criterion placement in the face of changing memory sensitivity, the existing data do not clearly support one model over another. One type of analysis, using group‐level data, involves the calculation of the location of each criterion relative to the mean of the (constant) lure distribution. Stretch and Wixted (1998a) found that the criteria fanned out as sensitivity decreased, a result that is most consistent with the likelihood ratio model. Likelihood‐based decision axes have been assumed in modern models of memory (REM: ShiVrin & Steyvers, 1997; SLiM: McClelland & Chappell, 1998). In contrast, Balakrishnan and RatcliV (1996) adopted a more complicated analysis, using individual data. They compared the forms of the cumulative distribution functions (CDFs) estimated from rating data across conditions that diVered in study repetitions. The key idea underlying their analysis is that, according to a likelihood decision rule, a particular strength in memory is not equally likely to be a target in diVerent conditions. Thus, the probability of saying ‘‘old’’ will diVer across conditions, even if the memory probe has identical strength. Because of this property, the CDFs will not be parallel to one another as study repetitions are manipulated; instead, the CDFs will cross at the memory strength or confidence level at which the likelihoods equal 1 in each condition. However, in three experiments, including one recognition memory experiment, Balakrishnan and RatcliV observed CDFs that simply paralleled one another. Parallel CDFs are consistent with a lockstep model in which memory judgments are based simply on strength. The lockstep model is also most consistent with the constant false‐ alarm rates seen in within‐list manipulations of memory strength (Stretch & Wixted, 1998b).
Response Bias in Recognition Memory
87
Van Zandt (2000) challenged both conclusions, arguing that the decision axis (and therefore criterion location) was based on neither memory strength nor likelihood ratio. If either of those assumptions were correct, then changing subjects’ response bias through payoVs or the ratio of targets and lures on the test would not influence the form of the underlying memory distributions. Instead, the ratings decision criteria would simply shift their locations along the strength or likelihood axis. Confirming an earlier study by Schulman and Greenberg (1970), Van Zandt found that the slope of the zROC, which measures the ratio of the standard deviation of the lure distribution to that of the target distribution, was systematically influenced by the biasing conditions: Steeper slopes were observed when more targets were tested and when rewards were higher for confident ‘‘old’’ judgments. Hirshman and Hostetter (2000) reported a similar, but smaller scale, result: Providing subjects with response scales that included an equal or unequal number of ‘‘old’’ and ‘‘new’’ ratings categories influenced the slope of the zROC. Both of these results are problematic for the simplest models in which either strength or likelihood serves as the decision axis in recognition memory tasks and decision criteria are stable across trials. Treisman and Williams’ (1984) analysis, discussed earlier, provides an answer to this challenge. Recall that their presumed stabilization process shifts the old–new criterion toward the median strength of recently observed stimuli. In a rating design, the magnitude of the criterion shift is influenced by the desired location of each criterion. For example, for the most confident ‘‘old’’ response category, upward (more conservative) criterion shifts are larger in size than downward shifts. (In the absence of this assumption that criterion would eventually collapse onto lower confidence criteria because there are relatively few targets of extremely high strength.) Less extreme confidence criteria have less variability due to the stabilization process. Together, these assumptions explain the zROC slope eVects observed by Van Zandt (2000) and others, and they do so within the context of a strength‐based decision model. When more targets are tested, the ‘‘old’’ response criteria, and especially the most conservative criteria, are more likely to be shifted than the ‘‘new’’ response criteria. Thus, the conservative criteria have greater variability than the liberal criteria. Conversely, if more lures than targets are tested, the liberal criteria have greater variability. The net result is a systematic change in observed zROC slope, as shown in a series of simulations by Treisman and Faulkner (1984). This argument, if correct, rebuts Van Zandt’s plague‐on‐both‐your‐houses conclusion, but does not address the discrepancy between the Stretch and Wixted (1998a) and Balakrishnan and RatcliV (1996) findings. Recent work by Benjamin and Wee (submitted for publication) supports the general importance of criterion variability in recognition memory performance, but does not
Rotello and Macmillan
88
address the specific argument about whether some criteria have more variability than others.
X.
Conclusions and Recommendations
Response bias, according to our survey, is pervasive in recognition memory experiments. Investigators may take any of several attitudes toward bias, depending in part on whether they view it as an inconvenient truth or a cognitive skill of inherent interest. Among their aims are the measurement, control, optimization, and elimination of response bias. We have had the most to say about measurement; here we summarize our recommendations on this issue and add a few comments about the other, more ambitious, goals.
A.
CHOOSE THE RIGHT SENSITIVITY MEASURE
Even if two experimental conditions diVer only in response bias, some estimates of sensitivity (such as d0 ) are dependent on criterion if the underlying memory distributions are not of equal variance (see Fig. 1B). In this case, substantive conclusions about memory accuracy require a more appropriate measure such as da (or Az, the area under the ROC; see Verde, Macmillan, & Rotello, 2006). Failing to use a proper measure of sensitivity can lead researchers to attribute what are actually response‐bias eVects to sensitivity diVerences (Dougal & Rotello, in press; Verde & Rotello, 2003). Once sensitivity is correctly evaluated and found to be fixed across conditions, it matters little whether response bias is summarized with c, , cT, cL, or some other measure.
B.
CHOOSE THE RIGHT RESPONSE‐BIAS MEASURE
If two experimental conditions diVer in sensitivity, the choice of criterion measure does matter: Figure 2 demonstrates that conclusions about constancy of bias vary systematically with one’s choice of measure. The appropriate criterion location measure depends on the question being asked of the data: For example, if is it important to know whether the criterion is fixed relative to the lure distribution, then cL is the best choice. A decision between location and likelihood indexes depends on the specific model being employed.
Response Bias in Recognition Memory
C.
89
APPLY AN EXPLICIT MODEL TO THE DATA
Some model is always necessary. We have already noted, for example, that adopting an equal‐variance model when variances are unequal can lead to false conclusions. A more complex situation arises in the many memory domains in which dissociations between conditions are sought. Such dissociations may reflect distinct underlying processes, though they provide weak evidence for that conclusion (Dunn & Kirsner, 1988). But it is at least as likely that the dissociations reflect either sensitivity diVerences or criterion changes across conditions, or both, as Dunn (2004) demonstrated in the remember–know domain. In the absence of a model, an erroneous inference is probable.
D.
CONSIDER THE USE OF FEEDBACK AND/OR FORCED‐CHOICE DESIGN
A
Few recognition memory experiments provide feedback to subjects, whereas many perceptual experiments do include this feature. The diVerence may have developed because in many perception experiments the same stimuli are repeated across trials, and learning relevant features might be aided by feedback. In memory, contrariwise, every test item is (typically) diVerent. Nonetheless, feedback does allow the subject to estimate the hit and false‐ alarm rate, and that information can be used to adjust the criterion. In many cases, this adjustment can be expected to encourage neutral responding, making feedback a tool for controlling or optimizing bias rather than measuring it. In the two‐alternative forced‐choice (2AFC) design, two stimuli are presented on each test trial and the subject chooses the one that was studied. Forced‐choice response bias tends to be small; distinguishing sensitivity from response‐bias changes is easier when the latter are minimal. The relation between the forced‐choice and yes–no paradigms may not be the same in memory as in perception (Kroll, Yonelinas, Dobbins, & Frederick, 2002; Smith & Duncan, 2004), but this relation is of little consequence if the design is fixed throughout an experiment. Performance in 2AFC is better than in yes–no, so experimental parameters like length of the study and test lists must be altered to avoid ceiling eVects. To the extent that feedback and 2AFC eliminate bias, they of course eliminate the opportunity for studying it. Real‐life recognition memory is, arguably, more often yes–no than forced‐choice, and this ecological validity supports the continued use of yes–no even at the cost of increased complexity.
90
E.
Rotello and Macmillan
USE RATINGS AND PLOT ROCS
Our strongest recommendation—to ask for confidence ratings and use the resulting data to construct ROC curves—is an aid to implement the others. ROCs provide the safest way to estimate sensitivity without the risk of contamination by bias eVects, and they constrain the possible representations (e.g., by revealing the equality or inequality of variances). In many cases, ROCs allow nonparametric conclusions, as we saw in the case of Dougal and Rotello’s (in press) data for emotion‐laden words displayed in Fig. 8. For a given false‐alarm rate, memory sensitivity is a monotonic function of the hit rate, so that diVerences in memory strength are reflected by ROC heights. Alterations in response bias are indicated by points that fall along equal‐ sensitivity curves (that is, the ROCs): Points in the left region of a curve reflect more conservative biases and those to right more liberal biases. Comparing two conditions in ROC space, then, can indicate at a glance whether they diVer in underlying sensitivity (curves of diVerent heights), diVerences in bias (placement of the points along a common curve), or both. ACKNOWLEDGMENT The authors are supported by a grant from the National Institutes of Health (MH60274).
REFERENCES Balakrishnan, J. D., & RatcliV, R. (1996). Testing models of decision making using confidence ratings in classification. Journal of Experimental Psychology: Human Perception and Performance, 22, 615–633. Bechara, A., Damasio, H., & Damasio, A. R. (2000). Emotion, decision making, and the orbitofrontal cortex. Cerebral Cortex, 10, 295–307. Bechara, A., Damasio, H., Damasio, A. R., & Lee, G. P. (1999). DiVerent contributions of the human amygdala and ventromedial prefrontal cortex to decision‐making. Journal of Neuroscience, 19, 5473–5481. Benjamin, A. S. (2001). On the dual eVects of repetition on false recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 941–947. Benjamin, A. S. (2005). Recognition memory and introspective remember/know judgments: Evidence for the influence of distractor plausibility on ‘remembering’ and a caution about purportedly nonparametric measures. Memory & Cognition, 33, 261–269. Benjamin, A. S., & Bawa, S. (2004). Distractor plausibility and criterion placement in recognition. Journal of Memory and Language, 51, 159–172. Benjamin, A. S., & Wee, S. (submitted for publication). Signal detection with criterial variability: Applications to recognition memory. Bohil, C. J., & Maddox, W. T. (2001). Category discriminability, base‐rate, and payoV eVects in perceptual categorization. Perception & Psychophysics, 63, 361–376.
Response Bias in Recognition Memory
91
Bohil, C. J., & Maddox, W. T. (2003). A test of the optimal classifier’s independence assumption in perceptual categorization. Perception & Psychophysics, 65, 478–493. Brown, J., Lewis, V. J., & Monk, A. F. (1977). Memorability, word frequency and negative recognition. Quarterly Journal of Experimental Psychology, 29, 461–473. Brown, S., & Steyvers, M. (2005). The dynamics of experimentally induced criterion shifts. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 587–599. Clancy, S. A., McNally, R. J., Schacter, D. L., Lenzenweger, M. F., & Pitman, R. K. (2002). Memory distortion in people reporting abduction by aliens. Journal of Abnormal Psychology, 111, 455–461. Clancy, S. A., McNally, R. J., Schacter, D. L., & Pitman, R. K. (2000). False recognition in women reporting recovered memories of sexual abuse. Psychological Science, 1, 26–31. Diana, R. A., Reder, L. M., Arndt, J., & Park, H. (2006). Models of recognition: A review of arguments in favor of a dual‐process account. Psychonomic Bulletin & Review, 13, 1–21. Donaldson, W. (1996). The role of decision processes in remembering and knowing. Memory & Cognition, 24, 523–533. Dorfman, J., Kihlstrom, J. F., Cork, R. C., & Misiaszek, J. (1995). Priming and recognition in ECT‐induced amnesia. Psychonomic Bulletin & Review, 2, 244–248. Dougal, S., & Rotello, C. M. (1999). Context eVects in recognition. American Journal of Psychology, 112, 277–295. Dougal, S., & Rotello, C. M. (in press). ‘‘Remembering’’ emotional words is based on response bias, not recollection. Psychonomic Bulletin & Review. Dunn, J. C. (2004). Remember‐know: A matter of confidence. Psychological Review, 111, 524–542. Dunn, J. C., & Kirsner, K. (1988). Discovering functionally independent mental processes: The principle of reversed association. Psychological Review, 95, 91–101. Feenan, K., & Snodgrass, J. G. (1990). The eVect of context on discrimination and bias in recognition memory for pictures and words. Memory & Cognition, 18, 515–527. Gardiner, J. M., & Java, R. I. (1990). Recollective experience in word and nonword recognition. Memory & Cognition, 18, 23–30. Gardiner, J. M., & Richardson‐Klavehn, A. (2000). Remembering and knowing. In E. Tulving and F. I. M. Craik (Eds.), The Oxford handbook of memory (pp. 229–244). Oxford: Oxford University Press. Glanzer, M., Kim, K., Hilford, A., & Adams, J. K. (1999). Slope of the receiver‐operating characteristic in recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25, 500–513. Goh, W. D. (2005). Talker variability and recognition memory: Instance‐specific and voice‐ specific eVects. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 40–53. Green, D. M. (1964). General prediction relating yes‐no and forced‐choice results. Journal of the Acoustical Society of America, 36, 1042 (Abstract). Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley. Gregg, V. H., & Gardiner, J. M. (1994). Recognition memory and awareness: A large eVect of study‐test modalities on ‘‘know’’ responses following a highly perceptual orienting task. European Journal of Cognitive Psychology, 6, 131–147. Heit, E., BrockdorV, N., & Lamberts, K. (2003). Adaptive changes of response criterion in recognition memory. Psychonomic Bulletin & Review, 10, 718–723. Hicks, J. L., & Marsh, R. L. (1998). A decrement‐to‐familiarity interpretation of the revelation eVect from forced‐choice tests of recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 1105–1120.
92
Rotello and Macmillan
Hirshman, E. (1995). Decision processes in recognition memory: Criterion shifts and the list‐ strength paradigm. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 302–313. Hirshman, E., & Hostetter, M. (2000). Using ROC curves to test models of recognition memory: The relationship between presentation duration and slope. Memory & Cognition, 28, 161–166. Kornbrot, D. E. (2006). Signal detection theory, the approach of choice: Model‐based and distribution‐free measures and evaluation. Perception & Psychophysics, 68, 393–414. Kroll, N. A., Yonelinas, A. P., Dobbins, I. G., & Frederick, C. M. (2002). Separating sensitivity from response bias: Implications of comparisons of yes‐no and forced‐choice tests for models and measures of recognition memory. Journal of Experimental Psychology: General, 131, 241–254. Lamberts, K., BrockdorV, N., & Heit, E. (2003). Feature‐sampling and random‐walk models of individual‐stimulus recognition. Journal of Experimental Psychology: General, 132, 351–378. LeDoux, J. E. (1986). Sensory systems and emotion: A model of aVective processing. Integrative Psychiatry, 4, 237–248. Macmillan, N. A., & Creelman, C. D. (1990). Response bias: Characteristics of detection theory, threshold theory, and ‘‘nonparametric’’ measures. Psychological Bulletin, 107, 401–413. Macmillan, N. A., & Creelman, C. D. (1996). Triangles in ROC space: History and theory of ‘‘nonparametric’’ measures of sensitivity and response bias. Psychonomic Bulletin & Review, 3, 164–170. Macmillan, N. A., & Creelman, C. D. (2005). Detection theory: A user’s guide (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates. Macmillan, N. A., & Rotello, C. M. (2006). Deciding about decision models of remember and know judgments: A reply to Murdock (2006). Psychological Review, 113, 657–665. Maddox, W. T. (2002). Toward a unified theory of decision criterion learning in perceptual categorization. Journal of the Experimental Analysis of Behavior, 78, 567–595. Maddox, W. T., & Bohil, C. J. (2004). Probability matching, accuracy maximization, and a test of the optimal classifier’s independence assumption in perceptual categorization. Perception & Psychophysics, 66, 104–118. Maratos, E. J., Allen, K., & Rugg, M. D. (2000). Recognition memory for emotionally negative and neutral words: An ERP study. Neuropsychologia, 38, 1452–1465. McClelland, J. L., & Chappell, M. (1998). Familiarity breeds diVerentiation: A subjective‐ likelihood approach to the eVects of experience in recognition memory. Psychological Review, 105, 734–760. Morrell, H. E. R., Gaitan, S., & Wixted, J. T. (2002). On the nature of the decision axis in signal‐ detection‐based models of recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 1095–1110. Murdock, B. (2006). Decision‐making models of remember/know judgments. Psychological Review, 113, 648–656. Murnane, K., & Phelps, M. P. (1994). When does a diVerent environmental context make a diVerence in recognition? A global activation model Memory & Cognition, 22, 584–590. Murnane, K., & Phelps, M. P. (1995). EVects of changes in relative cue strength on context‐ dependent recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 158–172. Murnane, K., Phelps, M. P., & Malmberg, K. (1999). Context‐dependent recognition memory: The ICE theory. Journal of Experimental Psychology: General, 128, 403–415. Niewiadomski, M. W., & Hockley, W. E. (2001). Interrupting recognition memory: Tests of familiarity‐based accounts of the revelation eVect. Memory & Cognition, 29, 1130–1138.
Response Bias in Recognition Memory
93
Phelps, E. A., LaBar, K. S., & Spencer, D. D. (1997). Memory for emotional words following unilateral temporal lobectomy. Brain and Cognition, 35, 85–109. Postma, A. (1999). The influence of decision criteria upon remembering and knowing in recognition memory. Acta Psychologica, 103, 65–76. RatcliV, R., Sheu, C.‐F., & Gronlund, S. D. (1992). Testing global memory models using ROC curves. Psychological Review, 99, 518–535. Rotello, C. M. (1999). Metacognition and memory for nonoccurrence. Memory, 7, 43–63. Rotello, C. M., Macmillan, N. A., Hicks, J. L., & Hautus, M. (2006). Interpreting the eVects of response bias on remember‐know judgments using signal‐detection and threshold models. Memory & Cognition, 34, 1598–1614. Rotello, C. M., Macmillan, N. A., Reeder, J. A., & Wong, M. (2005). The remember response: Subject to bias, graded, and not a process‐pure indicator of recollection. Psychonomic Bulletin & Review, 12, 865–873. Schulman, A. I., & Greenberg, G. Z. (1970). Operating characteristics and a priori probability of the signal. Perception & Psychophysics, 8, 317–320. ShiVrin, R. M., & Steyvers, M. (1997). A model for recognition memory: REM–retrieving eVectively from memory. Psychonomic Bulletin & Review, 4, 145–166. Singer, M., Gagnon, N., & Richards, E. (2002). Strategies of text retrieval: A criterion shift account. Canadian Journal of Experimental Psychology, 56, 41–57. Singer, M., & Wixted, J. T. (2006). EVect of delay on recognition decisions: Evidence for a criterion shift. Memory & Cognition, 34, 125–137. Smith, D. G., & Duncan, M. J. J. (2004). Testing theories of recognition memory by predicting performance across paradigms. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 615–625. Strack, F., & Bless, H. (1994). Memory for nonoccurrences: Metacognitive and presuppositional strategies. Journal of Memory and Language, 33, 203–217. Strack, F., & Forster, J. (1995). Reporting recollective experiences: Direct access to memory systems? Psychological Science, 6, 352–358. Stretch, V., & Wixted, J. T. (1998a). Decision rules for recognition memory confidence judgments. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 1397–1410. Stretch, V., & Wixted, J. T. (1998b). On the diVerence between strength‐based and frequency‐ based mirror eVects in recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 1379–1396. Treisman, M., & Faulkner, A. (1984). The eVect of signal probability on the slope of the receiver operating characteristic given by the rating procedure. The British Journal of Mathematical and Statistical Psychology, 37, 199–215. Treisman, M., & Williams, T. C. (1984). A theory of criterion setting with an application to sequential dependencies. Psychological Review, 91, 68–111. Tulving, E. (1985). Memory and consciousness. Canadian Journal of Psychology, 26, 1–12. Van Zandt, T. (2000). ROC curves and confidence judgments in recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 582–600. Verde, M. F., Macmillan, N. A., & Rotello, C. M. (2006). Measures of sensitivity based on a single hit rate and false‐alarm rate: The accuracy, precision, and robustness of d 0 , Az, and A0 . Perception & Psychophysics, 68, 643–654. Verde, M. F., & Rotello, C. M. (2003). Does familiarity change in the revelation eVect? Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 739–746. Verde, M. F., & Rotello, C. M. (2004a). ROC curves show that the revelation eVect is not a single phenomenon. Psychonomic Bulletin & Review, 11, 560–566.
94
Rotello and Macmillan
Verde, M. F., & Rotello, C. M. (2004b). Strong memories obscure weak memories in associative recognition. Psychonomic Bulletin & Review, 11, 1062–1066. Verde, M. F., & Rotello, C. M. (2007). Memory strength and the decision process in recognition memory. Memory & Cognition, 35, 254–262. Windmann, S., & Kru¨ger, T. (1998). Subconscious detection of threat as reflected by an enhanced response bias. Consciousness and Cognition: An International Journal, 7, 603–633. Windmann, S., & Kutas, M. (2001). Electrophysiological correlates of emotion‐induced recognition bias. Journal of Cognitive Neuroscience, 13, 577–592. Windmann, S., Urbach, T. P., & Kutas, M. (2002). Cognitive and neural mechanisms of decision biases in recognition memory. Cerebral Cortex, 12, 808–817. Wixted, J. T. (1992). Subjective memorability and the mirror eVect. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 681–690. Wixted, J. T., & Stretch, V. (2004). In defense of the signal detection interpretation of remember/ know judgments. Psychonomic Bulletin & Review, 11, 616–641. Woodard, J. L., Axelrod, B. N., Mordecai, K. L., & Shannon, K. D. (2004). Value of signal detection theory indexes for Wechsler memory Scale‐III recognition measures. Journal of Clinical and Experimental Neuropsychology, 26, 577–586. Worthen, J. B., & Eller, L. S. (2002). Test of competing explanations of the bizarre response bias in recognition memory. The Journal of General Psychology, 129, 36–48.
WHAT CONSTITUTES A MODEL OF ITEM‐BASED MEMORY DECISIONS? Ian G. Dobbins and Sanghoon Han
I. A.
Introduction
WHAT ARE ITEM‐BASED MEMORY DECISIONS?
The focus of this chapter is on item‐based memory decisions. These arise in a host of experimental and everyday tasks where one is presented with an item and required to act based on memories linked with that item. For example, bumping into someone at a conference and quickly deciding whether they are suYciently familiar to warrant a wave or greeting. As this chapter illustrates, the contextual specificity required of item‐based memory decisions varies widely, with some requiring situating the item with respect to a specific experience (often referred to as source‐ or context‐memory) and others simply requiring that the item be recognized as having been encountered recently (viz., recognition). Although source memory and recognition tasks have often been treated as qualitatively diVerent, if we assume that the required contextual specificity of the memory decision is likely a continuous variable, then it is important not to artificially segregate these tasks. Using the conference‐greeting example above as an illustration, there may be a host of decision factors that determine whether one executes an action given a particular level of item‐based memory evidence. For example, the size of the conference may be valuable indicator as to whether the individual is likely to be known, since if one is attending a very small close‐knit conference, then the THE PSYCHOLOGY OF LEARNING AND MOTIVATION VOL. 48 DOI: 10.1016/S0079-7421(07)48003-3
95
Copyright 2007, Elsevier Inc. All rights reserved. 0079-7421/08 $35.00
96
Dobbins and Han
odds of knowing most of those one encounters are greatly increased. Numerous other factors may also come into play, for example, personality factors such as introversion or extroversion, social factors such as purpose of the conference, or more situational factors such as whether one is in a hurry. Thus, while these tasks are typically characterized as decisionally simple in the laboratory, even cursory consideration of an everyday demand reveals a host of factors other than the memory evidence itself, which must be considered. Despite this, the idea that item‐based memory judgments require much in the way of decision mediation has only recently received much interest. One historical reason for this late interest may have been frequent neuropsychological findings suggesting that relative to free recall, item‐based judgments were often (though not always) relatively spared in spite of considerable damage to the prefrontal cortex (PFC) (Janowsky, Shimamura, & Squire, 1989; Jetter, Poser, Freeman, & Markowitsch, 1986; Swick & Knight, 1996). Given that PFC is generally recognized as the seat of controlled decision making, the natural inference was that item‐based memory judgments made few demands on decision processes, perhaps because they provided subjects with the core objects of interest (aka copy cues). In contrast to this initial view, it is now generally accepted that PFC regions make important contributions across a range of item‐based memory judgments, provided these judgments recruit one or more decision‐related processes often grouped under the general label of executive control processes (for relevant reviews, see Fletcher & Henson, 2001; Schacter, Norman, & Koutstaal, 1998; Shimamura, Janowsky, & Squire, 1991; Stuss & Alexander, 2005). An additional contributor to the lack of research specifically targeting item‐based memory decision processes may have been the immense success of the basic one‐dimensional signal detection theory (1D‐SDT) characterization of recognition judgment shown in Fig. 1. This model assumes only the simplest of decision processes and serves as the key reference point for the rest of this chapter. B.
ONE DIMENSIONAL SIGNAL DETECTION THEORY: A SIMPLE DECISION MODEL
The 1D‐SDT model assumes that item‐based memory discriminations rely on continuous evidence that is normally distributed for test items drawn from two memory classes (i.e., targets and lures). In the case of simple recognition, it is assumed that the baseline strength of evidence of novel items is augmented through study, yielding a target distribution falling to the right of a lure distribution. Because these evidence distributions overlap, observers are forced to use a single evidence value to parse the continuum into the two required response options: ‘‘old’’ and ‘‘new’’ (Fig. 1A). This value is referred to as the decision criterion, and can be based directly on the level of memory
What Constitutes a Decision Model?
A
B “A” response
“Old” response
New
Probability
“New” response
Probability
97
Old
“B” response
Source A
Source B
Source memory evidence
Item evidence C
“Know” response “New” response
Probability
“Remember” response
New
Old
C RC Fig. 1. Examples of the one‐dimensional signal detecion theory (1D‐SDT) model of item‐ based memory discrimination. Panel A illustrates the model for old/new recognition with normal probability density distributions of familiarity values for old and new items. Panel B illustrates the model for source memory discrimination. Panel C illustrates the extension of the basic item recognition memory model to old/new discrimination requiring subjective ‘‘remember’’ and ‘‘know’’ reports.
evidence evoked by the item, or instead reflect a statistical estimate of relative likelihood given this evidence (Banks, 1970; Macmillan & Creelman, 1991; Parks, 1966). During recognition, if an item’s value falls above the criterion the item is categorized as ‘‘old,’’ whereas if it falls below, it is instead categorized as ‘‘new’’; this is the full extent of the decision process required to make item recognition judgments. As the evidence becomes extreme in either direction, subjects will become increasingly accurate and confident in their reports. If the design instead asks subjects to express levels of confidence for each recognition decision (e.g., high, medium, or low), then it is assumed that additional criteria are used to parse the continuum into confidence regions. Accuracy under the 1D‐SDT model is reflected as the distance between the distribution means, measured in units of the common standard deviation of the distributions. If the model is correct, one can calculate this distance, termed d0 , wholly independent of the placement of the decision criterion and this is the primary attraction of the model, namely, it provides
98
Dobbins and Han
a way to factor out decision processes from the estimate of observer skill. However, the extent to which this is achieved of course depends on the validity and generality of the decision characterization. Extending the model to source memory discrimination is relatively straightforward and several researchers have suggested that source discriminations can likewise be conceptualized as a 1D‐SDT decision process (Qin, Raye, Johnson, & Mitchell, 2001; Slotnick & Dodson, 2005; Slotnick, Klein, Dodson, & Shimamura, 2000). For example, if subjects are asked to discriminate between items encoded under two diVerent orienting tasks, for example pleasantness versus concreteness ratings of words (Source A vs Source B), then it is assumed that the evidence dimension now reflects the amount of context memory supporting one or the other source origin (Fig. 1B). Here, the relative ordering of the two evidence distributions is wholly arbitrary. A key point to emphasize, regardless of whether one is talking about a source or recognition decisions, is that the decision operation itself remains patently simple. In the case of simple recognition, the 1D‐SDT model has been very successful in capturing the empirical relationship between observers’ confidence and performance (termed the receiver operating characteristic–ROC). Furthermore, it has been proposed as a simpler explanation than ‘‘dual‐process’’ theories of recognition decisions that instead postulate two fundamentally diVerent item‐based retrieval processes; a context recollection process and item‐familiarity process. Under dual‐process theories, recollection is thought to involve the recovery of context details successfully associated with some studied probes and results in the phenomenology of remembering a unique personal experience (e.g., remembering a thought one had when the item was previously encountered). In contrast, item familiarity is usually conceptualized as an acontextual sense of recent item exposure, akin to the feeling one gets when encountering a familiar face in the complete absence of knowledge why that face seems familiar (Jacoby, 1991; Mandler, 1980; Tulving, 1985; Yonelinas, 1994). One method of measuring context recollection and item familiarity developed by Tulving (1985) is to ask the observers to subjectively reflect on why they are endorsing test items. If they recover first person contextual information about the item experience, then they ‘‘remember’’ encountering the item. If instead the item strikes them as familiar in the absence of recollection, they instead report that they ‘‘know’’ the item was previously studied. A considerable body of research demonstrates dissociations of remember and know response rates suggesting the presence of fundamentally diVerent underlying processes or systems (Gardiner, 1988; Gardiner & Java, 1990; Gardiner & Parkin, 1990; Parkin & Walter, 1992; Rajaram, 1993; Yonelinas & Jacoby, 1995). Despite this research, investigators have recently made fairly
What Constitutes a Decision Model?
99
compelling claims that the simple 1D‐SDT model of Fig. 1 can incorporate most if not all of the remember/know dissociations in the literature (Donaldson, 1996; Dunn, 2004; Hirshman & Master, 1997; Inoue & Bellezza, 1998). To do this, it is assumed that observers simply use a second, more stringent criterion in order to provide ‘‘remember’’ reports (Fig. 1C). Thus under the 1D‐SDT model, remember and know responses simply reflect two diVerent decision criteria placed along the evidence continuum with ‘‘remembering’’ merely signaling that the observer is highly confident of the endorsement and ‘‘knowing’’ indicating that he or she is less so; no process distinction between remember and know responses is required. Our strategy in the current chapter is to use the models in Fig. 1 as a known benchmark or ‘‘ideal’’ with which to compare findings in our laboratory and others across several domains. As we progress, we will emphasize two types of shortcomings of the 1D‐SDT model. The rarer shortcoming is when the 1D‐SDT model clearly makes incorrect predictions about experimental outcomes. For example, confidence and accuracy are predicted to be monotonically related under the model, yet there are several reports challenging this assumption in the literature. The second and more prevalent shortcoming we note is when the 1D‐SDT model can accommodate a given finding after the fact, but the data would not have been anticipated or predicted given the core assumptions of the model. These latter types of findings do not strictly falsify the model, however, they serve to shed light on what we argue is a core deficiency. Namely, although the model is often characterized as a model of item‐based memory judgment, it oVers little in the way of explaining many trends or patterns in the way subjects actually render item‐based memory decisions. In other words, the model’s heuristic value is arguably questionable. In this spirit we eventually suggest that the 1D‐SDT model characterized in Fig. 1 is less an account of memory decision making than a convenient benchmark or standard whose shortcomings may help reveal the strategies and heuristics that subjects actually bring to bear during item‐based memory attributions. To address this shortcoming, we call for an approach that is similar, if not identical, to the Heuristics and Biases approach to judgment and decision making (JDM) (Tversky & Kahneman, 1974). Under the Heuristics and Biases approach, the shortcomings of a mathematical ideal known as Rational Choice Theory form the grist for theorizing about the heuristics and decision strategies subjects employ when rendering decisions about noisy, incomplete, or complex data. In the case of item‐based memory decisions the mathematical ideal we use is the 1D‐SDT in Fig. 1. Below we present four sections each highlighting data that suggest the 1D‐ SDT model is a fairly limited characterization of item‐based memory decisions. Section II.A examines the lability of the decision criterion during testing. Section II.B looks at the relationship between confidence and accuracy
Dobbins and Han
100
during recognition decisions. Section II.C considers whether criterion placement is informed by the skill of the observer. Section II.D examines the recruitment of PFC during item‐based context versus familiarity judgments. Finally, Section III concludes and considers some potential objections. II.
A.
The Characteristics and Neural Substrates of Item‐Based Memory Decisions
HOW LABILE IS THE RECOGNITION CRITERION DURING TESTING?
As many a postponed visit to the optometrist illustrates, visual acuity is coldly indiVerent to one’s desperate wishes to avoid glasses or contact lenses. Similarly, under the 1D‐SDT model, it is generally assumed that observers have little if any ability to influence the actual acuity of their memory evidence representations. Given this, the only way observers can flexibly adapt to changing environmental contingencies is to adjust the position of the decision criterion in order to take advantage of the relative costliness of errors of commission versus omission, or the relative rewards of detections versus rejections. In short, although changing one’s d0 is not an option, changing one’s criterion position (aka bias) can increase the likelihood of a preferred outcome. The basic assumptions of the 1D‐SDT model give us no direct clues as to whether we should view recognition decision criteria as relatively entrenched or labile during testing. Yet, assuming the decision axis is one of relative likelihood leads quite naturally to the idea of a very flexible and adaptive criterion because likelihood information can be easily used to determine where the criterion should be placed to achieve diVerent observer goals such as maximizing rewards (Macmillan & Creelman, 1991). Furthermore, if instead the decision criterion is conceptualized as simply an evidence value held in working memory, there is no a priori reason to expect that it would be particularly diYcult for observers to flexibly adjust it during testing. Despite this, recent studies have suggested that the item recognition criterion can be exceedingly resistant to change once testing begins. A particularly striking example of criterion rigidity was documented by Morrell, Gaitan, and Wixted (2002). Using encoding manipulations that preferentially strengthened one semantic category of items relative to another (e.g., professions vs locations) the authors reasoned that an adaptive criterion strategy during testing would be to adjust criterion upward for items drawn from the known strong category relative to those drawn from the category known to be weakly encoded. For example, if the observers realized that professions were studied five times whereas locations were only seen once, then an optimal strategy at test
What Constitutes a Decision Model?
101
would be to elevate the decision criterion for any item that was a profession since one would anticipate fairly robust evidence had such items been studied. However, despite clear diVerences in the hit rates for items drawn from strong and weak categories, lures from the two categories demonstrated equivalent false alarm rates (see also Stretch & Wixted, 1998). These data suggest a relatively rigid decision criterion that subjects may be remarkably resistant to shift on a trial‐by‐trial basis. The data could also suggest that subjects are relatively local in their assessment of memory evidence. That is, they do not employ global list regularities because these are ambiguous at the item level that is the dominant focus of their interest. For example, even though one might know that a particular category of items was studied five times, one will nonetheless encounter novel items from that category eliciting minimal memory evidence. Given this lack of consistency between category membership and evidence at the local level, subjects may disregard or fail to use the information. This is perhaps even more likely given that observers typically realize that half of all items, regardless of category, are likely to be lures. 1.
Memorability Heuristics
We have examined several manipulations designed to illustrate the tendency of observers to take a ‘‘local’’ focus in the evaluation of memory evidence during testing. Dobbins and Kroll (2005) investigated criterion lability based on the subjective memorability heuristic proposed by Brown (Brown, Lewis, & Monk, 1977). This decision heuristic assumes that subjects use the semantic characteristics of individual items to estimate how likely remembrances would be, had the item been studied. For example, upon encountering a picture of one’s mother in a test list otherwise composed of novel and studied strangers, the heuristic assumes that subjects adopt an extremely conservative stance because they rightfully believe that they would remember the thoughts and emotions that would have been previously experienced had the personally relevant photo been studied. Thus the absence of vivid remembrances serves as countermanding evidence to the high familiarity of the lures (see also Dodson & Schacter, 2001, 2002). To test this idea in a subtler fashion, Dobbins and Kroll (2005) presented undergraduate subjects with photos of their campus and its surrounding area, and photos of an unknown campus and its surrounding area. The key prediction was that when given suYcient time to employ the heuristic, subjects would adopt a more stringent decision criterion for photos of personally known locations (e.g., one’s favorite pizza parlor) versus those of unknown locations (e.g., a pizza parlor in an unknown town). Importantly, this should lead to potentially lower false alarm rates for known than unknown lures during testing, despite the fact
102
Dobbins and Han
that known locations would necessarily have a much greater baseline familiarity or strength of evidence. This is in fact what occurred. When recognition testing was immediate and self‐paced, subjects correctly identified known studied items more often than unknown study items, yet the pattern for intrusion errors was largely reversed (Dobbins & Kroll, 2005). That is, the tendency to incorrectly endorse lures drawn from known locations (false alarms) was either equivalent or significantly lower than the tendency to endorse lures drawn from unknown locations. This suggests a higher memory standard was applied to known than unknown items, or that subjects used the lack of recollections for known items as evidence against study. Thus the data suggested trial‐by‐trial adaptation in the criterial basis of memory decisions. This interpretation received further support from a response speeding manipulation that required subjects to make memory judgments extremely rapidly. Because the proposed memorability heuristic is a controlled decision process, Dobbins and Kroll (2005) reasoned it would be abandoned if responding were sped. Consistent with this hypothesis, the false alarm relationship was reversed under speeded compared to self‐paced responding. That is, when subjects were not given enough time to use heuristic, responses appeared predominantly influenced by the baseline familiarity of the lures, and lures from known locations were endorsed significantly more often than those drawn from unknown locations (see also Dodson & Hege, 2005). The feasibility of this and similar criterion placement heuristics has also been demonstrated by Benjamin and colleagues using manipulations of lure plausibility (Benjamin & Bawa, 2004) suggesting that observers rapidly take a more conservative stance when they know that the lures are highly plausible variants with respect to the targets. Because the 1D‐SDT model oVers no way to predict or explain the selective use of decision heuristics, it cannot easily anticipate such findings. That is, there is nothing inherent in the 1D‐SDT model that would lead one to predict that the false alarm rates for known scenes would be lower than unknown scenes during self‐paced responding, and that this relationship would reverse when subjects were speeded to respond. 2.
Biased Feedback Contingencies
A second way we have tested the flexibility of the recognition decision criterion is through a false feedback procedure (Han & Dobbins, under review). The use of feedback to influence verbal recognition memory criteria is rare and in fact we know of only two reports doing so (Estes & Maddox, 1995; Verde & Rotello, 2007). Verde and Rotello were able to induce a criterion change by providing subjects with correct feedback on
What Constitutes a Decision Model?
103
every trial and by stopping intermediately and giving subjects a summary of their performance for each quartile of the test. Using this procedure in conjunction with a large reduction in the strength of targets across the first and second half of the test, these researchers were able to induce a more liberal criterion placement in the latter half of the test. Estes and Maddox (1995) induced a criterion shift during digit and picture recognition memory by changing the base rates of old stimuli, yet this manipulation was ineVective during word recognition. What distinguishes the false‐feedback procedure described below is that (1) the criterion shift occurs in the absence of any change in the nature of the test stimuli, that is, neither density nor strength is manipulated; (2) the manipulation that drives the eVect occurs on a small minority of the trials; and (3) the manipulation is generally transparent to the observers because it is only performed when they are uncertain of their responses, namely, during errors. During the procedure, observers are selectively misinformed about performance only during errors of commission (false alarms) or omission (misses); the remaining trial types receive correct feedback. Thus observers in the ‘‘strict’’ group receive correct feedback for hits, correct rejections, and false alarms, but are incorrectly informed that misses were in fact correct responses. Conversely, observers in the ‘‘lax’’ group are incorrectly informed that false alarms were correct responses while all other response types receive correct feedback. By falsely informing subjects on a small portion of their errors, Han and Dobbins (under review) reasoned that the false feedback would induce incremental changes in the criterion position such that observers would tend to avoid the response type more often associated with negative outcomes. Again, because the misinformation is given only during incorrect responses, subjects are generally unaware of the manipulation. Figure 2 shows the outcome of one experiment confirming this prediction. Subjects underwent 3 study/test cycles with 60 items during study and 120 during test. During the first test, half of the subjects received false feedback predicted to encourage a lax criterion ‘‘L,’’ whereas the other half received feedback designed to encourage a strict criterion ‘‘S.’’ This assignment was reversed across the groups in the remaining two tests. As can be seen in the Fig. 2, a complete crossover of the estimate of criterion, Ca, was observed indicating that the group diVerences in criterion were influenced by the nature of the feedback. Post hoc t‐tests demonstrated that whereas the ‘‘SLL’’ group was significantly more conservative than the ‘‘LSS’’ group during test 1 (t(31) ¼ 2.07, p < .05), this pattern was reversed during the final two tests in which the feedback contingencies were reversed [t(31) ¼ 2.91, p < .01 and t(31) ¼ 3.53, p < .005, respectively].
Dobbins and Han
104
.5 .4
Criterion ca
.3 .2 .1 .0 −.1 −.2 −.3 −.4
Test 1
Test 2 Test block
“LSS” group
Test 3
“SLL” group
Fig. 2. Demonstration of false feedback eVect in criterion placement (Ca) from Han and Dobbins (under review). Groups received diVerent biasing manipulations across the three study/ test cycles. For example, the ‘‘LSS’’ group was induced to be lax on the first test and strict on the remaining two.
These false feedback induced criterion shifts appear quite durable, continuing for considerable periods even when feedback is removed or shifted to fully correct feedback. For example, in another experiment the group criterion manipulation was interspersed with a condition in which the observers received fully correct, neutral feedback (‘‘N’’). Here, the criterion was significantly more conservative in the ‘‘SNL’’ group than ‘‘LNS’’ during test 1 [t(27) ¼ 3.05, p < .01], and it remained so in the second test ‘‘N,’’ in which all feedback was correct [t(27) ¼ 3.36, p < .005]. Thus criterion diVerences induced in the first test remained (although they were numerically smaller) during a second test in which feedback was entirely neutral. Extensions of the false feedback manipulation also indicate that it is eVective even when administered probabilistically, for example on 85, 70, and 50% of the incorrect trials (Han & Dobbins, in preparation). Overall, the false feedback technique suggests that observers are extremely sensitive to feedback contingencies given on a trial‐by‐trial fashion during testing and that they learn to avoid negative outcomes by repositioning the decision criterion. There are several key questions about the eVect that we are currently exploring. For example, it is unclear to whether the criterion shift should be viewed as an explicit strategy or implicit learning phenomenon. Regardless, the data clearly suggest the decision criterion position can be fairly labile during testing, given the right circumstances.
What Constitutes a Decision Model?
3.
105
When Is the Criterion Position Initially Determined?
The insensitivity of the criterion to the category strength manipulation used in Morrell et al. (2002) suggests that under standard testing, subjects may not spontaneously consider global characteristics of the test list when positioning decision criteria, and in this case, it appears that they do not take into account diVerences in the average strength of certain categories of test items during decision making. Verde and Rotello (2007) have documented another factor during testing to which observers appear insensitive, namely, average diVerences in target strength between the first and second half of a test. In this study, words were encoded in either a strong or weak condition by altering study duration or repetition. The key manipulation was the construction of the test list with the first half of the list being constructed from strong items intermixed with lures, and the second half weak items intermixed with lures. There was no overt break in the list and from the perspective of the subjects it was a single seamless test. Quite expectedly, there was a prominent drop in hit rates when transitioning from the first to second halves of the test, with the hit rate falling on average 22% across experiments. More importantly, the false alarm rate remained unaVected across the test halves. Thus, subjects did not compensate for their poorer detection in the second half of the test by lowering the decision criterion, which would have necessarily increased the false alarm rate. This demonstrates that global changes in the average strength of the targets across large portions of the test list do not spontaneously trigger criterion adjustment. However, Verde and Rotello (2007) were able to induce a shift to more liberal responding in the second half by pairing the strength manipulation with trial‐by‐trial feedback throughout the entire test and giving performance summaries after every quarter of the test. We have replicated and further investigated the criterion insensitivity documented by Verde and Rotello (2007) using a levels of processing manipulation at encoding, and exploring diVerent manipulations between the halves of the test list (Raposo & Dobbins, in preparation). In a particularly striking example of criterion rigidity, subjects encoded 40 items under a deep task (pleasantness judgments) and 40 using an extremely shallow task (case judgments) in an intermixed fashion. In the immediately following self‐paced recognition test, the first half of the list contained the deep (strong) items intermixed with new items. In the second half of the test however, unlike the standard versions where weak and new items are intermixed, only new items were presented. Thus, halfway through a test in which new and highly memorable old items were intermixed, the construction changed such that the remaining 80 test items were all new. Despite this, the false alarm rate in the two halves of the test was equivalent (Fig. 3). Since it is not possible to reduce available memory evidence any further than by completely removing all
Dobbins and Han
106
1.0
Proportion “yes”
0.8
0.6
0.4
0.2
0.0
Hit
FA
FA
1st 2nd (Deep items + new items) (All new items)
Test half Fig. 3. Demonstration of criterion rigidity from Dobbins (in preparation). (Box equals one standard error of the mean, box plus whiskers correspond to two standard errors.)
studied items, this illustrates the extreme insensitivity of observers to this global list characteristic. On average, subjects responded ‘‘new’’ for 73 of the final 80 trials and even this was apparently insuYcient to induce them to conclude that their initial criterion placement may have been overly strict. Although striking, this is consistent with the results of early studies also suggesting insensitivity of recognition criteria target/lure density manipulations during testing. More specifically, in a series of studies, Wallace and colleagues demonstrated that target detection rates were often unaVected by whether lures were even present in the test lists (Wallace, 1982; Wallace, Sawyer, & Robertson, 1978). Again, this runs strongly counter to the tendency, under signal detection theory, to assume that subjects frequently attempt to capitalize on regularities during testing in order to maximize success rates or desired outcomes. Although the data above suggest that subjects can be highly insensitive to strength and density manipulations during testing, these manipulations are arguably not particularly salient. For example, in the Verde and Rotello (2007) designs, target strength is manipulated within the context of a seamless test transition. Given this, we were further interested in whether the simple act of beginning a test could induce the repositioning of the criterion based on the idea that subjects assess criterion suitability during the initial trials of test and then remain entrenched following this period. As a simple test of this
What Constitutes a Decision Model?
107
idea, we interspersed tasks between the test halves containing strong and then week targets. Figure 4A shows that an intermediate perceptual task of 3 min did not result in a resetting or shifting of the criterion once testing resumed. During the experiment, the first half of the test contained deep and new items and the second half shallow and new. However, unlike the seamless design used by Verde and Rotello (2007), at the halfway point, the computer engaged subjects in a perceptual gender classification task using cropped faces (Female? ‘‘yes’’ or ‘‘no’’). After 3 min, the verbal recognition test resumed. Despite the brief break and resuming the test anew, the false alarm rate nonetheless remained unchanged from the first half of the test. Thus the act of simply beginning a test anew was insuYcient to cause the subjects to reevaluate or reposition the criterion. Given the data above, and similar evidence for criterion rigidity, previous researchers have speculated that in some manner, the criterion may be largely determined during encoding act itself (Hirshman, 1995). Under this account, it is the nature of the encoding operations that may directly determine the position of the criterion adopted prior to testing. If correct, one should be able to influence the decision criterion prior to a given test simply by
Proportion “yes”
A
B
Intermediate perceptual task of 3 min
1.0
1.0
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0.0
Intermediate shallow verbal encoding task (80 items not tested for memory)
0.0 Hit
FA
Hit
1st
FA 2nd
(Deep items + new items) (Shallow items + new items)
Test half
Hit FA 1st
Hit FA 2nd
(Deep items + new items) (Shallow items + new items)
Test half
Fig. 4. Illustration of the eVects of intermediate tasks on criterion placement across test halves from Dobbins (in preparation). (A) An intermediate perceptual task did not result in repositioning the criterion once recognition resumed. (B) An intermediate shallow verbal encoding task did influence the subsequent criterion on the second half even though these items were not tested.
108
Dobbins and Han
engaging in deep or shallow encoding operations, even if the items from these operations are not actually tested. To test this idea we interspersed a shallow verbal encoding task (syllable counting) using a set of words that was never tested for memory, in between the test halves of the strength manipulation study. More specifically, following the first half of the test in which deep and new items were interspersed, subjects counted syllables for 80 items never again seen during the experiment and never tested for memory (a shallow task). Immediately following this incidental task, the recognition test resumed as in the prior examinations; that is, shallow items from the original study list were interspersed with new items. If it is the shallow nature of the encoding operations itself that determines the later memory criterion, then this manipulation should lower the criterion, despite the fact that these particular shallowly processed items are never actually tested. Subjects typically took less than 3 min to finish the incidental task, and thus the delay was similar to that used in the version using the intermediate, perceptual gender discrimination task shown in Fig. 4A. Nonetheless, the false alarm rate was now significantly elevated in the second half of the test compared to that in the first (Fig. 4B) half, indicating that subjects had shifted their criteria in a liberal direction. These data suggest that the quality of the immediately preceding encoding experiences directly influences the subsequent criterion position even if these items are never again encountered. Again, this is somewhat unexpected as several models of criterion placement assume that observers rapidly optimize the criterion location based on the initial experiences during testing (for discussion, see Hirshman, 1995). If this were the case however, the data from the second half of the tests in Fig. 4A and B would have been equivalent because in both cases subjects were discriminating items encoded in a similar fashion. Even though the data above suggest the criterion may be determined prior to testing, one can consider what the pattern of hits and false alarms would look like if each subject did in fact rapidly approach an optimal placement during the first few trials of a test. To do this one must collapse across the subjects and treat each trial as the variable of interest. Figure 5 shows the outcome of a simple simulation assuming a rapid adjustment toward optimal placement during the first few trials of a hypothetical test. For the simulation, 90 artificial subjects were drawn from a population d 0 value of 2.0 with a standard deviation of 0.5. This simulates a set of 90 ‘‘subjects’’ with very high accuracy varying around the d 0 value of 2. For each subject, a starting criterion location was determined by taking each subject’s optimal location (d 0 /2), and adding random noise to it (drawn from a unit normal distribution). Following this, a test item target and lure value was drawn for each subject from an old and new item distribution corresponding to his or her underlying sensitivity. For example, a subject with a d 0 of 1.8 would have a
What Constitutes a Decision Model?
0.86
0.30
0.84
0.28
109
0.26
0.82
0.24 0.80 0.22 0.78 0.20 0.76
0.18
0.74 0.72
0.16 Hit1
Hit2
Hit3
Hit4
Hit5
0.14
FA1
FA2
FA3
FA4
FA5
Fig. 5. Result of a simulation demonstrating that a rapidly optimized criterion across trials should lead to a characteristic improvement in performance across the trials when data are collapsed across subjects. FA ¼ false alarm rates.
target item sampled from a normal distribution centered on this value and a lure item randomly chosen from a unit normal distribution centered on 0. This yielded 90 observations for old items and 90 for new items across observers with varying skill levels and noisily placed criteria. Then, collapsed across the subjects, the net hit and false alarm rates were calculated on the first trial. That is, the number of target and lure items that fell above the diVerent criteria of the diVerent subjects was tallied. Following this, the criterion noise of each subject was reduced by half, simulating a rapid adjustment toward the optimal value following the information gleaned from the first test trial. Target and lure items were again sampled, and the hit and false alarm tallies calculated collapsed across the ‘‘subjects.’’ This was repeated for five trials. As can be seen from the simulation data in Fig. 5, this resulted in a characteristic rise in the hit rates, and fall in the false alarm rates across the trials. The simulation confirms that even if there is considerable variability in accuracy and initial criterion position across a set of observers, a clear pattern in the data is likely to emerge if subjects are rapidly adjusting their criterion locations toward the optimal position in the first several test trials. One should expect aggregate performance to improve across the first several or more trials when the data are collapsed across the subjects, and the hits and false alarm occurrences are tallied (for more discussion, see Benjamin & Wee, under review). Figure 6 perhaps more intuitively illustrates the mechanism underlying the simulation outcomes. The gray box in the plot indicates a region within which the criterion randomly varies from trial to trial about an optimal
Dobbins and Han
110
Old
< Movement-induced hit
Movement-induced miss
Fig. 6. Graphic illustration of the eVect of random variation in the criterion location from trial to trial. Such variation reduces performance because of the asymmetric shape of the evidence distributions.
value indicated by the black line for this observer. The bold‐lined distribution, which we focus on first, represents the sampling distribution of the old items and hence is to the right of the optimal location. As the criterion varies, it can fall lower than the optimal location and increase the likelihood of a hit, or it can fall higher than the optimal location and increase the likelihood of a miss. Were the distribution symmetric about this point, the eVect of this movement would cancel out. That is, the misses induced by upward movements would be compensated for by hits induced through downward movements. However, the normal distributions typically assumed under signal detection theory are not symmetric about the optimal location. In Fig. 6, the areas bounded by the box and underneath the old item distribution represent the relative likelihood of each outcome (i.e., a movement‐induced hit or movement‐induced miss) across the trials of the test and the shaded trapezoids illustrate these areas. As can be seen, because of the shape of the distribution, this area is larger for the movement‐induced miss region than the movement‐induced hit region and thus variability in the criterion across the trials will lead to a net reduction in the observed hit rates. In other words, deviation from the optimal location is more likely to induce a miss than a hit. The same principle applies to the false alarm rates governed by the lure distribution shown in the thin dashed line that lies to the left of the optimal location. Here, trial‐to‐trial movement is more likely to induce false alarms
What Constitutes a Decision Model?
111
than to induce movement induced correct rejections because of the shape of the distribution. Overall then, if observers start with criteria that vary considerably from optimal across subjects, but each observer quickly converges on his or her optimal location across the first few trials, the hit and false alarm rates calculated across observers for each test trial should indicate an improvement in discrimination. As an additional test of this possibility, the data across several experiments of Raposo and Dobbins (in preparation) were examined to see if there was any evidence for criterion optimization occurring across the first few trials. The actual data for 111 subjects from multiple experiments drawn from Raposo and Dobbins are shown in Fig. 7. Subjects from the diVerent experiments were all treated identically until the halfway point of the test and the figure shows the first five trials of the test, which contained deep targets and new items intermixed. There is no discernible increase in hit rates or decrease in false alarm rates across the first trials of the test. Of course, such a null finding could arise for other unspecified reasons and caution is warranted given the high performance levels, nonetheless, it is important to note that the numerical change is in fact in the wrong direction for the false alarm rates across the first three trials. In combination with the above findings however, Fig. 7 oVers further evidence that subjects are not rapidly optimizing
Proportion of subjects responding “old”
1.0 .9 .8
(57)
(59)
(49)
(59)
(56)
.7 .6 Hit False alarm ( ) Observations
.5 .4 .3 .2
(54)
(52)
(62)
(52)
(55)
1
2
3
4
5
.1 .0 Trial number
Fig. 7. Individual trials analysis from Raposo and Dobbins (in preparation). Hit and false alarm rates were tallied for the first five test trials with the data collapsed across 111 subjects. The data do not suggest that performance improves across the first few trials which would occur if each subject started with a noisy criterion, and rapidly moved toward the optimal location.
112
Dobbins and Han
criterion across the first several trials and instead support the notion that the criterion may be determined prior to even the first test trial.
4.
Summary of Section II.A. How Labile Is the Recognition Criterion During Testing?
The data of Morrell et al. (2002) and Verde and Rotello (2007) suggest that the recognition decision criterion is fairly entrenched once testing begins. Yet as the feedback manipulation of Han and Dobbins (under review) and the distinctiveness manipulation of Dobbins and Kroll (2005) demonstrate, subjects are quite capable of modifying the criterion in a trial‐by‐trial fashion when given information pertinent to the memory judgment on each trial. Indeed, the feedback manipulation indicated subjects were extremely sensitive to the manipulation given that the only diVerence between groups was the false feedback provided on the minority of trials in which a particular type of error occurred. How can we accommodate this apparent anomaly; is the criterion entrenched or highly labile during testing? Clearly further research needs to be done, but two general principles can be gleaned from Section II.A. First, subjects appear to take a very ‘‘local’’ perspective when evaluating mnemonic evidence during testing. By this we mean that they do not spontaneously track or utilize more global characteristics of the test list in order to adjust their decision criterion. On the one hand, this is not surprising as they are generally under the impression that the ordering of the items is random and given this they should focus on each trial as a unique opportunity. From this perspective it is clear why the feedback and distinctiveness manipulations work, namely, because they provide temporally local information about the current decision for each item. This is not to say that through instruction subjects could not be induced to consciously track their response tendencies or to explicitly use list‐wide category information to adjust the criterion, but to do so would presumably add a working memory component not usually or naturally employed by observers. The second principle arising from the data is that it very well may be the case that the recognition decision criterion is usually established prior to the first trial of the test list, as a function of the encoding operations undertaken immediately before testing. This explains why introducing an intermediate shallow verbal encoding task (Fig. 4B) in the middle of the test was eVective in lowering the subsequent criterion, despite the fact that the items from this task were never actually tested. This is an exciting and likely controversial idea because it implies that some form of learning, perhaps even implicit, is occurring during encoding that informs the memory criterion prior to testing. What this form of learning may be awaits further research.
What Constitutes a Decision Model?
113
Regardless of the outcomes of future investigations, the studies examined above already underscore the inadequacy of the 1D‐SDT model of Fig. 1 as a characterization of decision making because none of the findings were anticipated or explained by the basic assumptions and mechanisms of the model. Again, our argument is not that the data falsify the model, but that inasmuch as the interesting findings catalogued above all relate the way people make recognition decisions, it is clear that the 1D‐SDT model is not a comprehensive account of how recognition decisions are made. B.
IS THE RELATIONSHIP BETWEEN CONFIDENCE AND ACCURACY MONOTONIC?
One of the core assumptions of the 1D‐SDT model is that confidence proportional and accuracy increase as evidence falls increasingly distant from the optimal criterion location (unless d 0 is 0). That is, as an item’s evidence value becomes increasingly extreme, subjects should become more confident and accurate in their selected response. Despite the centrality of this assumption, there are multiple findings that weigh against it, several of which we review here. 1.
Confidence/Accuracy Inversion—Scene Manipulations
Tulving (1981) demonstrated that the perceptual similarity of test probes could be used to generate a situation where subjects were less confident and yet more accurate in one compared to another condition. In that study, subjects were shown halves of scenic pictures during study. Following study, these items were re‐presented along with one of three types of lures in a two‐alternative forced‐choice (2AFC) format (Fig. 8) (Tulving, 1981). Perceptually similar lures were those that were constructed of the other half of the studied test target (A‐A0 Fig. 8). Thus these lures were not only similar to an item held in memory, they, along with the target, clearly arose from two halves of the same scene. Mnemonically similar lures were drawn from the other halves of studied items, however, they were perceptually dissimilar to the current test target (A‐B0 Fig. 8). Thus from the subject’s perspective, they were viewing halves of two diVerent pictures. Finally, there was a standard condition in which the lure neither perceptually resembled the current target nor closely resembled a studied item in memory (A‐X Fig. 8). The key finding was that for the conditions having a familiar lure (A‐A0 and A‐B0 ), accuracy was greater during the condition in which the target and lure were more perceptually similar (A‐A0 vs A‐B0 condition), yet confidence showed the reverse relationship. Subjects were more confident of their responses when the target and lure were perceptually dissimilar (A‐B0 ). Tulving reasoned that the eVect might reflect strategic diVerences in the way subjects approached
Dobbins and Han
114
Study list
…..
…….
…..
Test pairs A-A⬘
A-B⬘
A-X
Fig. 8. Illustration of the types of scene manipulations used in Tulving (1981) and Dobbins et al. (1998). Depicted scenes are not from the original studies.
A‐A0 and A‐B0 trials such that when the probes were clearly from the same scene, subjects examined the relative memory evidence more elaboratively or thoroughly before selection. While the need to do so would presumably lower confidence, the act of doing so would increase accuracy. This confidence/accuracy inversion has recent analogues (Chandler, 1994) and a version similar to the Tulving (1981) study was used by Dobbins, Kroll, and Liu (1998) who added a remember/know judgment during testing. The remember/know procedure was developed by Tulving (1985) to assess the phenomenology of recognition decisions. In cases where the probe elicits recollection of specific contextual details, such as thoughts or actions previously linked to the item, subjects are instructed to respond ‘‘remember.’’ If instead a test probe strikes them as familiar but no specifics are available, they are to respond that they ‘‘know’’ the item was studied. Often, researchers will substitute the word ‘‘familiar’’ for the ‘‘know’’ response option so as to not conflate the option with the everyday usage of ‘‘know’’ as an indication of high confidence. Using this technique, Dobbins et al. (1998) found an even more extreme confidence/accuracy dissociation than the original
What Constitutes a Decision Model?
115
TABLE I MEAN ACCURACY AND CONFIDENCE ADAPTED FROM DOBBINS ET AL. (1998) Proportions
% correct
Confidence
p (Rem.)
Fam.
Rem.
Overall
Total
Fam.
Rem.
Fam. Rem.
.70 .30
.37 .25
.32 .05
.37
.60
.88
2.68 2.01
1.91 1.82
3.67 3.10(38)
.66 .34
.28 .20
.38 .14
.52
.58
.74
2.98 2.59
2.08 1.99
3.66 3.55(56)
.81 .19
.35 .16
.46 .03
.49
.68
.94
2.97 2.03(59)
2.09 3.67 1.87(59) 3.27(26)
0
A‐A Correct Incorrect A‐B0 Correct Incorrect A‐X Correct Incorrect
Note: Parentheses indicate the number of participants contributing to the mean. These values will differ somewhat from those used in the statistical analysis because the repeated measures analyses exclude participants missing any relevant data. Fam ¼ familiar; Rem ¼ remember; A‐A0 ¼ scenically and mnemonically similar distractor; A‐B0 ¼ scenically dissimilar but mnemonically similar distractor; A‐X ¼ novel distractor..
Tulving study. More specifically, the accuracy of ‘‘remembering’’ was higher when the probes were drawn from the same as opposed to diVerent scenes (A‐A0 vs A‐B0 ); however, the confidence of ‘‘remember’’ report was similar under both conditions. In contrast, when subjects reported basing decisions on item familiarity, accuracy under the A‐A0 and A‐B0 conditions was similar. Nonetheless, subjects were more confident during ‘‘familiar’’ reports in the A‐B0 compared to A‐A0 condition (Table I) (Dobbins et al., 1998). 2.
1D‐SDT Model of Two Alternative Forced Choice (2AFC) in the Confidence/Accuracy Inversion Paradigm
To more fully appreciate the potential importance of such findings, it is necessary to extend the one‐dimensional model of Fig. 1C to two‐alternative forced‐ choice (2AFC) responding. Under the signal detection account, 2AFC decisions are based not on the individual evidence for each item, but on the diVerence of this evidence. If the target has more memory evidence it is selected (e.g., diVerence is positive), otherwise the lure is incorrectly endorsed (e.g., diVerence is negative). Thus the decision axis is one of diVerence values not individual item values because the judgment is relativistic. Figure 9 shows this model and its extension to the confidence/accuracy scene paradigm (Clark, 1997). The x‐axis now represents the diVerence in evidence for the two memory probes. If this diVerence is positive then the target yielded greater evidence and the response was correct. If the diVerence is negative, then the lure spuriously yielded more
Dobbins and Han
116
Incorrect “remember”
“Familiar” region
Correct “remember” A-A⬘ A-B⬘ A-X
−3
−2
−1
Incorrect selection
0
1
2
3
4
Correct selection
Fig. 9. Example of the 1D‐SDT model of two alternative forced choice (2AFC) that was obtained by fitting the confidence/accuracy inversion data of Dobbins et al. (1998). The model incorporates a ‘‘remember’’ criterion such that a diVerence in evidence between target and probe exceeding this value evokes a ‘‘remember’’ response, which is merely an expression of high confidence. Less extreme diVerences in evidence yield ‘‘familiar’’ responses (an original figure).
evidence and was incorrectly chosen. The A‐B0 diVerence distribution falls to the left of the A‐X distribution because the lures are more similar to items in memory, reducing the average diVerence in evidence. Both of these distributions have the same variance because in each trial type the memory evidence values are assumed independent for each of the probes in the pair. Importantly, Clark (1997) noted that the memory evidence values of the A‐A0 items were quite possibly not independent. If A yields strong memory evidence, then a highly perceptually similar probe A0 should also yield strong, although spurious, memory evidence. Given this, the diVerence distribution must also incorporate a covariance parameter unlike the other two cases where the resulting variance of the diVerence distribution is the sum of the variances of independent target and lure evidence values [Var(Target) þ Var (Lure)]. Instead, if evidence for the targets and lures is correlated across trials, then the resulting diVerence distribution has a net variance which is the sum of the originating distributions minus two times their covariance [Var(Target) Var(Lure) 2Cov(Target,Lure)]. This diVerence distribution has the same mean value as the A‐B0 distribution, but has less variance because of the trial‐by‐trial similarity of the A‐A0 probes. Finally, because the remember/know distinction under 1D‐SDT is thought merely to reflect a confidence distinction (Donaldson, 1996; Hirshman & Master, 1997), we have also added a remember criterion. If the diVerence in evidence is suYciently large, it is assumed that the subject becomes highly confident of the endorsement and thus reports ‘‘remembering.’’ Although there are two remember criterion lines depicted in the figure, this indicates only a single criterion value because the sign of the diVerence value is not known
What Constitutes a Decision Model?
117
to the subject. The sign of the diVerence value merely indicates whether the subject was correct or incorrect on the trial. Thus the subject selects whichever item has the greatest evidence value, and if this value is suYciently large and favors the lure he or she will incorrectly claim to actually ‘‘remember’’ the lure. This happens very rarely in Fig. 9 because the diVerence in evidence rarely exceeds 1.6 units in favor of the lure item. The model depicted in Fig. 9 corresponds to a least squares fit to the data in Table I using four free parameters; two distance values, one variance value, and a remember criterion value which are all represented in the figure. The variance of the A‐X and A‐B0 distributions was fixed at 2, consistent with the basic 1D‐SDT model in Fig. 1 (i.e., it is the sum of the two independent variances of 1). Considering first the overall performance, ignoring the remember/familiar distinction, the 1D‐SDT model indeed eYciently accommodates the confidence/accuracy inversion phenomenon. A‐X performance yields the highest performance and confidence because this distribution falls furthest to the right of the 0 diVerence location. However, although the means of the A‐A0 and A‐B0 distributions are equivalent, the accuracy is higher for the A‐A0 items. This occurs because reducing the variance of this distribution pushes the mode up further than the resulting loss of performance in the right tail. That is, although reducing the variance reduces the correct responses in the right tail, it also reduces the incorrect responses in the left tail, with both of these reductions combining to elevate the center of the distribution. Since the center of the distribution lies to the right of 0, almost all of the covariance eVect promotes increased correct responding. Despite the greater A‐A0 accuracy compared to A‐B0 , confidence is predicted to have the reverse relationship. This is because average confidence during correct reports should reflect the average distance of the correct evidence values from the 0 point. The relative depression in the right‐hand tail of the A‐A0 compared to A‐B0 distributions means that the average evidence value will be very slightly lower than that of A‐B0 condition for correct responses. Hence, this condition is predicted to yield lower confidence, which in fact happens (Table I). The problems for the 1D‐SDT model arise when ‘‘remember’’ and ‘‘familiar’’ response confidence are considered separately. For example, the regions between the 0 point and right remember criterion correspond to correct familiar responses. It is clear from the figure that the correct responses in the A‐A0 and A‐B0 condition should yield similar confidence because the average evidence values should be equivalent. Nonetheless, as noted above, the empirical data showed that familiar response confidence was significantly higher in the A‐B0 compared to A‐A0 conditions (Table I). Additionally, the average value of correct familiar A‐X items would seem to fall well to the right of A‐B0 items given the heavy skew/asymmetry of this portion of the A‐X distribution. Nonetheless, the mean confidence for these two response
118
Dobbins and Han
types was virtually identical. Finally, consider the correct remember responses, which fall to the right of the correct remember criterion. The figure clearly indicates that correct remembering for the A‐X condition should be much more confident than for the A‐A0 and A‐B0 condition because the A‐X distribution falls well to the right of the latter two. Despite this, the mean confidence for all three remember report types was strikingly similar (Table I). Thus overall, there is no clear relationship between the predicted mean evidence values and the expressed confidence of the observers. This is not to say the 1D‐SDT model might not be expanded somehow in the future to accommodate such data, but it merely emphasizes that any extension is unlikely to be motivated by the core properties of the basic model shown in Figs. 1 and 9. One way to accommodate such data outside the 1D‐SDT framework is to assume that the confidence of responding based on diagnostic recollection is less aVected by extra‐mnemonic context than when responding is based predominantly on item familiarity. Under this assumption, subjects more heavily weigh the perceptual similarity of the probes when responding is familiarity based and thus the A‐A0 condition yields low confidence assessments compared to the A‐B0 and A‐X conditions. In contrast, when recollection is the predominant source of diagnostic information and unique to only one probe, then the confidence of reporting is uniformly high and similar across the three conditions. In short, the perceptual similarity of the probes (an extra‐mnemonic context) weighs more heavily during familiarity‐based than recollection‐based responding. Of course, such an approach requires a strong psychological distinction between recollection and familiarity that also maps generally onto subjective ‘‘remember’’ and ‘‘familiar’’ reports. 3.
Other Confidence and Accuracy Dissociations
Aside from manipulating perceptual or scene similarity during recognition, numerous other phenomena yield confidence/accuracy inversions (for brief review, see Busey, TunnicliV, Loftus, & Loftus, 2000). For example, Busey et al. (2000) found that dimly studied faces that were tested when illuminated brightly showed decreased accuracy but increased confidence compared to when they were tested in the original dim format. Given the data of Dobbins et al. (1998), it might be interesting to see whether this finding is conditioned by subjective remember/familiar reports such that only familiarity‐based responses demonstrate the illumination context sensitivity. Psychologically, Busey et al. (2000) explained this finding as reflecting a decision heuristic whereby observers gave too much significance to illumination at test when calculating their confidence. This eVect may also reflect the use of a fluency attribution heuristic similar to those proposed to account for mere exposure and other eVects (Higham & Vokey, 2000; Jacoby & Whitehouse, 1989).
What Constitutes a Decision Model?
119
Under this explanation, the fluency with which an item is perceived is used by the observer as evidence that the item has in fact been recently encountered, and thus manipulations that improve perceptual fluency can sometimes erroneously inflate memory attributions. To the extent illumination improves processing fluency then subjects may inadvertently rely on this as an indication of prior encounter. Aside from manipulations of test context, there are other reasons to question the confidence/accuracy predictions of the basic 1D‐SDT model during even simple designs. For example, the Raposo and Dobbins (in preparation) criterion investigation discussed above yields relevant data. As noted earlier, this study contained a series of experiments replicating and extending the findings of Verde and Rotello (2007) who found that gross changes in target detection rates across test halves had no discernible eVect on the false alarm rate, which in turn, suggests a fixed recognition decision criterion maintained throughout testing. Figure 10 shows data from a replication of the eVect using an even stronger manipulation of target ‘‘strength’’ across test halves (Response rates shown in Fig. 4A). Is there a simple relationship between confidence and accuracy during this relatively straightforward design? Considering the mean confidence of the diVerent report types in each test half suggests not. For example, in the deep test half of the correct rejection rate for lures (1 minus the false alarm rate shown in Fig. 4A) was numerically higher than the hit rate and this diVerence approached significance [0.88 vs 0.81; t(15) ¼ 1.86, p ¼ .082]. However, as shown in Fig. 10, the confidence relationship is prominently in the opposite direction with much greater confidence in hits than correct rejections [2.38 vs 2.81; t(15) ¼ 4.18, p < .001]. Shifting focus to the second test half, correct rejections greatly outnumber hits [0.87 vs 0.39; t (15) ¼ 7.73, p < .001] and yet as Fig. 10 shows, confidence does not diVer between the response types [2.31 vs 2.45; t(15) ¼ 1.53, p > .14]; although it is numerically higher during the less accurate, infrequent hits. Thus subjects were numerically more confident despite being abysmally deficient in the detection of old items compared to the rejection of new lures. The 1D‐SDT model of Fig. 1 again does not provide consistent clues as to which class of responses should be made more confidently. One potential criticism of the analysis above is that it ignores potential diVerences in the variance of the old and new item distributions. For example, the receiver operating characteristics (ROCs) of old/new recognition performance are often consistent with the assumption that the old item distribution is more variable than that of the new, and this diVerence in variability will have an impact on the average distance from criterion for the items of a given response type (RatcliV, Sheu, & Gronlund, 1992). The insets of Fig. 10 illustrate the addition of the unequal variance assumption. The insets were derived by assuming that the old item variance was 1.5 times
Dobbins and Han
120
3.0 2.9 .81
Mean confidence
2.8 2.7 2.6 2.5
.39
2.4
.88
.87
2.3 2.2 2.1 2.0
1.17
Hit CR Hit CR 1st (deep) 2nd (shallow) Test half
1.19
A. Deep test half
−.23 −3 −2 −1
0
1
B. Shallow test half
−.23
2.65 2
3
4
5
−3 −2 −1
0
1
2.04 2
3
4
5
Fig. 10. Confidence data from Raposo and Dobbins (in preparation). Top panel illustrates the mean confidence for hits and correct rejections. Numbers next to each box‐plot indicate the response proportions. Bottom figures show a fitted unequal variance 1D‐SDT model to the data. Upside down triangles indicate the mean strength of evidence for hits and correct rejections obtained through simple simulation.
the new. Given this assumption and the observed response rates, the distribution distances and criterion positions were calculated. Finally, the mean evidence distances were estimated using truncated normal distributions both above and below the criterion (10,000 sample values for each estimated average). The unequal variance assumption only minimally improves the situation. More specifically, in the first test half (Fig. 10B) the average distance from criterion for the hits was 1.48 units (2.65–1.17). For correct rejections the distance was 1.40 units. Thus the hit items lay, on average, 0.08 units further from the criterion under the unequal variance assumption. Given this one might predict perhaps a very slight confidence advantage for the hits, but the dramatic diVerence shown at the top of Fig. 10 seems highly unlikely.
What Constitutes a Decision Model?
121
In contrast, during the second half of the test, the simulation suggested that hits averaged 0.85 units above the criterion whereas correct rejections averaged 1.42 units below the criterion. Given this one would expect much greater confidence for correct rejections, yet as shown in the top of the figure, actual confidence was in fact numerically higher for the hits, again in the opposite direction as predicted. Thus, despite a much greater discrepancy in average distance from criterion for hits and correct rejections during the second test half, the diVerence in average confidence is (1) in the wrong direction, and (2) much smaller than that observed during the first half of the test. Thus one would have to conclude that a miniscule diVerence in distances in the first list produced prominent diVerences in rated confidence, whereas a prominent diVerence in distances in the second test half yield miniscule diVerences in confidence that were in the wrong direction. As with the scene manipulations described above, the pattern of confidence can potentially be explained by drawing a distinction between context recollection and item familiarity. If old items evoke a mix of feelings of familiarity and contextual recollection, and if the latter becomes prevalent with deeper levels of processing, then one expects greater confidence for hits than correct rejections because recollection is a source of highly confident responding (Yonelinas, Dobbins, Szymanski, Dhaliwal, & King, 1996). This explains why there is a prominent diVerence in hit and rejection confidence in the first test half where the targets were processed deeply, and only minor diVerences in the second test half where target encoding was impoverished and hence recollection likely to be rare. It is also consistent with the failure to drive the confidence level of hits below that of correct rejections in the second test half, even when the response rate was dramatically lower for hits. If recollection is the dominant contributor to confidence, removing it will reduce the confidence and performance, but is unlikely to reduce confidence below that of correct rejections, which are assumed to be based purely on familiarity. It is important to note that this characterization does not require one to assume threshold or discrete states for recollection as one can additionally assume that recollection and familiarity are both continuous values. However, the key point is that the prediction arises because of hypothesized diVerences in the phenomenology and relative prevalence of recollection and familiarity, and these distinctions cannot be gleaned merely from inspecting the fitted 1D‐SDT model. A similar argument was made by Dobbins, Kroll, and Yonelinas (2004) in a study that examined the eVects of dividing attention at study on subsequent recollection and familiarity estimates. This study used a Brown–Peterson manipulation to regulate the processing of words during study repetitions. Subjects were informed that their primary task was to encode digit strings similar to zip codes. During each trial they were given a digit string and then
Dobbins and Han
122
TABLE II REMEMBER AND FAMILIAR RESPONSE PROPORTIONS AS A FUNCTION OF NUMBER REHEARSALS AND ACCURACY ON DUAL‐TASK PERFORMANCE ADOPTED FROM DOBBINS ET AL. (2004)
OF
Rehearsals Two
CorrectBP InCorrectBPa
Eight
‘‘Familiar’’
‘‘Remember’’
‘‘Familiar’’
‘‘Remember’’
0.26 0.26 False Alarms ‘‘Familiar’’ 0.18
0.24 0.34
0.35 0.25
0.30 0.49
Rehearsal increase 0.15 0.14
‘‘Remember’’ 0.03
a IncorrectBP data was computed only for subjects contributing 14 or more error trials in the low rehearsal condition (n ¼ 22). Rehearsal Increase is the net gain in overall hit rate as a function of increased rehearsal (8 vs 2). Note: BP ¼ Brown–Peterson Task.
required to engage in a ‘‘distractor’’ task that involved reading word pairs presented on the screen either twice (two rehearsals) or eight times (eight rehearsals). Following the last presentation of a trial pair, subjects were tested on their memory for the digit string. Finally, following the last encoding Brown–Peterson trial subjects were given a surprise test for the words used during the ‘‘distractor’’ task using the remember/familiar distinction. Trial outcomes were binned not only by the remember/familiar memory outcomes, but also by whether digit string memory on the earlier Brown–Peterson trial was correct or incorrect. The critical test data are shown in Table II with both raw familiarity rates and rates adjusted assuming independence between familiarity and recollection [i.e., familiarity/(1 – recollection)]. The key diYculty for the 1D‐SDT model is in fitting the eVect of rehearsal for items from correct Brown–Peterson trials (top row, Table II). Focusing on the correct Brown–Peterson trials, the raw data demonstrate a modest increase in familiar response rates with rehearsal (0.26 to 0.35) and a smaller increase in remember rates (0.24 to 0.30). The problem is that the 1D‐SDT model predicts the qualitatively opposite outcome with the general prediction being that manipulations of item strength should almost always be more prominent in remember rates because the remember criterion isolates the tails of the distributions. To illustrate this, Dobbins et al. (2004) used the
What Constitutes a Decision Model?
123
1D‐SDT to make quantitative predictions about the eVects of increasing rehearsal from two to eight repetitions for correct Brown–Peterson items. To do this the hypothetical location of the old/new and remember/familiar criterion positions were determined based on the response rates to items seen twice. For example, if the hit rate for twice seen items was 50% then the old/new criterion must fall at the mean of this distribution. Likewise, if 25% of the items also elicited ‘‘remembering,’’ then the position of the remember/ familiar criterion can be calculated to lay 0.67 units above the mode. Finally, given these initial calculations, the predicted familiarity and remember rates for the items seen eight times could be calculated. This is shown graphically in Fig. 11. Figure 11A demonstrates that if one assumes the low and high rehearsal items have the same variance then the model fails. More specifically, it predicts little change in the raw ‘‘familiar’’ response rates with increased rehearsal, whereas there should be a prominent increase in the ‘‘remember’’ rates. When the model was applied to each individual subject, the systematic failure between the predicted and observed remember/familiar proportions was clear (Fig. 11B). The model systematically under predicted the gains in familiarity responses and over predicted the gains in remember responses. Finally, if one adopts the assumption that increasing rehearsal would increase the variance of the highly rehearsed items compared to the items seen only twice, then the errant predictions of the 1D‐SDT model become even more pronounced (Fig. 11C). Of course, demonstrating that the 1D‐SDT model is inadequate does not explain the pattern of findings. Dobbins et al. (2004) explained the pattern by suggesting that attentional resources are critical for elaborative encoding, which in turn predominantly improves later recollection. Thus during correct Brown–Peterson trials it was assumed that attention was heavily diverted from meaning based, elaborative processes and hence recollection should suVer generally, and the repetition benefit should be muted. In contrast, when subjects were incorrect on the Brown–Peterson task, Dobbins et al. (2004) assumed that attention had been diverted from the primary task to the rehearsed words, enabling elaborative encoding. From this it follows that rehearsal gains in recollection during correct Brown–Peterson trials should be heavily muted, whereas they should be prominent under incorrect Brown– Peterson performance. This is also demonstrated in Table II. There is an overall increase in recollection rates for incorrect versus correct Brown– Peterson trials, and gains in recollection accrual with rehearsal are much more prominent following incorrect compared to correct Brown–Peterson trials. Furthermore, this interpretation is consistent with attentional capacity interpretations of the eVects of aging on item memory (Craik & Jacoby, 1996). Again however, the interpretation requires assuming a key distinction
Dobbins and Han
124
A 2 Rehearsals 8 Rehearsals Old/New cutoff Remember cutoff
.2 7 .26
.37 .24 −3
−2
−1
B
0
1
2
3
8-Rehearsal condition
0.48
Proportion
0.44 0.40 0.36 0.32 0.28 0.24 0.20
Fobs
Fpred
Robs
Rpred
C Old/New cutoff Remember cutoff
.26 .18
.47 .24 −3
−2
−1
0
1
2
3
Fig. 11. Data from the rote rehearsal study of Dobbins et al. (2004). Top and bottom panels illustrate fitted 1D‐SDT models employing separate criteria for remember reports and overall hits. Proportions indicate the remember and know rate predictions of the model across items rehearsed twice or eight times. Panel A assumes equal variance and panel C assumes that increased rehearsal increases the variance of the evidence distribution. Panel B shows the data when fitted to each individual subject. Fpred ¼ ‘‘predicted familiarity rate.’’ Fobs ¼ ‘‘observed familiarity rate.’’ R plots are the analogous values for the remember rates. In all three panels, the model over predicts the rehearsal eVects on remember rates and under predicts the eVects on know/familiar rates. (Drawn from one figure and waiver applies to primary author—Psychonomics Society.)
What Constitutes a Decision Model?
125
between recollection and familiarity, with the former being more reliant on attentional resources at encoding. 4.
Summary of Section II.B. The Relationship Between Confidence and Accuracy
Overall, the findings presented in Section II.B do not support the clear and simple relationship between confidence and accuracy that characterizes the 1D‐SDT model. In paradigms using simple context manipulations and subjective reports, there are findings that cannot be understood, and more importantly would not have been predicted, using the models depicted in Figs. 1 and 9. One potential criticism of such a conclusion is that the 1D‐SDT model does not assume that other factors will not also aVect reported confidence; it merely assumes that there is a lawful relationship between evidence and confidence. From this perspective other factors may mask or perhaps even reverse the observed relationship, but this should not be viewed as a challenge to the 1D‐SDT model itself. Because this type of critique potentially applies to the entire chapter, we will address it here. Briefly, there are two factors that weigh against this counterargument. First, the confidence/accuracy relationship of the model seems strained even if very simple paradigms are considered. For example, as noted above, the predicted relationship is questionable even during simple single item recognition tasks (e.g., Fig. 10). It is hard to think of what hidden factors might be obscuring the predicted relationship in such a simple paradigm. Of course, one could hold that correct rejections are somehow by nature less confident than endorsements, perhaps by asserting that ‘‘no’’ responses are somehow inherently less confident than ‘‘yes’’ responses. However, this merely highlights the inadequacy of the model because such a tendency cannot be deduced from the model itself. Furthermore, Dobbins and McCarthy (under review) have manipulated the framing of recognition cues such that ‘‘yes’’ responses are mapped to either old items (‘‘Old Item?’’ queries) or new items (‘‘New Item?’’ queries) and this, in and of itself, did not aVect reported confidence. That is the confidence for items was unaVected by whether these were mapped onto ‘‘yes’’ or ‘‘no’’ response options (Dobbins & McCarthy, under review). The second problem with such a counter to the above model criticisms is that it potentially stifles testing of the model itself. Under the 1D‐SDT model, it should be quite easy to generate conditions where hits are significantly less confident than correct rejections. Indeed, all one should have to do is to move the criterion in a suYciently conservative direction. However, we know of no findings demonstrating this pattern and this absence should be disturbing to model advocates. Allowing appeals to unknown and incidental factors as
126
Dobbins and Han
masking this assumed relationship would seem to obscure the fact that the full range of the predicted relationship is hard to find in extent data. Put another way, the scarcity of evidence suggesting that one can reduce confidence for hits below that of correct rejections should in and of itself be taken as a serious problem, and any special pleading to unspecified factors as masking the ‘‘true’’ confidence diVerence makes the model diYcult to critically test.
C.
IS THE RECOGNITION CRITERION INFORMED BY INDIVIDUAL SKILL?
1.
Optimal Decision Criterion in 1D‐SDT
As shown in Fig. 1A, the optimal placement of the decision criterion in the equal variance 1D‐SDT model is midway between the two distributions. It is at this point that a subject will maximize his or her correct response rate across the trials. Although the 1D‐SDT model does not demand it, some advocates of the model assume that subjects do in fact use optimal decision criteria (Glanzer & Adams, 1990). This additional assumption arises for at least two reasons. First, in the derivation of the signal detection model, the preferred treatment of the evidence axis was as an indication of relative likelihood and more formally, a likelihood ratio statistic, instead of an individual evidence value for each stimulus (for a brief review, see Balakrishnan & RatcliV, 1996). The statistical and theoretical benefits of this approach are beyond the scope of this chapter but to the extent that one assumes that the observer has access to indicators of relative likelihood, then it becomes clear why he or she should typically respond near optimally. If a given item yields a likelihood ratio estimate indicating it more likely arose from the target distribution, then it would be odd for the observer not to respond in that fashion. Indeed, under a relative likelihood evidence axis, subjects should only respond in a decidedly nonoptimal way if the calculation of the likelihood ratio itself is errant. Thus, for theorists that assume a likelihood ratio axis, it is perhaps natural to assume an optimal criterion placement. The second reason for assuming optimal or near optimal placement is linked with assumptions about the observer’s goals during recognition. For example, Dunn (2004) argued that the pattern of several experiments was consistent with ‘‘. . .that of an unbiased observer who is attempting to maximize his or her overall level of performance while weighing the costs of misses and false alarms equally’’ (p. 531). Furthermore, the ability to place the criterion optimally is sometimes argued to naturally arise out of a lifetime of experience with recognition judgments preceding the experimental context. That is, the importance of maximizing memory successes outside the laboratory is assumed to drive the development of optimal criterion placement and this should naturally carry over to laboratory performance.
What Constitutes a Decision Model?
2.
127
Hit/False Alarm Correlation and Individual Variation
Before examining the optimal criterion claim more carefully, it is important to point out that the model in Fig. 1 is a model of individual behavior, not a model of group or condition eVects. More specifically, the evidence distributions represent the range and density of evidence values experienced by a single individual, and the decision criterion is that individual’s solution to the problem of continuous, overlapping evidence. Given this, the most appropriate level of testing the model is at the level of individuals. If subjects place the criterion optimally, what patterns should be present in the raw subject data? As demonstrated in Fig. 12, an optimal criterion placement across a group of individuals in any given study, should result in a perfect negative correlation between detection (hits) and errors of commission (false alarms). This figure was constructed by sampling d0 values ranging from 1 to 4 in increments of 0.1, and using the optimal criterion location (d0 /2) to determine the hits and false alarms. Every point falls along the negative diagonal and this occurs because the criterion position (d0 /2) must track the mean of the old
−3 −2 −1 0 1
2
3
4
5
d⬘= 2
Hit rate
d⬘= 1
4
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
3 2 1
0 −3 −2 −1
0
1
2
3
4
0.2
0.4 0.6 0.8 False alarm rate
1
5
d⬘= 3
−3 −2 −1
0
1
2
3
4
5
d⬘= 4
−3 −2 −1 0
1
2
3
4
5
Fig. 12. Illustration of the predicted relation between hit and false alarm rates across observers if criterion were placed optimally under the 1D‐SDT model. Maintenance of an optimal location results in a perfect negative correlation of hits and false alarms.
128
Dobbins and Han
item distribution across subjects in order to remain optimal. In other words, in an optimal 1D‐SDT model, the d0 distance and the old/new criterion position are perfectly correlated. Thus across a set of individuals with suYcient variation in accuracy, the optimal 1D‐SDT model predicts a strong negative correlation between hits and false alarms. A second prediction arises from the fact that the optimal position is half of the distance between the distribution means (d0 /2). This means that the variability in the criterion locations across subjects should be half of the variability in skill levels across subjects (d0 ) when measured in terms of standard deviations, or a quarter of the variability when measured in terms of variances. In contrast, to these clear predictions, analysis of the pattern of individual variation in actual data does not support the 1D‐SDT model under the assumption of an optimal criterion. For example, examining the individual subject data from three separate studies, Dobbins, Khoe, Yonelinas, and Kroll (2000) found that all studies demonstrated a significant positive correlation between hits and false alarms across observers (mean r of .61). This finding has been well replicated with Wixted and Stretch (2004) presenting significant positive correlations for four additional studies (mean r of .49). Thus the actual correlation between hit and false alarm rates of observers, within any given study, appears to be qualitatively reversed from what is predicted under the optimal 1D‐SDT model. This in turn suggests that individuals are decidedly nonoptimal in the placement of decision criteria. One potential criticism of such a conclusion is that the variability in accuracy across subjects within a given experiment may be too restricted because of the similar encoding conditions and testing procedures, and that this will aVect the observed relationships (Dunn, 2004). There are two potential counters to this argument, one theoretical and one empirical. First, restricting the range of discrimination ability would not, under the optimal 1D‐SDT model, reverse the sign of the predicted relationship. That is, limiting the range of d0 values in a given experiment might reduce one’s power to detect the negative hit/false alarm relationship if present, but there is no reason to suspect it would result in a systematically opposite outcome, namely a reliable tendency to observe robust positive correlations. For this to occur, some other factor that is independent of accuracy has to be influencing the criterion location. Second, the range in performance across individuals within basic recognition studies, such as those reported by Dobbins et al. (2000) and Wixted and Stretch, is generally quite pronounced. For example, in the three studies reported by Dobbins et al. the range of d0 values was 0.71 to 2.09, 0.28 to 2.58, and 0.16 to 1.63 (Dobbins et al., 2000). In each case this would appear to be suYcient to glean a negative relationship had one been present and furthermore, and when all the data were combined into a single
What Constitutes a Decision Model?
129
correlational analysis, ensuring a wide range of performance, the relationship between hits and false alarms remained significantly positive (r ¼ .46). The second prediction of the optimal model discussed above was that the variability in the old/new decision criterion would be considerably less than that of the d0 values of the observers. That is, since the optimal point is d0 /2 for each observer, the variability in this point would necessarily be half of the variability of the accuracy performance levels across subjects (or a quarter if using variances). The old/new decision criterion variability can be estimated by taking the z‐score of the false alarm rate for each subject and comparing the variance in this value with the variance of the d0 estimates. More specifically, the ratio of the variance of the false alarm rate z‐score and the d0 values across subjects should be 0.25 if they are using an optimal criterion. In contrast, using the 72 cases reported in Dobbins et al. (2000) this value was greater than 1 (0.31/0.30 ¼ 1.03). Although it will be important to verify this pattern in other individual subject data sets, the data from the three studies reported by Dobbins et al. indicate that the variance ratio is four times larger than it should be if subjects were in fact optimal in their criterion placement according to the 1D‐SDT model. Again, it is clear that some factor or factors extrinsic to the model in Fig. 12 is or are playing a dominant role in influencing each observer’s criterion placement. There are many potential factors that could play a pivotal role in how observers place their criterion, such as stable subject traits like risk aversion. However, to our knowledge, these or similar subject traits have not been systematically investigated with respect to individual variation in criterion placement during recognition or other item‐based memory judgments. From our perspective, it would be odd if subjects exhibited stable tendencies toward risky or conservative choice in other noisy cognitive judgments, but failed to do so during item‐based memory judgments. Do the findings above mean that the criterion is completely uninformed by the skill level of the observers? Aside from possible variability in personality characteristics, this depends on whether there are multiple kinds of evidence contributing to accuracy, and whether these diVer in their relative influence on criterion placement. The experiments summarized by Dobbins et al. (2000) not only contained simple recognition scores, but all three used the now familiar remember/know response technique. Based on the data, these researchers suggested that the remember rate, and hence recollection, was either unrelated or negatively related to the placement of the criterion (see also Cary & Reder, 2003; Reder et al., 2000). Psychologically, the idea behind this is that if subjects come to expect recollection in the detection of targets, they may decide to be less willing to endorse an item simply because it seems familiar. Similarly, if a given subject restricts endorsements to recollective
Dobbins and Han
130
TABLE III CORRELATIONS ADAPTED FROM DOBBINS ET AL. (2000) Hit & False alarm Experiment 1 Experiment 2 Experiment 3 Total
.
.56c .83d 45a .46d
Rem‐Ratio & False alarm
Rem‐Ratio & Hit
.63d .57c .51b .44d
.01 .35 .05 .15
a
p < .05. p < .01. c p < .005. d p < .001. b
evidence, then as a consequence he or she will be unlikely to endorse an item simply based on its familiarity. A simple way to operationalize this idea is to calculate the ratio of correct remember rates to overall hit rates. Subjects relying more exclusively on recollection would then have values approaching 1. That is, if all of a subject’s hits are based on recollection, this number will be 1, conversely, if a subject frequently endorsed items based on familiarity, this number will approach 0. Table III demonstrates that this remember rate ratio value is in fact systematically related to the false alarm rate of observers in the studies examined by Dobbins et al. In all three experiments, and in the overall data, subjects whose hit rates are dominated by remember reports are in fact significantly less likely to commit false alarms. Importantly, the remember/hit rate ratio is not significantly correlated with the hit rate itself. Indeed, if it were significantly correlated with the hit rate, then it would have demonstrated a positive relationship with the false alarm rate because as established above, there is a clear positive correlation between overall hits and false alarms. Instead, it is the degree to which the observers’ detection rates are constrained to reports of remembering that inversely predicts the tendency to commit false alarms. Such a tendency would appear to be diYcult to predict based on the simple mechanics of the basic 1D‐SDT model of Fig. 1C. 3.
Summary Section II.C. Is the Recognition Criterion Informed by Individual Skill?
The data in Section II.C.3 demonstrate in a straightforward manner that if one adheres to a 1D‐SDT framework, then one must also concede that observers deviate significantly from optimal when placing the old/new
What Constitutes a Decision Model?
131
decision criterion. That is, one must conclude that criterion placement is determined in large part by a factor or factors other than the individual skill of each observer. If this were not the case, that is, if subjects placed the criterion optimally, then the hit and false alarm rate would be strongly negatively correlated, not positively correlated as they are in the raw data. Additionally, the variability of the old/new criterion would be noticeably smaller than that of the d 0 accuracy estimate, and not similar as indicated above. Interestingly, when the remember/know procedure is used, it does appear that the tendency to restrict hits to remember reports is inversely related to the commission of false alarms. As noted above and in Fig. 1C, remembering under the 1D‐SDT framework is held to reflect the position of a secondary criterion that is simply higher on the decision axis than the old/ new decision criterion (Donaldson, 1996; Hirshman & Henzler, 1998; Hirshman & Master, 1997). Looking at the model in Fig. 1C, it seems fair to argue that nothing inherent in the spatial representation itself would lead one to predict a positive correlation between hits and false alarms across observers, and yet a negative correlation between the remember/hit rate ratio and false alarms across observers. If one instead makes a distinction between the diagnostic properties of recollection and familiarity, and assumes that a greater relative reliance on the former will also reduce the tendency to incorrectly endorse new items, then the patterns in the data are predicted. Observers who base responding primarily on remembering, will conversely tend not to commit false alarms to new items. This is similar to the notion that observers can naturally adjust the specificity of memory information they choose to base responding on during memory reports (Goldsmith & Koriat, 1999). Importantly, this does not mean one must abandon the detection theoretic framework altogether, merely that the models will necessarily be more complicated than the 1D‐SDT model. Furthermore, the notion that observers will diVer with respect to the extent they restrict responding to certain types of information also fits well with literatures outside simple recognition. For example, the source monitoring framework of Johnson and colleagues (Johnson, Hashtroudi, & Lindsay, 1993; Lindsay & Johnson, 1991) is founded on the notion that observers flexibly weigh characteristics and features of multidimensional memory representations using simple preferences and heuristics that are likely to diVer across participants and situations. Finally, one might also assume that since observers enter memory experiments with a wide range of personality characteristics that may tonically influence their preference for risky versus conservative response strategies, necessarily contributing to the variance in the relation between hits and false alarms. Given these ideas, the fact that performance deviates
132
Dobbins and Han
notably from what is predicted under an optimal 1D‐SDT model is perhaps wholly unsurprising. The key point however is that this in turn indicates that much about item‐based decision tendencies of individuals appears to fall outside the purview of the 1D‐SDT framework. D. 1.
CONTEXT VERSUS ITEM MEMORY JUDGMENTS AND PFC 1D‐SDT Framework and Source Memory
To this point we focused exclusively on item recognition. However, the 1D‐ SDT model of Fig. 1 has also been argued to be an accurate characterization of source memory discrimination problems (Qin et al., 2001; Slotnick & Dodson, 2005; Slotnick et al., 2000). Under this approach, the evidence distributions reflect the contextual memory details that are specific to the two source origins to be discriminated. As in item recognition, observers are assumed to use a decision criterion in order to map the continuous evidence into two response categories (e.g., ‘‘Source A’’ or ‘‘Source B’’). Because the same simple one‐ dimensional representation is used for source and recognition problems, the 1D‐SDT approach is obviously limited in terms of making substantial predictions about fundamental decision diVerences between these two tasks. Traditionally however, source monitoring and constructive memory approaches have assumed that relative to item recognition, source memory places a heavier premium on controlled decision processes (Johnson et al., 1993; Schacter, Norman, & Koutstaal, 2000). 2.
Functional Neuroimaging of Source Memory Versus Item Memory
Functional imaging studies routinely demonstrate greater activity in lateral PFC regions during source memory tasks compared to item memory judgments (Dobbins, Foley, Schacter, & Wagner, 2002; Nolde, Johnson, & D’Esposito, 1998; Rugg, Fletcher, Chua, & Dolan, 1999) even when the materials are fully matched across tasks, and the source judgment represents the easier of the two memory decisions (Dobbins, Rice, Wagner, & Schacter, 2003). This suggests that processes dependent on the PFC that are not simply a reflection of task diYculty, contribute to source memory judgments (for reviews, see Dobbins & Davachi, 2006; Fletcher & Henson, 2001). Furthermore, the nature of PFC involvement changes drastically with subtle changes in the sought‐after source information and this can be demonstrated using a matched‐probe paradigm in which retrieval queries vary across memory probes that are identical in construction. For example, during encoding, Dobbins and Wagner (2005) had subjects rate studied pictures for either pleasantness or animacy (Fig. 13). These
What Constitutes a Decision Model?
133
Study phase Semantic rating task Pleasant/Unpleasant? Living/Nonliving?
Pleasant/Unpleasant?
[Scanned]
Test phase 3 AFC recognition Bigger before?
Pleas. task before?
or
Perceptual source trial
New item?
or
Conceptual source trial
Novelty detection trial
[Scanned]
Fig. 13. The design of Dobbins and Wagner (2005). During study, subjects completed semantic rating tasks on pictures that varied in size. During subsequent testing, triplets were constructed from two old items and one new item. Old items were crossed for prior study size and prior task. Three retrieval cues were used. Perceptual source queries required the subject to retrieve prior size information, whereas conceptual source queries required them to recover prior task information. Novelty detection trials simply required the subjects to identify the new item.
judgments tasks were also crossed with the size of the picture being rated such that they could be performed on either a physically large‐ or small‐sized picture. At test subjects received intermediate‐sized picture triplets with one new item and two studied items. Critically, the prior study size (either bigger or smaller before) and the type of judgment task previously performed (pleasantness or animacy) were fully crossed such that an old item that was bigger or smaller before was equally likely to have been previously judged using either rating task. The key manipulation was the form of the retrieval cue used on each test trial. During Perceptual Source Memory trials the cue required selection of the item that was perceptually bigger when previously encountered (‘‘Bigger Before?’’) regardless of rating task. In contrast during Conceptual Source trials, subjects were prompted to select the item associated with the pleasantness rating judgment (‘‘Pleas. Task Before?’’) regardless of its prior physical
Dobbins and Han
134
BA 45
0.4
Peak percent signal
0.3 0.2 0.1 0.0 −0.1
P. Src.
C. Src. Novelty
Left hemisphere
P. Src. C. Src. Novelty Right hemisphere
Fig. 14. Data from Dobbins and Wagner (2005). Hemispheric interaction of Brodmann’s area 45 (posterior inferior frontal gyrus) during retrieval as a function of retrieval questions. The peak response at 6 s post‐stimulus onset was calculated for anatomically based ROIs in each hemisphere across the three retrieval tasks (P. Src. ¼ Perceptual Source; C. Src. ¼ Conceptual Source; Novelty ¼ Novelty Detection). ANOVA demonstrated a qualitatively diVerent response Status : OK pattern across the two hemispheres. The box plot demonstrates the interaction.
size. Finally, Novelty Detection trials simply required the subject to select the new item (‘‘New Item?’’) of the triplet (Dobbins & Wagner, 2005). The pattern of engagement of multiple PFC regions depended on the form of the retrieval cue and hence presumably on the manner in which subjects were strategically monitoring memory and processing the retrieval probes. For example, the data drawn from the posterior portion of the left and right inferior frontal gyri (Brodmann’s area 45) are shown in Fig. 14. In the left hemisphere, this region has been implicated during controlled or deliberative semantic judgments and may be critical for sustaining top down attention to restricted conceptual or linguistic characteristics of probes during decision making (Badre & Wagner, 2002; Buckner, 2003; Thompson‐Schill, D’Esposito, Aguirre, & Farah, 1997). In contrast, right BA 45 has been
What Constitutes a Decision Model?
135
implicated in judgments that require sustained attention to the perceptual characteristics of probes (Wagner et al., 1998). The findings of Dobbins and Wagner (2005) support these distinctions and highlight the strategic nature of PFC responses during memory retrieval attempts. As a case in point, there was a robust cerebral hemisphere by retrieval task interaction in the posterior inferior frontal gyri (IFG). In left posterior IFG activation was elevated for both Conceptual and Perceptual Source trials, compared to simple Novelty Detection trials. To the extent that subjects maintain focus on the conceptual or linguistic features of the probes during contextually specific retrieval attempts, this pattern makes sense. In contrast, right posterior IFG showed a qualitatively diVerent pattern. Here, activity was elevated for Perceptual Source trials and Novelty Detection trials, compared to Conceptual Source trials, which is consistent with the hypothesis that this region participates in biasing attention toward the perceptual characteristics of the probes when such information is critical. These characteristics are the least relevant when one is deciding whether a pleasantness judgment was previously performed on the given probes because such a judgment is a function of the conceptual properties of the probe items, not the simple physical features. Again, because all trials used similarly constructed probes, this pattern underscores the flexible nature of the prefrontal recruitment during item‐based memory attribution. In addition to widespread PFC recruitment during source memory attribution, the temporal profiles of regional recruitment will also likely yield clues about the functional roles of diVerent PFC regions during memory attribution. Dobbins and Han (2006a) hypothesized that activation of PFC regions might be dissociable as a function of the whether the underlying processes were dependent on the form of the retrieval questions (context or item memory query) or the evidence evoked by the retrieval probes. For example, several cognitive frameworks suggest that context memory retrieval requires subjects to formulate a retrieval plan or retrieval description (Burgess & Shallice, 1996; Schacter et al., 1998) that coarsely outlines the prior context elements and how these should be mapped onto responses should retrieval be successful. Although these plans are assumed to be fairly elaborate for source memory demands, they should be minimal with respect to detecting novel items, as observers need to make little if any reference to the contextual specifics of their experiences in order to detect new items. If this hypothesis is correct, then source memory queries should activate working memory areas involved in maintenance of conceptual information, and should do so even if the source memory probes are not yet present in the environment. In order to test this idea, Dobbins and Han temporally jittered the onsets of source or item memory questions relative to their subsequent probe items using a matched probe design similar to that above (Fig. 15, top left). This technique allowed us to isolate the response
Dobbins and Han
136
During PROBE event
Delayed/Jittered cueing manipulation Pleas. task before ? or new item?
+
jittered
Cue event 0.5 s
…..
Cue / Probe SOA 5.5, 7.5 or 9.5 s
…..
Probe event 3s
During CUE event 0.4 A BA9 DLPFC
Mean percent signal
0.3
B BA6 Precentral 0.2
0.1
A B
0.0 −0.1
Item
Ctxt.
Item
Cue event
Ctxt.
Item
Ctxt.
Item
Ctxt.
Probe event
Fig. 15. Data from Dobbins and Han (2006). Demonstration of qualitatively diVerent response profiles of proximal BA9 (posterior middle frontal gyrus) and BA6 (superior pre‐ central gyrus) lateral PFC regions. Box plots show mean response 4–8 s post onset for the conditions of interest. Box is one standard error of the between subjects mean, box plus whiskers equals two standard errors. The axial overlay demonstrates the proximity of the two ROIs used in the ANOVA. Gray box in the plot and region B in the overlay correspond to BA6, whereas white box and region A correspond to BA9. During the cue event the more posterior BA6 region was diVerentially sensitive to the context versus item distinction but the more anterior BA9 region was not [left panel of box plot; F(1,14) ¼ 36.61, p < .001]. In contrast, during the probe event the BA9 region was sensitive to the context item distinction but the BA6 region was not [right panel of box plot; F(1,14) ¼ 14.54, p < .005]. Overall, this resulted in a significant three way interaction across region (BA9 vs BA6), Trial Phase (Cue vs Probe) and Retrieval Task (Context or Item Memory), F(1,14) ¼ 49.69, p < .001.
to the questions from the response to the probes, and to determine if the greater recruitment of left PFC during source compared familiarity‐based responding was triggered as soon as the question was posed, the probes were presented, or some combination of the two (Dobbins & Han, 2006a).
What Constitutes a Decision Model?
137
Consistent with previous findings, there was generally greater recruitment of left PFC regions during context (e.g., Pleas. Task Before?) compared to item familiarity (e.g., New Item?) trials (Fig. 15; top right light and dark regions). However, the PFC regions could be further distinguished based on whether this elevated response began with the appearance of the retrieval question (dark regions Fig. 15), or instead also required the presence of the probe items themselves (light regions Fig. 15). A more concrete example of these timing diVerences is shown in the bottom panel of Fig. 15, which compares the response of a region previously linked with verbal/conceptual working memory (BA 6 in the pre‐central gyrus; location ‘‘B’’ on axial overlay) to the response of a proximal region previously linked to source monitoring (BA 9 in the dorsal middle frontal gyrus; location ‘‘A’’ on axial overlay). Focusing on the BA 6 region (dark boxes), we see that this region demonstrates an elevated response early in the trial as soon as the question indicates an upcoming source memory demand (see Cue Event; Item vs Context Response). However, once the probes are presented, the response of the region appears to saturate and its activation level is similar for the two trial types. Dobbins and Han (2006) interpreted this pattern as reflecting the early recruitment of verbal working memory during the planning of a context retrieval attempt. However, since verbal working memory is also required to maintain the probe identities in mind during responding, this region becomes similarly responsive once the probes are presented during either trial type. In contrast, the BA 9 region (open boxes) demonstrated a minimal response that did not diVerentiate context or item memory questions early in the trial (see Cue Event, Item vs Context Response). However, once the probes appeared, the region demonstrated a response that was selective to context memory (see Probe Event, Item vs Context Response). This pattern is supportive of the hypothesis that dorsolateral PFC regions are critical for classifying or ‘‘manipulating’’ memory representations with respect to decision goals (Crone, Wendelken, Donohue, van Leijenhorst, & Bunge, 2006; Dobbins & Han, 2006b). Such representations cannot be elicited until the probes are present. 3.
Summary of Section II.D: PFC and Context Memory
Even the brief review of the functional imaging data presented here demonstrates widespread recruitment of PFC during item‐based memory judgment. Importantly, this recruitment varies in highly contextually specific ways. For example, the data from Dobbins and Wagner (2005) presented in Fig. 14 suggest that subjects direct or sustain attention to diVerent feature attributes of the probes depending on their retrieval goals. Other regions also expressed highly specific response patterns but were not included here for brevity.
Dobbins and Han
138
Returning our attention to the 1D‐SDT characterization of source memory problems it becomes clear that there is nothing within this framework that would suggest source‐memory problems should recruit PFC to a much greater extent than item‐memory problems, and nothing to suggest that within source memory problems that the specific framing of the source memory questions should greatly aVect regional PFC recruitment. These activity patterns however would seem to support the idea that source memory more heavily depends on strategic decision‐making processes. Indeed, similar conclusions have arisen from behavioral data. For example, Marsh and Hicks (1998) demonstrated that when single probes are presented, and only two prior sources are possible, the framing of the source retrieval question greatly influences the source memory attribution performance. These researchers examined the ability of subjects to distinguish between words that were previously simply read versus previously generated from anagrams. Critically, when the retrieval question required positive responses for read items (Did you see?), accuracy was significantly lower than when the question required positive responses for items that were previously generated (Did you generate?) (Marsh & Hicks, 1998). Thus the ability of the observers to discriminate the same two classes of items critically depended on a very subtle change in the framing of the retrieval query. Given these types of data, it becomes increasingly clear that subjects employ a wide range of decision heuristics and strategies during source memory attribution, and that to a large extent, many of these may be absent during familiarity‐based item judgments. However, as shown in Fig. 1, under the 1D‐ SDT framework, there is nothing fundamentally diVerent about the spatial representations used for the two types of decision problems; each involves a simple comparison of scalar evidence values to a decision criterion value. This again underscores the very limited sense in which these simple spatial models can be described as models about memory decision making, particularly if one holds that PFC is a key substrate of decision‐making processes. III.
Conclusion—What Constitutes a Model of Item‐Based Memory Decisions?
Rather than rehash the findings of the previous sections, here we simply reassert that in most if not all of the presented cases, the pattern of data could not have been anticipated based solely on the core (or typically accompanying) assumptions of the 1D‐SDT models of Fig. 1. Whether this lack of predictive power and heuristic value presents a serious problem depends on the extent that one asserts the statistical ‘‘decision’’ models depicted are important in understanding the ways that memory decisions are made. If one views the models as simply convenient ways to summarize data whose
What Constitutes a Decision Model?
139
explanation lies outside the scope of one‐dimensional signal detection framework itself, then the findings above pose no grave problem and are the grist for useful theorizing. In this spirit, previous researchers have argued that these or similarly simple models are useful inasmuch as they are well‐specified benchmarks or ideals. From this perspective, it is the deviation from the patterns predicted by these simple models, or the fact that the data are wholly unexpected given the core assumptions of the models, that serves as impetus for theorizing about how observers actually make memory attributions (for similar arguments, see Dobbins, 2001; Hirshman & Master, 1997). In principle, although with clearly more success, this is the approach taken in the Heuristics and Biases framework of judgment and decision making (Tversky & Kahneman, 1974). Within this framework, decisions that rest on noisy or incomplete information and are assumed to be heavily influenced by heuristics that are sometimes decidedly ‘‘nonoptimal.’’ Importantly however, nonoptimal behavior simply represents deviation from a simple, well‐specified theory of predicted performance known as Rational Choice Theory. As with the 1D‐SDT model under an optimal criterion placement assumption, Rational Choice Theory simply provides a benchmark against which performance can be compared under the assumption that the observer is always trying to maximize a preferred outcome. It is the deviation from this benchmark that arguably provides important insights about the type of decision operations or traits expressed by the observer. Although debate continues with respect to the Rational Choice versus Heuristics and Biases frameworks, there is little doubt of the general utility of the latter for generating theory and interesting data. However, in contrast to its role as merely a useful benchmark or simple summary of performance, we suspect that many theorists view the 1D‐SDT model as a viable candidate for explaining memory decisions, in and of itself. The ongoing debate over the adequacy of the two‐criterion variant of the model for explaining remember/know paradigms nicely illustrates the range of positions taken with respect to this issue. At one end of spectrum are theorists who are purposely noncommittal with respect to underlying process models but assert that the diVerences in the conscious correlates of the reports are the phenomenon of interest (Gardiner & Richardson‐Klavehn, 2000); whether they can be mathematically reduced to a single statistical dimension is arguably irrelevant. At the other end are theorists who suggest that the single process 1D‐SDT model oVers a full account of remember/know data that does not require a distinction between recollection and familiarity processes or the phenomenology they supposedly elicit (Donaldson, 1996; Dunn, 2004; Inoue & Bellezza, 1998). Indeed, one need only look at the titles of these latter research reports to see that these authors contend that the model fully accounts for all of the germane decision processes relevant to remember–know memory judgments. In contrast to this position, we feel that
140
Dobbins and Han
the full range of the data discussed above clearly illustrates that a host of important decision phenomena are simply not anticipated by the assumptions of the 1D‐SDT model. No doubt many of the strict advocates of the model would argue that the failure to predict several interesting data patterns does not amount to a falsification of the model, and such arguments are often accompanied by post hoc fitting of the model to observed data to show that it can mimic the observed patterns. We feel this argument is problematic and counterproductive for at least two reasons. First, there are numerous demonstrations that the model not only fails to anticipate many interesting data, it also makes clearly errant predictions and is unable to fit many existing findings. The confidence/accuracy (Dobbins et al., 1998) and rote rehearsal (Dobbins et al., 2004) findings reviewed here constitute two examples, but there are many other important demonstrations in the literature (Gallo, Weiss, & Schacter, 2004). The second and perhaps more fundamental counter to the above argument is that it represents a fairly strained conceptualization of a ‘‘decision model.’’ To the extent the model clearly misfits some data sets and oVers no a priori predictions with respect to a host of other decision‐ related findings, it seems reasonable to question it as a model of memory decision making in any general sense. Instead, we suggest that the range of findings reported above speaks to the need for continued eVorts to catalogue and explore the diVerent control processes, decision heuristics, memory representations, and perhaps even personality factors that govern the way subjects make item‐based memory attributions when faced with noisy and complex memory evidence.
ACKNOWLEDGMENTS This work was supported in part by NIH grant 1R01‐MH073982 to I.G.D. We thank Julie Grimley and Emily Gotschlich for assistance in data collection.
REFERENCES Badre, D., & Wagner, A. D. (2002). Semantic retrieval, mnemonic control, and prefrontal cortex. Behavioral & Cognitive Neuroscience Reviews, 1(3), 206–218. Balakrishnan, J. D., & RatcliV, R. (1996). Testing models of decision making using confidence ratings in classification. Journal of Experimental Psychology. Human Perception and Performance, 22(3), 615–633. Banks, W. P. (1970). Signal detection theory and human memory. Psychological Bulletin, 74(2), 81–99. Benjamin, A., & Bawa, S. (2004). Distractor plausibility and criterion placement in recognition. Journal of Memory & Language, 51(2), 159–172.
What Constitutes a Decision Model?
141
Benjamin, A. S., & Wee, S. (under review). Singal detection with criterial variability: Applications to recognition memory. Brown, J., Lewis, V., & Monk, A. (1977). Memorability, word frequency and negative recognition. Quarterly Journal of Experimental Psychology, 29(3), 461–473. Buckner, R. L. (2003). Functional‐anatomic correlates of control processes in memory. Journal of Neuroscience, 23(10), 3999–4004. Burgess, P. W., & Shallice, T. (1996). Confabulation and the control of recollection. Memory, 4(4), 359–411. Busey, T. A., TunnicliV, J., Loftus, G. R., & Loftus, E. F. (2000). Accounts of the confidence‐ accuracy relation in recognition memory. Psychonomic Bulletin & Review, 7(1), 26–48. Cary, M., & Reder, L. M. (2003). A dual‐process account of the list‐length and strength‐based mirror eVects in recognition. Journal of Memory and Language, 49(2), 231–248. Chandler, C. (1994). Studying related pictures can reduce accuracy, but increase confidence, in a modified recognition test. Memory & Cognition, 22(3), 273–280. Clark, S. E. (1997). A familiarity‐based account of confidence‐accuracy inversions in recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23(1), 232–238. Craik, F. I. M., & Jacoby, L. L. (1996). Aging and memory: Implications for skilled performance. In Wendy A. Rogers, Arthur D. Fisk, and Neff Walker (Eds.), Aging and skilled performance: Advances in theory and applications (pp. 113–137). Hillsdale, NJ, England: Lawrence Erlbaum. Crone, E. A., Wendelken, C., Donohue, S., van Leijenhorst, L., & Bunge, S. A. (2006). Neurocognitive development of the ability to manipulate information in working memory. Proceedings of the National Academy of Sciences of the United States of America, 103(24), 9315–9320. Dobbins, I. G. (2001). The systematic discrepancy between a’ for overall recognition and remembering: A dual‐process account. Psychonomic Bulletin & Review, 8(3), 587–599. Dobbins, I. G., & Davachi, L. (2006). Functional neuroimaging of episodic memory. In R. Cabeza and A. Kingstone (Eds.), Handbook of functional neuroimaging of cognition (2nd edn.). Cambridge: MIT Press. Dobbins, I. G., & Han, S. (2006a). Cue‐ versus probe‐dependent prefrontal cortex activity during contextual remembering. Journal of Cognitive Neuroscience, 18(9), 1439–1452. Dobbins, I. G., & Han, S. (2006b). Isolating rule‐ versus evidence‐based prefrontal activity during episodic and lexical discrimination: A functional magnetic resonance imaging investigation of detection theory distinctions. Cerebral Cortex, 16(11), 1614–1622. Dobbins, I. G., & Kroll, N. E. (2005). Distinctiveness and the recognition mirror eVect: Evidence for an item‐based criterion placement heuristic. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31(6), 1186–1198. Dobbins, I. G., & McCarthy, D. (under review). Cue framing eVects in source remembering: A memory misattribution model. Dobbins, I. G., & Wagner, A. D. (2005). Domain‐general and domain‐sensitive prefrontal mechanisms for recollecting events and detecting novelty. Cerebral Cortex, 15(11), 1768–1778. Dobbins, I. G., Kroll, N. E., & Liu, Q. (1998). Confidence‐accuracy inversions in scene recognition: A remember‐know analysis. Journal of Experimental Psychology. Learning, Memory, and Cognition, 24(5), 1306–1315. Dobbins, I. G., Khoe, W., Yonelinas, A. P., & Kroll, N. E. A. (2000). Predicting individual false alarm rates and signal detection theory: A role for remembering. Memory & Cognition, 28(8), 1347–1356.
142
Dobbins and Han
Dobbins, I. G., Foley, H., Schacter, D. L., & Wagner, A. D. (2002). Executive control during episodic retrieval: Multiple prefrontal processes subserve source memory. Neuron, 35(5), 989–996. Dobbins, I. G., Rice, H. J., Wagner, A. D., & Schacter, D. L. (2003). Memory orientation and success: Separable neurocognitive components underlying episodic recognition. Neuropsychologia, 41(3), 318–333. Dobbins, I. G., Kroll, N. E. A., & Yonelinas, A. P. (2004). Dissociating familiarity from recollection using rote rehearsal. Memory & Cognition, 32(6), 932–944. Dodson, C. S., & Hege, A. C. (2005). Speeded retrieval abolishes the false‐memory suppression eVect: Evidence for the distinctiveness heuristic. Psychonomic Bulletin & Review, 12(4), 726–731. Dodson, C. S., & Schacter, D. L. (2001). ‘‘if i had said it i would have remembered it’’: Reducing false memories with a distinctiveness heuristic. Psychonomic Bulletin & Review, 8(1), 155–161. Dodson, C. S., & Schacter, D. L. (2002). When false recognition meets metacognition: The distinctiveness heuristic. Journal of Memory & Language, 46(4), 782–803. Donaldson, W. (1996). The role of decision processes in remembering and knowing. Memory & Cognition, 24(4), 523–533. Dunn, J. C. (2004). Remember‐know: A matter of confidence. Psychology Review, 111(2), 524–542. Estes, W. K., & Maddox, W. T. (1995). Interactions of stimulus attributes, base rates, and feedback in recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21(5), 1075–1095. Fletcher, P. C., & Henson, R. N. (2001). Frontal lobes and human memory: Insights from functional neuroimaging. Brain, 124(Pt. 5), 849–881. Gallo, D. A., Weiss, J. A., & Schacter, D. L. (2004). Reducing false recognition with criterial recollection tests: Distinctiveness heuristic versus criterion shifts. Journal of Memory & Language, 51(3), 473–493. Gardiner, J. M. (1988). Functional aspects of recollective experience. Memory & Cognition, 16(4), 309–313. Gardiner, J. M., & Java, R. I. (1990). Recollective experience in word and nonword recognition. Memory & Cognition, 18(1), 23–30. Gardiner, J. M., & Parkin, A. J. (1990). Attention and recollective experience in recognition memory. Memory & Cognition, 18(6), 579–583. Gardiner, J. M., & Richardson‐Klavehn, A. (2000). Remembering and knowing. In E. Tulving and F. I. M. Craik (Eds.), The oxford handbook of memory (pp. 229–244). London: Oxford University Press. Glanzer, M., & Adams, J. K. (1990). The mirror eVect in recognition memory: Data and theory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16(1), 5–16. Goldsmith, M., & Koriat, A. (1999). The strategic regulation of memory reporting: Mechanisms and performance consequences. In Daniel Gopher and Asher Koriat (Eds.), Attention and performance XVII: Cognitive regulation of performance: Interaction of theory and application (pp. 373–400). Cambridge, MA, USA: The MIT Press. Han, S., & Dobbins, I. G. (in preparation). Implicit learning of explicit memory criteria: A false‐ feedback study. Han, S., & Dobbins, I. G. (under review). Examining recognition criterion lability using a biased feedback technique. Higham, P. A., & Vokey, J. R. (2000). Judgment heuristics and recognition memory: Prime identification and target‐processing fluency. Memory & Cognition, 28(4), 575–584. Hirshman, E. (1995). Decision processes in recognition memory: Criterion shifts and the list‐ strength paradigm. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21(2), 302–313.
What Constitutes a Decision Model?
143
Hirshman, E., & Henzler, A. (1998). The role of decision processes in conscious recollection. Psychological Science, 9(1), 61–65. Hirshman, E., & Master, S. (1997). Modeling the conscious correlates of recognition memory: Reflections on the remember‐know paradigm. Memory & Cognition, 25(3), 345–351. Inoue, C., & Bellezza, F. S. (1998). The detection model of recognition using know and remember judgments. Memory & Cognition, 26(2), 299–308. Jacoby, L. L. (1991). A process dissociation framework: Separating automatic from intentional uses of memory. Journal of Memory & Language, 30(5), 513–541. Jacoby, L. L., & Whitehouse, K. (1989). An illusion of memory: False recognition influenced by unconscious perception. Journal of Experimental Psychology: General, 118(2), 126–135. Janowsky, J. S., Shimamura, A. P., & Squire, L. R. (1989). Source memory impairment in patients with frontal lobe lesions. Neuropsychologia, 27(8), 1043–1056. Jetter, W., Poser, U., Freeman, R. B., & Markowitsch, H. J. (1986). A verbal long term memory deficit in frontal lobe damaged patients. Cortex, 22(2), 229–242. Johnson, M. K., Hashtroudi, S., & Lindsay, D. S. (1993). Source monitoring. Psychological Bulletin, 114(1), 3–28. Lindsay, D., & Johnson, M. K. (1991). Recognition memory and source monitoring. Bulletin of the Psychonomic Society, 29(3), 203–205. Macmillan, N. A., & Creelman, C. (1991). Detection theory: A user’s guide. New York, NY: Cambridge University Press. Mandler, G. (1980). Recognizing: The judgment of previous occurrence. Psychological Review, 87(3), 252–271. Marsh, R. L., & Hicks, J. L. (1998). Test formats change source‐monitoring decision processes. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24(5), 1137–1151. Morrell, H. E., Gaitan, S., & Wixted, J. T. (2002). On the nature of the decision axis in signal‐ detection‐based models of recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28(6), 1095–1110. Nolde, S. F., Johnson, M. K., & D’Esposito, M. (1998). Left prefrontal activation during episodic remembering: An event‐related fmri study. Neuroreport: An International Journal for the Rapid Communication of Research in Neuroscience, 9(15), 3509–3514. Parkin, A. J., & Walter, B. M. (1992). Recollective experience, normal aging, and frontal dysfunction. Psychology and Aging, 7(2), 290–298. Parks, T. E. (1966). Signal‐detectability theory of recognition‐memory performance. Psychological Review, 73(1), 44–58. Qin, J., Raye, C. L., Johnson, M. K., & Mitchell, K. J. (2001). Source rocs are (typically) curvilinear: Comment on yonelinas (1999). Journal of Experimental Psychology: Learning, Memory, and Cognition, 27(4), 1110–1115. Rajaram, S. (1993). Remembering and knowing: Two means of access to the personal past. Memory & Cognition, 21(1), 89–102. Raposo, A. L. N., & Dobbins, I. G. (in preparation). The setting of memory decision criteria during testing: Evidence for a criterion learning mechanism. RatcliV, R., Sheu, C.‐F., & Gronlund, S. D. (1992). Testing global memory models using roc curves. Psychological Review, 99(3), 518–535. Reder, L. M., Nhouyvanisvong, A., Schunn, C. D., Ayers, M. S., Angstadt, P., & Hiraki, K. (2000). A mechanistic account of the mirror eVect for word frequency: A computational model of remember‐know judgments in a continuous recognition paradigm. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26(2), 294–320. Rugg, M. D., Fletcher, P. C., Chua, P. M., & Dolan, R. J. (1999). The role of the prefrontal cortex in recognition memory and memory for source: An fmri study. Neuroimage, 10(5), 520–529.
144
Dobbins and Han
Schacter, D. L., Norman, K. A., & Koutstaal, W. (1998). The cognitive neuroscience of constructive memory. Annual Review of Psychology, 49, 289–318. Schacter, D. L., Norman, K. A., & Koutstaal, W. (2000). The cognitive neuroscience of constructive memory. In David F. Bjorklund (Ed.), False‐memory creation in children and adults: Theory, research, and implications (pp. 129–168). Mahwah, NJ, USA: Lawrence Erlbaum Associates. Shimamura, A. P., Janowsky, J. S., & Squire, L. R. (1991). What is the role of frontal lobe damage in memory disorders? In Harvey S. Levin, Howard M. Eisenberg, and Arthur L. Benton (Eds.), Frontal lobe function and dysfunction (pp. 173–195 xv, 427 pp.) New York, NY: Oxford University Press. Slotnick, S. D., & Dodson, C. S. (2005). Support for a continuous (single‐process) model of recognition memory and source memory. Memory & Cognition, 33(1), 151–170. Slotnick, S. D., Klein, S. A., Dodson, C. S., & Shimamura, A. P. (2000). An analysis of signal detection and threshold models of source memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26(6), 1499–1517. Stretch, V., & Wixted, J. T. (1998). On the diVerence between strength‐based and frequency‐ based mirror eVects in recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24(6), 1379–1396. Stuss, D. T., & Alexander, M. P. (2005). Does damage to the frontal lobes produce impairment in memory? Current Directions in Psychological Science, 14(2), 84–88. Swick, D., & Knight, R. T. (1996). Is prefrontal cortex involved in cued recall? A neuropsychological test of pet findings Neuropsychologia, 34(10), 1019–1028. Thompson‐Schill, S. L., D’Esposito, M., Aguirre, G. K., & Farah, M. J. (1997). Role of left inferior prefrontal cortex in retrieval of semantic knowledge: A reevaluation. Proceedings of the National Academy of Sciences of the United States of America, 94(26), 14792–14797. Tulving, E. (1981). Similarity relations in recognition. Journal of Verbal Learning & Verbal Behavior, 20(5), 479–496. Tulving, E. (1985). Memory and consciousness. Canadian Psychology, 26(1), 1–12. Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124–1131. Verde, M. F., & Rotello, C. M. (2007). Memory strength and the decision process in recognition memory. Memory & Cognition, 35(2), 254–262. Wagner, A. D., Poldrack, R. A., Eldridge, L. L., Desmond, J. E., Glover, G. H., & Gabrieli, J. D. E. (1998). Material‐specific lateralization of prefrontal activation during episodic encoding and retrieval. Neuroreport: An International Journal for the Rapid Communication of Research in Neuroscience, 9(16), 3711–3717. Wallace, W. P. (1982). Distractor‐free recognition tests of memory. American Journal of Psychology, 95(3), 421–440. Wallace, W. P., Sawyer, T. J., & Robertson, L. C. (1978). Distractors in recall, distractor‐free recognition, and the word‐frequency eVect. American Journal of Psychology, 91(2), 295–304. Wixted, J. T., & Stretch, V. (2004). In defense of the signal detection interpretation of remember/ know judgments. Psychonomic Bulletin & Review, 11(4), 616–641. Yonelinas, A. P. (1994). Receiver‐operating characteristics in recognition memory: Evidence for a dual‐process model. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20(6), 1341–1354. Yonelinas, A. P., Dobbins, I., Szymanski, M. D., Dhaliwal, H. S., & King, L. (1996). Signal‐ detection, threshold, and dual‐process models of recognition memory: Rocs and conscious recollection. Consciousness and Cognition: An International Journal, 5(4), 418–441. Yonelinas, A. P., & Jacoby, L. L. (1995). The relation between remembering and knowing as bases for recognition: EVects of size congruency. Journal of Memory & Language, 34(5), 622–643.
PROSPECTIVE MEMORY AND METAMEMORY: THE SKILLED USE OF BASIC ATTENTIONAL AND MEMORY PROCESSES Gilles O. Einstein and Mark A. McDaniel
I.
Introduction
In the course of a day, we form many intentions that cannot be carried out immediately and instead must be performed at opportune times in the future. The processes involved in encoding, storing, and remembering these delayed intentions fall within the topic of prospective memory (Ellis, 1996). From managing our work activities (e.g., remembering to pack our needed papers in the morning, remembering to make announcements in class) to coordinating our social relations (remembering to pick up the children at various locations in the evening, remembering a lunch engagement), to regulating our health‐ related needs (e.g., remembering to exercise, remembering to take medication), our lives are often overflowing with prospective memory demands. Satisfying these intentions in a timely manner is not only critical for smooth and eYcient functioning but severe deficits in prospective memory, such as problems in remembering to take medication, can threaten independent living. This forward‐looking ability to plan for future events and to carry them out later is arguably central to human survival and accomplishments (cf. Tulving, 2004). In light of the importance of good prospective memory for eVective functioning as well as its prevalence in everyday life, it is surprising that until recently there has been little empirical or theoretical interest in this type of memory. THE PSYCHOLOGY OF LEARNING AND MOTIVATION VOL. 48 DOI: 10.1016/S0079-7421(07)48004-5
145
Copyright 2007, Elsevier Inc. All rights reserved. 0079-7421/08 $35.00
Einstein and McDaniel
146
II.
What Is Different About Prospective Memory?
Since the time of Ebbinghaus (1885/1964), the focus of memory research has been on understanding retrospective memory or memory for past events. In studying cued recall, for example, experimenters present subjects with pairs of items to study and then after a delay present the first member of each pair and request retrieval of the second member. A characteristic of retrospective research is that experimenters put subjects in a retrieval mode (Tulving, 1983) and direct them to query memory. In the case of cued recall, the experimenter presents a cue along with an explicit request to search memory for the associated item. Thus, most theories that have been developed to explain recall and recognition assume that this request to remember is essential for stimulating retrieval (for exceptions, see Conway, 2005; Kvavilashvili & Mandler, 2004; Moscovitch, 1994; and see McDaniel, Guynn, Einstein, & Breneiser, 2004, for further development of this point). The idea is that the explicit request to remember initiates processes (such as gathering of cues or the activation of retrieval structures, e.g., see Raajimakers & ShiVrin, 1980) that eventually lead to retrieval. By contrast, a critical characteristic of prospective memory is that no one puts you in a retrieval mode at the appropriate moment for retrieval. Instead, remembering is self‐initiated in the sense that the person must remember to perform the intended action at the proper time in the absence of the experimenter stimulating a controlled search of memory (Craik, 1986; Einstein & McDaniel, 1990, 1996). For example, consider the task of giving a colleague a message. As in the cued recall task, we must associate the action with the friend. An important diVerence, however, is that at retrieval, when you encounter the cue (i.e., the colleague) there is no external agent to put you in a retrieval mode. How then, in the absence of a prompt to search memory when we encounter the colleague, do we switch from processing our colleague as a friend to seeing the friend as a cue for an unfulfilled intention? In this chapter, we first consider the possibility that there exists a special memory system or mechanism for accomplishing prospective memory. We then present some core findings regarding prospective memory that disfavor this view. Compatible with the theme of this volume, we then consider several ways that humans might use basic memory systems to accomplish prospective memory. As we do so, we present evidence supporting the plausibility of these multiple mechanisms for prospective memory retrieval. We show how people flexibly rely on these mechanisms in diVerent situations. We conclude by applying this theoretical view as a foundation for framing questions about people’s meta‐awareness of the variety of prospective memory demands and the most eYcacious method for accomplishing retrieval under these demands.
Prospective Memory and Metamemory
III.
147
Is There a Specialized Prospective Memory System?
How might the human cognitive system have developed to support the critically important function of prospective memory? One possible view is that a prospective memory system evolved as a distinct memory system to handle this important human function or faculty. This view would follow from an orientation in the memory literature that a number of memory systems have evolved to accomplish the increasingly complex and specialized cognitive activities of humans (cf. Klein, Cosmides, Tooby, & Chance, 2002; Schachter & Tulving, 1994). In terms of prospective memory, the idea would be that there are special mechanisms that have evolved to support the remembering of delayed intentions. As an ideal, imagine a prospective memory module that stores the intention, creates a marker or sets a timer associated with the intention, and alerts the cognitive system when the marker or target time is encountered. Given that one must remember to perform prospective memory tasks while engaged in other ongoing daily activities, it would be adaptive for the module not to compete with working memory resources needed for meeting ongoing cognitive demands. As a concrete example, in your oYce you might be working on a paper for a conference, and at a particular time have to go to a doctor’s appointment. The prospective memory module sketched here would enable you to remember the appointment and concurrently maintain the quality of your work on the paper. Though there is no compelling scientific evidence for this ideal (and for readers who have missed that doctor’s appointment, not much personal experience to support it either), there is a current approximation in the literature to a specialized mechanism for prospective memory. Specifically, an idea derived from an influential model, adaptive control of thought (ACT; Anderson, 1983), is that goal representations (intentions for future completion) have privileged status in memory such that the representation is sustained in a state of higher activation over a delay interval (relative to other representations; Goschke & Kuhl, 1993, 1996). Note that the key assertion from the perspective we are sketching here is that higher levels of activation of goal‐related stimuli is an inherent property conferred by a system that has evolved to serve prospective memory function. Higher levels of activation translate into prospective remembering by making the intention easily brought into awareness at the appropriate moment. Although it is tempting to embrace the idea that there is a special prospective memory mechanism, several results suggest that prospective memory depends on more general memory and cognitive processes. If it were just a matter of a specialized prospective memory module being sensitive to the marker (or a goal residing in a privileged state of activation), the key element
Einstein and McDaniel
148
to remembering should be determined by the properties of the module for processing the marker. One problem for this expectation is the finding that the identical marker can lead to significantly diVerent prospective memory performance depending on its associative relation with the intended action. For instance, in one study, subjects were involved in the ongoing task of rating various characteristics of words (concreteness, pleasantness, and so on). At the outset of the experiment, subjects were also instructed that if they ever saw a target word (e.g., ‘‘spaghetti’’) in the word‐rating task, then they should write down an intended action word. In one condition the intended action was a word highly associated with the target word (sauce); in the other condition the intended action was not highly associated with that target word (needle). Despite the marker being identical in both experimental conditions (and therefore equally detectable to a putative prospective memory module), prospective memory performance was significantly better in the condition with the highly associated intended action (McDaniel et al., 2004, Experiment 2). Note that this eVect was not due to diVerential retrospective memory of the target word—intention word pair because at the conclusion of the experiment a memory test for the pair indicated nearly perfect recall in both conditions. This pattern thus suggests that fundamental associative memory mechanisms can be involved in prospective remembering, and we flesh out this idea later. A second potential problem with the notion of a specialized prospective memory module is the finding that having a prospective memory intention can interfere with the eYciency of performing ongoing activity (Marsh, Hicks, Cook, Hansen, & Pallos, 2003; Smith, 2003). It is interesting to note, however, that consistent with the idea of a specialized resource‐free prospective memory module (or an inherently activated special goal representation), having an intention sometimes does not compromise performance of ongoing activities (Einstein & McDaniel, 2005; Einstein et al., 2005; Marsh et al., 2003; McDaniel & Einstein, 2007a). The foregoing pattern that sometimes prospective remembering seems to require resources and sometimes does not require resources discourages a unitary, modular, and specialized approach to prospective memory.
IV.
Using Basic Memory and Attentional Processes in the Service of Prospective Memory
Our antipathy for invoking a dedicated prospective memory mechanism is additionally fueled by a wealth of findings showing many commonalities and shared functional relationships between prospective and retrospective
Prospective Memory and Metamemory
149
memory (see Marsh, Cook, & Hicks, 2006, for a thorough review and analysis). The mulitprocess view we present here embraces these commonalities between prospective and retrospective memory. Instead of having developed a unique memory system designed to accomplish prospective remembering, we suggest that the cognitive system strategically exploits fundamental memory processes to support this important human memory function. Thus, we believe that humans have developed higher‐order strategic processes that make use of fundamental memory processes such as recognition memory and associative memory (likely shared in common among other species; Eichenbaum, Yonelinas, & Ranganath, in press) in the service of prospective memory. From our perspective, one advantage of such a system is that the higher‐order strategic processes allow the flexible use of these basic memory processes to meet diVerent contextual demands that are naturally found in prospective memory tasks. Moreover, given the critical importance of prospective memory (as outlined at the outset), the interplay of strategic processes with fundamental memory systems allows redundant but not identical paths for remembering a prospective memory intention. Just as redundant mechanical systems in airplanes reduce catastrophic failures, redundancy, or multiple routes for remembering help ensure that people will remember to carry out their intended actions. In following sections, we present several mechanisms by which people can recruit basic memory processes to accomplish prospective memory retrieval. These range from controlled processes that involve actively maintaining a process of making recognition checks over the delay to relatively automatic or spontaneous processes that reflexively respond to the presence of target stimuli. As noted above, and as in retrospective memory where it is now clear that people rely on more than one process for retrospective retrieval (e.g., recollective and familiarity processes in recognition; Eichenbaum et al., in press; Jacoby, 1991), we believe that there are multiple routes to prospective remembering. A.
USE OF CONTROLLED (OR DIRECTED) RECOGNITION PROCESSES
One solution to solving the prospective memory problem is to endogenously put oneself in a retrieval mode and to maintain that over the retention interval. Essentially, then, this view assumes that successful retrieval results from people monitoring the environment for the target event. The most precisely specified view of this type is Smith’s preparatory attentional processes and memory (PAM) theory (Smith, 2003; Smith & Bayen, 2004, 2006). According to this theory, after forming an intention, people engage in preparatory attentional processes that perform simple recognition checks
150
Einstein and McDaniel
(the memory component) on selected environmental events in order to determine if they are instances of the target event. These processes might also include some rehearsal or maintenance of the intention (Smith & Bayen, 2006). Consider for example a task in which participants are rating individually presented items for pleasantness but in addition are asked to perform the prospective memory task of pressing a designated key whenever the target item ‘‘dog’’ occurs. According to the PAM theory, the preparatory processes initiate a recognition judgment for each item. When the target event is recognized, subjects then retrieve and carry out the intended action. Prospective memory failure occurs either when participants fail to maintain their preparatory attentional processes on the intention and therefore fail to perform a recognition check or when the recognition process fails (see Smith & Bayen, 2004, 2006, for a multinomial model that measures these attention and memory components). Thus, within this theory, the retrieval mode problem is solved by subjects actively maintaining a retrieval mode by self‐initiating a recognition check for each item. A crucial assumption of this view is that preparatory attentional processes are resource demanding and thus should interfere with performing ongoing activities. Consistent with this assumption, and as mentioned earlier, there is good evidence (Marsh et al., 2003; Smith, 2003) that adding a prospective memory task to the ongoing task of performing a lexical decision task substantially slows down responding to nontarget items (by about 300 ms in Smith’s Experiment 1). It is important to note that this slowing is computed using only nontarget items, which is exactly what would be expected if subjects were performing recognition checks on all items in order to determine if the target has been presented. This theory also assumes that the preparatory attentional processes are critical for successful prospective memory retrieval. Some support for this expectation comes from Smith’s findings that higher prospective memory performance was associated with greater slowing on the lexical decision task (Smith, 2003). Additional support comes from Smith and Bayen’s results showing that older adults, who are thought to have compromised working memory resources (and thus more diYculty engaging preparatory attentional processes) have poorer prospective memory than younger adults (Smith & Bayen, 2006). Although Smith and Bayen’s theory makes the strong claim that monitoring is the only path by which prospective memory retrieval can occur, we now consider two other theoretical views that could account for prospective memory retrieval (Smith & Bayen, 2004). Both of these views assume that active maintenance of attentional processes (and thus a retrieval mode) is not always necessary for prospective memory retrieval and instead that retrieval can also be initiated by fundamental memory processes that spontaneously respond to the occurrence of target events.
Prospective Memory and Metamemory
B.
151
SPONTANEOUS RECOGNITION PROCESSES
For at least the last three decades, many memory researchers have assumed a basic recognition process that is spontaneous—that is not based on controlled or directed attempts to make a judgment about the previous occurrence of an event (e.g., Atkinson & Juola, 1974; Mandler, 1980). More specifically, the assumption is that recognition can be supported by a spontaneous or automatic familiarity process (Jacoby, 1991; Jacoby & Dallas, 1981; Mandler, 1980). A compelling everyday example of this process is what might occur when a person gets on a city bus. In this context, the person is not in a retrieval mode—she is not trying to recognize people. Nevertheless, while walking down the aisle, she may experience a feeling of familiarity (recognition) on encountering a particular face (e.g., her neighborhood butcher). We (Einstein & McDaniel, 1996; McDaniel, 1995; Guynn & McDaniel, in press) have proposed that recognition of a target event as a prospective memory cue might be based on somewhat similar spontaneous recognition processes. One specific idea is that when an event that has been previously encoded as the appropriate moment for executing an intended action is encountered, that event may be processed more fluently than it ordinarily would. Consequently, we may spontaneously experience a sense of familiarity (McDaniel) or significance (McDaniel et al., 2004) for that event. This feeling of familiarity causes the target to be noticed, after which strategic or controlled processes may be called into play in an attempt to determine the reason for the feeling of familiarity. These controlled processes would serve to identify the event as a cue to execute an intended action and to attempt to retrieve the intended action. Findings from both behavioral and cognitive neuroscience prospective memory studies are consistent with the idea that a basic familiarity process might be exploited to support prospective remembering. In one experiment, prospective memory targets were either repeatedly preexposed or not preexposed prior to the prospective memory instructions and the start of the ongoing activity in which the prospective memory task was embedded (Guynn & McDaniel, in press). In addition, the demands of the ongoing task (rating various dimensions of words) were varied such that sometimes an additional secondary task had to be performed (monitoring an audio input for two consecutive odd digits) along with the ongoing task. Preexposing the targets improved prospective memory, and most importantly, eliminated the negative influence of divided attention on prospective memory that obtained when targets were not preexposed. The straightforward interpretation from the familiarity‐based view is that preexposing the targets stimulated the automatic familiarity process, obviating the need for attentional resources to identify the targets as a cue to execute the intended action.
152
Einstein and McDaniel
In an ERP study, subjects were shown pairs of words, and they had to make a semantic relatedness judgment for each pair (West & Krompinger, 2005). Prior to each block of semantic judgment trials, subjects were given a prospective memory target item. For the prospective memory task, when the target occurred the subjects had to indicate the position on the screen in which the word appeared (upper or lower). The important ERP pattern for present purposes was a positivity over the frontal region of the scalp between 300 and 500 ms after the stimulus was presented (the FN400) that was significantly more pronounced for prospective memory hits than for nontarget ongoing trials. Because this ERP modulation (the FN400) has been regarded as an index of familiarity in recognition (Curran, 2000; Curran & Dien, 2003), the implication is that familiarity processes may be involved in prospective memory. C.
SPONTANEOUS, REFLEXIVE ASSOCIATIVE MEMORY PROCESSES
Over 100 years ago, Ebbinghaus (1885) observed that one of three basic types of memory is the spontaneous appearance of a mental state ‘‘without any act of will’’ that is recognized as previously experienced. Current research has supported this original observation with increasing empirical and theoretical work focusing on so‐called involuntary memory (Ball & Little, 2006; Bernsten, 1996; Bernsten & Hall, 2004; Conway, 2005; Kvavilashvili & Mandler, 2004; Moscovitch, 1994). The key feature of involuntary memory is that a deliberate search for a specific memory does not precede the sudden appearance of the memory into consciousness (Ball & Little, see p. 1167). We contend that remembering an intended action at an appropriate moment can sometimes be an involuntary memory in which the memory of the intended action suddenly appears in consciousness with no deliberate search for the intention. (This process may bear some similarity to remindings that have been shown to occur in problem solving; e.g., see Ross & Kennedy, 1990.) More specifically, based on current empirical and theoretical work on involuntary memory suggesting that environmental cues are prominent in triggering a memory into consciousness (Ball & Little, 2006; Kvavilashvili & Mandler, 2004; Moscovitch, 1994), we suggest that involuntary memory is operative in prospective memory tasks in which an environmental event signals the appropriateness of the intended action. (We have labeled these event‐based prospective memory tasks; Einstein & McDaniel, 1990.) An example in daily experience is intending to give a message to a colleague; encountering the colleague is an environmental event that signals an appropriate moment to deliver the message. In the laboratory, a word previously specified as appropriate for executing the intention is encountered in some ongoing experimental task. Just as hearing a word in a class can produce
Prospective Memory and Metamemory
153
involuntary memory of singing a song that contained that word (Ball & Little), we believe that encountering the prospective memory target word in the experimental task can produce a spontaneous (involuntary) remembering of the intended action. Several lines of evidence support our claim. First, our subjects often report that the intended action ‘‘just popped into mind’’ on encountering the target event (Einstein & McDaniel, 1990). Second, in line with Moscovitch’s (1994) model of an automatic associative memory system underlying involuntary memory, as described above, prospective memory performance is facilitated when the target event and intended action are highly associated (McDaniel et al., 2004; Moscovitch, 1994). Third, in these cases prospective memory is not attenuated by divided attention (McDaniel et al., Experiment 3; McDaniel, Howard, & Butler, 2006). (See McDaniel & Einstein, 2007a,b, for detailed presentation and discussion of the supporting evidence.) To foreshadow, in line with our theme that successful prospective remembering can reflect strategic use of this basic involuntary memory process, at least two theoretical features of involuntary memory could be exploited to facilitate prospective memory. One is that involuntary memory is increasingly likely with increased associativity between the environmental cue and the memory representation (cf. Moscovitch, 1994). Accordingly, to facilitate prospective remembering, encoding strategies might be recruited that form a robust associative link between the anticipated environmental event and the intended action (Cohen & Gollwitzer, in press). Based on the approach being developed herein, the idea is that such an encoding strategy would increase the probability that involuntary memory mechanisms will support prospective remembering, thereby allowing reduced reliance on resource‐demanding processes. A second hypothesized feature of involuntary memory is that (involuntary) retrieval is more likely ‘‘when the individual is in a relaxed mood and performing a routine task that does not require a high level of focused attention’’ (Ball & Little, 2006, p. 1170). On this idea, strategic management of ongoing tasks could increase the likelihood that involuntary mechanisms will support prospective memory, and we later describe research consistent with this theme. A somewhat diVerent theoretical view is that thresholds for instantiating memories into consciousness may vary (cf. Conway & Pleydell‐Pearce, 2000; Norman, Newman, & Detre, in press). Our idea here is that thresholds might be thought of as a gradient wherein the threshold can change according to individual priorities and demands. For instance, a person who does not want performance of a particular ongoing activity to be degraded by distraction might set a high threshold—the consequence being that a sensory cue related to an intention would be less likely to lead to conscious activation of that intention. Following this suggestion, another provocative direction for future work
Einstein and McDaniel
154
in prospective memory would be exploration of the degree to which thresholds could be varied by a person’s goals or by a person’s willingness to tolerate interruption (or competition) from involuntary retrievals, and thereby modulate involuntary retrieval of prospective memory intentions. V.
The Multiprocess Theory: Contextual Factors Determining the Utility of Each Process
The compelling evidence for all three of these mechanisms fits nicely with our multiprocess theory of prospective memory (McDaniel & Einstein, 2000, 2007a; see also Einstein & McDaniel, 2005). This view makes a few basic assumptions about the processes underlying prospective remembering. First, unlike the PAM theory, the mulitprocess theory assumes that several processes (i.e., the ones just described and also possibly others) can support prospective memory retrieval. Second, in line with contextualistic views of memory (Jenkins, 1979), the theory assumes that the process(es) that people rely on in a given situation and the usefulness of that process(es) depend on a variety of factors. These include the nature of the prospective memory task (e.g., whether there is likely to be a good target cue at retrieval), the associative relation between the cue and the action, the importance of the task, the length of the retention interval, the nature of the ongoing activities (e.g., whether they are likely to direct attention to the target cue, the demands of these activities), and the momentary as well as more enduring characteristics of the individual (e.g., fatigue, working memory capacity, personality variables). This assumption seems well supported by the data (we review some of these data later; see also McDaniel & Einstein, 2000, 2007a). Third, we believe there is a bias to rely on spontaneous retrieval processes. Given the substantial costs of monitoring on the speed of performing ongoing activities, it would seem adaptive to generally hold this predisposition. Such a bias also fits with a body of research suggesting that our ability to maintain conscious control over behavior is limited (Bargh & Chartrand, 1999; Baumeister, Bratslavsky, Muraven, & Tice, 1998). As noted earlier, given the importance of prospective memory for carrying out daily activities, relying on multiple processes would add some redundancy and help ensure that we remember to carry out intentions. Also, it would allow us to flexibly solve the prospective memory problem eVectively and eYciently. And indeed there is a good deal of evidence suggesting that multiple processes are involved in prospective memory. For example, dividing attention seems to interfere with prospective remembering in some situations but not in others (McDaniel et al., 2004). Similarly, age‐related decrements in prospective memory are sometimes observed (Maylor, 1996;
Prospective Memory and Metamemory
155
Smith & Bayen, 2006) and sometimes not (Cherry & LeCompte, 1999; Einstein & McDaniel, 1990; see Henry, MacLeod, Phillips, & Crawford, 2004, for a review and meta‐analysis). As another example, performing a prospective memory task sometimes interferes with the speed of performing the ongoing task (Einstein et al., 2005; Marsh et al., 2003; Smith, 2003) and sometimes does not (Einstein et al.; Marsh et al.). It would be diYcult to explain these results with a unitary view of prospective memory that assumes either that retrieval always requires controlled recognition processes or that retrieval always occurs spontaneously. The multiprocess view can account for these eVects by assuming that diVerent processes come to the fore and are eVective in diVerent contexts. Specifically, the multiprocess view assumes that divided‐attention eVects, age‐related declines, and task interference should be greater in conditions that require monitoring (and controlled recognition processes) and minimal or nonexistent in situations with good cues that are likely to produce spontaneous retrieval. VI.
Metamemory and Prospective Memory
Given the evidence strongly suggesting that there are various ways to accomplish prospective memory, this raises a central question of how strategically people recruit these processes for prospective remembering. Unfortunately, there is very little research on this issue. From a theoretical perspective, there may be an inherent challenge in calibrating anticipated performance levels. In retrospective memory, one can periodically attempt to retrieve target information and by so doing gain leverage on the memorability of that information. In prospective memory by contrast, it is diYcult to see how one could accurately test the likelihood that you would remember to remember in general and in the anticipated future conditions in particular. One strategy would be to test your memory for the intended action. That is, one might acquire a judgment of the likelihood of remembering what it is that you have to do. Note, however, that this is not the prospective memory aspect, that is, remembering that you have to do something at some subsequent appropriate moment. Indeed assessing memory for the retrospective component may be a relatively inaccurate index of your ability to remember the prospective component. As an informal example, shortly after becoming the chair of his department, one of the authors forgot to attend his department meeting which he had scheduled! His metajudgment, based on the importance of the intended action, was that it would be impossible to forget. And indeed he never forgot the retrospective memory component. The point here of course is that the
156
Einstein and McDaniel
prospective memory prediction was based on the retrospective memory component. From this perspective, meta‐awareness of prospective memory tasks may be particularly problematic. Another factor that could undermine development of relatively accurate awareness of prospective memory may be an assumption that people hold about prospective memory reflecting a unitary ability or memory faculty. Alternatively, given the importance and frequency of prospective memory in our daily lives and the clear and often strong feedback when forgetting occurs (as experienced when one of us forgot to attend the department meeting), it may be that we are especially likely to develop accurate and eVective metamemory for prospective memory situations. This idea is consistent with the argument that prospective memory should develop at a relatively early age because of its importance in human existence (cf. Kvavilashvili, Kyle, & Messer, in press). Indeed the argument could be made that most of our attempts to assess our ability to remember (outside of educational situations) involve prospective memory, and not retrospective memory, tasks. If there were good meta‐awareness of prospective memory, we would expect to see eVective modulation in how people approach prospective memory as a function of contextual parameters. To date, the evidence bearing on this expectation is generally supportive, and we turn now to that evidence. A.
SENSITIVITY TO THE EXTENT TO WHICH THE ONGOING TASK ENCOURAGES FOCAL PROCESSING OF THE TARGET CUE
A key assumption of the multiprocess view is that the anticipated availability of good cues can stimulate spontaneous retrieval through the discrepancy and reflexive associative processes described earlier. Although there are likely many features that characterize good cues (e.g., distinctiveness, McDaniel & Einstein, 1993, and the extent to which the cues are associated with the actions, McDaniel et al., 2004), a critical consideration is the extent to which the ongoing task encourages focal processing of the target cue. This idea borrows heavily from the transfer‐appropriate processing explanation of retrieval eVects in retrospective memory (Morris, Bransford, & Franks, 1977; Roediger, 1996). As it relates to prospective memory, the idea is that spontaneous retrieval will be more likely when the processing of the target at retrieval overlaps greatly with the processing of that item at encoding. Thus, if the cue is thought of in a particular way during encoding or intention formation, spontaneous retrieval will be greater if the ongoing task at retrieval encourages similar processing of the target. For example, if the target event is the homonym ‘‘bat’’ and it is encoded as a baseball bat when the intention is formed, then prospective memory retrieval will be more likely if the ongoing task context at retrieval encourages an interpretation of the target
Prospective Memory and Metamemory
157
in terms of ‘‘baseball’’ rather than in terms of ‘‘flying mammals’’ (McDaniel, Robinson‐Riegler, & Einstein, 1998). Another important characteristic of focal processing is that the ongoing task focuses attention on the target cue. In retrospective memory contexts, cues are presented at retrieval and the assumption is that subjects fully process them. This is not necessarily the case in prospective memory situations as a cue may be present but not consciously processed. For example, we may have the intention to give a message to a friend but our attention may be elsewhere when we pass her in the hallway. Thus, we believe that focal processing of the target event occurs when the ongoing task at retrieval directs attention to the target event and when it encourages processing of the features that were thought about during intention formation (see Einstein & McDaniel, 2005, for examples of focal and nonfocal conditions). Are people generally sensitive to the focal nature of prospective memory target cues? To gain leverage on this question, one of our students, Sarah Moynan, constructed a meta‐prospective memory questionnaire and recently administered it to 20 college students. The students were presented with hypothetical prospective memory situations: focal and nonfocal event‐ based tasks and time‐based tasks. As an example, the focal and nonfocal event‐based scenarios were, respectively: You need to make an appointment with an oYce across campus. You need to talk to someone in person rather than leaving a message. You walk past the oYce on the way to your next class. You need to make an appointment with an oYce across campus. You need to talk to someone in person rather than leaving a message.
For each situation the students were asked to use a Likert scale to predict the likelihood they would complete the prospective memory task. Also, the students were asked what strategies, if any, they would use to aid in remembering to perform the prospective memory task. The preliminary responses indicate that people are generally sensitive to the facilitation aVorded by focal cues (as reported objectively in laboratory experiments; Einstein et al., 2005) and the more demanding nature of time‐based prospective memory tasks. Sixty percent of the students indicated that they would be ‘‘highly likely’’ to complete the focal event‐based task. By contrast, just 20% indicated that they would be ‘‘highly likely’’ to complete the nonfocal event‐based task. With regard to the time‐based tasks, only one‐third of the students indicated that they would be ‘‘highly likely’’ to complete the task. Along with the appropriate modulation of the predicted success across the focal—nonfocal and event—time‐based dimensions, the students reported some, but less pronounced, variation in whether they would recruit a strategy
158
Einstein and McDaniel
to help remember the various prospective memory tasks. For the focal event‐ based task, half of the students indicated they would engage a strategy. For the nonfocal event‐based and time‐based tasks, 65% indicated they would use a strategy. Interestingly, however, the mix of strategies mentioned (across students) was fairly consistent, and they were not heavily reliant on monitoring. Generally, across all of the prospective memory tasks, the most frequently mentioned strategy was generating external cues (writing the intention on a day planner or a sticky note, setting a cell phone or alarm). Monitoring (rehearse the intention repeatedly throughout the day) was mentioned as a strategy 19% of the time. Other cognitive strategies such as imagery and associative linkages to certain standard events were mentioned infrequently. For laboratory prospective memory tasks, behavioral indices of monitoring provide a somewhat diVerent picture with regard to recruitment of monitoring processes when nonfocal cues versus focal cues are anticipated. Although the full spectrum of ongoing task‐target cue relations has not been examined, available results generally suggest that people are more likely to recruit monitoring strategies with nonfocal cues. Consider, for instance, the research of Marsh et al. (2003). Their subjects performed a lexical decision ongoing task while performing no prospective memory task, a prospective memory task with a focal cue (press a key when you see the target word ‘‘dog’’), or a prospective memory task with a nonfocal cue (press a key when you see a target word representing an instance of the ‘‘animal’’ category). Our assumption here is that the animal intention was nonfocal because performing a lexical decision task does not require activation of taxonomic category information. If subjects were aware that focal cues are more likely to lead to spontaneous retrieval, then performing a prospective memory task with a focal cue should have produced less monitoring and thus less slowing on lexical decision times than with a nonfocal cue (relative to the control condition that did not perform a prospective memory task). This is exactly what happened as there was no significant slowing in the word condition but substantial and significant slowing in the category condition. Einstein et al. (2005, Experiment 1) set out to directly test whether subjects are sensitive to the focal nature of prospective memory target cues. For their ongoing task, subjects were presented with a word and a category and had to decide as quickly as possible whether the word represented a member of that category. The prospective memory task for half of the subjects was to press a designated key when the target word ‘‘tortoise’’ occurred (focal condition) and for the others when the syllable ‘‘tor’’ occurred (nonfocal condition). The assumption was that the syllable condition was nonfocal because the ongoing task encourages processing of the items as lexical units and does not ordinarily lead to conscious awareness of the syllables. Consistent with the Marsh et al. (2003) findings, there was significant slowing (relative to a control condition) in the nonfocal condition but not the focal condition. Thus, these
Prospective Memory and Metamemory
159
two laboratory studies indicate that people are sensitive to at least some dimensions related to the focal nature of a target stimulus, and are more likely to rely on spontaneous retrieval processes (i.e., less likely to monitor) when they anticipate encountering a focal target. Using the same conditions as in Einstein et al. (2005), McDaniel, Einstein, and Rendell (in press, Experiment 2) found that older adults exhibited a similar pattern of significant slowing with a nonfocal task but not with a focal task. In fact, the costs of performing a prospective memory task were especially pronounced for the older adults in the nonfocal condition and yet they attained prospective memory levels comparable to those of the younger subjects. This suggests an impressive awareness by older adults that they did not have to monitor with a focal task in order to achieve high prospective memory (comparable to that of younger adults) but that they did have to engage precious working memory resources in the service of monitoring with a nonfocal prospective memory task in order to achieve prospective memory levels comparable to those of younger adults. The general reluctance by both younger and older adults to engage capacity consuming monitoring processes when good (focal) cues appear to be qualified by individual diVerences. Although the available research shows no significant task interference with a single target event, there is typically nominal slowing in these studies (Einstein & McDaniel, 2005). This slowing may reflect chance variation. Another possibility is that it reflects a mix of strategies on the part of subjects with some subjects monitoring and others not. To test this possibility, we (Einstein et al., 2005, Experiment 4) presented a large sample (104 subjects) with an ongoing task (deciding whether a word meaningfully fit into a sentence) and a focal prospective memory task (pressing a key whenever a particular word appeared in a sentence). All subjects performed the ongoing task alone and also while performing the prospective memory task. With the high level of power in this experiment, the results revealed significant monitoring. More importantly, slightly over half of the subjects slowed down when performing the prospective memory task (an indication that they were monitoring) whereas the others did not (an indication that they were relying on spontaneous retrieval processes). Interestingly and consistent with the multiprocess view that focal cues stimulate spontaneous retrieval, prospective memory performance was high and not significantly diVerent for these two groups of subjects. Thus, these results suggest important individual diVerences in how people approach prospective memory situations. The monitoring with a focal cue could reflect a lack of awareness of the value of such cues and/or a cautiousness and a willingness to monitor in order to help ensure (provide redundancy) for prospective remembering. Further research is needed to determine the personality and cognitive variables that mediate people’s willingness to adopt a strategic monitoring approach versus a reliance on spontaneous retrieval processes.
160
B.
Einstein and McDaniel
IMPORTANCE OF THE PROSPECTIVE MEMORY TASK
Intuitively it seems that we are less likely to forget tasks that we consider more important. Indeed, Winograd (1988) has suggested that people often gauge how important we consider a task by whether or not we remember it. The research is generally consistent with these impressions as importance instructions tend to produce higher prospective memory (e.g., Kvavilashvili, 1987; Meacham & Singer, 1977). Interestingly, however, importance tends to have greater eVects on prospective memory tasks that require strategic monitoring for successful retrieval relative to those for which spontaneous retrieval is likely (Kliegel, Martin, McDaniel, & Einstein, 2001, 2004). In Kliegel et al. (2004), for example, importance instructions improved prospective memory on a nonfocal task but not on a focal task. But, what is it that people do to improve prospective memory in situations that they consider important? Are they more likely to monitor and if so, do they do so indiscriminately or are they sensitive to the availability of good cues? To begin to address these questions, Einstein et al. (2005) varied the importance of the prospective memory instructions while presenting subjects with either a focal or nonfocal prospective memory task and measured prospective memory performance as well as task interference (i.e., the eVects of performing a prospective memory task on the speed of performing the ongoing task). In the high‐importance condition, subjects were told that performing the prospective memory task was their main goal and that they should try to ‘‘find absolutely every occurrence of the target item’’ (p. 330). In the moderate‐ importance condition, they were told that the experimenters had ‘‘a secondary interest’’ (p. 330) in the prospective memory task and that their main concern was the accurate and speedy performance of the ongoing task. Consistent with the Kliegel et al. (2004) results, importance instructions improved prospective memory only in the nonfocal condition. Even so, increased slowing on the ongoing task with high‐importance instructions suggested increased monitoring by subjects in both the focal and nonfocal conditions. Thus, subjects seemed generally aware that they could improve prospective memory in important situations by increasing their monitoring for target events. As noted in the previous section, they also appeared to realize that spontaneous retrieval is likely with focal cues, and thus were willing to rely exclusively on spontaneous retrieval processes with moderate‐importance instructions. Given this awareness, the monitoring with focal cues under high‐importance instructions may reflect some understanding that spontaneous retrieval processes are probabilistic in nature and thus a willingness to engage a backup system for remembering. Other evidence suggests that people may be even more selective in their strategy use than perhaps indicated by the above findings. Specifically, people appear to systematically modulate the resources they recruit for the
Prospective Memory and Metamemory
161
prospective memory task (i.e., for monitoring) as a function of the perceived importance of the prospective memory task for a nonfocal cue but not a focal cue. In an extension of the Einstein et al. (2005) paradigm, at the conclusion of the experiment, Jen Breneiser (as part of her dissertation work) required participants to rate the importance of the prospective memory task relative to the other tasks in the experiment (on a scale of 1—‘‘least important’’ to 7— ‘‘most important’’). Preliminary results showed that the amount of slowing in the nonfocal condition was significantly correlated with the rated importance of the prospective memory task. Assuming that slowing indexed degree of monitoring and that monitoring is required with nonfocal cues, it would be expected that rated importance would also be associated with prospective memory performance in the nonfocal cue condition. This correlation was also highly significant. By contrast, in the focal cue condition, there was no association between any slowing and rated importance or between rated importance and prospective memory performance. These patterns suggest that people allocate resource demanding monitoring in a manner that is appropriate both for the nature of prospective memory cues and for the priority associated with the prospective memory task. Further support for this conclusion is evident in an experiment conducted by Sarah Moynan and Elaine Tamez (students at Washington University). To begin the experiment, in one condition participants were instructed that the prospective memory task (nonfocal task) was of high importance. Then midway through the experiment, participants were told that the priorities had changed and that the prospective memory task was now of low importance. As would be expected, response latencies on the ongoing task were slower when the prospective memory task was of high importance than when it was of low importance. Especially telling were qualitative diVerences in self‐reported strategies across the high‐ and low‐ importance segments. For the high‐importance segment, approximately 75% of the participants said that for each trial they first checked to see if the nonfocal information was a target item and then attended to the ongoing task information. In contrast for the low‐importance segment, participants reported attending to the ongoing task initially and then perhaps checking to see if the nonfocal cue was present.
C.
DELAYING EXECUTION OF RETRIEVED INTENTIONS
In some cases after retrieving an intention, we have to delay the action until the conditions are appropriate for performing it. For instance, one might retrieve the intention to take medication while in the living room (downstairs), but then have to delay taking the medication for the time it takes to walk to the room
162
Einstein and McDaniel
where the medication is (for ease of exposition, we will label this a delayed‐ execute prospective memory situation). Delayed‐execute tasks are also prominent in demanding work settings (e.g., aviation operations, Dismukes, Berman, & Loukopoulos, 2007; Loukopoulos, Dismukes, & Barshi, 2003). Several interesting questions can be posed concerning delayed‐execute tasks: (1) Does the brief delay between retrieval of the intention and its execution produce prospective memory forgetting? (2) Does variation in the length of those delays produce variation in forgetting? (3) Are people generally aware of the consequences of brief delays (following intention retrieval) on performance? To answer the first two questions, in a series of experiments, we (Einstein, McDaniel, Williford, Pagan, & Dismukes, 2003) instructed participants to defer execution of a retrieved intention until completion of an ongoing activity (which occurred 5, 15, or 40 s later). A very salient prospective memory cue was used (the appearance of a red screen on the monitor) to ensure that participants would retrieve the intention, and it is reasonable to assume that performance would be nearly perfect if immediate execution were allowed (see Einstein, McDaniel, Manzi, Cochran, & Baker, 2000). Performance was maintained at a high rate under ordinary ongoing task demands (92.5%), with forgetting increasing (76% accuracy) when ongoing tasks demands were augmented with a secondary task. Importantly, these performance levels did not fluctuate as a function of delay length. For each condition, performance after the 40‐s delay was at the level observed at the 5‐s delay (similar findings were reported by Einstein et al., 2000; McDaniel, Einstein, Stout, & Morgan, 2003). This latter finding is intriguing because it stands in contrast to the classic forgetting functions reported in the Brown–Peterson paradigm (Brown, 1958; Peterson & Peterson, 1959) using delays of this magnitude. One interpretation of the prospective memory finding is that as in the Brown–Peterson paradigm, the retrieved intention is maintained in an active state through working memory processes. The potential problem of course is that information does not last in focal awareness more than 2 s without refreshing the trace (Muter, 1980; Schweickert & BoruV, 1986). Consistent with the theme of this chapter (and the volume), we suggest that participants adopt strategies to countermand the temporal limits of working memory. Specifically, during the retention interval people may periodically activate the intention by sneaking in rehearsals or by retrieving the intention from long‐term memory. This interpretation accounts for the high delayed‐execute performance in the standard ongoing task condition, the drop‐oV in the highly demanding condition (which would interfere with attempts at periodic activation), and the absence of prospective memory forgetting over a 40 s span.
Prospective Memory and Metamemory
163
To explore people’s meta‐prospective memory for delayed‐execute situations, we gave 34 college students the following scenario (see McDaniel & Einstein, 2007a): You are working on an easy essay question and get the thought to add an argument to a previous question. Before adding that argument, however, you first want to finish answering the question. Based on what you know about how your own memory works, rate how likely you will be to remember to add the argument to the earlier question when it takes you 5, 15, or 40 seconds to finish the current question.
The students were asked to give ratings for both ‘‘normal exam conditions’’ and ‘‘very hurried exam conditions’’. In agreement with the performances found in our experiments, the students estimated that prospective memory would be lower under hurried than under normal conditions. The students also predicted that prospective memory would be relatively high at the 5‐s delay, decline at the 15‐s delay, and decline even more at the 40‐s delay. Because the students predicted forgetting over the delay, one might be tempted to conclude that people are not aware of the dynamics of prospective memory when execution must be delayed over brief intervals. Another interpretation, however, is that it is these very predictions that led participants in our study (Einstein et al., 2003) to adopt strategies to maintain activation of the intention over delays spanning half a minute or so. This interpretation is in line with the view that at least in this situation, people strategically augment basic memory properties (e.g., temporal limitations of working memory) to maximize prospective memory performance. Indeed, the particular strategies recruited may reflect a rich awareness of the demands of the task in conjunction with the capabilities of the individual. In the delayed‐execute task, we have found that older adults show dramatic forgetting in performance even at the short 5‐s intervals; approximately 50% of the time older adults forgot to perform the intended action that they retrieved just 5 s previously (Einstein et al., 2000; McDaniel et al., 2003; Rendell, Ozgis, & Wallis, 2004). Such rapid forgetting is consistent with the storage of the intention in a rapidly decaying working memory when rehearsal is not used to refresh the intention. Perhaps older adults are aware that they have reduced cognitive resources and will thus have diYculty marshalling the resources needed to maintain the intention in the face of distraction. In partial support of this idea, even when older adults are instructed to rehearse the intention over the brief delay, they display significantly more forgetting than the younger adults (who are not instructed to rehearse; McDaniel, et al., 2003). Accordingly, older adults seem to opt for the less cognitively demanding strategy of reframing the prospective memory task.
164
Einstein and McDaniel
That is, we have indirect evidence that on encountering the prospective memory cue (e.g., a red screen), older adults might recode the task as ‘‘perform the intended action when I’m finished with the current questions’’ and then rely on this new cue (the end of the ongoing activity) to stimulate retrieval of the intended action (McDaniel et al., 2003). D.
SENSITIVITY TO COSTS OF MONITORING
As we discussed earlier, in the laboratory people tend to engage resource demanding monitoring processes for prospective memory tasks for which spontaneous processes cannot be depended on to support retrieval of the intention (at the appropriate moment). In particular they are more likely to monitor with time‐based prospective memory tasks and event‐based tasks with nonfocal cues. However, monitoring presumably exacts costs on the ongoing task (Smith, 2003). To the extent that people are sensitive to these costs and wish to avoid degradation of ongoing activity, fatigue from monitoring (cf. Bargh & Chartrand, 1999), or both, we would expect people to strategically deploy monitoring. Consider for example the situation in which, finding no orange juice for breakfast, you intend to buy juice that day. It would be possible to monitor (or actively maintain the intention) throughout the day for the opportunity to buy juice. Yet doing so could disrupt ongoing performance, or at the minimum, be distracting throughout the day. An alternative strategy that could minimize monitoring costs would be to suspend monitoring until a context in which the intention can be executed—on the drive home from work. With this strategy, monitoring is activated only in a circumscribed context in which the intended action might be executed, thereby optimizing both prospective memory performance and allowing performance on the ongoing task to be maintained during much of the retention interval. Cook, Marsh, and Hicks (2005) developed a laboratory paradigm to investigate whether people adopted monitoring behaviors that were strategically modulated according to contexts in which the appropriate moment for intention execution were expected. In their paradigm, subjects were informed that the experiment would consist of three distinct phases. For instance, a first phase would require pleasantness ratings of words, a second phase would involve a demographic questionnaire, and a third phase would involve counting number of syllables in words. In a time‐based prospective memory task, when subjects were correctly informed that the target time would occur in a particular phase (e.g., phase 3), clock monitoring was generally limited to that phase. By contrast, when subjects were not given information about the phase in which the target time would occur, clock monitoring was distributed across all phases of the experiment. Further, prospective memory
Prospective Memory and Metamemory
165
performance was better when subjects were able to anticipate a context in which the target time would occur than when subjects were not given information about the context (or were given misleading information). Similar results were obtained with an event‐based prospective memory task in which the cue was nonfocal (respond when any word from the animal category appeared; Marsh, Hicks, & Cook, 2006). Here, ongoing task performance was significantly disrupted in early phases when subjects were not aware of the phase in which target words would appear (phase 3), whereas there was no disruption to the ongoing task (in early phases) for subjects aware of the phase in which the targets would appear. These studies support the idea that when people have expectations about the context in which a target time or event will occur, they will adopt a monitoring strategy to minimize expenditure of resources outside of that context. Doing so avoids possible disruption of the ongoing tasks throughout much of the prospective memory retention interval, presumably prevents fatigue of cognitive resources, and helps focus resources appropriately to optimize prospective memory performance (Cook et al., 2005). VII. Summary and Future Directions In this chapter, we argued against the view that a specialized prospective memory mechanism evolved to accomplish the increasingly important prospective memory demands of humans. Instead, we proposed that several fundamental memory mechanisms that are not unique to prospective memory enable us to carry out remembering of delayed intentions. This view assumes the existence of suYcient working memory and attentional resources that allow for higher‐order strategic processing for such things as planning, actively maintaining preparatory attentional processes or a retrieval mode, and selecting a retrieved intention and then inhibiting ongoing activity and carrying out the action. The use of several mechanisms is adaptive in the sense that it enables us to eYciently respond to demands of diVerent prospective memory situations as well as allowing for backup routes for remembering. Following the ideas of Marsh, Hicks, and Cook (2005; see also Meeks, Hicks, & Marsh, in press), depending on personal characteristics as well as the perception of the demands of the ongoing and prospective memory tasks, we believe that people develop a policy for allocating their attentional resources. This policy is thought to be dynamic as it responds to changing demands of the ongoing and prospective memory activities as well as momentary fluctuations in attention. And, Marsh et al. (2005) have shown that subjects can quickly shift attentional priorities to the ongoing or prospective memory task when cued to do so by the experimenter. A key and as
166
Einstein and McDaniel
of yet unresolved issue is whether we successfully self‐allocate attentional priorities on the basis of relevant dimensions of the prospective memory demands. In this chapter, we have presented the multiprocess view as a starting point for addressing this issue. This theory proposes that several processes, ranging from active monitoring of the environment to relying on spontaneous retrieval, can accomplish prospective memory retrieval. It also assumes that there are a variety of conditions that aVect the extent to which spontaneous retrieval is likely. Although the theory assumes a predisposition for relying on spontaneous retrieval processes, when spontaneous retrieval is unlikely (and it is important to remember and we can sacrifice performance on the ongoing task), we are more likely to actively monitor for the target event/time or to develop external aids that can provide cuing. An interesting question therefore is the extent to which we have meta‐awareness of the critical dimensions of prospective memory situations that determine whether spontaneous retrieval is likely. Recognizing that there has been little research that bears on this topic and even less that purposely addresses this topic, the preliminary research is generally encouraging. One approach to testing this question has been to present people with diVerent prospective memory situations and to measure their eYciency of performing the ongoing activities while and while not performing a prospective memory task. This research for the most part shows good levels of meta‐awareness in that subjects seem sensitive to the focal nature of cues, are more likely to monitor when the prospective memory task is important and especially so with nonfocal tasks, seem aware that retrieved intentions that cannot be performed immediately are fragile, and seem unwilling to engage capacity consuming monitoring processes unless they are in the anticipated retrieval context. Even though this initial research suggests good awareness of conditions for which spontaneous retrieval is less likely and consequently more in need of monitoring, it is important to keep in mind that researchers have not yet explored people’s responsiveness over a broad range of prospective memory conditions. People may mistakenly interpret critical features of cues, ongoing tasks, and relations between the two. For example, because a supermarket may be physically large, people may erroneously think that they will be easily cued to stop to shop for groceries and not realize that because the market is set back a bit from the road that it is unlikely to be focally processed. Thus, further work is needed to examine the underlying dimensions that people actually use in making decisions about relying on spontaneous retrieval versus more strategic monitoring processes. Also, a current limitation of using this technique alone (assessing eYciency of ongoing task performance) is that it simply exposes the degree of costs of a prospective memory strategy on the eYciency of performing the ongoing task but it does little to illuminate
Prospective Memory and Metamemory
167
the nature of the strategy. Consequently, this method may be even more revealing when combined with other techniques for assessing the particular strategies that are used. Researchers may also want to use metamemory questionnaires in which they ask subjects to gauge the diYculty of prospective memory situations that vary in focal processing (and other dimensions). Somewhat along these lines, Smith, Della Sala, Logie, and Maylor (2000) developed the Prospective and Retrospective Memory Questionnaire (PRMQ) in which they asked people of various ages to rate their memory failures in prospective and retrospective memory situations that varied in terms of the length of the delay and whether remembering was self‐cued or environmentally cued. Interestingly, subjects indicated more memory failures with self‐cued prospective memory tasks than environmentally cued ones (possibly indicating an awareness that cues can produce spontaneous retrieval) but this held up only with shorter delays. The failure of their subjects to be aware of the benefits of cues with longer delays may has been due to item eVects. Other uses of the PRMQ suggest some awareness of general prospective memory ability as scores from the prospective memory scale of this questionnaire correlated with actual performance on a laboratory test of prospective memory (Kliegel & Jager, 2006). Another technique that has been commonly used in the retrospective memory literature (Koriat, 1995) is to ask subjects to predict their memory performance on tasks that diVer on various dimensions and then to test the accuracy of their predictions. In one of the few uses of this technique in the prospective memory arena, Meeks et al. (in press) found that there was a correlation between predicted performance and actual performance and also that subjects underestimated their actual prospective memory performance. When asked to indicate how they would do in remembering to press a designated key when they encountered the syllable ‘‘tor’’ or an instance of an ‘‘animal’’, both groups of subjects indicated that they would remember about 50% of the time. Yet, subjects did better in the animal condition than in the syllable condition. Although the ratings seem to indicate a lack of awareness of the diYculty of the syllable prospective memory task, it is interesting to note that subjects were significantly slower at performing the ongoing task in the syllable condition, thereby indicating an awareness of the diYculty of the syllable condition. This conflict between predictions and the behavioral data suggests that subjects may have diYculty anticipating their strategic needs in unfamiliar experimental contexts (cf. Ceci & Bronfenbrenner, 1985). Another technique would be to ask people to create cues for diVerent situations to examine whether they tend to use focal ones. Still another potentially useful technique is the one of Reese and Cherry (2002) in which they occasionally probed subjects to report their current thoughts during a prospective memory experiment. Interestingly, with a single focal target cue, they found
Einstein and McDaniel
168
that both younger and older adults rarely reported thinking about the prospective memory task, which would be consistent with the general theme that subjects tend to rely on spontaneous retrieval processes with a focal cue. In line with contextualistic accounts of memory (Jenkins, 1979; see also Kvavilashvili & Ellis, 1996), it is likely that the strategies that people rely on for prospective remembering depend not only on their meta‐knowledge concerning the nature of the prospective memory task but also on their understanding of other factors such as characteristics of the ongoing activity, and their cognitive abilities and personal tendencies (see Fig. 1 for a listing of Nature of the Prospective Memory Task Factors associated with the target event (e.g., presence of cues, focal cues, distinctive cues, associative relation between the target event and action) Assessment of the importance of the task Previous experiences with similar tasks Whether it is a habitual/routine prospective memory tasks
Nature of the Ongoing Activities Length of the delay interval Demands during the delay and at retrieval
Characteristics of the Individual Knowledge of effective encoding strategies for stimulating spontaneous retrieval (e.g., implementation intentions) Knowledge that it is difficult to sustain an active monitoring strategy Fatigue and knowledge that fatigue may affect the ability to maintain monitoring Working memory capacity and thus the ability to monitor while performing ongoing activities Personality dimensions such as compulsiveness and locus of control Fig. 1. Possible factors that could aVect the strategies that people bring to bear on a prospective memory task.
Prospective Memory and Metamemory
169
some possible factors). The involvement of personal factors seems necessary to explain the tendency of some subjects to monitor with good focal cues (even when monitoring did not improve prospective memory; Einstein et al., 2005) and why older adults tended not to monitor when they needed to delay a retrieved intention (McDaniel et al., 2003). It is important to realize that these factors may aVect people’s encoding or planning strategies (e.g., the development of eVective encoding strategies or the use of external aids) as well as their retrieval strategies. Indeed, there is likely to be a reciprocal relationship between encoding and retrieval strategies. For example, people’s encoding strategies seem to aVect their levels of monitoring. Marsh, Hicks, and Landau’s subjects who indicated using planners to remember their daily activities reported fewer thoughts about the intended actions over the retention interval (Marsh, Hicks, & Landau, 1998). Similarly, Cohen and Gollwitzer (in press) reported that subjects who use implementation intentions (which involve specifically linking an action to a cue and are thought to lead to spontaneous retrieval; see Gollwitzer, 1999) are less apt to monitor over the retention interval. We believe that a full understanding of the strategies that people use to accomplish prospective memory is likely to be influenced by all of these factors. The increasingly rich empirical and theoretical work in prospective memory (McDaniel & Einstein, 2007a) has given us a solid foundation for understanding the cognitive processes that are used for prospective memory retrieval and their eVectiveness in diVerent contexts. We see this work as a good starting point for future research examining how people strategically approach prospective memory tasks and the skill with which they can select an eVective and eYcient strategy in diVerent situations. As noted earlier, owing to the self‐initiated character of prospective memory and the inherent diYculty of monitoring the eVectiveness of this self‐initiated component (i.e., assessing one’s memory tends to test for the accessibility of the retrospective memory component), there are interesting theoretical issues about our metamemory skill in prospective memory situations. Assuming also that we are not completely aware of all of the conditions that influence prospective memory forgetting and how to counteract these conditions, research of this type has important practical implications. REFERENCES Anderson, J. R. (1983). The architecture of cognition. Cambridge, MA: Harvard University Press. Atkinson, R. C., & Juola, J. F. (1974). Search and decision processes in recognition memory. In D. H. Krantz, R. C. Atkinson, R. D. Luce, and P. Suppes (Eds.), Contemporary developments in mathematical psychology (pp. 242–293). Oxford, UK: W. H. Freeman.
170
Einstein and McDaniel
Ball, C. T., & Little, J. C. (2006). A comparison of involuntary autobiographical memory retrievals. Applied Cognitive Psychology, 20, 1167–1179. Bargh, J. A., & Chartrand, T. L. (1999). The unbearable automaticity of being. American Psychologist, 54, 462–479. Baumeister, R. F., Bratslavsky, E., Muraven, M., & Tice, D. M. (1998). Ego depletion: Is the active self a limited resource? Journal of Personality and Social Psychology, 74, 1252–1265. Bernsten, D. (1996). Involuntary autobiographical memories. Applied Cognitive Psychology, 10, 435–454. Bernsten, D., & Hall, N. M. (2004). The episodic nature of involuntary autobiographical memories. Memory & Cognition, 32, 789–803. Brown, J. A. (1958). Some tests of the decay theory of immediate memory. Quarterly Journal of Experimental Psychology, 10, 12–21. Ceci, S. J., & Bronfenbrenner, U. (1985). ‘‘Don’t forget to take the cupcakes out of the oven’’: Prospective memory, strategic time‐monitoring, and context. Child Development, 56, 152–164. Cherry, K. E., & LeCompte, D. C. (1999). Age and individual diVerences influence prospective memory. Psychology and Aging, 14, 60–76. Cohen, A. L., & Gollwitzer, P. M. (in press). The cost of remembering to remember: Cognitive load and implementation intentions influence ongoing task performance. In M. Kliegel, M. A. McDaniel, and G. O. Einstein (Eds.), Prospective memory: Cognitive, neuroscience, developmental, and applied perspectives. Mahwah, NJ: Erlbaum. Conway, M. A. (2005). Memory and self. Journal of Memory and Language, 53, 594–628. Conway, M. A., & Pleydell‐Pearce, C. W. (2000). The construction of autobiographical memories in the self‐memory system. Psychological Review, 107, 261–288. Cook, G. I., Marsh, R. L., & Hicks, J. L. (2005). Associating a time‐based prospective memory task with an expected context can improve or impair intention completion. Applied Cognitive Psychology, 19, 345–360. Craik, F. I. M. (1986). A functional account of age diVerences in memory. In F. Clix and H. Hangendorf (Eds.), Human memory and cognitive capabilities: Mechanisms and performances (pp. 409–422). Amsterdam: Elsevier. Curran, T. (2000). Brain potentials of recollection and familiarity. Memory & Cognition, 28, 923–938. Curran, T., & Dien, J. (2003). DiVerentiating amodal familiarity from modality‐specific memory processes: An ERP study. Psychophysiology, 40, 979–988. Dismukes, R. K., Berman, B. A., & Loukopoulos, L. D. (2007). The limits of expertise: Rethinking pilot error and the causes of airline accidents. Hampshire, UK: Ashgate. Ebbinghaus, H. (1964). Memory: A contribution to experimental psychology. New York: Dover. (Original work published 1885; translated 1913.) Eichenbaum, H., Yonelinas, A. P., & Ranganath, C. (in press). The medial temporal lobe and recognition memory. Annual Review of Neuroscience. Einstein, G. O., & McDaniel, M. A. (1990). Normal aging and prospective memory. Journal of Experimental Psychology. Learning, Memory, and Cognition, 16, 717–726. Einstein, G. O., & McDaniel, M. A. (1996). Retrieval processes in prospective memory: Theoretical approaches and some new empirical findings. In M. Brandimonte, G. Einstein, and M. McDaniel (Eds.), Prospective memory: Theory and applications (pp. 115–142). Hillsdale, NJ: Erlbaum. Einstein, G. O., & McDaniel, M. A. (2005). Prospective memory: Multiple retrieval processes. Current Directions in Psychological Science, 14, 286–290. Einstein, G. O., McDaniel, M. A., Manzi, M., Cochran, B., & Baker, M. (2000). Prospective memory and aging: Forgetting over short delays. Psychology and Aging, 15, 671–683.
Prospective Memory and Metamemory
171
Einstein, G. O., McDaniel, M. A., Williford, C. L., Pagan, J. L., & Dismukes, R. K. (2003). Forgetting of intentions in demanding situations is rapid. Journal of Experimental Psychology: Applied, 9, 147–162. Einstein, G. O., McDaniel, M. A., Thomas, R., Mayfield, S., Shank, H., Morrisette, N., et al. (2005). Multiple processes in prospective memory retrieval factors determining monitoring versus spontaneous retrieval. Journal of Experimental Psychology: General, 134, 327–342. Ellis, J. (1996). Prospective memory or the realization of delayed intentions: A conceptual framework for research. In M. Brandimonte, G. Einstein, and M. McDaniel (Eds.), Prospective memory: Theory and applications (pp. 1–51). Hillsdale, NJ: Erlbaum. Gollwitzer, P. M. (1999). Implementation intentions: Strong eVects of simple plans. American Psychologist, 54, 493–503. Goschke, T., & Kuhl, J. (1993). Representation of intentions: Persisting activation in memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19, 1211–1226. Goschke, T., & Kuhl, J. (1996). Remembering what to do: Explicit and implicit memory for intentions. In M. Brandimonte, G. Einstein, and M. McDaniel (Eds.), Prospective memory: Theory and applications (pp. 53–91). Hillsdale, NJ: Erlbaum. Guynn, M. J., & McDaniel, M. A. (in press). Target pre‐exposure eliminates the eVect of distraction in event‐based prospective memory. Psychonomic Bulletin & Review. Henry, J. D., MacLeod, M. S., Phillips, H., & Crawford, J. R. (2004). A meta‐analytic review of prospective memory and aging. Psychology and Aging, 19, 27–39. Jacoby, L. L. (1991). A process dissociation framework: Separating automatic from intentional uses of memory. Journal of Memory and Language, 30, 513–541. Jacoby, L. L., & Dallas, M. (1981). On the relationship between autobiographical memory and perceptual learning. Journal of Experimental Psychology: General, 110, 306–340. Jenkins, J. J. (1979). Four points to remember: A tetrahedral model of memory. In L. S. Cermak and F. I. M. Craik (Eds.), Levels of processing in human memory (pp. 429–446). Hillsdale, NJ: Erlbaum. Klein, S. B., Cosmides, L., Tooby, J., & Chance, S. (2002). Decisions and the evolution of memory: Multiple systems, multiple functions. Psychological Review, 109, 306–329. Kliegel, M., Martin, M., McDaniel, M. A., & Einstein, G. O. (2001). Varying the importance of a prospective memory task: DiVerential eVects across time‐ and event‐based prospective memory. Memory, 9, 1–11. Kliegel, M., Martin, M., McDaniel, M. A., & Einstein, G. O. (2004). Importance eVects in event‐ based prospective memory tasks. Memory, 12, 553–561. Koriat, A. (1995). Dissociation knowing and the feeling of knowing: Further evidence for the accessibility model. Journal of Experimental Psychology: General, 124, 311–333. Kvavilashvili, L. (1987). Remembering intention as a distinct form of memory. British Journal of Psychology, 78, 507–518. Kvavilashvili, L., & Ellis, J. (1996). Varieties of intention: Some distinctions and classifications. In M. Brandimonte, G. Einstein, and M. McDaniel (Eds.), Prospective memory: Theory and applications (pp. 23–51). Mahwah, NJ: Erlbaum. Kvavilashvili, L., Kyle, F., & Messer, D. J. (in press). The development of prospective memory in children: Methodological issues, empirical findings, and future directions. In M. Kliegel, M. A. McDaniel, and G. O. Einstein (Eds.), Prospective memory: Cognitive, neuroscience, developmental, and applied perspectives. Mahwah, NJ: Erlbaum. Kvavilashvili, L., & Mandler, G. (2004). Out of one’s mind: A study of involuntary semantic memories. Cognitive Psychology, 48, 47–94. Kliegel, M., & Jager, T. (2006). Can the prospective and retrospective memory questionnaire (PRMQ) predict actual prospective memory performance? Current Psychology: Developmental, Learning, Personality, Social, 25, 182–191.
172
Einstein and McDaniel
Loukopoulos, L. D., Dismukes, R. K., & Barshi, I. (2003). Concurrent task demands in the cockpit: Challenges and vulnerabilities in routine flight operations. In Proceedings of the 12th International Symposium on Aviation Psychology (pp. 737–742). Dayton, OH: The Wright State University. Mandler, G. (1980). Recognizing: The judgment of prior occurrence. Psychological Review, 87, 252–271. Marsh, R. L., Cook, G. I., & Hicks, J. L. (2006). An analysis of prospective memory. In B. H. Ross (Ed.), The psychology of learning and motivation (Vol. 46, pp. 115–156). San Diego, CA: Academic Press. Marsh, R. L., Hicks, J. L., & Cook, G. I. (2005). On the relationship between eVort toward an ongoing task and cue detection in event‐based prospective memory. Journal of Experimental Psychology: Learning, Memory, & Cognition, 29, 861–870. Marsh, R. L., Hicks, J. L., & Cook, G. I. (2006). Task interference from prospective memories covaries with contextual associations of fulfilling them. Memory & Cognition, 34, 1037–1045. Marsh, R. L., Hicks, J. L., Cook, G. I., Hansen, J. S., & Pallos, A. L. (2003). Interference to ongoing activities covaries with the characteristics of an event‐based intention. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 861–870. Marsh, R. L., Hicks, J. L., & Landau, J. D. (1998). An investigation of everyday prospective memory. Memory & Cognition, 26, 633–643. Maylor, E. A. (1996). Age‐related impairment in an event‐based prospective memory task. Psychology and Aging, 11, 74–78. McDaniel, M. A. (1995). Prospective memory: Progress and processes. In D. L. Medin (Ed.), The psychology of learning and motivation (Vol. 33, pp. 191–222). San Diego, CA: Academic Press. McDaniel, M. A., & Einstein, G. O. (1993). The importance of cue familiarity and cue distinctiveness in prospective memory. Memory, 1, 23–41. McDaniel, M. A., & Einstein, G. O. (2000). Strategic and automatic processes in prospective memory retrieval: A multiprocess framework. Applied Cognitive Psychology, 14, S127–S144. McDaniel, M. A., & Einstein, G. O. (2007a). Prospective memory: An overview and synthesis of an emerging field. Thousand Oaks, CA: Sage. McDaniel, M. A., & Einstein, G. O. (2007b). Spontaneous retrieval in prospective memory. In J. Nairne (Ed.), The foundations of remembering: Essays in honor of Henry L. Roedgier III (pp. 225–240). Hove, UK: Psychology Press. McDaniel, M. A., Einstein, G. O., & Rendell, P. G. (in press). The puzzle of inconsistent age‐related declines in prospective memory: A multiprocess explanation. In M. Kliegel, M. A. McDaniel, and G. O. Einstein (Eds.), Prospective memory: Cognitive, neuroscience, developmental, and applied perspectives. Mahwah, NJ: Erlbaum. McDaniel, M. A., Einstein, G. O., Stout, A. C., & Morgan, Z. (2003). Aging and maintaining intentions over delays: Do it or lose it. Psychology and Aging, 18, 807–822. McDaniel, M. A., Guynn, M. J., Einstein, G. O., & Breneiser, J. E. (2004). Cue focused and automatic‐associative processes in prospective memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 605–614. McDaniel, M. A., Howard, D. C., & Butler, K. M. (2006). Implementation intentions facilitate prospective memory: Immunity to high attention demands. Unpublished manuscript, St. Louis, MO: Washington University. McDaniel, M. A., Robinson‐Riegler, B., & Einstein, G. O. (1998). Prospective remembering: Perceptually driven or conceptually driven processes? Memory & Cognition, 26, 121–134. Meacham, J. A., & Singer, J. (1977). Incentive eVects in prospective remembering. Journal of Psychology: Interdisciplinary and Applied, 97, 191–197.
Prospective Memory and Metamemory
173
Meeks, J. T., Hicks, J. L., & Marsh, R. L. (in press). Metacognitive awareness of event‐based prospective memory. Consciousness and Cognition. Morris, C. D., Bransford, J. D., & Franks, J. J. (1977). Levels of processing versus transfer appropriate processing. Journal of Verbal Learning & Verbal Behavior, 16, 519–533. Moscovitch, M. (1994). Memory and working with memory: Evaluation of a component process model and comparisons with other models. In D. L. Schacter and E. Tulving (Eds.), Memory systems (pp. 269–310). Cambridge, MA: MIT Press. Muter, P. (1980). Very rapid forgetting. Memory & Cognition, 8, 174–179. Norman, K. A., Newman, E., & Detre, G. (in press). A neural network model of retrieval‐ induced forgetting. Psychological Review. Manuscript submitted for publication. Peterson, L. R., & Peterson, M. J. (1959). Short‐term retention of individual verbal items. Journal of Experimental Psychology, 58, 193–198. Raajimakers, J. G. W., & ShiVrin, R. M. (1980). SAM: A theory of probabilistic search of associative memory. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 14, pp. 207–262). New York: Academic Press. Reese, C. M., & Cherry, K. E. (2002). The eVects of age, ability, and memory monitoring on prospective memory task performance. Aging, Neuropsychology, and Cognition, 9, 98–113. Rendell, P. G., Ozgis, S., & Wallis, A. (2004). Age‐related eVects in prospective remembering: The role of delaying execution of retrieved intentions. Paper presented at the 10th cognitive aging conference, Atlanta, Georgia. Roediger, H. L. (1996). Commentary: Prospective memory and episodic memory. In M. Brandimonte, G. O. Einstein, and M. A. McDaniel (Eds.), Prospective memory: Theory and applications (pp. 149–155). Mahwah, NJ: Erlbaum. Ross, B. H., & Kennedy, P. T. (1990). Generalizing from the use of earlier examples in problem solving. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 42–55. Schachter, D. L., & Tulving, E. (1994). What are the memory systems of 1994? In D. L. Schacter and E. Tulving (Eds.), Memory systems 1994 (pp. 1–38). Cambridge, MA: MIT Press. Schweickert, R., & BoruV, B. (1986). Short‐term memory capacity: Magic number or magic spell? Journal of Experimental Psychology: Learning, Memory, and Cognition, 12, 419–425. Smith, G., Della Sala, S., Logie, R. H., & Maylor, E. A. (2000). Prospective and retrospective memory in normal ageing and dementia: A questionnaire study. Memory, 8, 311–321. Smith, R. E. (2003). The cost of remembering to remember in event‐based prospective memory: Investigating the capacity demands of delayed intention performance. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 347–361. Smith, R. E., & Bayen, U. J. (2004). A multinomial model of event‐based prospective memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 756–777. Smith, R. E., & Bayen, U. J. (2006). The source of adult age diVerences in event‐based prospective memory: A multinomial modeling approach. Journal of Experimental Psychology: Learning, Memory, and Cognition, 32, 623–635. Tulving, E. (1983). Elements of episodic memory. New York, NY: Oxford University Press. Tulving, E. (2004). Memory, consciousness, and time. Keynote address presented at the 16th Annual Convention of the American Psychological Society, Chicago, IL. West, R., & Krompinger, J. (2005). Neural correlates of prospective and retrospective memory. Neuropsychologia, 43, 418–433. Winograd, E. (1988). Some observations on prospective remembering. In M. M. Gruneberg, P. E. Morris, and R. N. Sykes (Eds.), Practical aspects of memory: Current research and issues (Vol. 1, pp. 348–353). Chichester, UK: Wiley.
This page intentionally left blank
MEMORY IS MORE THAN JUST REMEMBERING: STRATEGIC CONTROL OF ENCODING, ACCESSING MEMORY, AND MAKING DECISIONS Aaron S. Benjamin
I.
Introduction
The goal of this chapter and this volume in general is to provide a beginning sketch of the view that the successes and failures of memory are a reflection of skill in interacting with memory eVectively rather than an expression of inherent qualities or liabilities of memory itself. In this chapter, I review examples of how particular capacities of memory can be conceptualized as interactions between a quite simple memory system and a set of higher‐level control processes that are diverse and varied. In such a theoretical perspective, memory capacities reflect the myriad ways in which learners strategically engage encoding processes and successfully accommodate memory queries to the task at hand, as well as how the products of memory are flexibly aligned, recombined, and operated upon in the service of behavior and action. Much of what we think of as ‘‘memory’’ is thus actually the eYcient action of higher‐level decision making on the inputs to and the outputs from memory stores themselves. This perspective contrasts with current views of memory, which appeal to an ever‐increasing number of distinct memory systems (Schacter & Tulving, 1994) or separable memory processes (Roediger, Weldon, & Challis, 1989). I do not confront those perspectives directly here; nor do I deny levels of explanation (such as neurobiological ones) in which THE PSYCHOLOGY OF LEARNING AND MOTIVATION VOL. 48 DOI: 10.1016/S0079-7421(07)48005-7
175
Copyright 2007, Elsevier Inc. All rights reserved. 0079-7421/08 $35.00
176
Aaron S. Benjamin
those perspectives may be particularly apt. I present the memory‐as‐skilled‐ cognition perspective as an alternate theoretical basis from which to make sense out of interesting human memory behavior, noting that, in most experiments, ‘‘. . . the most common approach is to treat subject‐controlled processing as a nuisance factor . . .’’ (Koriat & Goldsmith, 1996, p. 509) and that ‘‘investigators go to . . . great lengths to design experiments that eliminate or hold . . . self‐ directed processes constant . . .’’ (Nelson & Narens, 1994, p. 8). The overarching message of this chapter is that those attitudes underlie unnecessary and artificial experimental constraints, and have led to an inappropriate partition between research on memory and research on metamemory and related decision processes. This is not to claim that the current perspective is historically unprecedented or particularly revolutionary. The tremendous emphasis on control processes in the late 1960s and 1970s generated a wealth of research that fits neatly with the present claims. In fact, the groundbreaking model of memory proposed by Atkinson and ShiVrin (1968, 1971) presumed that long‐term storage of memories was permanent and impervious to forgetting and interference—in that model, variance in recall performance was attributed entirely to control processes that governed the entry of information into long‐term storage and the generation of retrieval cues and strategies suYcient for later access. This chapter espouses the same general principle as Atkinson and ShiVrin (1971): that ‘‘memory . . . is best described in terms of the flow of information into and out of short‐term storage and the subject’s control of that flow . . .’’ (p. 83). Whereas their work principally addressed free recall in shorter‐term memory tasks, the current chapter applies more recent research in recognition, recall, metamemory, and decision making to understanding memory control in longer‐term, more ecologically valid, and more diverse memory tasks. Similar arguments have been made by Koriat and Goldsmith (1996; see also Barnes, Nelson, Dunlosky, Mazzoni, & Narens, 1999) with respect to memory retrieval and Nelson and Narens (1990) with a somewhat greater emphasis on encoding. To understand the goals of this chapter, the reader must temporarily appreciate, if not sympathize with, two concurrent goals. The first goal is to expand the purview of memory research by considering the cognitive contexts in which memory behavior is situated. It is possible to take an ecological (Neisser, 1976, 1982) or an embodiment (Glenberg, 1997) perspective on this issue, but those points of view force the theorizer to consider behavior at a more aggregate and complex level than I plan to here, and it loads the task with the additional diYculty of intuiting ‘‘real‐world’’ memory demands. Instead, I take as a starting point the simple fact that memory use exists in the larger cognitive context of servicing intellectual and behavioral goals, and that part of using memory eVectively involves knowing not just
Memory Is More than Just Remembering
177
how to increase access to useful information from the past, but also how to decrease the costs of doing so (see also Anderson & Milson, 1989). The second and concurrent goal of the chapter is to achieve the first goal with as minimal a set of assumptions as are necessary about the nature of memory itself. This goal represents an explicit attempt to reduce the proliferation of memory systems and memory processes. By decreasing the degrees of freedom available to theorists of memory qua memory, I hope to increase consideration of how extramnemonic processes might yield the wide variety of memory behavior that is reviewed here. II.
Interacting with Memory
The approach of this chapter will be to characterize ways in which nonmemorial processes interact with the inputs and outputs of memory in order to produce memory behavior that is realistically but only approximately suited to the demands of a student facing the end of the semester. During upcoming examinations, she will be queried on all manner of material from diVerent courses, most of which she has not yet mastered. A rough characterization of this situation and the routes of access to memory are sketched in Fig. 1. Memory storage is depicted in the center of the diagram; there is one route in
Environment Strategic changes in encoding
What gets encoded? How does it get encoded?
Memory How to restrict search space? When to cease search?
How to create recognition query?
Matching
Retrieval
Criterion placement Output order Grain size of output Criterion adjustment
Action
Fig. 1. A characterization of memory and memory control.
Aaron S. Benjamin
178
and two routes out of this store. Information can be attended to and committed to memory, or not, and additional decisions can be made about how to commit it to memory—by what means and under what scheduling regimen, for example. To access information in memory, two processes are available. The matching process takes as its input a putative match for a memory trace and rapidly provides a measure of how well that trace resonates with a large portion of memory. The retrieval process takes as its input a partial memory trace and returns in response a noisy pattern‐completed full trace. This chapter takes these two processes as a starting point and does not attempt to defend their necessity or suYciency from first principles. I shall return to describe these processes in greater detail at a later point in this chapter, but here I call attention to the fact that the output of these processes are the impetus to behavioral action—for example, answering a question in class, continuing or discontinuing study of test‐relevant material, or deciding to stop studying and spend more time with friends. In Fig. 1, memory is encapsulated by the box in the center of the diagram. It consists of a store and the two access processes which act upon it. This chapter will deal almost exclusively in cognitive processes outside of that box, and how those extramnemonic processes yield behavior that we typically think of as within the province of memory. Control of memory is a particularly important skill in light of the current and future potential for oZoading aspects of memory onto systems with digital memory. Software like Nokia’s Lifeblog and InSense (Blum, Pentland, & Troster, 2006) allow users to encode arbitrary and copious amounts of data from their everyday lives with the goal of reducing the burden on their pitiable brain‐based memory system. The implied division of labor between carbon and silicon appears to play to the strengths of all parties: The human mind can do what it uniquely does—control memory. And the hard drives to do what they do best—retain information. However, users must still confront the problems of recovering information from their voluminous artificial memory, and what to do with it if and when they recover it. Throughout this chapter, it will be useful to consider as a benchmark exactly what advantages, and disadvantages, the prospect of ‘‘life logging’’ aVords to users. III.
Strategic Decisions About Encoding
During study, learners often attempt to tackle a greater amount of information than can be easily mastered in the limited time available. They must thus make decisions about how to limit their intake of material, and how to
Memory Is More than Just Remembering
179
engage in eVective learning during the limited time period. Even subjects in the highly artificial context of traditional laboratory memory experiments are attempting to balance multiple goals: They may want to do well because they believe that their performance reflects well upon their intelligence, but they may also want to exert a minimal amount of eVort and time before returning to their other responsibilities that have greater consequence in their lives. Clearly, personal motivation (Wolters, Yu, & Pintrich, 1996) and expectations of evaluators (Nolen & Haladyna, 1990) influence this trade‐oV for the average student, and the trade‐oV function may diVer considerably across subjects. For some subjects under some conditions, the most eVective encoding strategy might be not to encode at all. Examples are reviewed in this section that reveal how learners eVectively use encoding strategies to enhance memory performance by catering their encoding to the demands of the material and the task. Other examples are provided that reveal failures to do so eVectively. The important lesson of these results is that they reveal ways in which one subject may show superior memory over another because of better encoding decisions, rather than better memory. Making smart decisions about encoding necessitates two skills: accurate monitoring of learning (Benjamin & Bjork, 1996) and reasonable knowledge about how various encoding schemes translate into long‐term retention, both of which will be considered in this section. Some early reports claimed little or no relationship between learners’ abilities to monitor their own learning and actual memory performance (Begg, Martin, & Needham, 1992; Cull & Zechmeister, 1994; Kelly, Scholnick, Travers, & Johnson, 1976). However, studies that systematically investigated individual diVerences (Maki & Berry, 1984) or provided learners an opportunity to guide their own encoding through self‐paced study regimens (Thiede, 1999) or to restudy self‐selected subsets of items (Nelson, Dunlosky, Graf, & Narens, 1994) revealed clear evidence for superior memory performance in groups of learners with superior monitoring and metacognitive skills. A.
WHAT GETS ENCODED?
Learners can reduce the burden on memory by divising a plan for how to allocate their time among study items. Two candidate theories of how learners do so appear to have merit. The discrepancy‐reduction theory (Dunlosky & Hertzog, 1998) suggests that time is allocated across items in accordance with each item’s proximity to a desired level of learning, which is presumed to typically be equivalent across items (Le Ny, Denhiere, & Le Taillanter, 1972; Nelson & Narens, 1990). A review (Son & Metcalfe, 2000) provided good support for this theory: Under most conditions, subjects allocated more study time to items that were either normatively more diYcult
180
Aaron S. Benjamin
or that they rated as idiosyncratically more diYcult (see also Mazzoni, Cornoldi, & Marchitelli, 1990; Zacks, 1969). Even children allocate more study time to items previously unrecalled or unrecognized—and thus presumably more poorly learned—than successfully remembered items (Masur, McIntyre, & Flavell, 1973; RogoV, Newcombe, & Kagan, 1974). There are limiting conditions on this generality, however. When a memory task places low performance demands on the subject by requiring mastery of only a small proportion of the total material, subjects allocate more study time to the easier, rather than the harder items (Thiede & Dunlosky, 1999). Similarly, if learning takes place under conditions of considerable time pressure (i.e., short study times), subjects primarily allocate that limited time to easier items (Son & Metcalfe, 2000). These results are consistent with the claim that subjects devote their study time to materials that are just beyond their current level of mastery, or in their region of proximal learning (Metcalfe, 2002; Metcalfe & Kornell, 2003). This second theory of study allocation appears to qualify in important ways the simple predictions of the discrepancy‐reduction approach. Other evidence for strategic influences on encoding is available in paradigms in which task instructions discount the value of remembering certain items over others. For example, in directed forgetting tasks (MacLeod, 1998), subjects are provided instructions about which items are necessary to remember and which can be forgotten. The relevant finding for present purposes is that subjects show poorer memory for the to‐be‐forgotten material (Bjork, LaBerge, & Legrand, 1968; Davis & Okada, 1971). Although some have postulated that memory inhibition plays a role in such eVects (Bjork & Bjork, 1996), diVerences in encoding strategies and subsequent rehearsal appear to account for the data more coherently (Benjamin, 2006; Sahakyan & Kelley, 2002). Other related findings support the general claim that learners selectively encode material of greater interest or value. Higher incentives for retention lead to superior memory than do lower incentives in both shorter‐term (Weiner & Walker, 1966) and longer‐term (Heyer & O’Kelly, 1949) memory tasks. Other results show that subjects achieve this eVect in part by either shirking concomitant goals, such as a secondary task performance (Wickens & Simpson, 1968) or by deliberately avoid encoding potentially interfering and irrelevant information. Castel, Benjamin, Craik, and Watkins (2002) reported a task in which subjects were awarded a memory score based not on the total number of items recalled, but rather the total ‘‘point’’ value of the recalled words. During study, words were assigned an arbitrary point value ranging from one to 12. The results revealed that subjects were clearly able to selectively retain the highly valued items, and that older adults—for whom declining memory ability places an even greater value on the ability to
Memory Is More than Just Remembering
181
selectively encode important and ignore unimportant details—were even more eVective than younger subjects in doing so. This success was due in part to a strategy of willfully ignoring or failing to encode items of low value, a strategy they acquired over the course of repeated testing (see also the chapter by Castel, this volume). B.
HOW DOES INFORMATION GET ENCODED?
Once learners make a decision that some piece of information is worthy of learning, they must make additional choices about how to do so. Two ways in which learners appear to control the means of encoding is by actively varying the processing they engage in and by controlling the scheduling of study events. 1.
Controlling Processing at Study
One of the major achievements of research on human memory from the last 50 years is an impressive catalogue of encoding variables and manipulations that aVect memory. An informed and intuitive student should be able to take advantage of such a wealth of knowledge by using eVective encoding strategies. However, the evidence on learners’ abilities to successfully control encoding at study is mixed, and the recent literature is somewhat sparse. One domain in which to look for such evidence is in the eVects of test expectancy on memory performance. If knowledge of the nature of the upcoming memory test elicits superior performance relative to a group that lacks such knowledge, then learners must be catering their study strategy somewhat eVectively to the demands of the test. Neely and Balota (1981; see also Balota & Neely, 1980) have shown that subjects who expect a test of recall (as opposed to a test of recognition) exhibit superior performance on either recall or recognition. Subjects expecting recall even demonstrate better memory for the order of studied items (Leonard & Whitten, 1983), suggesting that they simply work harder to learn the material when they expect the demands of the test to be greater, as they are on a test of recall. This pattern illustrates satisficing behavior (Simon, 1957): learners distribute resources to achieve at least (but no more than) some predetermined standard for performance; they do not attempt to maximize performance under all conditions. The preceding results suggest that subjects optimize encoding by selectively attending to important materials and ignoring irrelevant stimuli, and by deliberately engaging in encoding suited to the diYculty demands of the material and the upcoming test. However, other data reveal failures to do so eVectively. For example, there are conditions in which a simple orienting instruction to perform ‘‘deep’’ processing on to‐be‐learned names can lead to superior memory than self‐guided learning, at least in older adults
182
Aaron S. Benjamin
(Troyer, Hafliger, Cadieux, & Craik, 2006). Clearly, the choices learners make in committing information to memory are suboptimal if retention can be improved by a simple orienting task that could have been implemented with little cost. A similar conclusion can be drawn from studies of the generation eVect, in which subjects who self‐generate portions of a to‐be‐remembered stimulus show superior memory than subjects who merely read that stimulus passively. Instructions to engage in active imagery during encoding eliminate the generation eVect by increasing performance for read items up to the level of generated items (Begg, Vinski, Frankovich, & Holgate, 1991; see also McDaniel, Waddill, & Einstein, 1988), as do instructions about the nature of the memory test and how best to prepare for it (deWinstanley & Bjork, 1997). These data reveal that the typical disadvantage of reading can be oVset by employing more active processing during reading (see also Bjork, deWinstanley, & Storm, in press)—something learners do not, apparently, spontaneously do. The apparent inability of learners to strategically use generation as a means to ensure eVective encoding must be qualified by experiments examining the relationship between manipulations of encoding and judgments of learning (JOLs). If subjects provide higher JOLs for encoding conditions that actually elicit superior performance, it suggests that learners appreciate the advantages aVorded by the superior encoding condition, and it would stand to reason that they would implement it for materials that they desired strongly to learn. Yet, despite the fact that they do not spontaneously engage in those activities during reading that elevate performance, they do give higher JOLs for generated than read items (Begg et al., 1991; Mazzoni & Nelson, 1995). It thus appears as though learners appreciate the advantages of generative processing but are either unable or unwilling to utilize such processing in the absence of instruction. Other data weigh in favor of a motivational over a cognitive interpretation to the disadvantage of reading: The generation advantage is considerably greater under incidental than intentional learning conditions (Watkins & Sechler, 1988), revealing that subjects are able to engage in processing that eliminates at least a portion of the diVerence between generating and reading when they know that their memory will be tested. A similar pattern may be evident with respect to the eVect of depth‐of‐ processing variables on retention (Craik & Lockhart, 1972). Subjects correctly predict higher levels of recall for more deeply processed materials (although they do underestimate the magnitude of the eVect considerably; Shaw & Craik, 1989), even though they do not appear to always use it to their advantage, as suggested earlier. Spontaneously implemented deeper processing does appear to account for the superior retention of words in intentional over incidental
Memory Is More than Just Remembering
183
learning conditions (Hyde & Jenkins, 1969), however. Overall, data from manipulations of processing depth and generation appear to reveal diVerences in what subjects report—as assessed by experiments in which their predictions about encoding strategies are queried—and what they are able or willing to implement. Organization of material at the time of encoding can also have a profound eVect on memory, as exemplified by the seminal concepts of chunking (Miller, 1956; Simon, 1974) and clustering (Bousfield, 1953) in memory. Allen (1968) provided a particularly good example of the functional value of organizational processes at encoding by comparing recall in a group of subjects that were instructed only to rehearse the current item being studied with a group given standard instructions to learn a list of words. The latter group had a large advantage on the later test, supporting the view that learners were using the intervals between items to selectively rehearse and relate materials from across the list, not just the current item. It is clear from these results and others that learners strategically employ organizational and mnemonic schemes in order to increase the eVectiveness of memory encoding. However, there are important limits to this generality in behavior, both intellectual and ecological. On the intellectual side, metamnemonic skill limits knowledge of the eVectiveness of strategies, and thereby influences choices. On the ecological side, subjects may be unwilling to engage processes that will not yield memories that are accessible under realistic time demands (Benjamin & Bjork, 2000; Lea, 1975). They also are in the position of balancing the demands of our experiments with the ongoing, much more relevant demands of their lives as college students. Overall, it appears as though students are often willing to satisfice on memory tasks, and thus make choices that achieve a desired level of performance without expending more eVort or resources than are necessary. 2.
Self‐Scheduling of Study Events
Learners also typically have control over the scheduling of events when many things need to be learned. A student may decide to study the material relevant for a final examination in one session or distribute study for multiple examinations throughout that time. They may decide to study immediately prior to an examination or long before. Spacing apart multiple presentations of to‐be‐learned material is one of the most eVective ways of enhancing retention (Crowder, 1976). From a metacognitive perspective, it is doubly eYcacious, as it incurs little cost to implement: Performance can be enhanced while keeping total study time constant. What does the literature reveal about learners using self‐spacing as a means to enhance their memory? Again, the evidence is somewhat mixed.
184
Aaron S. Benjamin
Experiments in which subjects are exposed to both spaced study and massed study tend to report a preference for massing (Baddeley & Longman, 1978) and a sense of superior acquisition during massing (Simon & Bjork, 2001). These data might seem to suggest a false belief in the superiority of massing over spacing, but several caveats are in order. First, preferences about study strategy reflect more than its estimated eYcaciousness; they reflect the desirability of implementation as well. Learners may be somewhat unwilling to go to the trouble of coscheduling multiple tasks. Second, subjects may choose to satisfice, rather than maximize, especially because performance on examinations and other real‐world assessments are typically on an absolute scale. If learners feel that they can master the material suYciently to meet their goals without imposing on them the burden of scheduling, they may desire to do so, even if they could perform even better under other conditions. Third, providing ratings and making explicit judgments may not adequately predict study behavior. Several experiments have explored subjects’ choices about massing and spacing in tasks in which they had to make on‐line item‐by‐item choices about scheduling. Son (2004) showed that subjects were considerably more likely to space than mass to‐be‐learned materials when given the option. That result qualifies any claim that subjects do not appreciate the beneficial eVects of spacing. Benjamin and Bird (2006) further showed that subjects were more likely to space normatively diYcult items and mass the easy ones, indicating that subjects reserved the more eVective study procedure for the more diYcult materials. This result parallels the finding mentioned in the previous section that subjects typically spend more study time on more diYcult materials. However, under conditions in which the study time for each individual item was very short, this preference either disappeared (Benjamin & Bird, 2006) or reversed (Son, 2004). These latter data are consistent with the idea that subjects specifically choose strategies that apply to constraints of the learning situation. All of these data reveal that learners choose eVective scheduling strategies for learning words, at least under some conditions. A related issue is what people understand about the relationship between the order of learning events and retention. Dunlosky and Matvey (2001) showed that subjects correctly predict enhanced recall for the first few items (or primacy items) in a list of unrelated words. Castel (2006) has shown that subjects predict reasonably accurate primacy and recency (enhanced memory for the last few items in a list) eVects when their predictions are solicited prior to the presentation of the to‐be‐remembered item but not when queried after its presentation, suggesting that subjects have reasonably accurate knowledge of list‐order eVects but are overwhelmed by idiosyncratic item diVerences when making predictions in the presence of the items themselves (cf. Koriat, 1997). While learners may
Memory Is More than Just Remembering
185
be in possession of rudimentary knowledge about these order eVects, they do not appear to be aware of the more transient nature of recency than primacy eVects (Craik, 1970): during a test of immediate recall, subjects predict enhanced retention for both primacy and recency items on a delayed test (Benjamin, Bjork, & Schwartz, 1998b). Only the advantage for the primacy items remains after a delay; however, so those predictions reveal a lack of appreciation for the more complex relationship between item order and long‐ term retention. In addition, subjects who are allowed to select items for study based on their own monitoring of learning perform better on tests of memory than do subjects who are provided randomly ordered items for study (Atkinson, 1972). This result implies that subjects are able to use their self‐assessments of learning to generate eVective study orderings (although it is worth noting that subjects in the self‐generated study group performed more poorly than subjects in a group whose study order was determined by an adaptive algorithm). C.
LEARNING ABOUT ENCODING
The fact that learners make utilitarian decisions about what to encode and how to encode it based on their goals and assessments of the diYculty of the material, as well as on their limited knowledge of encoding strategies, begs the question of the origin of such strategic behavior. In this section, I outline several examples of how experience with relevant tasks fosters increasingly strategic behavior, which often leads to superior performance. In that sense, these are examples of ‘‘learning how to learn:’’ performance improvements come about as a function of increased metamnemonic skill. 1.
Stimulus Characteristics
An intriguing first clue in the search for improvement in encoding strategies is the fact that the relationship between monitoring accuracy—the degree to which learners can successfully predict which items they will remember and which they will not—and memory performance increases over the course of multiple study‐test trials when subjects are allowed to self‐select items for additional study (Thiede, 1999). Presumably, the mediating factor here is an increasing ability to discriminate between items that are needy of additional study and those that are not. Another example concerns the eVects of word frequency on recognition. Subjects incorrectly predict superior recognition of common words, but correctly postdict superior recognition of uncommon words (Benjamin, 2003; Guttentag & Carroll, 1998), a point that will be reviewed in more detail in a later section. Benjamin (2003) further showed that, after engaging in such
Aaron S. Benjamin
186
postdictions, subjects correctly predict superior recognition of uncommon words when given another opportunity. The act of making explicit judgments during the test appeared to rectify their misconceptions about recognition. It did not, notably, simply act to change their opinions about what types of words are more memorable: Subjects who predicted recall rather than recognition on the second trial correctly predicted an advantage for common words. 2.
Encoding Strategies
Evidence reviewed earlier suggested that learners have somewhat incomplete knowledge of the relative advantages of eVective encoding orientations, such as generating and deep processing. However, it was already noted that the disadvantage of reading relative to generating material can be oVset by instructional manipulations (deWinstanley & Bjork, 1997), suggesting that these gaps of knowledge can be remedied. Here the question is: is exposure to relevant conditions and consequent test performance suYcient to underlie such changes? Let us examine the cases of generation and depth of processing that were reviewed earlier, and consider how experience with such encoding strategies aVects knowledge of their relative eYcaciousness. deWinstanley and Bjork (2004) conducted an experiment in which subjects read text passages and were tested on specific words from that passage that had been printed in a distinctive color. During study, those words were either read or generated from a fragment cue. On a fill‐in‐the‐blank cued‐recall test, previously generated items were remembered more often than previously read words. Subjects then repeated the procedure, but the results were quite diVerent for the second test: Performance on the read items was elevated, leading to the absence of a generation eVect. Experience with the diVerent encoding procedures and observation of their respective outcomes led subjects to change how they read items in such a way as to eliminate the traditional disadvantage relative to generation. With respect to depth of processing and elaborative encoding schemes, several studies have shown how experience can ameliorate metacognitive failures. Matvey, Dunlosky, Shaw, Parks, and Hertzog (2002) showed an improvement in the degree to which mean JOLs approximated actual performance under deep encoding conditions. Subjects with experience showed a decreased (but nonetheless substantial) underappreciation of the value of deep encoding. Dunlosky and Hertzog (2000) similarly showed that exposure to the diVerential outcomes of repetition and imagery encoding increased the diVerence between mean JOLs provided to items processed under those conditions. Brigham and Pressley (1988) showed a similar result in vocabulary learning when comparing the use of a keyword mnemonic with the less
Memory Is More than Just Remembering
187
eVective technique of generating a semantic context: although subjects predicted no diVerence between these procedures ahead of time (Pressley, Levin, & Ghatala, 1984), experience with the mnemonic outcomes of the two encoding types increased understanding of the diVerence and even influenced later self‐selection of strategies for encoding. All of the results in this section illustrate ways in which experience informs either judgments about the eVectiveness of encoding strategies or choices about desirable encoding strategies. However, it is interesting that none of these studies provided evidence that the correlation between predictions and performance increased with experience. It may be that experience with strategies increases the degree to which predictions approximate performance (absolute accuracy) but not the ability to discriminate between items that will and won’t be remembered. One possibility is that, although subjects learn to recognize the eVectiveness of one strategy over another, they underestimate the eVects of that diVerence and overestimate the degree to which individual characteristics of the words drive performance (Koriat, 1997). This explains why correlations increased across trials when the manipulated variable was a characteristic intrinsic to the word (Benjamin, 2003), but not when it was a characteristic inherent to the processing performed on that word (Brigham & Pressley, 1988; Dunlosky & Hertzog, 2000; Matvey et al., 2002).
3.
Strategies for Association and Categorization
Another example of learning how to encode eVectively comes from Finley and Benjamin (2007). In their experiment, subjects were exposed to paired‐ associate terms and either given a cued‐ or a free‐recall test in which they were asked only to recall the second member of each pair. Two findings are relevant from that study, and are shown in Fig. 2. First, performance on the free‐recall test improved over trials, providing clear evidence of learning how to deal with the demands of the test. Second, a test‐expectancy manipulation—in which the final trial was either the same or diVerent from the previous four—revealed a large disadvantage for switching test, regardless of which test it was. Unlike the previous examples from the test‐expectancy paradigm, in which it was shown that expecting a recall test increased performance regardless of the criterion test, these results indicate that subjects were eVectively catering their encoding to the specific test that they expected, and that violating those expectancies left their memory representations ill‐suited to the new test. Specifically, subjects who learned to expect free recall engaged in more target–target association building and learned to ignore the cue words. Subjects who learned to expect cued recall associated each target word with its matched cue word.
Aaron S. Benjamin
188
0.7 0.6
Performance
0.5 0.4
Cued recall Free recall Unexpected cued recall Unexpected free recall
0.3 0.2 0.1 0.0
1
2
3 Study-test phase
4
Final
Fig. 2. Performance on cued and free recall as a function of experience and test expectancy (on the final trial).
As described earlier, imposing an organization on material at encoding provides for more successful access and greater recall later when the same organization scheme is used at retrieval. Rabinowitz, Freeman, and Cohen (1992) showed that experience can promote the use of such an organizational strategy. In their experiment, subjects initially studied lists of items that varied in the diYculty with which they could be categorized. They then studied a second list of moderately categorizable stimuli. The amount of clustering is seen in the recall of the second list and the overall performance on that list both increased with the degree to which the initial lists were categorizable. This result shows that conditions that promote use of an organizational strategy—in this case, the transparency of the category structure—fostered conditions in which subjects were likely to learn about the eVectiveness of that strategy and then apply it in conditions in which they might not have otherwise.
D.
CONTROL OF ENCODING AS A MEANS OF CONTROL OVER MEMORY
This section has outlined ways in which learners can modulate encoding and encoding strategies in order to control, and eVectively reduce, the demands on their memory. Rather than attempting to memorize everything, learners allocate resources commensurate with demands and diYculty. This is one major way in which learners can improve memory performance by improving memory skill. In the next several sections, however, we examine strategies
Memory Is More than Just Remembering
189
that learners can implement at the time of memory access and how they can be used to improve memory performance.
IV.
Strategic Decisions About Memory Access
Once information is in memory, test‐takers must make decisions about how to access it. The choices they make should depend on the circumstances (such as the type of test) the urgency of the need for that information or the respective costs and payoVs for errors or successful access, and they should also take into account the type and extent of previous learning or knowledge for the queried material. This section reviews evidence of how learners engage in such strategic processes in order to improve memory performance (see also Barnes et al., 1999). First, I consider the postulated means of memory access and how they diVerentially contribute to decisions depending on the time available to make those decisions. This is the short section of this chapter that considers processes inside the ‘‘memory box’’ in Fig. 1. The remainder of this section outlines ways in which the outputs from memory processes are flexibly used to guide performance in tasks that vary in their conditions and demands. Access to memory is particularly troublesome for our friends, the lifeloggers. Although digital memory possesses a clear advantage for retaining information veridically, it does not have any capacity for self‐organization. Consequently, access is limited to the user’s ability to retrieve from their actual memories relevant keywords or details to get a foot in the door and to the quality of the engineering underlying the organization of digital memory. As journalist Clive Thompson noted about the life logger Gordon Bell’s memory: And it’s true—the information is all there—but he hasn’t quite figured out how to organize it and sort it perfectly. . .So sometimes it was amazing, but a lot of times he would start to try to find something and then spend 20 minutes trying to find it! (Gladstone, January 5, 2007)
This example underscores again how important good control of memory is, even in a system with near flawless memory stores. Good organization at encoding and eYcient retrieval plans ensure timely access to relevant information. Comprehensive encoding of a day’s events increases the need for eYcient organization, and lifeloggers at already at an organizational disadvantage: with external memory, organization is a problem of engineering, rather than knowledge. The human memory system is remarkable in its capacity to self‐organize and reorganize—this is why knowledge and
190
Aaron S. Benjamin
expertise arise spontaneously out of the acquisition of facts and routines in humans but not in computers. A.
MEANS OF ACCESSING MEMORY TRACES
This short section serves as a prelude to the larger discussion of how learners strategically regulate memory access and flexibly use the outputs of memory processes to serve their needs. Given the stated goal of this intellectual enterprise as reducing memory and its processes to be as simple and few as possible, why does a memory system need the two routes of access (matching and retrieval) shown in Fig. 1? The brief answer to this question is that they serve the goal of accounting for two very general and opposing characteristics of human memory, characteristics that are revealed by almost every memory act we engage in (see also Malmberg, this volume). The first characteristic is that of generalization gradients and consequent confusability: Errors in memory tasks are more likely than not to reveal phonological (Watson, Balota, & Sergent‐Marshall, 2001), orthographic (Underwood & Zimmerman, 1973), or semantic similarity (Roediger & McDermott, 1995) to sought‐after information (see also Matzen & Benjamin, 2007). This aspect of memory—which reveals something about the nature of its flexibility and not just its fallibility—is well accounted for by models that postulate distributed representations and memory access to a large set of memory traces in parallel (Eich, 1982; Hintzman, 1986; Murdock, 1993; ShiVrin & Steyvers, 1997), a process often termed matching. Such a mechanism cannot be the sole means of accessing memory, however. Many tests of memory and most uses of memory require more complex output than a degree of match between a probe stimulus and the contents of memory; they necessitate qualitative output, such as the pronunciation of a word, the name of an acquaintance, or the combination to a locker. The matching process provides no means of producing such output from memory. The second major reason that memory must be more than matching is that items that match memory well do not always elicit a higher rate of false alarms on a test of recognition. That is, there are conditions under which subjects accurately reject highly plausible items, their high match notwithstanding. This result suggests an additional retrieval mechanism that counteracts the matching mechanism, a claim that is supported by qualitative dissociations in the eVects of manipulations of learning on false‐alarm rates (FARs). Here, I review a few examples that illustrate how the imposition of time pressure on responses can influence the relative contributions of matching and retrieval. The first two examples are from recognition tasks, and the third is from a metamemory task. This variety of tasks has been chosen deliberately in order to demonstrate the range of phenomena to which that the current framework is intended to apply.
Memory Is More than Just Remembering Tests with NO time pressure .9 Massed study Spaced study
.70
.7 p ("yes")
.50
.5
.30
.4
.20
.3 Old items
Contraindicated items
Tests WITH time pressure
2.00 1.90 1.80 1.70 1.60 1.50
Old items
Unstudied related items
.9 .8
2.10
.60
.7
2.00
.50
.6
.40
.5
.30
.4
.20
.3 Old items
Contraindicated items
Short
Long
Short
Long
2.20
.70
p("yes")
p("yes")
.6
.40
Cue duration 2.10
Mean judgment
p("yes")
.60
.80
2.20 Studied once Studied thrice
.8
Mean judgment
.80
191
1.90 1.80 1.70 1.60 1.50
Old items
Unstudied related items
Fig. 3. Three examples of how time pressure changes the contributions of matching and retrieval to memory decisions.
Figure 3 shows the relevant data from the three experiments reviewed briefly here. The far left panel shows FARs (right bars) to words studied in a list that subjects were instructed to exclude (cf. Jacoby, 1991) at test (Benjamin & Craik, 2001). The abscissa represents a manipulation of learning in that experiment (spacing) and the two conditions represented vertically correspond to no time pressure (top panel) and time pressure (bottom panel). Note the critical pattern: The rate of errors decreased with additional learning (i.e., greater spacing) when the test imposed no time pressure, but increased with additional learning when the test was speeded. The middle panel shows a similar result in a very diVerent paradigm (Benjamin, 2001). In this case, FARs are shown to words that were associatively related to lists of words that were studied either once or three times. In the top panel, in which no time pressure was imposed on the recognition judgments, errors decreased with additional list learning; in the bottom panel, in which there was time pressure, errors increased with additional learning. The final example is from a task in which subjects were required to make judgments about their ability to recall the second member of a cue‐associate pair when presented with the first term. Subjects were provided with the cue (first) term and asked to make their judgments under time pressure or under no time pressure. The data show how judgments varied as a function of how well the cue was previously learned (it had been previously presented for
192
Aaron S. Benjamin
either 1 or 5 s). Under unspeeded conditions, that manipulation—which has no eVect on performance in the task of recalling the other word—also has no eVect on judgments. However, under speeded conditions, subjects provided higher estimates of being able to recall the target term when the cue term was previously more well learned (Benjamin, 2005b). Each of these examples demonstrates how matching and retrieval diVerentially contribute to decisions made under time pressure or under no time pressure. Conditions of time pressure allow the matching process to proceed unimpeded but uncountered by the retrieval process; consequently, what is seen is the pure matching generalization gradient mentioned earlier: Subjects false alarm more to stimuli that were more well learned (Benjamin & Craik, 2001) or related to more well‐learned material (Benjamin, 2001), and incorrectly incorporate the match of a cue to memory in predicting memory for a related target (Benjamin, 2005b; see also Reder, 1987). These examples are intended to illustrate two important facts. First is the necessity of (at least) two processes in contributing to memory behavior generally, and the apparent partial independence of these processes. The second fact leads us into the next section: Learners can strategically use these two processes, diVerentially or in concert, to elicit information that can contribute to memory decisions, depending on the time pressure to make those decisions. However, I provided no evidence here that this eVect is purely strategic; it could be the case that subjects under both conditions do the same thing, but that test conditions limit the type of information that is available when the decision is required to be made. The remainder of this section outlines more specific evidence for strategic processing during and after memory access. B.
DECISIONS ABOUT HOW TO ACCESS MEMORY
The previous section outlined ways in which memory can be accessed—either by matching a probe, which is fast, or by retrieving associated information, which is slow. In this section, I consider ways in which those processes are used selectively or strategically in order to fulfill the demands of a variety of memory tasks. A reasonable first place to start looking for evidence of strategic control of memory retrieval is the eVect of incentives. Although incentives have clear eVects when they are provided at the time of encoding, the eVects at retrieval are less clear. Several reports have concluded that incentives at retrieval do not aVect performance and are thus not under strategic control (Weiner, 1966; Wickens & Simpson, 1968). However, as we shall see later, tasks that aVord the rememberer an opportunity to remember more if they work harder (by searching memory for a longer time, for example) do show eVects of incentives at retrieval.
Memory Is More than Just Remembering
1.
193
To Retrieve or Not to Retrieve?
The first decision that must be made by someone preparing to answer a question or evaluate how they know someone or something is whether to go to the trouble of querying memory. Subjects can accurately report their absence of knowledge very quickly for questions with unfamiliar terms (Glucksberg & McCloskey, 1981). Similarly, unfamiliar, distant locations elicit rapid judgments of never having been visited (Kolers & Palef, 1976). These data and others like them are well accounted for by a model that presumes that the queried terms in a memory probe like a question are matched to memory rapidly to determine whether any information is present in memory that would allow possible retrieval of the answer (Atkinson & Juola, 1973); when that match is low, a rapid response indicating the absence of information—either ‘‘don’t know’’ to a question or ‘‘no’’ to a recognition query—is made. Similar evidence is apparent in tasks in which people make explicit judgments about their ability to recognize currently unrecallable information (Hart, 1965; see also Koriat & Lieblich, 1974). Such feeling‐of‐knowing judgments increase spuriously when the terms in the memory query are made familiar (Metcalfe, Schwartz, & Joaquim, 1993; Schwartz & Metcalfe, 1992). The same pattern emerges, as noted earlier, when subjects are forced to make very rapid decisions about their ability to retrieve information (Benjamin, 2006; Reder, 1987; Reder & Ritter, 1992). These data indicate that the fast matching process is used as a mechanism to determine the usefulness of more fruitful, but more costly, retrieval access. 2.
Retrieval Versus Plausible Inference
Choosing not to access memory does not necessarily mean that someone is willing to profess ignorance. I choose not to search my memory when evaluating whether a Ford Model T automobile had an on‐board navigation system, but nonetheless might respond confidently that it did not if I have some knowledge of the development of automobile technology. Under many circumstances, the truth of information can either be directly retrieved from memory, with some probability of success and some associated cost, or more simply evaluated for its plausibility, an inferential strategy that presumably has somewhat lower cost and perhaps also a lower probability of success (Reder, 1982). Such inferences are faster than retrieval, and, unlike memory‐based recognition judgments, become quicker rather than slower with an increase in the number of relevant facts stored in memory (Reder & Ross, 1983; cf. Anderson, 1974). Conditions in which memory for the relevant material is not particularly strong, such as after a substantial delay, tend to elicit plausibility judgment
194
Aaron S. Benjamin
strategy use instead of memory retrieval (Reder & Wible, 1984), a policy that would appear to reflect a correct assessment of the decreasing probabilities of success of a retrieval attempt with time. Similarly, Reder (1987) showed that the rate with which people choose to make inferences could be increased by decreasing the proportion of questions that matched previous presented statements or by explicitly instructing subjects to evaluate plausibility (see also Gauld & Stephenson, 1967). It has been argued that choice of strategy is informed in part by a rapid assessment of the degree to which the query is familiar (Reder & Ritter, 1992); this claim is the same as the one described earlier, in which decisions about the value of a retrieval attempt are based in part on the outcome of a rapid match of the memory query to the relevant contents of memory. 3.
Retrieval Plans, Search Order, and Output Order
Some memory tasks are diYcult not because the material is poorly learned, but because accessing it in an eVective way is diYcult. Many of us know all 50 of the United States, but attempting to list them typically proves more diYcult than one might expect. Rememberers that are successful have an eVective retrieval plan, either by taking advantage of structure within the material (Bower, 1970), by taking encoding context into consideration (Thomson & Tulving, 1970; Tulving & Thomson, 1973), or by having a more general strategy (Anderson, 1972; Kintsch, 1974; ShiVrin, 1970). Evidence for the implementation of such a retrieval plan is provided by the fact that recall protocols become more homogeneous—that is, they display more consistent patterns—over multiple trials of free recall (Bousfield & PuV, 1964). Rememberers may be in a position in which the order with which they recall material is almost as important as remembering it at all. Remembering the steps of a mathematical proof, but not their correct order, might be useless if the student doesn’t have the knowledge needed to order those steps appropriately. In addition, some orders with which we query our memory lead to more eVective recall than others. Whitten and Leonard (1981) showed, for example, that retrieving names of teachers from early schooling years was more accurate when starting with later years and moving to earlier ones, rather than vice versa. The advantage of that order may lie in the fact that successful retrieval—which is more likely at the later, more recent years—provides more additional prompts for retrieval of more diYcult information. Subjects also clearly provide some subjective organizations to study materials that guide their own later recall. Analyses of output order have shown that preexisting semantic relationships guide the order with which items are recalled (Howard & Kahana, 2002; Rundus, 1971; Tulving, 1972).
Memory Is More than Just Remembering
195
Blocking of categorized words at study increases recall (D’Agostino, 1969; Cofer, Bruce, & Reicher, 1966), presumably by increasing the probability of detecting relationships or by helping to formulate a retrieval plan (Slamecka, 1968). Related items are more likely to occur near one another in the output protocol (Bousfield, 1953), and display shorter interresponse times than unrelated items (Patterson, Meltzer, & Mandler, 1971). Consecutively recalled items are also more likely to have occurred near one another in the study list, and also display shorter interresponse times (Kahana, 1996). It thus appears clear that subjects use both semantics and input order as a means of guiding their search through memory, both of which influence recall output order. Subjects also appear to have a strategy for search and output with respect to serial position. Although JOLs do not reveal an understanding of the superiority of longer‐term retention for primacy than recency items (Benjamin et al., 1998b), subjects are likely to recall recency items early in their output protocols, especially with experience (Beaman & Morton, 2000; Deese & Kaufman, 1957). They also tend to output primacy items fairly early, but not as early as recency items. Subjects may even selectively recall first those items that they have had trouble recalling previously (Battig, Allen, & Jensen, 1965), indicating an appreciation for the transience of accessibility to some memory traces. The advantage for this strategy is made apparent by experiments that manipulate whether recall is forced in a forward or backward order (Cowan et al., 1992), experiments that elicit only a partial report of the set of items (Brown, 1954; Healy, Fendrich, Cunningham, & Till, 1987), and experiments in which the starting position within the input set is forced (Cowan, Saults, Elliot, & Moreno, 2002). In each of these research domains, it is evident that the magnitude of primacy and recency eVects in list recall is strongly aVected by output order, and that early output of items confers on them a considerable advantage. Thus, by recalling currently accessible but poorly learned items first, subjects put themselves in a position of having greater total recall output. 4.
Constructing Probes for Memory Access
Whether memory is to be accessed via matching or retrieval, the rememberer must make an important decision about how to query memory. The problem was noted in early work by Tulving and Pearlstone (1966; see also Dong & Kintsch, 1968), who demonstrated that category cues increased recall of members from categorized word lists when compared with pure free recall. This result reveals that success in memory access is driven in part by the successful generation of memory probes, and the general problem is similar to one faced in information retrieval systems in general: How can I
196
Aaron S. Benjamin
successfully limit the response to my query to the most relevant and useful information, but no more? If I intend to search the Internet for information about my friend Adam Jones, do I use the same strategy as when searching for his wife Ophelia Dionysios Cottonwood? Most likely, I recognize the fact that the former search will provide much irrelevant information that I will have to sift through, postaccess, in order to find what I need. I might thus further restrict my search by entering other information that should increase the relevance of the set of returned items, perhaps by using his home town, or employer. Cues must thus be suYcient for access while still maintaining a reasonably high signal‐to‐noise ratio in the elicited information. We do something similar with accessing our memory. This phenomenon can be seen in data from an experiment by Diaz and Benjamin (2007), in which the exclusion paradigm mentioned earlier—where subjects are instructed to endorse only a subset of the previously studied items—was applied to the traditional short‐term memory scanning task of Sternberg (1966). Subjects studied multiple short lists of words, some of which were printed in red and some in blue. After each list and a short distraction interval, subjects were presented with one of the two colors and then an item. The task for the subject was to endorse the item if and only if it had been presented in the queried color. Figure 4 shows the relevant data, and reveals an important eVect: Although response times (RTs) increased with the number of items studied in the queried color, it did not vary with the number of items studied in the unqueried color. This datum indicates that subjects were eVectively restricting the subset of items that were probed in memory to include only relevant ones. Similar results have been found in tasks in which people are queried about facts about numerous fictitious people: RTs vary with the number of facts studied about a particular person, but not with the number of facts studied in the very same session about other people (McCloskey & Bigler, 1980). Given the evidence that we can construct probes that increase the relevance of the output from memory, we turn now to the question of whether probes can be catered to the previous conditions of encoding. It is a well‐known principle that retrieval is maximally eVective when the processes instantiated at encoding and retrieval are similar (Tulving & Thomson, 1973). Do people intentionally reinstate processes at test to maximize memory performance? Recent evidence bearing on this question comes from experiments in which subjects are tested on their memory for the distractors that were presented during a previous recognition test. Memory for those distractors reflects the type of qualitative processing that was performed on them during the previous test—for example, if subjects made recognition judgments by evaluating the phonology of test items, then their memory for distractors from that test will be poorer than if they had evaluated the semantic aspects of the test items. This result was found by
Memory Is More than Just Remembering
197
1400 Queried color
Median decision time
Unqueried color
1300
1200
1100
1000
1
2 Number of items in set
3
Fig. 4. Mean RTs for exclusion decisions as a function of the set size of items in the unqueried and queried colors.
Jacoby, Shimizu, Daniels, and Rhodes (2005) in a task in which depth of processing was manipulated during the original encoding phase. Subjects who engaged in deep processing during encoding showed superior memory on a second, later test of distractors that were present and evaluated on the earlier recognition test that immediately followed the encoding phase. This result indicates that subjects shaped the memory query to match the type of information they expected to have gleaned from the processing induction during encoding, although older subjects appear to not be able to do so successfully (Jacoby, Shimizu, Velanova, & Rhodes, 2005). 5.
Continuing or Discontinuing Search of Memory
Once a query has been submitted to memory and information is being extracted, the rememberer must decide if and when to cease search. Search may continue by changing the query slightly, or by using retrieved information to bootstrap oneself to more relevant or a greater number of relevant results. Or it may cease if the rememberer feels that they either have suYcient information to proceed with whatever larger task they are engaged in or that they have little hope of successfully mining any more useful information from memory. We have already discussed one way in which this can take place: If a match to memory reveals that the terms in the probe match memory poorly, a decision may be made not to search memory at all. But what about cases in which active search has begun and changes to the memory probe must be implemented in order to further that progress? To return to the example of recalling the states of the United States, you might try an alphabetic strategy,
198
Aaron S. Benjamin
a geographic strategy, or even a political strategy. Chances are that you will use one until it becomes unproductive, and then switch to another. These spontaneous and idiosyncratic changes in memory queries have been revealed in a number of tasks. For example, Walker and Kintsch (1985) noted that recall of members of real‐world categories (e.g., kitchen utensils) showed evidence for subjects querying their memory with personal and distinctive cues (e.g., breakfast this morning), particularly after the first few obvious members were generated and reported. Others have reported a variety of search strategies in similar tasks (Whitten & Leonard, 1981; Williams & Santos‐Williams, 1980). The SAM model of free recall (Raaijmakers & ShiVrin, 1981; see also Gronlund & ShiVrin, 1986) dynamically implements changes to memory queries by including in the probe the most recently recalled relevant item. Each of these results shows that subjects (or SAM) dynamically modifies a memory query in order to bootstrap their way to maximal recall. Earlier models by Restle (1964) and Polson (1972) similarly suggested that subjects modify their retrieval strategy when there is a failure of recall. Evidence that recall success is largely determined by the quality of the memory probe and not by forgetting comes from the list–length paradigm, in which longer lists of words lead to less accurate recall of individual members of those lists than shorter lists. ShiVrin (1970) showed that overall recall performance depended entirely on the length of the to‐be‐recalled list and not at all on the length of a list of words interpolated between the critical list and the recall test. This result speaks strongly to the claim that creating an eVective memory probe is more important in promoting performance than is minimizing forgetting. So far, I have reviewed data that support the claim that subjects dynamically modify their retrieval probes during recall, and that the success of the venture depends largely on the quality of that probe. A rememberer may also have to make decisions about when to terminate search of memory. An experiment by Young (2004) investigated how people make decisions about when to cease memory search in a task in which subjects were asked to recall as many exemplars as possible from two diVerent categories in a limited time period. She found that subjects spent more time searching a category if it was of normatively higher potency (i.e., from which more items were typically retrieved) and also that subjects ceased search and switched to a second category sooner when that second category was of relatively higher potency. In addition, higher feeling‐of‐knowing judgments predict longer search times in memory (Costermans, Lories, & Ansay, 1992; Nelson, Gerler, & Narens, 1984), revealing that rememberers wisely search for a longer time when they believe that that search has a higher probability of success. A failure of this
Memory Is More than Just Remembering
199
particular control process has even been evaluated as a basis for poorer memory performance in the elderly (Lachman, Lachman, & Thronesbery, 1979). When subjects can not access information that they desire to retrieve, high FOKs may even drive people to continue search outside of their memory stores—by querying other people or by searching through their lifelogs— when they have confidence in their ability to recognize that information on contact. Rememberers also search for a longer time when the incentives for success are higher. Loftus and Wickens (1970) showed superior memory for paired‐ associates associated with higher incentives, even when those incentives were only provided at test. Latencies to provide either correct responses or errors were longer when the incentives were higher, revealing that subjects were willing to search their memories longer, and that they gained something by doing so. Barnes et al. (1999) extended this result to semantic knowledge by asking subjects general knowledge questions and separately varying the incentives for correct retrieval and for the costs of search (by penalizing subjects for the time taken to provide a response). Their results clearly confirmed that subjects are willing to search longer when the rewards are greater, and that they cut their search time short when the costs are greater. These behaviors—in which subjects appear to make smart decisions about which category is more likely to support higher levels of overall success or appear to incorporate knowledge about the respective costs and benefits of retrieval into their decisions to retrieve—map neatly onto the basic assumptions of the rational analysis of memory (Anderson & Milson, 1989; Anderson & Schooler, 1991). That framework proposes that search of memory continues only if the estimated utility of retrieving additional information exceeds the cost of searching for it. This framework provides a good example for how the strategies that govern retrieval and retrieval success are determined by a complex interplay of goals, motivational factors, and metacognitive assessments.
C.
LEARNING ABOUT MEMORY ACCESS
There are several lessons one could learn about accessing memory that could improve performance. First, subjects may learn eVective ways of accessing material; for example, by reducing output interference. They might acquire more eYcient retrieval plans with experience. Finally, they might adjust relevant parameters for decision making, such as response criteria, to more accurately match the demands, payoVs, or base rates probabilities that they only assess accurately with experience.
200
1.
Aaron S. Benjamin
Understanding the Value of Self‐Testing
A recurring theme in this chapter has been how memory access can be profitably used because of its diagnosticity about levels of learning or states of knowledge. The outcomes of a matching decision can be used to decide, for example, whether to search memory (Reder, 1987) and for how long (Costermans et al., 1992), or to estimate the likelihood of successful recognition (Hart, 1967). The outcome of a retrieval event can be used to predict future memory performance (Benjamin et al., 1998b) or to foster further study or more eVective encoding (Battig et al., 1965). Can subjects learn about the value of self‐testing and use that knowledge to increase memory performance? Clearly, subjects do use retrieval to some degree as a means of keeping information active or retarding forgetting (Rundus, 1971). The evidence for this claim comes from tasks in which rehearsal is prohibited by task demands (Glanzer & Meinzer, 1967) and by experiments that elicit overt rehearsals (Ward & Tan, 2004). But how well do they learn to use self‐testing or retrieval strategies with experience? A relevant study was reported by Dunlosky, Kubat‐Silman, and Hertzog (2003) with elderly subjects (who are less likely to spontaneously use self‐testing strategies than younger, college‐aged subjects; Murphy, Schmitt, Caruso, & Sanders, 1987). In their experiment, subjects studied and were tested on their memory for two lists of paired associates. In the approximately two week period between those study‐test events, subjects were taught strategies for successful encoding (such as imagery), and some were additionally taught to use self‐testing as a means of assessing their own states of knowledge. Compared to a control group that had no instruction between the two tests, subjects who learned encoding strategies showed improved performance across the two study‐test events. More importantly, however, the group that learned self‐ testing in addition to those strategies outperformed both other groups. Thus, subjects can improve memory performance by improving the quality of their monitoring of their own learning via self‐testing. 2.
Formulating a Better Retrieval Plan
Can subjects improve their performance on tests by developing better retrieval plans? One example of successful strategy adaptation can be seen in the results of Conover and Brown (1977), who had subjects engaged in multiple study‐recall trials. They showed that subjects were increasingly likely to output the recency items first with experience. This is a wise strategy, as those items are not typically well learned and will be forgotten if they are not output early. Consequently, the magnitude of the recency eVect increased over lists (see also Maskarinec & Brown, 1974).
Memory Is More than Just Remembering
201
Our earlier discussion of retrieval plans emphasized the criticality of generating eVective retrieval cues in promoting successful retrieval. We reviewed evidence that subjects use personal (Walker & Kintsch, 1985), semantic (Rundus, 1971), encoding‐matched (Tulving & Thomson, 1973), and list order (Kahana, 1996) cues in service of fostering retrieval, and briefly described one model that has a retrieval algorithm that updates its cue regularly. Other models propose that the act of learning to retrieve is the process of working through possible cues until eVective ones are found (HalV, 1977). Theories propose a ‘‘win‐stay/lose‐shift’’ strategy, by which cues are only varied when they fail to elicit a sought‐after memory (Polson, Restle, & Polson, 1965; Restle, 1962), and that successful retrieval makes certain retrieval cues more accessible and thus more likely to be used (Izawa, 1971). All of these suggestions have in common the constantly developing nature of a retrieval plan that makes retrieval more likely to be successful. The basic assumptions of this perspective are borne out by several related findings. First, over multiple recall tests from the same study list, increased clustering of categories is evident (Mulligan, 2001). Second, the phenomenon of hypermnesia occurs in part because the increasing eYciency of retrieval strategies over multiple tests limit the degree of forgetting with time, thus allowing reminiscence—remembering items that were not remembered earlier—to outweigh forgetting and produce net gains in recall (Hunt & McDaniel, 1993; McDaniel, Moore, & Whiteman, 1998; Mulligan, 2001).
D.
STRATEGIC MEMORY ACCESS AS A COGNITIVE SKILL
The examples outlined in this section have illustrated ways in which decisions about how to access memory can influence the success of remembering. These strategies are ones that are suited to the peculiarities of the human memory system and to the variety of demands faced by rememberers. Balancing those demands and executing control processes that are appropriate to those demands are a type of memory skill. How do lifeloggers cope with the demands of accessing the huge database of memory that they store on a daily basis? There appear to be both advantages and disadvantages of such copious capacity when it comes to retrieval. The advantages include the ready availability of useful retrieval probes: Pictures taken over the course of a day or copies of e‐mails, documents, and other files can serve as cues to remember intentions and goals that have been forgotten (see Einstein & McDaniel, 2005; this volume). Alan Smeaton, a professor of Computing at Dublin University, reported that a rapid review of the day’s events provides extra useful retrieval cues:
Aaron S. Benjamin
202
If, at the end of each workday, they [lifeloggers] spent a minute scrolling through the thousands of pictures the SenseCam had taken—a high‐speed replay of their day—it had the eVect of stimulating their short‐term memory. ‘‘You’d actually remember things you’d already forgotten,’’ Smeaton says. ‘‘You’d see somebody you met in a corridor and had a two‐minute conversation with that you’d completely forgotten about. And you’d go, ‘Oh, I forgot to send an email to that guy!’’ (Thompson, 2006, p. 72)
This claim is borne out by evidence with both normal daily use (Sellen et al., 2007) and with amnesic patients (Hodges et al., 2006; Kapur, Glisky, & Wilson, 2002). By contrast, it is not evident that having a stack of photographs from the day’s events is a means to solving problems with memory eYciently. Selective encoding yields a library of memories that are highly relevant for future use and the further refinement of knowledge, whereas a stack of postcards is agnostic as to the diVerential importance of the day’s events. This qualitative diVerence in encoding is a serious disadvantage at this point in the process: Accessing a favorite book is considerably easier from the 100 or so volumes in my home library than from the millions in the university library, and I may be unable to generate the appropriate retrieval plan or memory probe to search eVectively through space that hasn’t been organized on the way in. V.
Postaccess Decision Processes
In some ways, the job for the rememberer has only just begun after the relevant information has been secured from the depths of memory. In many uses of memory, the task of remembering is trivially easy, but the decision about how and whether to respond is fraught with additional complexities. I might have a chance encounter with an indisputably familiar person in the library, but my interactions with that person are likely to be diVerent if I know them to be an acquaintance from work than if I remember them from a grainy photograph on the wall of the post oYce. A.
SUPPRESSION OF OUTPUT
One interesting theoretical problem arises from the general view of memory that I have espoused here. If access is driven primarily by the quality of the memory probe, then repeated applications of that probe should elicit the same mnemonic products. Given the evidence that retrieval is a potentiator of memory strength (Bjork, 1975) and that retrieval of a subset of items decreases accessibility to the others (Anderson, Bjork, & Bjork, 1994), it would seem that recall would elicit a great number of repeated items.
Memory Is More than Just Remembering
203
This prediction is echoed in models that employ sampling without replacement as a basis for retrieval in recall (ShiVrin, 1970; Slamecka, 1969), an assumption made in part because interresponse times in free recall decrease with the number of words previously recalled and increase with the number of words left to recall (Murdock & Okada, 1970). However, this prediction is not correct: Repetitions in recall protocols are quite rare (Murdock & Okada, 1970). The answer seems to be a strategic, volitional suppression of repetitions in recall, as suggested by a model proposed by Rundus (1973). This claim is supported by two data: First, subjects seem to have fairly accurate memory for what they have previously recalled (Gardiner & Klee, 1976; Robinson & Kulp, 1970). Second, instructions to subjects to be ‘‘uninhibited’’ in their recall—that is, to report everything that comes to mind as it comes to mind—leads to a much higher rate of repetitions than do standard recall instructions (Bousfield & Rosner, 1970). Similarly, a suppression of response feedback—by having subjects wear noise‐canceling headphones during verbal recall or writing their answers on carbon‐copy paper in which they could not view their writing—led the number of repetitions in recall protocol to triple (Gardiner, Passmore, Herriot, & Klee, 1977). In addition, older adults—who appear to be more prone to retelling familiar stories to captive audiences—have more repetitions in their recall output and show poorer memory for what they have previously recalled (Koriat, Ben‐Zur, & SheVer, 1988). These data all support the claim that the relative absence of repetitions in recall protocols is a consequence of a deliberate, strategic suppression of those responses. There is another occasion on which rememberers may want to suppress output. Using memory in conversation and other ‘‘real‐world’’ circumstances places demands on the rememberer both to provide as much information as possible and be accurate (Koriat & Goldsmith, 1996). Responding with irrelevant, redundant, or even misleading details can be worse than not responding at all, so decisions must be made about when to report the results of memory access and when to withhold reporting. This situation is analogous to the problem faced in evaluation of memory in eyewitnesses (Fisher, Geiselman, & Raymond, 1987; Hilgard & Loftus, 1979), for whom errors of commission carry quite diVerent consequences than errors of omission (Fisher, Geiselman, & Amador, 1989). Indeed, incentives to be accurate or to produce a high quantity of recall output appear to aVect recall protocols in a straightforward manner: Accuracy instructions reduce total output but increase the accuracy of output (Koriat & Goldsmith, 1994), even in children (Koriat, Goldsmith, Schneider, & Nakash‐Dura, 2001). Incentives to increase the quantity of output do not increase the number of items correctly recalled relative to a control condition (Barnes et al., 1999; Weiner, 1966), nor do encouragements to be lenient in
Aaron S. Benjamin
204
output (Erdelyi, 1970; Roediger & Payne, 1985). This combination of results suggests that this may be yet another domain in which having good control of memory is more important than having good memory: In free‐report situations, rememberers exert control over the accuracy of their output by deliberately failing to report retrieved information that they assess as having a low probability of being correct (Koriat & Goldsmith, 1996). B.
OUTPUT GRAIN
Even when someone has made a decision to respond overtly, additional choices must be made about how to respond. Just as someone might withhold information in order to be accurate, they might also choose a level of detail with which to report the contents of their memory search appropriate to an accuracy demand imposed by themselves or the situation. When I’m asked where I live by a neighbor in a local park, I indicate the approximate intersection, allowing them to localize my house to within a couple hundred feet. If a colleague at an international conference asks me the same question, I might respond with the city, the state, or even the country—allowing them to localize my house only within an area of 9 million km2. In this case, the pragmatics of the situation dictate the trade‐oV between the accuracy and the informativeness of my response (Grice, 1975; Yaniv & Foster, 1997). In other cases, explicit demands on estimation accuracy determine the coarseness of output (Einhorn & Hogarth, 1985; Erev, Wallsten, & Neal, 1991; Wallsten, Budescu, Rapoport, Zwick, & Forsyth, 1986). People exercise control over the precision or coarseness of their output in order to decrease errors of commission (Neisser, 1988), to reduce the eVects of forgetting (Goldsmith, Koriat, & Pansky, 2005), and, most generally, to place themselves at an optimal point on the informativeness‐accuracy tradeoV function (Goldsmith, Koriat, & Weinberg‐Eliezer, 2002). Goldsmith et al. (2002) provided additional support that the choice mechanism underlying grain size choice in their tasks—in which subjects provided multiple answers to questions at diVerent grain sizes and then chose one of them as a more desirable response—was a basic preference for more fine‐grained answers that could be vetoed when the assessed probability of that answer being correct was below some threshold. This mechanism is thus essentially the same as the one that is presumed to govern the choice of responding or withholding a response, as discussed in the previous section. C.
CRITERION PLACEMENT AND ADJUSTMENT IN RECOGNITION
To this point, this section has reviewed how qualitative evidence retrieved from memory is selectively modified and reported in the service of meeting the accuracy demands on a given situation. An analogous situation exists following access to memory by means of the matching mechanism: Continuous
Memory Is More than Just Remembering
205
evidence—in the form of a mnemonic response to a matching query—must be translated into a binary, or n‐ary, decision of one sort or another. Usually, that decision is of whether a queried stimulus has been previously studied (Mandler, 1980), but it may also refer to whether it was studied in a particular context (Jacoby, 1991), or how recently (Hintzman, 2003; Peterson, Johnson, & Coatney, 1969) or how frequently (Hintzman, 2001; Rowe & Rose, 1977) it was studied. Similarly, quantitative information from a memory match appears to subserve metamnemonic judgments, such as JOLs (Benjamin, Bjork, & Hirshman, 1998a; Benjamin & Diaz, in press). The mechanism by which such judgments are made is the comparison of a test value to a criterion. That criterion may be in terms of absolute amounts of evidence or in terms of the relative evidence for one decision alternative over another (Green & Swets, 1966), and may be thought to be stationary (Peterson, Birdsall, & Fox, 1954) or labile and variable over conditions (Benjamin & Wee, 2007). The theory of signal detection (TSD; Green & Swets, 1966) has guided conceptualization of the recognition task as analogous to other tasks involving detection or discrimination (Egan, 1958; Parks, 1966). The theory makes explicit assumptions about the nature of the probability distributions governing the evidence yielded by diVerent types of stimuli, and from those assumptions provides derivations of how criteria can be placed optimally (and thus also provides figures of merit with which to evaluate the quality of criterion placement). It is quite diYcult to evaluate the optimality of criteria in memory tasks, most of which solicit either yes/no or more finely grained judgments. Each of these procedures has known problems. The parameters that are derived from yes/no tasks are known to be inadequate because they fail to explicitly account for greater variance in the probability distribution in the strengths of studied than unstudied items (Macmillan & Creelman, 2005), as are ‘‘nonparametric’’ variants of those values (Benjamin, 2005a). In addition, the ratings task is known to introduce unwanted variability to parameter estimates (Benjamin & Wee, 2007; Markowitz & Swets, 1967). An additional diYculty is purely conceptual: Although Green and Swets (1966) discussed criteria in terms of likelihood ratios (LR)—that is, the relative evidence for one alternative as compared to another alternative—many studies discuss criteria in terms of evidence values—that is, their value on an arbitrary scale (these questions are considered in more depth in the chapters in this volume by Rotello and Macmillan, and Dobbins and Han). Unfortunately, LR and evidence are not monotonically related when the distributions are of unequal variance. Equivalent LRs often imply quite diVerent evidence values, and equivalent evidence values imply diVerent LRs (Stretch & Wixted, 1998a). LR criteria appear to vary more or less optimally with manipulations of prior odds in perceptual (Swets, Tanner,
206
Aaron S. Benjamin
& Birdsall, 1961) and numerical judgment tasks (Healy & Kubovy, 1977) but not in recognition (Healy & Jones, 1975; Healy & Kubovy, 1977). However, this result leaves open the possibility that recognition lends itself more naturally to setting and adjusting criteria on evidence, rather than LR, scales. And, indeed, studies that evaluate criterion placement in terms of their locations on an evidence axes report robust and reasonable responses to experimental manipulations. Hirshman (1995) compared criterion placement for words studied for a short period of time in homogeneous lists and in mixed lists in which half of the items were studied for a longer duration, and showed that subjects set a higher criterion for the recognition of those items from the mixed list, indicating that the overall memorability of the list influences the placement of criteria. Similarly, Benjamin and Bawa (2004) showed that subjects employed more stringent criteria on tests that included distractors that were more diYcult to discriminate from the previously studied items. Each of these results indicates a diVerence in criterion placement that is consistent with the theory of Green and Swets (1966), although it is worth noting that diVerences are typically smaller and that criteria are often somewhat more conservative than predicted by TSD (Healy & Jones, 1975; Healy & Kubovy, 1977). These data show that subjects can modulate their criterion in response to manipulations at encoding and at test, but there remain questions as to whether subjects can modulate criteria based on stimulus factors on an item‐by‐item basis. 1.
Stimulus Memorability and the Mirror EVect
Of particular relevance is the mirror eVect, which describes the finding that manipulations that enhance memory often operate both by increasing the hit rate (HR) of items from a particular category, and also by decreasing the FAR to items from that category (Glanzer, Adams, Iverson, & Kim, 1993). The signature case of this eVect involves normative word frequency (McCormack & Swenson, 1972), and is thought to reflect the fact that uncommon words elicit superior encoding by virtue of their distinctiveness (Malmberg, Steyvers, Stephens, & ShiVrin, 2002) and that subjects set a higher criterion commensurate with that encoding advantage (Benjamin, 2003). That explanation makes two serious assumptions about the role of strategic processes in producing the mirror eVect: (a) that subjects set higher criteria for material that they deem to be more memorable, and (b) that they recognize low‐ frequency words as being more memorable. I shall treat these two assumptions in turn. It has been shown that subjects confidently and accurately reject distractors on a test that are idiosyncratically memorable, like the names of relatives or towns that they have lived in (Brown, Lewis, & Monk, 1977), or stimuli
Memory Is More than Just Remembering
207
that are episodically distinctive (Strack & Bless, 1994). Ghetti (2003) showed that this eVect increases with age in children, suggesting that experience is necessary to support the translation of high memorability into stringent criteria. The mirror eVect also obtains, albeit only in some circumstances, when diVerences in memorability are rendered experimentally (e.g., by variations in study time or number of study repetitions). Such ‘‘strength‐based’’ mirror eVects typically obtain when memorability is manipulated between subjects or between lists, but not always when it is manipulated within list (Stretch & Wixted, 1998b). For example, Morrell, Gaitan, and Wixted (2002) presented subjects with lists of professions and locations, one of which was presented in red and one in blue. In addition, one category was repeated multiple times and the other was not. Memory for the studied members of the categories diVered as expected, but no diVerence in FAR to new items from those categories was obtained, suggesting that subjects did not adjust their criteria on an item‐by‐item basis. However, other within‐list manipulations of strength did elicit diVerent FAR for diVerent categories (Benjamin, 2001; Dobbins & Kroll, 2005; Singer & Wixted, 2006). It appears as though subjects are only willing to shift criteria on a within‐list basis when the category membership is inherently related to the diVerence in memorability (like word frequency) or when the relationship is made particularly apparent. The second major component of the theory relating criterion shifts to the mirror eVect is the appreciation of the superior recognizability of uncommon words. Early results showing that subjects mistakenly predict higher recognition ability for uncommon over common words appeared to contradict this claim (Greene & Thapar, 1994; Wixted, 1992). However, when subjects are asked to make judgments during the recognition test itself—the point at which mirror eVects actually obtain—subjects correctly judge uncommon words to be of greater memorability (Benjamin, 2003; Guttentag & Carroll, 1998). Although criterion shifts are not the only theoretical means with which mirror eVects can obtain (Criss, 2006), it does appear to be a particularly parsimonious means of explaining the ubiquity of mirror eVects and understanding the variety of occasions on which they do not obtain.
D.
LEARNING ABOUT HOW TO MAKE MEMORY DECISIONS
Although it is possible that task experience can change either the suppression or the grain of output, I know of no data investigating those topics. Within the context of memory judgments, however, there is evidence about how criteria can change with experience on a task.
208
Aaron S. Benjamin
Criterion shifts. Bear in mind that it is a theoretical possibility is that subjects set their criterion based purely on the range of memory strengths assessed following learning (Hirshman, 1995), and thus that criteria do not vary across the test. We have already reviewed evidence that this perspective is incomplete, as criteria do vary with test characteristics (Benjamin & Bawa, 2004; Brown, Steyvers, & Hemmer, 2007). To some degree, criteria must be malleable and responsive to test characteristics, simply because they often exhibit probability matching (Parks, 1966), and the probabilities of targets and lures can only be stably estimated over a large number of trials. However, there is evidence that subjects can be remarkably insensitive to such test characteristics: Compared to standard recognition testing conditions, HRs remain the same on a test with no distractors (Wallace, 1980) and FARs remain the same on a test with no targets (Dobbins, this volume). Experiments that vary characteristics within a list or across multiple lists— for example, by varying the composition of targets and lures—reveal that subjects are quite insensitive, but not wholly so, to characteristics that should influence criteria. Verde and Rotello (in press) showed that FARs did not diVer across list halves when one half contained only well‐learned items (and distractors) and the other half contained more poorly learned items (and distractors), except when subjects were provided with performance feedback. Such data have caused some authors to conclude that criterion adjustment does not occur under normal recognition testing conditions. Although there are numerous results that are consistent with this claim (Morrell et al., 2002; Stretch & Wixted, 1998b), this conclusion fails to provide a ready explanation of why changes in either prior probabilities (Heit, BrockdorV, & Lamberts, 2003) or payoVs (Van Zandt, 2000) sometimes do influence criterion changes across multiple lists. Benjamin and Bawa (2004) showed that shifts occur when the tests get harder, but not when the tests get easier, suggesting that the mechanism underlying criterion adjustment with experience is not simply one of optimization. Theories of criterion‐setting have been proposed in other decision tasks (Treisman & Williams, 1984), but have not been systematically considered in the case of recognition memory (Benjamin & Wee, 2007). The fact that distractor manipulations (Benjamin & Bawa, 2004) but not target manipulations (Verde & Rotello, in press) influence criterion adjustment suggests a useful conceptualization might be as a Neyman–Pearson decision process, by which subjects attempt to maintain a constant rate of false alarms. The chapters by Rotello and Macmillan and by Dobbins and Han in this volume also review this evidence; it is clear that a full theoretical conceptualization about how subjects shift criteria with experience would be
Memory Is More than Just Remembering
209
premature, but it is also quite clear that there are circumstances in which criteria do shift in response to task demands. E.
POSTACCESS DECISION PROCESSES AS A MEANS OF CONTROL OVER MEMORY
This section has outlined three ways in which performance on tests of memory can vary as a function of decision processes that take place after memory is accessed: by deciding when to suppress output of a response, by choosing the appropriate level of detail to provide, and by imposing criteria for memory judgments that are appropriate to the task and the situation at hand. VI.
Conclusions
This chapter has reviewed ways in which subjects strategically use encoding, access, and decision processes to influence performance on tests of memory. As a theoretical exercise, I have taken the perspective that memory itself is unmalleable, and perhaps even nonvariable across subjects, and I have investigated the range of memory behavior that could nonetheless diVer across circumstances and across people simply as a function of ‘‘memory skill’’—the degree to which people use their strategies to eVectively allow them to achieve their intellectual goals, such as doing well in an examination. Given this strong claim that memory behavior has more to do with extramnemonic skill than storage capacity, let us revisit the life of the lifeloggers, the advantages they enjoy from flawless and constant encoding, and the disadvantages they may face from farming out the ‘‘scut work’’ of storage. As discussed earlier, lifeloggers replace strategic mental encoding with comprehensive external encoding. This strategy has four principal advantages. First, information is less likely to be ‘‘forgotten’’ from a hard drive. Second, time and resources are freed up for other activities. Third, decisions do not need to be made about what and how to encode because storage capacity is eVectively unlimited. And fourth, if circumstances and demand for information changes, events and materials that were previously deemed low‐priority—and thus perhaps poorly encoded in the memory of nonlifeloggers—will still be accessible to lifeloggers. The solution to the problem of how to select material for encoding is solved by the lifeloggers by outsourcing it. The rest of us use the strategic and selective allocation of encoding resources in order to eYciently reach our learning goals. This is an eVective way of reducing the load on a taxed
210
Aaron S. Benjamin
memory system: keeping nonvital information out. But what about the information we desire to remember? And are we truly less productive and creative because we force our brain to do the yeoman’s work of encoding? The theme of this chapter is that productivity and creativity derive from mechanisms of information aggregation that are the sine qua non of human memory systems but are as yet unrealized in artificial systems. The knowledge structures that then arise influence the strategic decisions we make about our memories, and it is in this way that we bootstrap ourselves to greater understanding of complex domains and to the new thoughts and ideas that underlie advancement in those domains. It thus seems somewhat disingenuous to conclude that the time and eVort we spend making encoding decisions robs us of an opportunity to use our minds productively. There is, also, at least one advantage of lifelogging with respect to memory access—having cues available in the form of photographs or documents decreases the need for rememberer to generate their own cues and modify them as needs dictate. Perhaps these advantages obviate the need for strategic memory use on the access side? Perhaps not. Consider the case of S., a patient studied by the neuropsychologist Aleksandr Luria in his now‐classic case study The Mind of a Mnemonist (Luria, 1968/1987). S. exhibited such an extraordinary and durable memory that no means of testing revealed limits to his capacity. In fact, S. exhibited problems related to his inability to forget; as noted by Luria on viewing the way S. read and attempted to understand a short story: There were numerous details in the text, each of which gave rise to new images that led him far afield; further details produced still more details, until his mind was a virtual chaos.’’ (p. 67)
S.’s inability to discard irrelevant and tangential details kept him from focusing on the central structure of the text and decreased his ability to meet the demands of reading; namely, understanding the gist of a series of events. How did S. eventually learn to forget? He became a proto‐lifelogger: Why, he reasoned, couldn’t he use some external means to help him forget—write down what he no longer wished to remember. . .‘‘People jot things down so they’ll remember them,’’ he said. ‘‘This seemed ridiculous to me, so I decided to tackle the problem my own way.’’ As he saw it, once he had written a thing down, he would have no need to remember it; but if he were without means of writing it down, he’d commit it to memory.’’ (pp. 69–70)
S. recognized and used to his advantage the simple fact that external encoding diminishes memory encoding. But, whereas this technique proved advantageous for S., it creates a considerable intellectual cost for a normal memory
Memory Is More than Just Remembering
211
user. But perhaps memory encoding should be considered superfluous, given the greater reliability of external encoding? Here we must consider one advantage that S. had that lifeloggers do not— ready access to a reasonably well‐organized structure of knowledge. The access advantage for lifeloggers—eVectively, the ability to turn recall into cued recall—has a tremendous downside: The cues themselves are agnostic as to their importance or relevance for a particular task. So, a lifelogger might indeed remember to send his colleague an email and the nonlifelogger will forget, but they may both have considerable diYculty generating the content of that email. Their external ‘‘memories’’ are bloated with unimportant and irrelevant details, much like S.’s memory but quite unlike the average memory user, and sorting through that morass is harder because the knowledge structures that guide recall from memory are unavailable or primitive in external memory systems. It is certainly my hope that knowledge of human cognition can inform engineering suYciently to someday provide search and retrieval algorithms that rival access to human memory, but it is also evident that that day is not today. To end our discussion of lifeloggers, it is worth reflecting again on the nature of expertise. All of the contributors to this volume are experts in research on human memory. Is that expertise no more than the stack of professional articles and books that comprise what each of these experts has read in their lifetime? Could an outsider with access to those stacks write this book? The ability to synthesize, to use memory traces to generate new knowledge, underlies expertise, creativity, and the ability to generate new knowledge. A SenseCam can not perform that synthesis and, even if it had some mechanism for doing so, its deliberately nonselective encoding mechanisms may render that task impossible. Just as the central premise of this chapter has been that higher‐order cognition guides the action and use of memory, memory itself underlies higher‐order cognition. It is diYcult to imagine how artificial memory devices could supplant the balance of goals, motivations, and abilities that human memory provides, and it is perhaps valuable to consider how such devices can be used to augment human memory capacity, rather than replace it. We often think explicitly about our memory only when it betrays us, perhaps by failing to provide us with needed information that we know was recently available, or maybe by tricking us into believing things that aren’t true. However spectacular these failures might be on occasion, it is a fact that almost every meaningful behavior we engage in relies on placing new information in memory, accessing information from memory, or both; and that every cognitive act we engage in relies on the eVective use of memory strategies in both enhancing and limiting storage. Most of the time these
212
Aaron S. Benjamin
processes operate so eVectively that we hardly give them a second thought, or give memory its proper due. This chapter has emphasized how interacting with memory is a, if not the, vital component in the eVective action of memory. REFERENCES Allen, M. (1968). Rehearsal strategies and response cueing as determinants of organization in free recall. Journal of Verbal Learning and Verbal Behavior, 7, 58–63. Anderson, J. R. (1972). FRAN: A simulation model of free recall. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 5, pp. 315–378). New York: Academic Press. Anderson, J. R. (1974). Retrieval of propositional information from long‐term memory. Cognitive Psychology, 6, 451–474. Anderson, J. R., & Milson, R. (1989). Human memory: An adaptive perspective. Psychological Review, 96, 703–719. Anderson, J. R., & Schooler, L. J. (1991). Reflections of the environment in memory. Psychological Science, 2, 396–408. Anderson, M. C., Bjork, R. A., & Bjork, E. L. (1994). Remembering can cause forgetting: Retrieval dynamics in long‐term memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 1063–1087. Atkinson, R. C. (1972). Optimizing the learning of a second‐language vocabulary. Journal of Experimental Psychology, 96, 124–129. Atkinson, R. C., & Juola, J. F. (1973). Factors influencing speed and accuracy of word recognition. In S. Kornblum (Ed.), Attention and performance IV (pp. 583–612). New York: Academic Press. Atkinson, R. C., & ShiVrin, R. M. (1968). Human memory: A proposed system and its control processes. In K. W. Spence and J. T. Spence (Eds.), The Psychology of learning and motivation (Vol. 2, pp. 89–195). New York: Academic Press. Atkinson, R. C., & ShiVrin, R. M. (1971). The control of short‐term memory. Scientific American, 224, 82–90. Baddeley, A. D., & Longman, D. J. A. (1978). The influence of length and frequency of training session on the rate of learning to type. Ergonomics, 21, 627–635. Balota, D. A., & Neely, J. H. (1980). Test‐expectancy and word‐frequency eVects in recall and recognition. Journal of Experimental Psychology: Human Learning and Memory, 6, 576–587. Barnes, A. E., Nelson, T. O., Dunlosky, J., Mazzoni, G., & Narens, L. (1999). An integrative system of metamemory components involved in retrieval. In D. Gopher and A. Koriat (Eds.), Attention and performance XVII: Cognitive regulation of performance: Interaction of theory and application (pp. 287–313). Cambridge, MA: The MIT Press. Battig, W. F., Allen, M., & Jensen, A. R. (1965). Priority of free recall of newly learned items. Journal of Verbal Learning and Verbal Behavior, 4, 175–179. Beaman, C. P., & Morton, J. (2000). The separate but related origins of the recency eVect and the modality eVect in free recall. Cognition, 77, B59–B65. Begg, I., Vinski, E., Frankovich, L., & Holgate, B. (1991). Generating makes words memorable, but so does eVective reading. Memory & Cognition, 19, 487–497. Begg, I. M., Martin, L. A., & Needham, D. R. (1992). Memory monitoring: How useful is self‐ knowledge about memory? European Journal of Cognitive Psychology, 4, 195–218.
Memory Is More than Just Remembering
213
Benjamin, A. S. (2001). On the dual eVects of repetition on false recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 941–947. Benjamin, A. S. (2003). Predicting and postdicting the eVects of word frequency on memory. Memory & Cognition, 31, 297–305. Benjamin, A. S. (2005a). Recognition memory and introspective remember/know judgments: Evidence for the influence of distractor plausibility on ‘‘remembering’’ and a caution about purportedly nonparametric measures. Memory & Cognition, 33, 261–269. Benjamin, A. S. (2005b). Response speeding mediates the contribution of cue familiarity and target retrievability to metamnemonic judgments. Psychonomic Bulletin & Review, 12, 874–879. Benjamin, A. S. (2006). The eVects of list‐method directed forgetting on recognition memory. Psychonomic Bulletin & Review, 13, 831–836. Benjamin, A. S., & Bawa, S. (2004). Distractor plausibility and criterion placement in recognition. Journal of Memory & Language, 51, 159–172. Benjamin, A. S., & Bird, R. D. (2006). Metacognitive control of the spacing of study repetitions. Journal of Memory and Language, 55, 126–137. Benjamin, A. S., & Bjork, R. A. (1996). Retrieval fluency as a metacognitive index. In L. Reder (Ed.), Implicit memory and metacognition (pp. 309–338). Mahwah, NJ: Erlbaum. Benjamin, A. S., & Bjork, R. A. (2000). On the relationship between recognition speed and accuracy for words rehearsed via rote versus elaborative rehearsal. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 638–648. Benjamin, A. S., Bjork, R. A., & Hirshman, E. (1998a). Predicting the future and reconstructing the past: A Bayesian characterization of the utility of subjective fluency. Acta Psychologica, 98, 267–290. Benjamin, A. S., Bjork, R. A., & Schwartz, B. L. (1998b). The mismeasure of memory: When retrieval fluency is misleading as a metamnemonic index. Journal of Experimental Psychology: General, 127, 55–68. Benjamin, A. S., & Craik, F. I. M. (2001). Parallel eVects of aging and time pressure on memory for source: Evidence from the spacing eVect. Memory & Cognition, 29, 691–697. Benjamin, A. S., & Diaz, M. (in press). Measurement of relative metamnemonic accuracy. In J. Dunlosky and R. A. Bjork (Eds.), Handbook of memory and metamemory. Psychology Press. Benjamin, A. S., & Wee, S. (2007). Signal detection with criterial variability: Applications to recognition memory. Manuscript under review. Bjork, R. A. (1975). Retrieval as a memory modifier. In R. Solso (Ed.), Information processing and cognition: The loyola symposium (pp. 123–144). Hillsdale, NJ: Erlbaum. Bjork, E. L., & Bjork, R. A. (1996). Continuing influences of to‐be‐forgotten information. Consciousness and Cognition, 5, 176–196. Bjork, E. L., DeWinstanley, P. A., & Storm, B. C. (in press). Learning how to learn: Can experiencing the outcome of diVerential encoding strategies enhance subsequent learning? Psychonomic Bulletin & Review. Bjork, R. A., LaBerge, D., & Legrand, R. (1968). The modification of short‐term memory through instructions to forget. Psychonomic Science, 10, 55–56. Blum, M., Pentland, A., & Troster, G. (2006). InSense: Interest‐based life logging. IEEE Multimedia, 13, 40–48. Bousfield, W. A. (1953). The occurrence of clustering in the recall of randomly arranged associates. Journal of General Psychology, 49, 229–240. Bousfield, W. A., & PuV, C. R. (1964). Clustering as a function of response dominance. Journal of Experimental Psychology, 67, 76–79. Bousfield, W. A., & Rosner, S. R. (1970). Free vs. uninhibited recall. Psychonomic Science, 20, 75–76.
214
Aaron S. Benjamin
Bower, G. H. (1970). Imagery as a relational organizer in associative learning. Journal of Verbal Learning and Verbal Behavior, 9, 529–533. Brigham, M. C., & Pressley, M. (1988). Cognitive monitoring and strategy choice in younger and older adults. Psychology and Aging, 3, 249–257. Brown, J. (1954). The nature of set‐to‐learn and of intra‐material interference in immediate memory. Quarterly Journal of Experimental Psychology, 6, 141–148. Brown, J., Lewis, V. J., & Monk, A. F. (1977). Memorability, word frequency and negative recognition. Quarterly Journal of Experimental Psychology, 29, 461–473. Brown, S. D., Steyvers, M., & Hemmer, P. (2007). Modeling experimentally induced strategy shifts. Psychological Science, 20, 40–46. Castel, A. D. (2006). Metacognition and learning about primacy and recency eVects in free recall: The utilization of intrinsic and extrinsic cues when making judgments of learning. Manuscript under review. Castel, A. D. (this volume). The adaptive and strategic use of memory by older adults: Evaluative processing and value‐directed remembering. In A. S. Benjamin and B. H. Ross (Eds.), The psychology of learning and motivation: Skill and strategy in memory use (Vol. 48, pp. 225–270). London: Academic Press. Castel, A. D., Benjamin, A. S., Craik, F. I. M., & Watkins, M. J. (2002). The eVects of aging on selectivity and control in short‐term recall. Memory & Cognition, 30, 1078–1085. Cofer, C. N., Bruce, D. R., & Reicher, G. M. (1966). Clustering in free recall as a function of certain methodological variations. Journal of Experimental Psychology, 71, 858–866. Conover, J. N., & Brown, S. C. (1977). Item strength and input location in free‐recall learning. Journal of Experimental Psychology: Human Learning and Memory, 3, 109–118. Costermans, J., Lories, G., & Ansay, C. (1992). Confidence level and feeling of knowing in question answering: The weight of inferential processes. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 142–150. Cowan, N., Day, L., Saults, J. S., Keller, T. A., Johnson, T., & Flores, L. (1992). The role of verbal output time in the eVects of word length on immediate memory. Journal of Memory and Language, 31, 1–17. Cowan, N., Saults, J. S., Elliott, E. M., & Moreno, M. (2002). Deconfounding serial recall. Journal of Memory and Language, 46, 153–177. Craik, F. I. (1970). The fate of primary memory items in free recall. Journal of Verbal Learning and Verbal Behavior, 9, 143–148. Craik, F. I., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11, 671–684. Criss, A. H. (2006). The consequences of diVerentiation in episodic memory: Similarity and the strength based mirror eVect. Journal of Memory and Language, 55, 461–478. Crowder, R. G. (1976). Principles of learning and memory. Oxford, England: Lawrence Erlbaum. Cull, W. L., & Zechmeister, E. B. (1994). The learning ability paradox in adult metamemory research: Where are the metamemory diVerences between good and poor learners? Memory & Cognition, 22, 249–257. D’Agostino, P. R. (1969). The blocked‐random eVect in recall and recognition. Journal of Verbal Learning and Verbal Behavior, 8, 815–820. Davis, J. C., & Okada, R. (1971). Recognition and recall of positively forgotten items. Journal of Experimental Psychology, 89, 181–186. Deese, J., & Kaufman, R. A. (1957). Serial eVects in recall of unorganized and sequentially organized verbal material. Journal of Experimental Psychology, 54, 180–187. deWinstanley, P. A., & Bjork, E. L. (1997). Processing instructions and the generation eVect: A test of the multifactor transfer‐appropriate processing theory. Memory, 5, 401–421.
Memory Is More than Just Remembering
215
deWinstanley, P. A., & Bjork, E. L. (2004). Processing strategies and the generation eVect: Implications for making a better reader. Memory & Cognition, 32, 945–955. Diaz, M., & Benjamin, A. S. (2007). The eVects of proactive interference (PI) and release from PI on judgements of learning. Manuscript under review. Dobbins, I. G., & Han, S. (this volume). What constitutes a model of item‐based memory decisions? In A. S. Benjamin and B. H. Ross (Eds.), The psychology of learning and motivation: Skill and strategy in memory use (Vol. 48, pp. 95–144). London: Academic Press. Dobbins, I. G., & Kroll, N. E. A. (2005). Distinctiveness and the recognition mirror eVect: Evidence for an item‐based criterion placement heuristic. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 1186–1198. Dong, T., & Kintsch, W. (1968). Subjective retrieval cues in free recall. Journal of Verbal Learning and Verbal Behavior, 7, 813–816. Dunlosky, J., & Hertzog, C. (1998). Training programs to improve learning in later adulthood: Helping older adults educate themselves. In D. J. Hacker, J. Dunlosky, and A. C. Graesser (Eds.), Metacognition in educational theory and practice (pp. 249–275). Mahwah, NJ: Lawrence Erlbaum Associates. Dunlosky, J., & Hertzog, C. (2000). Updating knowledge about encoding strategies: A componential analysis of learning about strategy eVectiveness from task experience. Psychology and Aging, 15, 462–474. Dunlosky, J., & Matvey, G. (2001). Empirical analysis of the intrinsic‐extrinsic distinction of judgments of learning (JOLs): EVects of relatedness and serial position on JOLs. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 1180–1191. Dunlosky, J., Kubat‐Silman, A. K., & Hertzog, C. (2003). Training monitoring skills improves older adults’ self‐paced associative learning. Psychology and Aging, 18, 340–345. Egan, J. P. (1958). Recognition memory and the operating characteristic. USAF Operational Applications Laboratory Technical Note. No. 58–51 32. Eich, J. M. (1982). A composite holographic associative recall model. Psychological Review, 89, 627–661. Einhorn, H. J., & Hogarth, R. M. (1985). Ambiguity and uncertainty in probabilistic inference. Psychological Review, 92, 433–461. Einstein, G. O., & McDaniel, M. A. (2005). Prospective memory: Multiple retrieval processes. Current Directions in Psychological Science, 14, 286–290. Einstein, G. O., & McDaniel, M. A. (this volume). Prospective memory and metamemory: The skilled use of basic attentional and memory processes. In A. S. Benjamin and B. H. Ross (Eds.), The psychology of learning and motivation: Skill and strategy in memory use (Vol. 48, pp. 145–174). London: Academic Press. Erdelyi, M. (1970). Recovery of unavailable perceptual input. Cognitive Psychology, 1, 99–113. Erev, I., Wallsten, T. S., & Neal, M. M. (1991). Vagueness, ambiguity, and the cost of mutual understanding. Psychological Science, 2, 321–324. Finley, J. R., & Benjamin, A. S. (2007). Adaptive changes in encoding with experience: Evidence from the test‐expectancy paradigm. Manuscript under review. Fisher, R. P., Geiselman, R. E., & Raymond, D. S. (1987). Critical analysis of police interviewing techniques. Journal of Police Science and Administration, 15, 177–185. Fisher, R. P., Geiselman, R. E., & Amador, M. (1989). Field test of the cognitive interview: Enhancing the recollection of the actual victims and witnesses of crime. Journal of Applied Psychology, 74, 722–727. Gardiner, J. M., & Klee, H. (1976). Memory for remembered events: An assessment of output monitoring in free recall. Journal of Verbal Learning and Verbal Behavior, 15, 227–233.
216
Aaron S. Benjamin
Gardiner, J. M., Passmore, C., Herriot, P., & Klee, H. (1977). Memory for remembered events: EVects of response mode and response‐produced feedback. Journal of Verbal Learning and Verbal Behavior, 16, 45–54. Gauld, A., & Stephenson, G. M. (1967). Some experiments relating to Bartlett’s theory of remembering. British Journal of Psychology, 58, 39–49. Ghetti, S. (2003). Memory for nonoccurrences: The role of metacognition. Journal of Memory and Language, 48, 722–739. Gladstone, B. (2007). The persistence of memory [Electronic version]. On the Media. National Public Radio. Glanzer, M., & Meinzer, A. (1967). The eVects of intralist activity on free recall. Journal of Verbal Learning and Verbal Behavior, 6, 928–935. Glanzer, M., Adams, J. K., Iverson, G. J., & Kim, K. (1993). The regularities of recognition memory. Psychological Review, 100, 546–567. Glenberg, A. M. (1997). What memory is for. Behavioral and Brain Sciences, 20, 1–55. Glucksberg, S., & McCloskey, M. (1981). Decisions about ignorance: Knowing that you don’t know. Journal of Experimental Psychology: Human Learning and Memory, 7, 311–325. Goldsmith, M., Koriat, A., & Pansky, A. (2005). Strategic regulation of grain size in memory reporting over time. Journal of Memory and Language, 52, 505–525. Goldsmith, M., Koriat, A., & Weinberg‐Eliezer, A. (2002). Strategic regulation of grain size memory reporting. Journal of Experimental Psychology: General, 131, 73–95. Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. Oxford, England: Wiley. Greene, R. L., & Thapar, A. (1994). Mirror eVect in frequency discrimination. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 946–952. Grice, H. P. (1975). Logic and conversation. In P. Cole and N. L. Morgan (Eds.), Syntax and semantics, Vol. 3: Speech Acts (pp. 41–58). New York: Academic Press. Gronlund, S. D., & ShiVrin, R. M. (1986). Retrieval strategies in recall of natural categories and categorized‐lists. Journal of Experimental Psychology: Learning, Memory, and Cognition, 12, 550–561. Guttentag, R., & Carroll, D. (1998). Memorability judgements for high‐ and low‐frequency words. Memory & Cognition, 26, 951–958. HalV, H. M. (1977). The role of opportunities for recall in learning to retrieve. American Journal of Psychology, 90, 383–406. Hart, J. T. (1965). Memory and the feeling‐of‐knowing experience. Journal of Educational Psychology, 56, 208–216. Hart, J. T. (1967). Memory and the memory‐monitoring process. Journal of Verbal Learning and Verbal Behavior, 6, 685–691. Healy, A. F., Fendrich, D. W., Cunningham, T. F., & Till, R. E. (1987). EVects of cuing on short‐ term retention of order information. Journal of Experimental Psychology: Learning, Memory, and Cognition, 13, 413–425. Healy, A. F., & Jones, C. (1975). Can subjects maintain a constant criterion in a memory task? Memory & Cognition, 3, 233–238. Healy, A. F., & Kubovy, M. (1977). A comparison of recognition memory to numerical decision: How prior probabilities aVect cutoV location. Memory & Cognition, 5, 3–9. Heit, E., BrockdorV, N., & Lamberts, K. (2003). Adaptive changes of response criterion in recognition memory. Psychonomic Bulletin & Review, 10, 718–723. Heyer, A. W., Jr., & O’Kelly, L. I. (1949). Studies in motivation and retention: II. Retention of nonsense syllables learned under diVerent degrees of motivation. Journal of Psychology: Interdisciplinary and Applied, 27, 143–152.
Memory Is More than Just Remembering
217
Hilgard, E. R., & Loftus, E. F. (1979). EVective interrogation of the eyewitness. International Journal of Clinical and Experimental Hypnosis, 27, 342–357. Hintzman, D. L. (1986). ‘‘Schema abstraction’’ in a multiple‐trace memory model. Psychological Review, 93, 411–428. Hintzman, D. L. (2001). Similarity, global matching, and judgments of frequency. Memory & Cognition, 29, 547–556. Hintzman, D. L. (2003). Judgments of recency and their relation to recognition memory. Memory & Cognition, 31, 26–34. Hirshman, E. (1995). Decision processes in recognition memory: Criterion shifts and the list‐ strength paradigm. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 302–313. Hodges, S., Williams, L., Berry, E., Izadi, S., Srinivasan, J., Butler, A., et al. (2006). SenseCam: A retrospective memory Aid. In Dourish and A. Friday (Eds.), Ubicomp 2006, LNCS 4206 (pp. 177–193). Berlin: Springer‐Verlag. Howard, M. W., & Kahana, M. J. (2002). When does semantic similarity help episodic retrieval? Journal of Memory and Language, 46, 85–98. Hunt, R. R., & McDaniel, M. A. (1993). The enigma of organization and distinctiveness. Journal of Memory and Language, 32, 421–445. Hyde, T. S., & Jenkins, J. J. (1969). DiVerential eVects of incidental tasks on the organization of recall of a list of highly associated words. Journal of Experimental Psychology, 82, 472–481. Izawa, C. (1971). The test trial potentiating model. Journal of Mathematical Psychology, 8, 200–224. Jacoby, L. L. (1991). A process dissociation framework: Separating automatic from intentional uses of memory. Journal of Memory and Language, 30, 513–541. Jacoby, L. L., Shimizu, Y., Daniels, K. A., & Rhodes, M. G. (2005). Modes of cognitive control in recognition and source memory: Depth of retrieval. Psychonomic Bulletin & Review, 12, 852–857. Jacoby, L. L., Shimizu, Y., Velanova, K., & Rhodes, M. G. (2005). Age diVerences in depth of retrieval: Memory for foils. Journal of Memory and Language, 52, 493–504. Kahana, M. J. (1996). Associative retrieval processes in free recall. Memory & Cognition, 24, 103–109. Kapur, N., Glisky, E. L., & Wilson, B. A. (2002). External memory aids and computers in memory rehabilitation. In A. D. Baddeley, M. Kopelman, and B. A. Wilson (Eds.), The handbook of memory disorders (2nd ed., pp. 757–783). Chichester: John Wiley. Kelly, M., Scholnick, E. K., Travers, S. H., & Johnson, J. W. (1976). Relations among memory, memory appraisal, and memory strategies. Child Development, 47, 648–659. Kintsch, W. (Ed.). (1974). The representation of meaning in memory. Oxford, England: Lawrence Erlbaum. Kolers, P. A., & Palef, S. R. (1976). Knowing not. Memory & Cognition, 4, 553–558. Koriat, A. (1997). Monitoring one’s own knowledge during study: A cue‐utilization approach to judgments of learning. Journal of Experimental Psychology: General, 126, 349–370. Koriat, A., Ben‐Zur, H., & SheVer, D. (1988). Telling the same story twice: Output monitoring and age. Journal of Memory and Language, 27, 23–39. Koriat, A., & Goldsmith, M. (1994). Memory in naturalistic and laboratory contexts: Distinguishing the accuracy‐oriented and quantity‐oriented approaches to memory assessment. Journal of Experimental Psychology: General, 123, 297–315. Koriat, A., & Goldsmith, M. (1996). Monitoring and control processes in the strategic regulation of memory accuracy. Psychological Review, 103, 490–517. Koriat, A., & Lieblich, I. (1974). What does a person in a ‘‘TOT’’ state know that a person in a ‘‘don’t know’’ state doesn’t know? Memory & Cognition, 2, 647–655.
218
Aaron S. Benjamin
Koriat, A., Goldsmith, M., Schneider, W., & Nakash‐Dura, M. (2001). The credibility of children’s testimony: Can children control the accuracy of their memory reports? Journal of Experimental Child Psychology, 79, 405–437. Lachman, J. L., Lachman, R., & Thronesbery, C. (1979). Metamemory through the adult life span. Developmental Psychology, 15, 543–551. Lea, G. (1975). Chronometric analysis of the method of loci. Journal of Experimental Psychology: Human Perception and Performance, 1, 95–104. Le Ny, J. F., Denhiere, G., & Le Taillanter, D. (1972). Regulation of study‐time and interstimulus similarity in self‐paced learning conditions. Acta Psychologica, 36, 280–289. Leonard, J. M., & Whitten, W. B. (1983). Information stored when expecting recall or recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 9, 440–455. Loftus, G. R., & Wickens, T. D. (1970). EVect of incentive on storage and retrieval processes. Journal of Experimental Psychology, 85, 141–147. Luria, A. R. (1968/1987). The mind of a mnemonist: A little book about a vast memory. (L. SolotaroV, Trans.). Cambridge, MA: Harvard University Press. (Original work published 1968). MacLeod, C. M. (1998). Directed forgetting. In J. M. Golding and C. M. MacLeod (Eds.), Intentional forgetting: Interdisciplinary approaches (pp. 1–57). Mahwah, NJ: Lawrence Erlbaum Associates. Macmillan, N. A., & Creelman, C. D. (2005). Detection theory: A user’s guide (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates. Maki, R. H., & Berry, S. L. (1984). Metacomprehension of text material. Journal of Experimental Psychology: Learning, Memory, and Cognition, 10, 663–679. Malmberg, K. J., Steyvers, M., Stephens, J. D., & ShiVrin, R. M. (2002). Feature frequency eVects in recognition memory. Memory & Cognition, 30, 607–613. Mandler, G. (1980). Recognizing: The judgment of previous occurrence. Psychological Review, 87, 252–271. Markowitz, J., & Swets, J. (1967). Factors aVecting the slope of empirical ROC curves: Comparison of binary and rating responses. Perception & Psychophysics, 2, 91–97. Maskarinec, A. S., & Brown, S. C. (1974). Positive and negative recency eVects in free recall learning. Journal of Verbal Learning and Verbal Behavior, 13, 328–334. Masur, E. F., McIntyre, C. W., & Flavell, J. H. (1973). Developmental changes in apportionment of study time among items in a multitrial free recall task. Journal of Experimental Child Psychology, 15, 237–246. Matvey, G., Dunlosky, J., Shaw, R. J., Parks, C., & Hertzog, C. (2002). Age‐related equivalence and deficit in knowledge updating of cue eVectiveness. Psychology and Aging, 17, 589–597. Matzen, L. E., & Benjamin, A. S. (2007). Remembering words not presented in sentences: How study context changes patterns of false memories. Manuscript under review. Mazzoni, G., Cornoldi, C., & Marchitelli, G. (1990). Do memorability ratings aVect study‐time allocation? Memory & Cognition, 18, 196–204. Mazzoni, G., & Nelson, T. O. (1995). Judgments of learning are aVected by the kind of encoding in ways that cannot be attributed to the level of recall. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 1263–1274. McCloskey, M., & Bigler, K. (1980). Focused memory search in fact retrieval. Memory & Cognition, 8, 253–264. McCormack, P. D., & Swenson, A. L. (1972). Recognition memory for common and rare words. Journal of Experimental Psychology, 95, 72–77. McDaniel, M. A., Moore, B. A., & Whiteman, H. L. (1998). Dynamic changes in hypermnesia across early and late tests: A relational/item‐specific account. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 173–185.
Memory Is More than Just Remembering
219
McDaniel, M. A., Waddill, P. J., & Einstein, G. O. (1988). A contextual account of the generation eVect: A three‐factor theory. Journal of Memory and Language, 27, 521–536. Metcalfe, J. (2002). Is study time allocated selectively to a region of proximal learning? Journal of Experimental Psychology: General, 131, 349–363. Metcalfe, J., & Kornell, N. (2003). The dynamics of learning and allocation of study time to a region of proximal learning. Journal of Experimental Psychology: General, 132, 530–542. Metcalfe, J., Schwartz, B. L., & Joaquim, S. G. (1993). The cue‐familiarity heuristic in metacognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19, 851–864. Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63, 81–97. Morrell, H. E. R., Gaitan, S., & Wixted, J. T. (2002). On the nature of the decision axis in signal‐ detection‐based models of recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 1095–1110. Mulligan, N. W. (2001). Word frequency and memory: EVects on absolute versus relative order memory and on item memory versus order memory. Memory & Cognition, 29, 977–985. Murdock, B. B. (1993). TODAM2: A model for the storage and retrieval of item, associative, and serial‐order information. Psychological Review, 100, 183–203. Murdock, B. B., & Okada, R. (1970). Interresponse times in single‐trial free recall. Journal of Experimental Psychology, 86, 263–267. Murphy, M. D., Schmitt, F. A., Caruso, M. J., & Sanders, R. E. (1987). Metamemory in older adults: The role of monitoring in serial recall. Psychology and Aging, 2, 331–339. Neely, J. H., & Balota, D. A. (1981). Test‐expectancy and semantic‐organization eVects in recall and recognition. Memory & Cognition, 9, 283–300. Neisser, U. (1976). Cognition and reality: Principles and implications of cognitive psychology. New York, NY: W H Freeman/Times Books/ Henry Holt & Co. Neisser, U. (1982). Snapshots or benchmarks? In U. Neisser and I. E. Hyman (Eds.), Memory observed: Remembering in natural contexts (pp. 68–74). San Francisco: Worth Publishers. Neisser, U. (1988). What is ordinary memory the memory of ? In U. Neisser and E. Winograd (Eds.), Remembering reconsidered: Ecological and traditional approaches to the study of memory (pp. 356–373). New York, NY: Cambridge University Press. Nelson, T. O., Dunlosky, J., Graf, A., & Narens, L. (1994). Utilization of metacognitive judgments in the allocation of study during multitrial learning. Psychological Science, 5, 207–213. Nelson, T. O., Gerler, D., & Narens, L. (1984). Accuracy of feeling‐of‐knowing judgments for predicting perceptual identification and relearning. Journal of Experimental Psychology: General, 113, 282–300. Nelson, T. O., & Narens, L. (1990). Metamemory: A theoretical framework and some new findings. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 26, pp. 125–173). New York: Academic Press. Nelson, T. O., & Narens, L. (1994). Why investigate metacognition? In J. Metcalfe and A. P. Shimamura (Eds.), Metacognition: Knowing about knowing. Cambridge, MA: The MIT Press. Nolen, S. B., & Haladyna, T. M. (1990). Personal and environmental influences on students’ beliefs about eVective study strategies. Contemporary Educational Psychology, 15, 116–130. Parks, T. E. (1966). Signal‐detectability theory of recognition‐memory performance. Psychological Review, 73, 44–58. Patterson, K. E., Meltzer, R. H., & Mandler, G. (1971). Inter‐response times in categorized free recall. Journal of Verbal Learning and Verbal Behavior, 10, 417–426. Peterson, L. R., Johnson, S. T., & Coatney, R. (1969). The eVect of repeated occurrences on judgments of recency. Journal of Verbal Learning and Verbal Behavior, 8, 591–596.
220
Aaron S. Benjamin
Peterson, W. W., Birdsall, T. G., & Fox, W. C. (1954). The theory of signal detectability. IEEE Transactions on Information Theory, 4, 171–212. Polson, M. C., Restle, F., & Polson, P. G. (1965). Association and discrimination in paired‐ associates learning. Journal of Experimental Psychology, 69, 47–55. Polson, P. G. (1972). Presolution performance functions for Markov models. Psychometrika, 37, 453–459. Pressley, M., Levin, J. R., & Ghatala, E. S. (1984). Memory strategy monitoring in adults and children. Journal of Verbal Learning and Verbal Behavior, 23, 270–288. Raaijmakers, J. G., & ShiVrin, R. M. (1981). Search of associative memory. Psychological Review, 88, 93–134. Rabinowitz, M., Freeman, K., & Cohen, S. (1992). Use and maintenance of strategies: The influence of accessibility to knowledge. Journal of Educational Psychology, 84, 211–218. Reder, L. M. (1982). Plausibility judgments versus fact retrieval: Alternative strategies for sentence verification. Psychological Review, 89, 250–280. Reder, L. M. (1987). Strategy selection in question answering. Cognitive Psychology, 19, 90–138. Reder, L. M., & Ritter, F. E. (1992). What determines initial feeling of knowing? Familiarity with question terms, not with the answer. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 435–451. Reder, L. M., & Ross, B. H. (1983). Integrated knowledge in diVerent tasks: The role of retrieval strategy on fan eVects. Journal of Experimental Psychology: Learning, Memory, and Cognition, 9, 55–72. Reder, L. M., & Wible, C. (1984). Strategy use in question‐answering: Memory strength and task constraints on fan eVects. Memory & Cognition, 12, 411–419. Restle, F. (1962). The selection of strategies in cue learning. Psychological Review, 69, 329–343. Restle, F. (1964). A cognitive interpretation of intensity eVects in stimulus generalization. Psychological Review, 71, 514–516. Robinson, J. A., & Kulp, R. A. (1970). Knowledge of prior recall. Journal of Verbal Learning and Verbal Behavior, 9, 84–86. Roediger, H. L., & McDermott, K. B. (1995). Creating false memories: Remembering words not presented in lists. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 803–814. Roediger, H. L., & Payne, D. G. (1985). Recall criterion does not aVect recall level or hypermnesia: A puzzle for generate/recognize theories. Memory & Cognition, 13, 1–7. Roediger, H. L., III, Weldon, M. S., & Challis, B. H. (1989). Explaining dissociations between implicit and explicit measures of retention: A processing account. In H. L. Roediger, III and F. I. M. Craik (Eds.), Varieties of memory and consciousness: Essays in honour of Endel Tulving. Hillsdale, NJ: Lawrence Erlbaum Associates. RogoV, B., Newcombe, N., & Kagan, J. (1974). Planfulness and recognition memory. Child Development, 45, 972–977. Rotello, C. M., & Macmillan, N. A. (this volume). Response bias in recognition memory. In A. S. Benjamin and B. H. Ross (Eds.), The psychology of learning and motivation: Skill and strategy in memory use (Vol. 48, pp. 61–94). London: Academic Press. Rowe, E. J., & Rose, R. J. (1977). EVects of orienting task, spacing of repetitions, and list context on judgments of frequency. Memory & Cognition, 5, 505–512. Rundus, D. (1971). Analysis of rehearsal processes in free recall. Journal of Experimental Psychology, 89, 63–77. Rundus, D. (1973). Negative eVects of using list items as recall cues. Journal of Verbal Learning and Verbal Behavior, 12, 43–50.
Memory Is More than Just Remembering
221
Sahakyan, L., & Kelley, C. M. (2002). A contextual change account of the directed forgetting eVect. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 1064–1072. Schacter, D. L., & Tulving, E. (Eds.). (1994). Memory systems. Cambridge, MA: The MIT Press. Schwartz, B. L., & Metcalfe, J. (1992). Cue familiarity but not target retrievability enhances feeling‐of‐knowing judgments. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 1074–1083. Sellen, A., Fogg, A., Aitken, M., Hodges, S., Rother, C., & Wood, K. (2007). Do life‐logging technologies support memory for the past? An experimental study using SenseCam Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’07). New York: ACM Press. Shaw, R. J., & Craik, F. I. M. (1989). Age differences in predictions and performance on a cued recall task. Psychology and Aging, 4, 131–135. ShiVrin, R. M. (1970). Forgetting: Trace erosion or retrieval failure? Science, 168, 1601–1603. ShiVrin, R. M., & Steyvers, M. (1997). A model for recognition memory: REM—retrieving eVectively from memory. Psychonomic Bulletin & Review, 4, 145–166. Simon, D. A., & Bjork, R. A. (2001). Metacognition in motor learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 907–912. Simon, H. A. (1957). Models of man; social and rational. Oxford, England: Wiley. Simon, H. A. (1974). How big is a chunk? Science, 183, 482–488. American Association for the Advancement of Science. Singer, M., & Wixted, J. T. (2006). EVect of delay on recognition decisions: Evidence for a criterion shift. Memory & Cognition, 34, 125–137. Slamecka, N. J. (1968). An examination of trace storage in free recall. Journal of Experimental Psychology, 76, 504–513. Slamecka, N. J. (1969). A temporal interpretation of some recall phenomena. Psychological Review, 76, 492–503. Son, L. K. (2004). Spacing one’s study: Evidence for a metacognitive control strategy. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 601–604. Son, L. K., & Metcalfe, J. (2000). Metacognitive and control strategies in study‐time allocation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 204–221. Sternberg, S. (1966). High‐speed scanning in human memory. Science, 153, 652–654. Strack, F., & Bless, H. (1994). Memory for nonoccurrences: Metacognitive and presuppositional strategies. Journal of Memory and Language, 33, 203–217. Stretch, V., & Wixted, J. T. (1998a). Decision rules for recognition memory confidence judgments. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 1397–1410. Stretch, V., & Wixted, J. T. (1998b). On the diVerence between strength‐based and frequency‐ based mirror eVects in recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 1379–1396. Swets, J. A., Tanner, W. P., Jr., & Birdsall, T. G. (1961). Decision processes in perception. Psychological Review, 68, 301–340. Thiede, K. W. (1999). The importance of monitoring and self‐regulation during multitrial learning. Psychonomic Bulletin & Review, 6, 662–667. Thiede, K. W., & Dunlosky, J. (1999). Toward a general model of self‐regulated study: An analysis of selection of items for study and self‐paced study time. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25, 1024–1037. Thompson, C. (2006). A head for detail [Electronic version]. Fast Company, 110, 72.
222
Aaron S. Benjamin
Treisman, M., & Williams, T. C. (1984). A theory of criterion setting with an application to sequential dependencies. Psychological Review, 91, 68–111. Troyer, A. K., Hafliger, A., Cadieux, M. J., & Craik, F. I. M. (2006). Name and face learning in older adults: EVects of level of processing, self‐generation, and intention to learn. Journals of Gerontology: Series B: Psychological Sciences and Social Sciences, 61B, P67–P74. Tulving, E. (1972). Episodic and semantic memory. In E. Tulving and W. Donaldson (Eds.), Organization of memory (pp. 381–403). Oxford, England: Academic Press. Tulving, E., & Pearlstone, Z. (1966). Availability versus accessibility of information in memory for words. Journal of Verbal Learning and Verbal Behavior, 5, 381–391. Tulving, E., & Thomson, D. M. (1973). Encoding specificity and retrieval processes in episodic memory. Psychological Review, 80, 352–373. Underwood, B. J., & Zimmerman, J. (1973). The syllable as a source of error in multisyllable word recognition. Journal of Verbal Learning and Verbal Behavior, 12, 701–706. Van Zandt, T. (2000). ROC curves and confidence judgments in recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 582–600. Verde, M. F., & Rotello, C. M. (in press). Memory strength and the decision process in recognition memory. Memory & Cognition. Walker, W. H., & Kintsch, W. (1985). Automatic and strategic aspects of knowledge retrieval. Cognitive Science, 9, 261–283. Wallace, W. P. (1980). On the use of distractors for testing recognition memory. Psychological Bulletin, 88, 696–704. Wallsten, T. S., Budescu, D. V., Rapoport, A., Zwick, R., & Forsyth, B. (1986). Measuring the vague meanings of probability terms. Journal of Experimental Psychology: General, 115, 348–365. Ward, G., & Tan, L. (2004). The eVect of the length of to‐be‐remembered lists and intervening lists on free recall: A reexamination using overt rehearsal. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 1196–1210. Watkins, M. J., & Sechler, E. S. (1988). Generation eVect with an incidental memorization procedure. Journal of Memory and Language, 27, 537–544. Watson, J. M., Balota, D. A., & Sergent‐Marshall, S. D. (2001). Semantic, phonological, and hybrid veridical and false memories in healthy older adults and in individuals with dementia of the Alzheimer type. Neuropsychology, 15, 254–268. Weiner, B. (1966). EVects of motivations on the availability and retrieval of memory traces. Psychological Bulletin, 65, 24–37. Weiner, B., & Walker, E. L. (1966). Motivational factors in short‐term retention. Journal of Experimental Psychology, 71, 190–193. Whitten, W. B., & Leonard, J. M. (1981). Directed search through autobiographical memory. Memory & Cognition, 9, 566–579. Wickens, D. D., & Simpson, C. K. (1968). Trace cue position, motivation, and short‐term. Journal of Experimental Psychology, 76, 282–285. Williams, M. D., & Santos‐Williams, S. (1980). Method for exploring retrieval processes using verbal protocols. In R. S. Nickerson (Ed.), Attention and performance VIII (pp. 671–689). Hillsdale, NJ: Erlbaum. Wixted, J. T. (1992). Subjective memorability and the mirror eVect. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 681–690. Wolters, C. A., Yu, S. L., & Pintrich, P. R. (1996). The relation between goal orientation and students’ motivational beliefs and self‐regulated learning. Learning and Individual DiVerences, 8, 211–238.
Memory Is More than Just Remembering
223
Yaniv, I., & Foster, D. P. (1997). Graininess of judgment under uncertainty: An accuracy‐ formativeness trade‐oV. Journal of Experimental Psychology: General, 124, 424–432. Young, C. J. (2004). Contributions of metaknowledge to retrieval of natural categories in semantic memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 909–916. Zacks, R. T. (1969). Invariance of total learning time under diVerent conditions of practice. Journal of Experimental Psychology, 82, 441–447.
This page intentionally left blank
THE ADAPTIVE AND STRATEGIC USE OF MEMORY BY OLDER ADULTS: EVALUATIVE PROCESSING AND VALUE‐DIRECTED REMEMBERING Alan D. Castel
I.
Overview
Why do we remember some events and not others, and how does this change in old age? Although there are a variety of ways to address this question, the present perspective emphasizes how value can have a profound eVect on how we use our memory to remember certain information. The ability to select and prioritize what information is important to remember, relative to less salient or peripheral information, is an essential skill for the eYcient use of memory. For example, university students seek to memorize information they think is important for a later test, while grandparents may focus on being able to remember information about children and grandchildren, as well as important life events. In both cases, value is used to direct resources toward information that is deemed to be important to remember. The role that value plays in memory performance is critical to develop a comprehensive understanding of how memory is used across the adult life span. The present summary focuses on how older adults use evaluative processing (a critical process that will be defined and discussed throughout this chapter) to guide encoding and retrieval operations, and how older adults then use value to make decisions about what information is important to remember. In light of the many memory impairments that typically THE PSYCHOLOGY OF LEARNING AND MOTIVATION VOL. 48 DOI: 10.1016/S0079-7421(07)48006-9
225
Copyright 2007, Elsevier Inc. All rights reserved. 0079-7421/08 $35.00
Alan D. Castel
226
accompany aging, older adults can at times be strategic and adaptive, and the evidence for this will be reviewed in terms of how value can guide remembering. Based on this, a conceptual framework is outlined that illustrates how value, goals and prior knowledge, and emotion can lead to qualitative changes in how memory is used by older adults. These changes can lead to memory impairments and errors, as well as the relatively eYcient strategic use of memory in old age. In most laboratory‐based memory experiments, participants study long lists of words, word pairs, pictures, or other types of information, with each item or event being equally important to remember for a later memory test. However, it is clear that when we encounter information in the natural environment, not all items or types of information compete equally for attention and memory resources. For example, when we encounter large amounts of information, some units are typically more important than others (e.g., some parts of this chapter are more important to remember than others in order to understand the main points). In order to develop a better understanding of how value influences memory performance, it is critical to take a perspective that acknowledges that memory eYciency is susceptible to the value that is placed on each item, a point that is especially important for older adults. In some cases, older adults may be highly skilled in terms of determining what information is important to remember, relative to younger adults who might have diYculty diVerentiating what is important to remember and what is of lesser value. Due to lifelong experience with memory operations, older adults may have expertise in terms of being aware of the dynamic parameters and limitations of memory or, in general, knowing that memory is a tool that needs to be carefully regulated and monitored. Of course, the memory impairments and challenges that older adults typically face may be directly related to this expertise, leading to the engagement of adaptive prioritizing that allows for the focus on high value information. It is this ability that is of particular interest in the present summary. II.
A.
A Selective Review of the Research on Memory and Lifespan Development
LIFE SPAN THEORIES OF COGNITIVE AGING
Decades of research in cognitive aging has shown a systematic decline in various forms of attention and memory performance, and that this can be attributed to mechanisms that appear to be impaired as a result of healthy, ‘‘normal,’’ nonpathological aging (Balota, Dolan, & Duchek, 2000; Craik & Salthouse, 2000; Zacks & Hasher, 2006). Although memory performance can begin to decline as early as the age of 20 years, the most apparent changes are
Value‐Directed Remembering and Aging
227
observed in later adulthood, typically around the age of 65 years (Park & Schwarz, 2000; Park et al., 1996, 2002). Declines in episodic memory and other types of memory that are thought to be resource‐demanding are most apparent for adults older than 65 years, relative to forms of memory which can be classified as semantic, implicit or procedural, and recognition (see Zacks & Hasher, 2006, for a review). This has led to several prominent theories of information processing and cognitive aging that focus on reductions in available processing and attentional resources (Craik & Byrd, 1982), general slowing and reductions in speed of processing (Salthouse, 1996), reduced inhibitory control in working memory (Hasher & Zacks, 1988), impairments in memory for associative information (Naveh‐Benjamin, 2000), and the reliance on familiarity due to deficient recollective processing (Jacoby & Hay, 1998; see also Yonelinas, 2002). Although these theories have received much support, the present chapter focuses more on the qualitative changes that influence the use of memory in old age. Specifically, this chapter outlines the manner in which value can influence how older adults process information, and how value‐directed remembering can lead to deficiencies and biases in memory, as well as adaptive performance in terms of the use of memory in old age. There are several prominent theoretical life span perspectives that are germane to the present discussion of aging and the strategic use of memory. The two most relevant life span perspectives regarding cognitive function in the present context are the selection, optimization and compensation framework (SOC; Baltes & Baltes, 1990; see also Riediger, Li, & Lindenberger, 2006, for the adaptive nature of SOC) and socioemotional selectivity theory (SST; Carstensen, 1992; Carstensen, Isaacowitz, & Charles, 1999). Both of these perspectives are related to the adaptive use of memory resources in old age, resulting in some impairments but, more importantly, a shift in the goals and motivation of older adults. The conceptual framework of selection, optimization, and compensation posits that successful aging is related to a focused and goal‐directed investment of limited resources into areas that yield optimal returns. Thus, older adults can selectively choose certain options in order to maximize performance based on goals, compensating for impairments by optimizing performance in specific goal‐related domains. This type of selectivity can be focused on achieving certain goals or can also be ‘‘loss‐based’’ (Freund & Baltes, 2002), as older adults adjust their goals in response to feedback or losses in order to eventually attain desired and realistic goals. In a similar vein, Heckhausen (1999) and Heckhausen and Schulz (1995) suggest that individuals have to take on the regulation of aging‐related resource losses in order to function eYciently, which can lead to an improvement in eYcient cognitive function. Other compensation‐based arguments have been made in the context of neural compensation for the under‐recruitment of brain areas
228
Alan D. Castel
needed for successful encoding and retrieval (Ba¨ckman et al., 1999; Logan, Sanders, Snyder, Morris, & Buckner, 2002; Park, 2002), suggesting that perhaps older adults can engage in successful memory processing via the recruitment of additional brain regions. Incorporating neural impairments with behavioral changes is a critical issue, as is considering individual diVerences in cognitive aging. SST provides a motivational‐based explanation for why age diVerences are observed in some situations, but not under other conditions, and it specifically addresses the apparent positivity of older adults’ reconstructions of past events (Carstensen, 1992; Carstensen et al., 1999). SST suggests that when time is perceived as limited—a perception strongly associated with old age— more emotionally meaningful goals are likely to be pursued, relative to goals that are aimed at gaining new information. For example, older adults are more likely to remember advertisements that emphasize an emotional component relative to knowledge‐related information (Fung & Carstensen, 2003). Thus, SST suggests that older adults have diVerent goals and values (e.g., Blanchard‐Fields & Camp, 1990), such that older adults can regulate emotions in complicated decision‐making situations (Blanchard‐Fields, 2007), and that this type of processing may be given priority by older adults. In terms of relating life span development theory to experimental evidence, a useful approach (although not typically incorporated in the area of cognitive aging) has been outlined in Jenkins’s tetrahedral model of memory experiments (Jenkins, 1979). This model also emphasizes the sensitivity of memory to context, such that memory performance in a given situation is determined by interactions between four categories of variables: participant characteristics and goals, the cognitive strategy that is necessary for good performance, the nature of the to‐be‐remembered materials, and the manner in which one assesses performance. Extending Jenkins’s ideas to cognitive aging and social cognition, Hess (2005) has suggested that aging is associated with increasing selectivity regarding task engagement in light of perceived or actual declines in cognitive functioning. In general, life span theories focus on a shift in social‐cognitive goals from knowledge acquisition in young adulthood to a more emotional and knowledge dissemination focus in old age (e.g., Labouvie‐Vief, 1990), and this perspective has implications for how to interpret findings from a variety of memory studies. Although life span theories typically focus on how older adults direct resources in light of goals and declining cognitive function, younger adults also face certain challenges. In particular, younger adults may have greater access to cognitive resources but may not know how to direct those resources to maximize performance. For example, one challenge students typically face is selecting what information is important to remember for an upcoming examination. Students often ask instructors, ‘‘do we need to know this for the
Value‐Directed Remembering and Aging
229
test,’’ displaying the need to place value on information before committing it to (or making the eVort to commit it to) memory (see also Benjamin, this volume). Older adults are often faced with this challenge in a much diVerent domain (outside the classroom, and a somewhat more limited perspective of time, see Carstensen, 2006)—and success in terms of being selective at encoding may be the critical process that leads to the eYcient use of memory in old age. By determining what information is of high priority, and ignoring peripheral information, one can limit the information that competes for cognitive resources. This ability, a form of ‘‘cognitive control,’’ may be compromised in old age. Such declines may manifest themselves in terms of having diYculty inhibiting irrelevant information in working memory (e.g., Hasher & Zacks, 1988), as well as being slower and showing more interference in attentional tasks, both at encoding and at the response level (Balota & Faust, 2001; Castel, Balota, Hutchison, Logan, & Yap, 2007). However, at more global levels of choosing how to allocate attention, older adults may be able to exhibit control by strategically attending to high value information at the expense of lower value information. This has been demonstrated to some degree by Castel and colleagues (2002, 2007), who report that older adults can selectively remember high value information at the expense of lower value information. B.
MOTIVATED COGNITION AND GOALS OF OLDER ADULTS
Several studies have examined how motivated cognition and goal‐directed memory influence memory performance in old age. Much of this work has examined how emotional information is processed by younger and older adults in the context of veridical and false memories, and the strategic and adaptive use of memory in decision making. The general pattern of findings has been consistent with a positivity eVect. Specifically, older adults are more likely to remember positive emotional information relative to negative emotional information, compared to younger adults (Mather & Carstensen, 2005). For example, Mather and Knight (2005) examined how older adults use cognitive control to direct resources toward positive information while studying and recalling pictures that varied in terms of their emotional valence. Older adults typically recall more positive memories relative to negative memories, and those older adults that scored highly on tests of cognitive control were more likely to show this positivity bias, relative to older adults that scored poorly on tests of cognitive control. For younger adults, cognitive control was not related to the positivity eVect. This suggests that older adults need to use cognitive resources to engage in the processes that lead to the positivity eVect, implying the need to assign value to the various types of emotional stimuli when prioritizing this information.
230
Alan D. Castel
Consistent with this finding, Fung and Carstensen (2003) found that older adults tend to favor and remember advertisements that are consistent with emotional goals, suggesting that emotional regulation can influence and motivate older adults’ use of memory. Since younger and older adults have diVerent goals and motivations regarding memory (and emphasize diVerent forms of value when processing information), this can also lead to certain biases and inaccuracies based on these goals. In the context of decision making and remembering choice features, Mather, Knight, and McCaVrey (2005) examined how analogous or alignable features might be used and (falsely) remembered by younger and older adults. For example, when making decisions about choosing an apartment to rent, or about diVerent health plans, people are often given various features to choose from. Mather et al. (2005) showed that older adults were more susceptible to misremembering or falsely recalling information about features that contrasted with previous options, suggesting that older adults might organize information in a manner that supports accurate memory, but can also lead to false memory. Furthermore, these memory errors were related to performance on tasks that assessed strategic control and frontal lobe function. In the context of decision making, Wood, Busemeyer, Koling, Cox, and Davis (2005) found that younger and older adults used diVerent strategies when performing the Iowa Gambling Task, which involves the integration of emotion (as assessed by wins and losses) and cognition (use of memory and learning) in a risky‐choice decision task. Although younger and older adults eventually reached similar levels of performance on this task, the two age groups used very diVerent strategies that emphasized the strengths and biases of the two groups. Specifically, older adults adaptively used memory for recent emotional events, such as gains and losses (based on valence), resulting in choices that maximized payoV, whereas the younger adults used more specific memory and learning for many prior trials to maximize performance on the task. This suggests that older adults can engage in adaptive decision making in the sense that they will rely on their strengths and assign high value to the emotional component of the task. In contrast, younger adults will perform optimally by using a much diVerent strategy—again suggesting that the two populations function diVerently with regard to how attention, emotion, and memory are used in decision‐making situations. It is likely that older adults perform better on more naturalistic memory and decision‐making tests because they involve more realistic reliance on memory and reasoning (e.g., Rahhal, May, & Hasher, 2002; Rendell & Craik, 2002; Tentori, Osherson, Hasher, & May, 2001). Thus, it may be possible to reduce impairments in the ability to remember information by using materials that lend themselves well to typical memory and decision‐making challenges that
Value‐Directed Remembering and Aging
231
face both younger and older adults outside the laboratory. Such findings are critical because they illustrate how younger and older adults behave diVerently in situations in which tasks involve more integrated and naturalistic/logical decision making in the context of using memory. The mystery still lies in how these groups weight various components of a task, and the present chapter seeks to emphasize the role of evaluative processing by older adults, leading to adaptive and eYcient performance (cf. Schacter, 1999). The manner in which value is subjectively and internally assigned to information, such as choice features, positive and negative emotional valence, or components of a decision‐making process, is a critical process. Older adults may in fact be more aware of the need to use value to guide encoding and retrieval, relative to younger adults, and it is this observation that might reinforce the need to prioritize how information will be processed in order to lead to eYcient memory performance. III.
A.
Strategic Control and Value as Memory Modifiers for Older Adults
VALUE AS A MEMORY MODIFIER FOR OLDER ADULTS
The debate regarding what makes people value certain things has been a central issue in many disciplines, particularly in psychology and economics. Adam Smith’s classic example of something with high value was water (Smith, 1776/1994). As suggested by Smith, the practical theory of value (also known as the intrinsic theory of value) stated that an object’s value was rooted in how useful it is to the individual. Although this concept of value seems straightforward, one perplexing question which has been asked is that if indeed this theory were true, why do diamonds (which had, at the time, little practical use) command a much higher price than water (which is utterly crucial to existence and survival)? This problem was known as ‘‘the diamond–water paradox,’’ as it seemed to make very little sense in most contexts—and in the case of memory, a similar dilemma might exist. Remembering essential and functional information may be placed at a premium, possibly at the expense of other important and valuable information, and being able to make this distinction (i.e., selection and focusing) is critical for the eYcient use of memory. In the context of what information is of high value when encoding new information and remembering past events, this depends on the point of view of the rememberer and the functional significance of the information (e.g., Nairne, 2005), with diVerences likely to be evident between younger and older adults. In a similar view, both Hess (2005) and Jenkins (1979) emphasize that memory research needs to take into account the goals of the rememberer and
232
Alan D. Castel
context in which information is studied in order to provide a clear account of how people attempt to remember information. Thus, value or relevance to the individual is a perspective that should be considered in light of how memory is used in old age. This perspective has also been taken in decision making and in the context of behavioral economics. The behavioral economics behind focusing on high value information is a concept that is critical to the arguments made here regarding memory and aging and is thought to be governed by strategic evaluative processing at encoding. In general, economists use the term ‘‘strategic’’ such that something will yield high returns from limited resources; it may be that older adults also function this way when it comes to memory. Thus, examining memory and aging in context requires knowing how value, and the assignment of value (via evaluative processing), place a premium on the goals and motivation of the rememberer (e.g., Hess, Rosenberg, & Waters, 2001), possibly leading to eYcient memory use in old age. Since the concept of value has been studied in a variety of contexts (e.g., economics, bioethics, and psychology), before expanding on the role of value in memory, it is important to provide a clear and functional description of the terms used in the present argument. Specifically, this chapter uses some critical but common terminology that requires some careful description given that these terms are often used in a variety of contexts. These terms include strategic control, value, evaluative processing, selectivity, and grain size. In the present context and arguments, strategic control refers to the ability to focus and direct resources on high value information, giving high priority to information that is deemed to be important, in either a subjective or an objective sense. It is strategic in the sense that ideally this allows for selective optimization (somewhat similar to SOC), much like an investor might allocate funds strategically in order to maximize returns without taking costly risks. Strategic control can also (ideally) be adaptive, such that strategic allocation of resources can be altered, adjusted or biased based on upcoming or anticipated goals and in light of current capabilities. Although this can be partially related to and mediated by a more general construct of ‘‘cognitive control,’’ it is diVerent than how the term ‘‘strategy’’ has often been used to describe specific encoding strategies such as imagery, peg‐word systems, elaborative processing, and other self‐initiated strategies that appear to be reduced or impaired in old age (cf., Hertzog & Dunlosky, 2005). Value in this context does not refer to ethical or moral value, but rather the importance or weight that is assigned to information, such as in economics in which value is often determined by availability and demand of a commodity in relation to its price per unit. Value can be both objectively and subjectively defined, and can depend on the situation (the need to remember certain high value information) and experience (prior knowledge and expertise can dictate what is high value) of the individual.
Value‐Directed Remembering and Aging
233
Assigning value or utility to characteristics or options in the context of decision making has been a central component to theories regarding choice behavior (Tversky, 1969, 1972), but in the context of memory research very little emphasis has been placed on value of the to‐be‐remembered item, and how older adults might use value to guide encoding and retrieval operations. Evaluative processing is the mechanism in which value is assigned to information by the rememberer, and this can be influenced by a variety of factors. Most typically it is based on how important the information is for the current goal of the individual, whether this information is consistent or inconsistent with prior knowledge, as well as motivation and anticipated future use of this information (e.g., Hess et al., 2001). The term selectivity refers to focusing on certain items or events that are perceived to be of high value, possibly at the expense of lower value information. As suggested by Riediger and Freund (2006), a more general form of ‘‘motivational’’ selectivity may involve two forms: (1) Focusing on high value/important information, while also (2) restricting the access of lower value or more peripheral information. Although the present chapter centers on the ‘‘focusing’’ mechanism involved in selectivity, older adults may have greater diYculty with the restricting component (as suggested by Hasher & Zacks, 1988, in terms of working memory), leading to the encoding (but not necessarily the later use) of peripheral perceptual information (e.g., Koutstaal, 2003). Finally, older adults may be able to maximize memory performance using appropriate ‘‘grain size’’ at encoding and retrieval (Goldsmith & Koriat, this volume; Goldsmith, Koriat, & Weinberg‐Eliezer, 2002; Koriat & Goldsmith, 1996). Grain size is defined as the level of detail (‘‘precision’’) or generality (‘‘coarseness’’) at which to encode and later report remembered information (Goldsmith et al., 2002), and older adults might more often rely on more general/coarse, gist‐based retrieval in a variety of settings. Older adults clearly have diVerent goals relative to college students, in terms of memory performance in the context of life span development, and possibly also on laboratory‐based memory tests, so it seems somewhat problematic to compare older adults to younger adults in these types of situations. Older adults have not typically been accustomed to memorizing large amounts of information, using esoteric methods to commit arbitrary information to memory, and the constant tests that college students typically encounter. Younger adults who are college students could in fact be classified as ‘‘expert memorizers,’’ in a much diVerent sense than older adults are experts in certain domains, given the emphasis that is often placed on memorizing information and terminology when studying for examinations, perhaps at the cost of being selective. Older adults use evaluative processing to selectively remember only certain types of information, often at the cost of being able to remember large amounts of information, or specific arbitrary details.
234
Alan D. Castel
Given the somewhat diVerent abilities and perspectives on the use of memory by younger and older adults, it does not seem surprising that age diVerences exist in a variety of laboratory‐based memory tests. The real surprise comes from situations in which older adults’ memory performance is similar to younger adults, given their somewhat diVerent approach to memory tasks. Thus, it seems that older adults who participate in memory experiments diVer in both a quantitative sense (less capacity to remember long lists of items) and qualitative sense (diVerent approach regarding what information is important to remember), and both of these factors lead to diVerences relative to younger adults. For this reason, it seems necessary to take caution when comparing younger and older adults on tests of memory, given that diVerences in performance could potentially result from qualitative and/or quantitative reasons—a point that plagues any type of cross‐ sectional design in cognitive aging research that directly compares younger college students to older adults. Although there is no immediate solution to this issue, by experimentally manipulating the value of to‐be‐remembered information, or considering how diVerent age groups use evaluative processing in this context, cognitive aging research can assess important diVerences and similarities in how younger and older adults can eYciently use memory. B.
SELECTIVITY, VALUE, AND THE USE OF MEMORY BY OLDER ADULTS
One useful way to examine the impact of value on memory performance in an experimental setting is to have to‐be‐remembered items in a list assigned a range of diVerent values. This is in contrast with most of the typical memory experiments in which each item, picture, or word pair is of equal importance to remember for a later memory test. By assigning diVerent values to to‐be‐ remembered items, one can determine how participants use value to guide encoding and retrieval processes. Furthermore, as participants become more aware of how easy or diYcult it is to remember information, one can observe how strategic control is exerted such that participants begin to focus on high value information. In the ‘‘selectivity’’ paradigm (Castel, Benjamin, Craik, & Watkins, 2002; Castel, Farb, & Craik, 2007; Watkins & Bloom, 1999), participants are presented with a list of 12 words, and each word is paired with a diVerent numeric value ranging from 1 to 12 (e.g., table 5, uncle 9, apple 2, pilot 6. . ., see left panel in Fig. 1). In some variants of this procedure, the value is presented immediately after the word to ensure that participants do not simply ignore low value words. Participants are told that they should try to remember as many words as they can for a later recall test, such that they maximize their score. The score is the sum of the value of the recalled words, and the experimenter informs participants about their score once they have recalled the words. Based on this feedback, participants are encouraged
Value‐Directed Remembering and Aging
Table 5 Uncle 9
.9
Apple 2
.8
Pilot 6
.7
Berry 11 Time
B
Cabin 1 Skate 7 Cheek 12
Probability of recall
A
.6 .5 .4
Young
.3
Old
.2
Fence 3
.1
Straw 8
0
Petal 10 Drain 4
235
1
2
3
4
5
6
7
8
9 10 11 12
Point value of word
Fig. 1. The procedure (A) and results (B) from the selectivity paradigm, which allows for an examination of how participants use value to guide encoding and retrieval processes. (A) The participants are presented with a list of 12 words (one at a time), of which each word having a unique value ranging from 1 to 12, and the values are randomized across the serial positions. Participants recall the words with the goal to maximize their score. Participants then repeat this with a new list, are given feedback about their score, and are given many successive trials with new lists and feedback about their score. (B) The results in terms of the probability of recall for younger and older adults as a function of point value (adapted from Castel et al., 2002; Castel, Farb et al., 2007). There are no age diVerences for high value information (12‐, 11‐, and 10‐point words), whereas age diVerences exist in memory performance for other lower values.
to remember the high value words in order to maximize their performance, though recalling any word will lead to a higher score. The results from selectivity experiment are displayed in the right panel of Fig. 1 in which the probability of recall is plotted as a function of point value. Younger adults perform quite well and on average recall more words than older adults, but in some instances do not appear as selective, as they recall both high and low value words (Castel, Benjamin et al., 2002). What is interesting about this is that after some experience with the task (participants are given numerous unique lists, one after another), participants become aware that they cannot remember all of the words (as the words are presented fairly rapidly at encoding). Thus, some participants begin to focus or select the highest value words to remember in order to eYciently boost their score. Older adults were in fact quite eYcient in terms of selectively remembering the high value words (i.e., the 12‐, 11‐, and 10‐point value words), in light of knowing that they will likely only be able to remember three or four words. Although a general impairment in memory in old age would predict that younger adults would recall more information at each point value, and an
236
Alan D. Castel
inhibitory impairment would predict that older adults would recall more low value words, the results suggest that older adults can direct attentional resources to high value information. Thus, for the high value information (12‐, 11‐, and 10‐point words), there are no age diVerences in memory performance, while age diVerences do exist for lower value information. It is important to note that participants were told their score after the recall of each list (and were then given another new list—in some cases doing this up to 48 times!), so it is likely that after the first few lists participants began to be more selective in order to maximize their score. An eYciency index was also calculated, which compared the participant’s score relative to an ideal score based on the number of words recalled. For example, if you recalled three words (the 8‐, 10‐, and 12‐point words), an ideal score would be 10 þ 11 þ 12 ¼ 33 (i.e., recalling the top three words), and if your actual score was 8 þ 10 þ 12 ¼ 30, then your eYciency index would be your actual score divided by the ideal score, actual/ideal ¼ 30/33 ¼ 0.91 (see Castel, Benjamin et al., 2002, for more details about the selectivity index). When the selectivity index was calculated for both younger and older adults, under certain conditions (such as immediate free recall), older adults displayed a higher selectivity index as they would consistently just recall the top three or four words, whereas younger adults would recall high and some additional low value words. Although this index may be somewhat biased, given that younger adults recall more words in general of varying value, it does provide a useful measure in which older adults displayed greater eYciency in terms of selectivity. To illustrate how older adults learned to become more eYcient in the selectivity task, performance (in terms of the selectivity index and proportion of words recalled) was plotted as a function of list, and displayed in Fig. 2. Recall performance improved slightly after the first few lists and then remained stable. Although older adults did not display high selectivity for the first few lists, after several lists older adults became more selective in terms of focusing on encoding higher value items, leading to higher eYciency scores as reflected by the selectivity index. Although not shown in the figure, younger adults showed a similar trend with higher overall recall. Thus, it may be necessary for older adults to learn about how to be eYcient, and this requires some experience with the task (and may be related to other changes in strategy and control by older adults, e.g., Spieler, Mayr, & LaGrone, 2006; Touron, 2006). It may be that older adults might engage in an eYcient form of event‐based prospective memory with practice (e.g., McDaniel, Einstein, Stout, & Morgan, 2003), in terms of remembering to remember higher value information, as this is reinforced with many trials and feedback about score in the selectivity paradigm. The findings from the selectivity paradigm suggest that older adults can direct attention to high value information. However, it was not clear how
Value‐Directed Remembering and Aging
237
Selectivity index and proportion recall
.7 .6 .5 .4 .3 .2
Selectivity index Proportion recall
.1 0
1
2
3
4
5
6
7
8
List Fig. 2. The selectivity index (derived from participants’ score relative to an ideal score) and proportion of words recalled as a function of list order for older adults (adapted from data in Castel et al., 2002; Castel, Farb et al., 2007).
value was represented in memory by older adults. In order to examine how older adults represent value that then leads to selectively guiding encoding and retrieval processes based on value, Castel, Farb et al. (2007) specifically tested how older adults recalled the value of the presented item. It may be the case that older adults simply determine which words are of high value (but do not remember the precise value), and this heuristic then guides encoding of high value information. Older adults might quickly encode a high value word (e.g., cheek 12) as important, and not remember the precise value after encoding, but rather the general level of importance (i.e., a coarse grain size). Thus, Castel et al. tested how well younger and older adults could recall specific value information, to see if younger adults could remember more specific value information, while older adults focus on remembering range information regarding value, given comparable abilities to remember high value words. The results were that both age groups were equally good at recalling point values when recalling the range of high value words, but younger adults outperformed older adults when recalling specific values. These findings suggest that although both groups retain value information, older adults rely more on gist‐based encoding and retrieval operations with regard to high value, while younger adults are able to remember specific
238
Alan D. Castel
value information. This may represent a heuristic at encoding, such that older adults convert information from precise numerical information to a more general value level that is then bound to the item, and this facilitates memory for high value information. In order to examine the degree of control that younger and older adults can use when value guides remembering, Castel, Farb et al. (2007) employed a selectivity procedure in which words were paired with either negative or positive point values. Thus, participants should focus on high value positive information, but restrict encoding and retrieval of lower value information, and especially information of negative value. The incentive to focus on positive value in this case was reinforced, as participants were instructed that recall of negative value information would actually lead to a reduction in their score. Results showed that both younger and older adults recalled only the positive value information, again with no age diVerences for the highest value information. Interestingly, much like younger adults, older adults did not recall any of the negative value information. However, on a later surprise recognition test for all items, older adults were in fact more likely to recognize the negative value words. These data suggest that older adults did, in fact, process these words perhaps due to poorer inhibitory control, and likely took longer to code them as negative value information (perhaps due to slowing; Salthouse, 1996). The observation that older adults do in fact encode negative, low value or irrelevant information is consistent with impairments in inhibitory control in the directed forgetting paradigm. In this task, participants are given instructions to remember or forget certain items after initial encoding, with older adults recalling more of the ‘‘forget’’ items under certain conditions, suggesting that older adults cannot inhibit the encoding and later recall of these items (e.g., Zacks, Radvansky, & Hasher, 1996). However, the directed forgetting paradigm does not allow for the examination of how value can influence control over encoding—a critical issue in the present context, given that older adults often rely on value to guide encoding operations. Thus, the idea that older adults can prioritize what information to commit to memory may have important implications for training eYcient use of memory. Rather than focusing on encouraging older adults to use techniques that typically help younger adults, such as improving recollection or enhancing the capacity of working memory, it may be helpful to enhance and emphasize the various mechanisms associated with selectivity, such as focusing on high value information, and restricting the access of lower value information—especially in situations in which both types of information compete for attention. The notion that value can influence memory performance has also been examined in applied areas of psychology, such as eyewitness identification. In a study by Leippe, Wells, and Ostrom (1978), students witnessed a staged
Value‐Directed Remembering and Aging
239
theft in which a purse that contained either an expensive object (i.e., high value, such as a diamond ring) or an inexpensive object (i.e., low value, such as a pair of gloves) was stolen, and subjects were either told of the value before or after the theft. When witnesses had prior knowledge of the object’s value, accurate identification of the thief was more likely when the theft involved a high value item, relative to the low value item. However, when knowledge of the crime’s seriousness was gained after the theft, then knowledge of the value of the item had little eVect on the ability to later identify the thief. Thus, the perceived seriousness of a crime or knowledge of the value of the stolen item can influence encoding operations, possibly via the recruitment of attentional resources in relation to the degree of value of the items or seriousness of the crime. This approach of ‘‘value‐directed remembering’’ by both younger and older adults, as shown in the selectivity paradigm and other instances, can thus serve several purposes in terms of the theoretical and applied aspects of strategic control of memory in old age. First, following the work of others (e.g., Hess, 2005; Mather & Carstensen, 2005; Baltes’s SOC theory; Zacks & Hasher, 2006), it informs clinicians and researchers regarding situations in which memory performance can be optimized for certain materials and certain contexts. Second, it outlines how future research can determine boundary conditions in which value influences memory performance in older adults. Finally, and perhaps most importantly for memory improvement in old age, it suggests ways in which older adults can become (or already are) expert users or students of their own memory in order to be most eYcient when using memory. This can be accomplished by knowing the limitations of memory, and selectively allocating resources to important information. C.
EVALUATIVE PROCESSING AS SKILLED COGNITION IN OLD AGE?
Evaluative processing can have benefits, but in many laboratory‐based memory tasks it can lead to negative consequences (see Hess, 2005, p. 389). Namely, older adults will often report more thoughts or feelings about to‐be‐remembered items which can then lead to poor source memory. For example, when encoding and processing unrelated word pairs for a later memory test, older adults often note (aloud) how the words are unrelated and ‘‘don’t go together,’’ whereas younger adults engage in elaborative processing to link the unrelated words, such as creating a far‐fetched story or using elaborate forms of imagery. Thus, older adults have diYculty remembering these associations because at encoding they fail to make a link between the two words (see Naveh‐Benjamin, 2000). If anything, older adults often report that the words are unrelated, and then appear to cease processing the pairs as this incongruent relationship between the arbitrary words is of low value to them to later remember (despite the instructions of the experimenter to remember all of the word pairs for a later test).
Alan D. Castel
240
Although other work attempts to measure the elaborative processing, or tries to equate elaborative processing for younger and older adults (see Hertzog & Dunlosky, 2003; Naveh‐Benjamin, Keshet, & Levi, 2007), it seems that there is a qualitative diVerence in the approach that is used by younger and older adults that is not captured via these measures. In a similar vein, when information is related to, or somewhat inconsistent with, older adults’ prior knowledge, evaluative processing can be especially helpful. Castel (2005) examined how younger and older adults can link grocery price information with grocery items, with the grocery prices either reflecting market value (e.g., milk $3.79), or unrealistically high or low prices (e.g., butter, $17.89). Participants studied various item‐price pairs, and were later presented with the grocery items and had to recall the prices paired with each item. Younger adults were much better than older adults at recalling the unrealistic prices, but there was no age diVerence for the realistic prices (see Fig. 3). While studying items, older adults would verbally report that the over‐ and underpriced items were unrealistic. For the market‐value items, older adults engaged in more specific evaluative processing, likely supplemented by prior knowledge at encoding in which they compared prices with what they usually paid for these items. In fact, older adults would often
Price recall performance (out of 20)
10
Young
Old
9 8 7 6 5 4 3 2 1 0 Regular prices
Unusual prices
Price type condition Fig. 3. The number of correctly recalled (exact) prices by younger and older adults for market‐value‐priced and overpriced grocery items (from Castel, 2005).
Value‐Directed Remembering and Aging
241
report that market‐value prices were either slightly more or less than what they might pay, whereas younger adults were good at simply memorizing the prices for all of the items. Again, this suggests a qualitative diVerence in the approach that was used by the two groups, with older adults using evaluative processing and reliance on prior knowledge to encode information in order to supplement memory performance. D.
VALUE, MOTIVATION, AND EMOTIONAL PRIORITY FOR OLDER ADULTS
Although older adults are often motivated in terms of trying to remember various types of information, it is important to describe the boundary conditions regarding what/how motivational factors can influence memory performance in old age. Often older adults who participate in memory experiments are analytical of the materials that they need to remember (e.g., unrelated word pairs), and state this at some point during the experiment, possibly reflecting the diYculty that they have encoding and retrieving the information. However, this might represent an important diVerence between younger and older adults; whereas younger participants simply memorize the information in question, older adults will be critical of why they need to remember somewhat arbitrary information—perhaps reflecting the use of selectivity. Just as in the infant and child development research that employs materials and stimuli that the participants find of value and can commit to memory, it is critical that the materials that are used to test older adults’ memory also share these essential features. Although memory research in general has been criticized for lack of naturalistic materials or generalizability to the real world (Neisser, 1982), it is not simply enough to use materials that older adults have experience with—value needs to be incorporated into tasks, either in terms of the subjective or objective measure of how value influence how older adults use their memory. One example of how motivation might contribute to memory performance is in terms of the emotional context in which information is encoded. Although there is some debate as to how emotion enhances binding that occurs in working memory (Mikels, Larkin, Reuter‐Lorenz, & Carstensen, 2005), and whether emotion helps or hurts memory for central and peripheral information for older adults (Kensinger, Piguet, Krendl, & Corkin, 2005; see Mather, 2007, for a review), the emotional context of incoming information plays a critical role in how older adults process and prioritize this information. In general, compared to younger adults, older adults report being more eYcient at regulating their emotions and focus more on emotion regulation (Diehl, Coyle, & Labouvie‐Vief, 1996). This increased focus on regulating emotions seems to influence their everyday information processing, as they show a positivity bias
Alan D. Castel
242
in their attention toward information, favoring positive over negative information (e.g., Mather & Carstensen, 2003; Mather & Knight, 2005; Mather et al., 2005; for a review see Mather & Carstensen, 2005). Older adults also exhibit this bias for emotional items in working memory (Mikels et al., 2005). These age‐ related attentional biases may amplify the eVects of arousal on memory binding for certain positive stimuli and diminish them for negative stimuli (e.g., Mather, 2006). However, another way to interpret this bias is that positive information is assigned higher value in old age, leading to older adults focusing and prioritizing based on this assigned value. The assessment of what receives high value can diVer as a function of the person’s experience and expertise (are older adults in fact experts in terms of emotional regulation?) as well as during certain situations in which specific types of information are important or salient.
IV. A.
Model, Review and New View of Value, Memory, and Aging
A MODEL OF EVALUATIVE PROCESSING AND VALUE‐DIRECTED REMEMBERING
Goals, Strategic control New information
Prior knowledge Errors Evaluative processing
Familiarity
Retrieved information
Gist
Initial VALUE assignment
Attentional control Specific
Goals Retrieval control Recollection
Emotion
Value-directed remembering
Retrieved information
Retrieved information
Increasing levels of precision (fine) grain size at retrieval (coarse)
Given the need to understand how value can influence memory performance in old age, a conceptual framework (which is illustrated in terms of a model of information flow from encoding to retrieval) is presented in Fig. 4. This framework is intended to model the influence of value‐directed remembering across the life span (i.e., for both younger and older adults), and highlights
Fig. 4. A conceptual framework that models the flow of information in memory based on value‐directed remembering, with an emphasis on evaluative processing at encoding, and diVerent levels of grain size at retrieval.
Value‐Directed Remembering and Aging
243
and summarizes the arguments made in this chapter. This framework emphasizes the role of evaluative processing at encoding, leading to selectivity as a filter once value has been assigned to new information. The utility of this framework is that it seeks to account for why in old age value‐directed remembering becomes increasingly important and attempts to explain how value‐directed remembering can account for some of the important findings in the domain of memory and aging. This model incorporates processes involved with value assessment relative to goals and prior knowledge at encoding, with attentional control needed to then bind this information (content plus its assigned value) for later processing. Thus, the theory that governs value‐directed remembering can be described as ‘‘strategic and selective control theory,’’ in that older adults can compensate for memory impairments by being strategic in what they choose or select as information to remember, which varies based on the ability to control or direct attention to important information. The role of value assignment is a critical and under‐defined process that occurs at encoding, and is thought to involve both objective and subjective factors, depending on the situation (the need to remember certain high value information) and experience (prior knowledge and expertise can dictate what is high value). Value is assigned via evaluative processing of the incoming information, and value can be represented in various forms. Evaluative processing relies on both attentional and emotional control, leading to the assignment of value based on a number of factors that are specific to the individual and context (e.g., Hess, 2005; Jenkins, 1979). It should be noted that value‐directed remembering can be governed by both objective value (e.g., high value words in the selectivity paradigm, as dictated by the experimenter), and subjective value (e.g., the rememberer assigns a value that can diVer based on the individual and context, such as grocery prices or emotional words). Further research is required to examine how these two forms of value can direct memory and be modified to maximize memory performance, and how memory training for older adults can focus on objective value in order to enhance memory for detailed information. Depending on the value assigned to incoming information, the information is then represented in either a specific verbatim form, or in most cases a more general but semantically rich gist form. Information that is represented in a specific form can then support recollection at retrieval, resulting in a more precise grain size and highly accurate remembering. However, older adults typically draw on information from gist‐based representations, which can be then supplemented with prior knowledge and inferences (see Reder, Wible, & Martin, 1986)—sometimes leading to specific and predictable memory errors that are consistent with prior knowledge. Gist‐based processing typically leads to familiarity at retrieval, translating to somewhat less
244
Alan D. Castel
precise but still useful retrieval of information. The degree or precision or grain size of retrieved information is then illustrated in terms of how accurate one can be when recalling details (e.g., remembering there were 43,567 steps on a hike to Machu Picchu, vs. recalling it was about 40,000 steps; or that your flight leaves at 12:06 PM, or just around noon), but can also lead to certain types of memory errors. The model and theory of strategic and selective control of value that guides this explanation of value‐directed remembering centers on how older adults can use evaluative processing to direct memory resources toward important information. According to this theory, impairments in a variety of tasks and situations may be ameliorated if older adults can strategically adapt to encoding large amounts of information by judging how important the information is for future use. Older adults can thus compensate for impairments in capacity by limiting the amount of information that one attempts to remember (see Benjamin, this volume, for the use of electronic alternatives for storing and access information). This also suggests that older adults might use (or need to use) broader grain size or familiarity as an adaptive form of retrieval in light of impairments, but also the judicious use of value in terms of deciding what information requires precise encoding and recollective processing at retrieval. The processing of emotional information has also been highlighted in the model and is based on the notion that evaluative processing of emotional information leads to an assignment of high value, or priority binding (e.g., MacKay, Hadley, & Schwartz, 2005), for this type of salient information. SST suggests that the processing of emotional material is often consistent with older adults’ goals, and for this reason it is assigned high value in the model (see also Mikels et al., for a similar account that involves working memory resources). As stated by Mather and Carstensen (2005), attentional control can lead to biases in the way negative and positive emotional material is processed, with positive emotional material being better remembered by older adults. In the model, positive information is bound to a high value variable as a result of attentional control, or in other cases high value assessment, leading to direct access of this information via recollection. Negative emotional material, although still assigned a high value during evaluative processing, is then diVerentiated from positive emotional information and stored in a specific manner, but not given the same priority and recollective nature as positive emotional information (and in some cases negative information could be ignored if it is of suYciently low value). This is especially the case when both negative and positive emotional information are encountered at a similar time, leading to relative comparisons and a highly observable positivity eVect at retrieval, based on recollection. This then explains the
Value‐Directed Remembering and Aging
245
finding of a greater bias or positivity eVect for mixed lists relative to lists that contain only positive or negative information (e.g., Gru¨hn, Smith, & Baltes, 2005). However, although emotional materials lend themselves well to the model in terms of evaluative processing and priority, in the following section other domains of impaired memory performance will be covered, in order to illustrate how the evaluative processing (as described in the model) can account for some important findings in the literature, especially in terms of the strategic control that is used by older adults. B.
GENERAL SLOWING, METACOGNITION, AND EVALUATIVE PROCESSING
One dominant explanation of cognitive aging suggests that older adults experience a general slowing of cognitive processes (Salthouse, 1996), and can be applied to changes in neuronal function as well as slower reaction time for older adults. Such slowing can explain a number of the impairments observed in working memory function, leading to deficits in long‐term episodic memory. Typically, age diVerences in memory tasks are most pronounced when tasks require speeded response (e.g., Stine, Wingfield, & Poon, 1986). However, in many situations older adults can compensate for slowing by taking longer to encode and retrieve information. Thus, some strategic control over retrieval processing might contribute to older adults taking more time at retrieval (although often studies attempt to account for this by giving older adults more time than younger adults). It may even be the case that older adults should be encouraged to take more time in order to engage eYcient encoding and retrieval operations, as opposed to simply relying on familiarity and gist— and perhaps value can dictate how older adults should allocate study time and resources. Strategic and selective control theory would posit that older adults can maximize memory performance in situations in which they are free to use controlled processing, and slower and self‐initiated engagement of memory. For example, Benjamin and Craik (2001) found that younger adults’ source memory under speeded response conditions resembled that of older adults. Jacoby (1999) has found that repetition at encoding can lead to greater use of familiarity by older adults in the process dissociation procedure. Together, these studies suggest that speeded conditions and repeated presentations can lead to the unopposed use of familiarity. When value is assigned or attached to items, it may be that older adults can thus compensate and benefit from studying fewer items but remember these items well, and can self‐select how to allocate study time to high value information. Along these lines, Dunlosky and Connor (1997) found that both younger and older adults allocated extra
246
Alan D. Castel
study time to items that they judged as diYcult to remember. In addition, judgments on one trial and study times on the next trial were negatively correlated, suggesting that both younger and older adults utilized monitoring to eYciently allocate study time to material that required extra time (although the magnitude of these correlations was less for older than for younger adults). The global use of metacognitive skills are crucial for older adults, and in some cases aging does not impair the monitoring of encoding, even though aging adversely aVects associative learning (Hertzog, Kidder, Powell‐Moman, & Dunlosky, 2002). Importantly, older adults can learn to eVectively monitor associative learning via training with retrieval and self‐testing (Dunlosky, Kubat‐Silman, & Hertzog, 2003), as also shown in the selectivity paradigm (Figs. 1 and 2). What is of interest is to extend this metacognitive work with items of diVerent values to determine whether older adults can suYciently allocate study time to high value information and spend minimal time encoding low value information. Thus, in light of slowing and reduced capacity, older adults’ memory performance may be made more eYcient with a shift in self‐paced study time from low to high value information (and encode this information with greater confidence and accuracy), while younger adults may not (need to) be as selective under these circumstances. C.
ASSOCIATIVE MEMORY IMPAIRMENTS AND VALUE
The ability to link units of information to form more complex representations of the past is a critical function of memory. One explanation for older adults’ poorer episodic memory performance is based on impaired binding (Chalfonte & Johnson, 1996), leading to an associative deficit (Naveh‐Benjamin, 2000). This is supported by many observations that older adults have greater diYculty remembering source information (Johnson, Hashtroudi, & Lindsay, 1993), the context in which information was previously presented (Spencer & Raz, 1995), the link between two units of information, such as names and faces (Naveh‐ Benjamin, Guez, Kilb, & Reedy, 2004), and, in the laboratory, unrelated words pairs (Castel & Craik, 2003; Naveh‐Benjamin, 2000). However, age diVerences are negligible in terms of memory for related word pairs. Thus, an associative deficit hypothesis (Naveh‐Benjamin, 2000) can partially explain many of the memory errors that older adults experience when trying to remember arbitrarily related associative information. However, this associative impairment does not fully explain findings in which older adults can in fact use and remember source information in certain circumstances. Typically, older adults display impairments in remembering source information (Schacter, Kaszniak, Kihlstrom, & Valdiserri, 1991). However, in a source memory experiment by Rahhal et al. (2002) it
Value‐Directed Remembering and Aging
247
was shown that although older adults had diYculty remembering the voice (male or female) in which a statement was heard, when older adults were told prior to study that the speaker’s voice indicated whether the statement was true or false they displayed good memory for the ‘‘truthfulness’’ of these statements. Thus, it may be that when an associative memory task involves binding arbitrary bits or units of information (e.g., a voice with a statement, unrelated word pairs, or numbers with words), older adults display associative memory impairments (e.g., Castel, 2007). However, when the memory task involves more meaningful and naturalistic associative information, which involves older adults placing a greater value on the source of information, age diVerences are reduced or eliminated. Naturalistic and emotional content can also influence how source memory information is processed and retained by younger and older adults. May, Rahhal, Berry, and Leighton (2005) showed that when emotional information was conveyed by the source, older adults could remember source information about where food was located (left or right side of a room), if this information was coupled with high value meaning (food of the left was poisonous while food on the right was fresh). Although May et al. interpreted this in terms of emotional content, value plays a critical role in how older adults process source information, and when source information conveys critical information, older adults seem able to remember this form of source. Finally, value formation can also be dictated by experience, particularly when older adults engage in evaluative processing, such as when remembering the prices of grocery items. As stated previously, Castel (2005) has shown that older adults can remember price information (an item and its exact price) only when the price reflects market value, but not when the prices are greatly exaggerated, suggesting that older adults require and benefit ‘‘schematic support’’ when encoding new associations (see Craik & Bosman, 1992). This may similar to older adults being better at remembering related words pairs, but in the context of more naturalistic materials. However, it is evident that value can influence the ability to remember associations, and older adults might focus on associative information only when this conveys critical information that is consistent with goals. D.
RECOLLECTION, FAMILIARITY, AND VALUE
One reason for older adults’ associative memory impairments may be the failure to utilize more detailed recollective processing at retrieval, and instead rely on familiarity, which can lead to false memory errors (Jacoby & Rhodes, 2006). Jacoby and colleagues have used the process dissociation procedure to investigate the contributions of recollection and familiarity, and consistently
248
Alan D. Castel
find that older adults rely more on familiarity, presumably due to deficits in recollection (see Yonelinas, 2002, for a review), although this is amenable to training in some circumstances (Jennings & Jacoby, 2003). The proposed model and strategic and selective control theory also illustrates situations in which older adults use familiarity at retrieval. However, as is often the case for older adults, unopposed reliance on familiarity can lead to a variety of errors and biases on memory tests, and the model suggests this is related to coarse grain size of information specificity at retrieval. Although recollection is thought to be a slower, more controlled and detailed retrieval process, and impaired in old age (Light, Prull, La Voie, & Healy, 2000; Yonelinas, 2002), there are situations in which older and younger adults exhibit a similar reliance on familiarity (e.g., Benjamin & Craik, 2001). For older adults, a reliance on familiarity may be a necessary and adaptive shift based on an awareness of deficits in recollection. Age‐related deficits in recollection are observed for verbal materials that diVer slightly from study to test (e.g., knee‐bend and knee‐ bone). Older adults’ use of familiarity does not allow for the diVerentiation between similar verbal materials at test. However, older adults can improve with specific training and feedback in this type of procedure (Rhodes, Jacoby, Daniels, & Rogers, 2007), and especially when other more naturalistic, nonverbal materials are used, familiarity seems to be used by both younger and older adults. For example, Bastin and Van der Linden (2006) used unfamiliar faces and found that both younger and older adults used familiarity to a similar extent on a recognition test. Rhodes, Castel, and Jacoby (2006) also reported that familiarity was a strong contributor for memory for previously presented face pairs, as both younger and older adults displayed high error rates for rearranged pairs, which consisted of two previously presented faces that were not paired together at encoding. When materials are somewhat conducive to the use of familiarity, both younger and older adults engage in this type of familiarity‐based processing, and it may be the case that older adults also use familiarity for word pairs for similar reasons. Although familiarity can lead to a number of memory errors and considerable frustration (e.g., a face is familiar, but where do I know this person from?), it may be that older adults can benefit from this initial familiarity to eventually engage later details of retrieval, although on most laboratory‐ based memory tests this can result in errors. Multhaup (1995) did find an exception to this by providing older adults with more detailed response options that allowed them to be more aware of familiarity‐based errors, and Roediger and Geraci (2007) similarly found a decrease in misinformation errors when older adults were given practice with the retrieval of source information, thus avoiding the use of familiarity.
Value‐Directed Remembering and Aging
E.
249
FALSE MEMORY AND FLEXIBLE REMEMBERING
There is also evidence to suggest that older adults rely on more gist‐based memory, which refers to a highly abstracted and semantically rich representation of the past, relative to more specific verbatim memory, which is memory for the exact sensory inputs of a given situation in the past (e.g., Reder et al., 1986). Fuzzy‐trace theory (Brainerd & Reyna, 2001) suggests that with age the ability to retain verbatim information deteriorates more quickly than the ability to retain gist information (e.g., Schacter, Koutstaal, Johnson, Gross, & Angell, 1997; Titcomb & Reyna, 1995; Tun, Wingfield, Rosen, & Blanchard, 1998). For example, in the Deese/Roediger/McDermott (DRM) paradigm (Deese, 1959; Roediger & McDermott, 1995), older adults are more likely to falsely remember the critical semantic associate (a highly related member of the semantic class which makes up the study list but was not actually presented at study) than younger adults on both recognition (Balota et al., 1999; Koutstaal & Schacter, 1997; Norman & Schacter, 1997) and recall (Kensinger & Schacter, 1999; Norman & Schacter, 1997; Tun et al., 1998) tasks. This suggests age‐related reliance of gist memory or age‐related declines in verbatim memory (Brainerd & Reyna, 2001), or age‐related diVerences in semantic activation and monitoring at retrieval (Balota et al., 1999). There are circumstances in which older adults can reduce false memory errors. Tun et al. (1998) observed that age‐related diVerences decreased when all participants were encouraged, through task demands, to rely on a gist representation of the study list. McCabe and Smith (2002) have also shown that older adults can reduce false alarms to the critical lure if warned prior to the encoding session about the nature of the task and materials. This suggests that although older adults may typically rely on gist‐based representations, under certain conditions they are able to access and use more specific information. Such findings are consistent with the data previously described showing that older adults can remember gist‐based information about grocery prices, as well as more specific information about market‐value prices (e.g., Castel, 2005), by relying on schemas and evaluative processing. The findings from the DRM paradigm might also indicate that older adults can focus on integrating‐related units of information to a more general grain size, leading to false memory errors, but also somewhat useful gist‐based memories of the past. For example, Adams (1991) has shown that for text recall, memory became more reconstructive (using prior knowledge to supplement recall) in old age, and included more elaborations and metaphoric prepositions, perhaps indicating that older adults place more value on this manner of recall when communicating information. Interestingly, younger experts are also prone to false recall of information within their domain of expertise (Castel, McCabe, Roediger, & Heitman, 2007)
250
Alan D. Castel
suggesting that this eVect may be driven in part by prior knowledge supplementing (and interfering with) the processes involved in accurate memory performance. Thus, if both younger experts and older adults are prone to these types of errors, one could make the claim that older adults use memory in an expert‐like fashion, which results in eYcient memory performance, but with the side eVect of gist‐based and domain‐specific memory errors. Although gist‐based processing can lead to memory errors, especially for older adults, it allows for the transfer of learning to new situations and to complex forms of thought such as using analogies and drawing inferences based on the classification of events and objects (e.g., Caplan & Schooler, 2001; Reder et al., 1986). Although older adults seem to rely on gist‐based processing, the ability to switch between gist recall and recollection of details is a critical function, and this has been referred to as ‘‘flexible remembering’’ (Koutstaal, 2006). Koutstaal (2006) has provided further evidence that older adults utilize gist‐based representations, and that the ability to switch between these two forms of remembering is present in younger adults, but also to a lesser extent for older adults. This suggests that gist‐based processing may be a default mode of encoding and retrieval by older adults, even though older adults can and do encode details (Koutstaal, 2003), as evidence by priming studies (e.g., Light et al., 2000). Given that Adams and colleagues (1991; Adams, Smith, Nyquist, and Perlmutter (1997) have shown that older adults recall the gist of stories, as well as more interpretative information, whereas younger adults are better at recalling specific details of the story, it suggests that older adults use memory in diVerent ways, especially in terms of the abstraction and retrieval of gist. For example, older adults might quickly decompose specific information to a more general, manageable gist‐based form, such as remembering that a new television costs ‘‘about $4000,’’ rather than the more specific and accurate price of $3989. What remains to be understood is if gist‐based processing is an adaptive form of remembering for older adults (cf. Schacter, 1999), and if older adults can utilize more detailed processing for high value information. The evidence from emotional processing, and other circumstances that involve evaluative processing seems to suggest that under some conditions older adults can recruit these processes, but only in situations that dictate the need and opportunity to avoid familiarity‐ or gist‐based memory errors. F.
P ROPER NAMES AS LOW VALUE INFORMATION?
Memory for proper names is one of the most noticeable memory impairments that typically accompanies aging (see James, 2006; Rendell, Castel, & Craik, 2005). Although proper name retrieval is one of the chief complaints of older
Value‐Directed Remembering and Aging
251
adults, and likely represents one of the critical memory impairments that older adults face, the consequences of not being able to recall other pertinent information about a target individual might be more severe. For example, not being able to recall a person’s name, but remembering other details (e.g., their profession, that they have two children, they drive an expensive sports car, they are someone you can confide in, etc.) is often more valuable and critical for future interactions with this person (despite the embarrassment of forgetting their name); thus, proper name memory impairments might be somewhat adaptive in light of then being able to focus on the encoding and recall other more pertinent associated details. If proper names contained higher value information, older adults might be more prone to pay attention to this type of information, as it would be given a certain degree of priority. In terms of emotion and binding, MacKay and colleagues suggest a priority‐binding theory, in which arousing stimuli trigger emotional reactions that prioritize the process of binding that item to its context (Hadley & MacKay, 2006; MacKay & Ahmetzanov, 2005; MacKay et al., 2005). According to the priority‐binding theory (MacKay et al., 2005), when a word is seen during a list‐learning task, a binding node is primed to form connections between the episodic context and the word meaning. When the word is arousing, relative to other words that are presented in rapid fashion, activation of other currently primed binding nodes is delayed until binding for the higher priority emotional item is complete (Hadley & MacKay, 2006). In the context of proper name encoding and retrieval, it is likely that other more salient information is given priority when someone is introduced (a time when most proper name encoding occurs in everyday interaction), but the name is given lower priority relative to other important or emotional information about the person. It is important to note that even when proper name information contains semantic information (e.g., Mr. Barber is a Baker), both younger and older adults show impairments for the proper name relative to the occupation information (Rendell et al., 2005). Craik (2002) suggests that as to‐be‐remembered information becomes more specific, age‐related diVerences in memory performance become more apparent, with proper names representing highly specific and arbitrary types of information. Older adults might only be able to encode and retrieval proper names in situations in which proper names carry significant meaning or importance for future use, or convey some emotional component, which is very rarely the case. However, this type of impairment might seem eYcient (although very frustrating) if one needs to retain other higher value information at the expense of recalling a proper name.
252
G.
Alan D. Castel
MEMORY, VALUE, AND GRAIN SIZE AT RETRIEVAL
Turning to retrieval, and the final stage in the proposed model that focuses on grain size, the findings from the false memory/DRM literature suggest that older adults may utilize gist‐based encoding and retrieval operations under certain situations, but it remains unclear why older adults use gist processing as opposed to relying on verbatim information. Although a general reduction in available processing resources may partially explain the reliance on gist, older adults may be able to maximize memory performance using appropriate ‘‘grain‐size’’ analysis (Koriat & Goldsmith, 1996; Goldsmith et al., 2002; Goldsmith and Koriat, this volume) in conjunction with environmental and schematic support. Control over grain size can be defined as the operation in which one chooses the level of detail (‘‘precision’’) or generality (‘‘coarseness’’) at which to encode and later report remembered information (Goldsmith et al., 2002). For example, if one witnesses a crime and attempts to remember certain characteristics of the assailant, one could encode (and/or later retrieve) precise information such as ‘‘the man was precisely 5 ft, 10 in. tall,’’ or more general information, such as the man was ‘‘about 6 ft tall,’’ or ‘‘about my height.’’ Goldsmith et al. (2002), and Goldsmith and Koriat (this volume) highlight an important distinction between memory accuracy and memory quantity, such that people can withhold information that they might feel unsure about or provide relatively coarse information that is unlikely to be wrong, or fits appropriately with the situation. According to this notion, the rememberer has the ability to strategically control and regulate the grain size of their answers to accommodate the competing goals of accuracy and informativeness, suggesting that grain size is mediated by both cognitive and metacognitive processes. It may also be the case that expertise in a particular domain gives the rememberer more control over grain size, leading to better memory accuracy, as was the case with older adults being able to remember the exact price of market‐value grocery items, but only the range of other unrealistic prices (Castel, 2005). Although retrieval precision is not typically emphasized when evaluating memory changes in older adults (but see Burke & Light, 1981), it is important to understand how strategic control and monitoring at retrieval and grain size is related to evaluative processing at encoding, and if older adults can avoid memory errors by using appropriate grain size. For older adults, monitoring at retrieval does not necessarily lead to more eVective and accurate retrieval (e.g., Rhodes & Kelley, 2005) and that this can also lead to diVerential trade‐oVs in terms of quantity and accuracy. For example, Kelley and Sahakyan (2003) found that older adults were substantially less accurate than young adults in free report cued recall,
Value‐Directed Remembering and Aging
253
which may be a more precise form of remembering; however, both older and younger adults made gains in memory accuracy from forced report to free report, but older adults did so at the expense of greater losses in quantity correct. However, like most memory paradigms, this did not involve a form of diVerential value at encoding (although Kelley and Sahakyan did manipulate incentives for accuracy), which could possibly lead to strategic opportunities for older adults to use memory in an optimal manner. The findings from studies such as those from the false memory/DRM literature suggest that it may be the case that older adults choose or are forced to employ a broader ‘‘grain‐size’’ analysis during both encoding and retrieval operations. This may lead to what appear to be impairments in terms of memory for specific items and associative information. For example, older adults could recall which grocery items were paired with prices that were incongruent with expectations (a broad grain size), but had greater diYculty remembering the precise price of these over‐ or underpriced items (a more fine‐grained analysis). Thus, as indicated in the model, goals can also influence retrieval specificity. Prior knowledge and expertise can fine‐tune the level of grain size, such that when prices were market value, older adults could rely on a more specific level of grain size to retrieve the exact price. Why older adults select (or can only use) certain levels of grain size is an important issue to examine in the future, as is looking at age‐ related diVerences in the ability to adaptively and volitionally alter the level of grain‐size analysis that is appropriate for the task at hand. It may be that older adults typically use (or can only use) a coarse/broad grain size in certain tasks (such as binding and later recognizing previously studied unrelated word pairs), whereas younger adults can adaptively modify grain size, leading to age‐related diVerences in the ability to remember associative information. The critical mechanism that has been emphasized in the model is the strategic and adaptive control of encoding operations, which involve the use of value assignment to incoming information in order to direct and bias encoding operations. Based on goals, motivation and prior knowledge, and consistent with the SOC framework, older adults can capitalize on remembering high value information. This information can be represented in most cases in a fairly general form, which is then supplemented with reconstructive processing and prior knowledge, leading to a certain grain size at retrieval. However, in some instances older adults can remember specific information especially if it is deemed to be high value or can be incorporated with goals and prior knowledge and has emotional significance, allowing for the use of recollective processing at retrieval.
Alan D. Castel
254
V. A.
Implications of Value on Memory and Aging
BRAIN MECHANISMS, VALUE, MEMORY, AND AGING
How the brain is involved in assigning value and prioritizing information is a critical question, especially in terms of aging and changes in cognitive function. The brain systems involved in memory for emotional information (and the positivity bias in old age) are thought to reflect preserved amygdala function in old age (Mather & Carstensen, 2005). However, the precise mechanism involved in assigning value to information via evaluative processing is largely unknown. Speculatively, it involves strategic control via frontal lobe function, and communication with hippocampal regions that are involved in binding (i.e., content to value binding). Given the strong emphasis on how goals influence value assignment, this might be somewhat similar to Moscovitch and Winocur’s (Moscovitch & Winocur, 1992) model of ‘‘working‐with‐memory’’ in which the interaction between frontal and medial temporal lobe functions lead to eYcient memory. Older adults display impairments due to poor communication between frontal and medial temporal regions; however, this might be the critical mechanism involved in value‐ directed remembering. In the present context, the assignment of value might involve frontal functions, and would include having to ‘‘work with’’ and adapt to impairments that exist in terms of binding in working memory, and communication with medial temporal structures. This can be incorporated with current models that emphasize how adaptive coding integrates the role of prefrontal cortex in working memory (e.g., Duncan, 2001), as well as how cognitive control and the processing of context can be critical for eYcient memory performance in old age (e.g., Braver et al., 2001). High value information might gain priority access and communication, at the expense of lower value information. However, the working‐with‐memory model does not include predictions regarding how value influences memory, a critical point for understanding the brain mechanisms involved in evaluative processing and directed or motivated remembering. A recent neuroimaging study has shown that value (in the form of monetary incentive presented prior to learning) can lead to diVerential encoding of high and low value information, via dopamine release in the hippocampus (Adcock, Thangavel, Whitfield‐Gabrieli, Knutson, & Gabrieli, 2006). Adcock et al. (2006) used a procedure much like the selectivity paradigm, in which cues signaled a high ($5.00) or low (10 cents) value monetary reward for memorizing an upcoming scene. Subjects were tested a day later and were significantly more likely to remember scenes that followed cues for high value rather than low‐value reward. In addition, the monetary incentive delay task independently localized regions responsive to reward anticipation. In the
Value‐Directed Remembering and Aging
255
encoding task, high‐reward cues preceding remembered but not forgotten scenes resulted in substantial release of dopamine in the hippocampus, consistent with the notion that reward motivation promotes memory formation via dopamine release in the hippocampus prior to learning. These findings provide a mechanism for value to influence memory performance for younger adults and suggest that value‐directed remembering can result in neurochemical variation, leading to better memory based on value activation and reward. Although this specific work has not been extended to older adults (but see Ba¨ckman, Nyberg, Lindenberger, Li, & Farde, 2006, for a review of aging, dopamine, and memory), it may be that older adults greatly benefit from this dopamine release when value is added to items, resulting in good memory for high value information. Logan et al. (2002) found that older adults often show activation at encoding of multiple frontal regions in a nonselective and atypical manner (compared to younger adults), resulting in poor memory performance. Thus, critical regions were under recruited, and when under‐recruitment was reversed by requiring older adults to use semantic elaboration, memory performance improved, suggesting that older adults can recruit regions when given appropriate direction (see also Cabeza, Anderson, Locantore, & McIntosh, 2002). In terms of value, it may be that value‐directed remembering results in the selective recruitment of critical frontal areas for high value information, leading to eYcient memory for high value items but not lower value items. Whereas orienting tasks can improve the recruitment of critical brain regions for older adults, little is known about how value can influence the recruitment and use of these brain regions. Although older adults might not be able to remember as much information in typical settings due to nonselective recruitment, this form of brain region ‘‘selectivity’’ may be improved via value‐directed remembering, such that older adults can remember high value information as a result of focusing on important information. Taken together, and trying to relate these results to value, memory, and aging, these two neuroimaging studies (Adcock et al., 2006; Logan et al., 2002) may suggest that the best way for older adults to remember high value information is to present the value prior to learning (i.e., state that the next bit of information is important to remember), or to emphasize value at the time of study in an incidental manner, allowing for the recruitment of appropriate brain regions. The actual assignment of value might be governed by frontal function, while binding might involve more subcortical structures such as hippocampus and in the case of emotion, the amygdala. For older adults, the subjective assignment of value is somewhat strategic in nature and involves frontal function, while the binding process in old age is more eVortful and requires controlled processing, but communication between these areas can result in adaptive control of memory.
256
B.
Alan D. Castel
EXPERTISE AS ADAPTIVE CONTROL AND SKILLED COGNITION IN OLD AGE REVISITED
In light of the previous review and model, it is important to assess how expertise can influence memory performance in old age, possibly via assigning value to information, as well as in a compensatory manner. Expertise in a certain domain can often improve memory performance for domain‐specific material, and the idea that this may also reduce age‐related memory decline has been widely studied. Previous research on aging, expertise, and memory has shown that expertise can facilitate memory performance for domain‐ relevant information (see Krampe & Charness, 2006, for a review). This has been demonstrated in domains such as memory for chess (Charness, 1981), cooking information (Miller, 2003), aviation information (Morrow, Leirer, Altieri, & Fitzsimmons, 1994), spatial layouts (Hess & Slaughter, 1990), and music (Meinz & Salthouse, 1998), although in many cases expertise simply leads to similar benefits in performance for both younger and older adults (see Arbuckle, Cooney, Milne, & Melchior, 1994). However, given that younger and older adults likely have diVerent goals, it may not be useful to determine when both age groups reach a similar level of performance (e.g., older adults might be satisfied with any expertise‐related improvements, even if they do not reach the same levels of younger adults). It may be that expertise leads to the assignment of higher value to certain types of domain‐specific information, such that a football score (and remembering the winning team) is better remembered by a football fan because it is of high value, relative to a stock market quote, which might have less relevance to a football fan, but greater value to a broker. The critical endeavor is to better understand how (and why) older adults engage in heuristics to facilitate performance, in light of impairments, and how older adults function like experts in terms of assigning value to information that they deem important, in order to compensate for global memory impairments. Previously, it was shown that older adults need to adapt to general slowing, and one mechanism involves compensation for slowing via strategic regulation and allocation of cognitive processes. Although not directly in the domain of long‐term memory and value, Salthouse (1984) provided convincing evidence for compensatory behavior in a study of expertise and transcription typing. In particular, he showed that older adult transcription typists compensated for declines in perceptual processing speed by looking further ahead in the to‐be‐typed text than younger adults. Thus, even though older typists were slower, consistent with the generalized slowing hypothesis, they engaged in a strategy that allowed for some compensation. Similarly, Bosman (1993) examined how younger and older adults performed in a task that involved making rapid responses to multiple sequentially presented
Value‐Directed Remembering and Aging
257
letters that were presented in pairs. She found that older adults strategically engaged in slower responses on the first trial, but would then benefit from this controlled slowing by making more rapid response on the second trial of a pair, resulting in somewhat eYcient performance. These findings can be explained by Baltes’s selective optimization with compensation model (Baltes, Staudinger, & Lindenberger, 1999), in that older adults will focus on optimizing performance in an area by using selective compensation. These examples suggest that older adults can exert some strategic control but that this is governed by expertise and heuristics, and this may be related to the value that is placed on speed and accurate performance. Some older adults may be highly experienced, skilled or even experts in terms of working with changes in memory performance, and using adaptive techniques to combat age‐related changes in memory. Thus, older adults who are aware of declines in memory ability may adapt by using strategies that allow them to focus on important information, and this might be considered a form of expertise in terms of the SOC framework. In the present context, older adults might be especially good at selectively assigning low value to many kinds of information that they feel they cannot remember, and then focusing on high value information. Thus, one important form of expertise in terms of dealing with memory changes in old age is the refined ability to successfully allocate value (and thus attention) to high value information, and not to focus on irrelevant information. The variability in this form of expertise (i.e., the ability to engage in evaluative processing) might contribute to the observation that some older adults are more selective in terms of what information they can remember. It could even provide anecdotal evidence for why some dementia patients are capable of seemingly selective memory errors, while remembering certain types of information at inappropriate times, which may have been high value information at another time in life. C.
INDIVIDUAL DIFFERENCES AMONG OLDER ADULTS: THE CONTROL AND USE OF VALUE ASSIGNMENT
Older adults may be able to use value to guide encoding and retrieval, but it is likely that there are diVerences in the extent to which all older adults can eYciently use value in a strategic manner. As a group, older adults diVer on an array of variables (see Nyberg & Ba¨ckman, 2006), including areas such as working memory capacity and inhibitory control (Hasher & Zacks, 1988), and these variables may relate to how well an individual can use value to guide encoding and retrieval operations. Thus, assignment of value may be the critical role that ‘‘cognitive control’’ plays in cognitive aging. Some older adults may be better at recruiting appropriate brain networks for compensation (e.g., Cabeza et al., 2002) and this might be accomplished via value‐directed remembering.
258
Alan D. Castel
Also, older adults need to use prospective memory in many cases in terms of knowing what information will be of high value at a later time, depending on whether value assignment and later retrieval is strategic and self‐initiated, or more automatic (e.g., Einstein, McDaniel, Richardson, Guynn, & Cunfer, 1995). Evaluative processing at encoding can ideally lead to older adults being selective about what information they encode for future use, but it is clear that this process does not lead to all older adults simply ignoring low value or task irrelevant information. Impairments may thus exist in terms of cognitive control at initial encoding stages, while a higher level control system (strategic control in light of value) then leads to older adults focusing on higher value information, despite perceptual or lower value information still being registered. It may be that within a general older adult population, individual diVerences exist in terms of the degree to which strategic control can override if/how low value or task irrelevant information is encoded. This ability might be related to frontal lobe function, working memory capacity and the reflexive refresh function in short‐term memory suggested by Johnson, Reeder, Raye, and Mitchell (2002; Johnson et al., 2005; see also Mather and Knight 2005, for how measures of cognitive control relate to goal‐directed memory). Although there is a great deal of variability in older adult samples, perhaps one common theme is that older adults try to remember information that they feel is personally relevant or of high value. In their review of aging and long‐term memory, Zacks and Hasher (2006) suggest that older adults may actually set their own agendas in terms of what information that they find important or personally useful, and thus may be more discriminating than younger adults. This might involve using more shortcuts or heuristics, relative to younger adults, and can lead to certain kinds of memory errors. However, some of the reviewed studies suggest that older adults can engage in more detailed analytic encoding by using evaluative processing, in the context of how valuable the information is for the older adult. Older adults are often more inclined to remember options that they have chosen relative to other options in the context of decision making (Mather & Johnson, 2000), and this might be viewed as adaptive in order to lead to more positive emotion later in life. It also represents how older adults might assign high value to chosen options, given that other nonselected options are likely no longer relevant (thus, now of lower value). This approach of investigating how preferences and personal choice influence how memory is used is an important facet of decision making, as well as how much satisfaction is derived from making decisions consistent with value (e.g., Higgins, 2005). Constructing preferences from memory, as dictated in the ‘‘Preferences as Memory’’ framework (Weber & Johnson, 2006), emphasizes how preferences and personal choice (i.e., subjective value) can influence and bias decision
Value‐Directed Remembering and Aging
259
making. Given how older adults can direct memory via value‐directed procedures, this might be related to preferences stored in long‐term memory. Although ‘‘Preferences as Memory’’ has not been examined in the context of older adults and memory performance, this is certainly an important avenue for future research in cognitive aging, one that can easily incorporate how preferences and value can lead to biases and eYcient memory use and decision‐making performance by older adults. D.
VALUE‐DIRECTED REMEMBERING AND IMPLICATIONS TRAINING
FOR
Given the variability and decline in memory performance in old age, there is considerable interest in developing training regimens that improve memory performance in old age, and most training studies require large amount of practice in order to acquire significant benefits. Memory performance can also be controlled and improved through the judicious use of mnemonic strategies. For example, imagery and verbal association strategies have been shown to enhance memory in both the laboratory (Verhaeghen, Marceon, & Goossens, 1992) and real‐world situations (West, 1996). Although older adults do not engage in spontaneous strategy use as frequently as younger adults do, their use of a strategy can greatly enhance performance if it is suggested to them (see West, 1996, for a review). The use of strategies requires strong motivation and eVort, and although older adults can often see the benefits of using such strategies in the short term, the maintenance of strategy use is short‐lived in the real world if motivation and eVort are not rewarded (Dunlosky & Hertzog, 1998; West, 1996). In terms of processing contextual information that is important for goal‐ related behavior, Paxton, Barch, Storandt and Braver (2006) found that age‐related diVerences in context processing can be ameliorated by directed strategy training. However, research on training and expertise has suggested that age‐related cognitive sparing is often quite narrow (Kramer & Willis, 2003), and only being observed on tasks and skills similar to those on which individuals have been trained (i.e., very little transfer to other domains or areas of learning). Ironically, it appears that significant training is often required to begin to use and benefit from memory training (one needs to have practice with training), and this added complication often leads to frustration and lack of reinforcement for older adults. Thus, training needs to be consistent with an individual’s need to improve memory in specific ways. Older adults’ perceived control of memory ability (e.g., Lachman, 2006) may be critical for the use of eVective training procedures, and value‐directed remembering may lead to enhancement in the degree of perceived control that older adults experience when working with memory.
Alan D. Castel
260
Although there are many situations in which older adults can benefit and exploit other factors that can optimize memory performance, such as exercise and fitness training (Kramer & Willis, 2003) and being aware that memory is better at optimal time‐of‐day (typically in the morning, see May, Hasher, & Stoltzfus, 1993), one missing element is the manner in which training is conceptualized in relation to older adults’ goals. Thus, it might not be appropriate to try to coerce older adults to use somewhat unfamiliar and esoteric memory strategies such as make bizarre images or elaboration and somewhat ‘‘nonsense’’ mnemonics, especially for low value information. Developing and implementing appropriate strategies is essential, in the same way that it is not appropriate to give a little league baseball player a 38 ounce baseball bat used by a professional to hit a baseball (i.e., the bat is a powerful tool, but in this case too heavy and not appropriate for a younger player). Although most training is directed toward getting older adults to engage in memory strategies favored by younger adults, a more appropriate goal might be to develop training regimens that build on older adults’ strengths, namely the ability to engage in value‐directed remembering. For example, Rhodes et al. (2007) found that older adults can improve memory and the accuracy of confidence judgments when given feedback in terms of a value‐based score. Given the distinction between objective and subjective value, it may be important for older adults to focus on objective value in order to enhance memory for detailed information, since subjective value assignment is already under the control of the individual. In general, training regimens might focus on teaching both younger and older adults how to prioritize what is committed to memory, via value‐directed remembering. It appears that younger adults often have diYculty knowing what is important to remember for future tests (but have very little problem memorizing large amounts of information), while older adults under certain circumstances seem to be able to prioritize according to reductions in the ability to remember vast amounts of new information. Thus, both age groups can benefit from learning and implementing principles related to selection, prioritizing, and value‐directed remembering. VI.
Summary and Conclusions
Neisser (1982) wrote that cognitive psychology needs to address the ‘‘important’’ question related to memory, leading to a cognitive revolution that generated great debate as to what were the important questions regarding memory, and the methodology that would be needed to answer these questions. Twenty‐five years since Neisser has stated this claim, research in cognitive aging might be faced with a similar challenge. Research in cognitive
Value‐Directed Remembering and Aging
261
aging can tell us a great deal about memory impairments, but we also need to be aware of how older adults strategically use memory in eYcient ways in light of impairments. Older adults may feel the need to focus on the important things to remember, given a long life span of encountering information, and the knowledge that life, as well as memory resources, is limited (Carstensen, 2006). Thus, Neisser’s research perspective is also somewhat related to how older adults begin to view their memory, by identifying what is important and focusing on these aspects. Although value can take various forms, and be assigned both objectively and subjectively, it may be useful to draw on the diamond‐water paradox that was presented earlier in the context of Adam Smith’s objective or intrinsic theory of value. It may be the case that while younger adults can focus on detailed memory of many events (perceived as high value but not always functional information, akin to a diamond), older adults focus on more functional, gist‐based, and perhaps more practical and positive information (what they feel is necessary for a sustainable existence and enjoyable survival, much like water). A functional approach to how older adults direct resources to certain kinds and types of information, in relation to value and strategic and selective control theory, will likely be a fruitful manner to study age‐related change across the life span in order to understand the impairments, biases and benefits that accompany memory performance in old age. Although this chapter presents a somewhat (overly?) optimistic outlook regarding how older adults can eYciently use memory in old age, these arguments are made in response to the obvious memory impairments that older adults face. Decades of research have shown numerous types of specific deficits and disproportionate impairments in a variety of memory tasks, both naturalistic and laboratory‐based, most of which are not ameliorated by training strategies. This might be one of the most universal findings in cognitive aging—but this story needs to be interpreted in a framework that emphasizes life span development. Although it does not come as a surprise that at the age of 60 years most of us cannot run as fast as we could at the age of 16 years, changes in certain kinds or speed of memory performance need to be interpreted in the context of how they can influence, not simply impair, the use of memory in old age. The present chapter emphasizes that what is critical is the adaptive nature of human memory, and how memory can function in light of the value placed on the information. Given that the adult life span has increased significantly over the past few decades, memory must adapt to cope with living longer. Carstensen and Charles (2002) argue that even good news (living longer) is taken in a somewhat negative tone (poorer memory and cognitive function)—a perspective that is often taken by younger but not older adults. What is important to study is how older adults adaptively cope with longer life span,
262
Alan D. Castel
and how value plays a critical role in maximizing memory performance, and well‐being in general. The present arguments suggest that as we get older we start to use our memory in diVerent ways, focusing on what we deem important in light of knowing that we cannot remember everything (or in many situations, most things). This does not just start at the age of 65 years, as many of us need to prioritize and eVortfully direct attention to PIN numbers, passwords, and learning new names, even at the ‘‘young’’ age of 30 years, often relying on (electronic) devices to help remember critical information. Anecdotally, older adults (as well as younger adults, to a certain degree) will remark that although they have diYculty remembering information, ‘‘if it is important, then I will remember it.’’ Thus, perhaps one benefit of old age and wisdom is learning what is important in life and then directing resources to achieve these goals. Memory impairments clearly exist in old age, but older adults can exert some degree of strategic control via evaluative processing to direct cognitive resources to high value or high priority information. William James (1890) commented on this, arguing that ‘‘Selection is the very keel on which our mental ship is built. And in the case of memory its utility is obvious. If we remembered everything, we should on most occasions be as ill oV as if we remembered nothing’’ (p. 680). Although this quote puts the case for selectivity rather strongly (and James might have been noticing his own age‐ related change in memory at the time?!), it does emphasize the need to be selective when trying to remember new information, especially in old age. This process may lead to eYcient memory performance in light of reductions in processing speed or capacity, and may lead older adults to be more discriminating about what kinds of information are committed to memory. In some cases this may lead to gist‐based processing, or the reliance on familiarity in the absences of more detailed recollection. Older adults’ bias to focus on processing high value information coupled with prior knowledge to supplement memory, or remembering important positive information at the expense of other details, can lead to the eYcient and eVective use of memory during the adult life span. ACKNOWLEDGMENTS I thank both Lynn Hasher and Gus Craik for engaging discussions and debates regarding memory and cognitive aging, which forced me to take a new approach to how I think about studying memory and aging, and provided some of the motivation for this perspective and line of research. I also acknowledge the influence of Paul Baltes, who has influenced how I thought about aging before I began to study aging. I specifically thank David McCabe and Matthew Rhodes for extremely useful comments, as well as the following people who have provided
Value‐Directed Remembering and Aging
263
encouragement, and critical views in response to the arguments made in the present work: Aaron Benjamin, Laura Carstensen, Boris Castel, Judy Gold, Jessica Logan, Don MacKay, and Rose Zacks. Finally, I am continually grateful for the insight of many older adults, who seem to learn and know valuable information about their own memory.
REFERENCES Adams, C. (1991). Qualitative age diVerence in memory for text: A lifespan developmental perspective. Psychology and Aging, 6, 323–336. Adams, C., Smith, M. C., Nyquist, L., & Perlmutter, M. (1997). Adult age‐group diVerences in recall for the literal and interpretive meanings of narrative test. Journal of Gerontology: Psychological Science, 57B, P28–P40. Adcock, R. A., Thangavel, A., Whitfield‐Gabrieli, S., Knutson, B., & Gabrieli, J. D. E. (2006). Reward‐motivated learning: Mesolimbic activation precedes memory formation. Neuron, 50, 507–517. Arbuckle, T. Y., Cooney, R., Milne, J., & Melchior, A. (1994). Memory for spatial layouts in relation to age and schema typicality. Psychology and Aging, 9, 467–480. Ba¨ckman, L., Andersson, J. L., Nyberg, L., Winblad, B., Nordberg, A., & Almkvist, O. (1999). Brain regions associated with episodic retrieval in normal aging and Alzheimer’s disease. Neurology, 52, 1861–1870. Ba¨ckman, L., Nyberg, L., Lindenberger, U., Li, S.‐C., & Farde, L. (2006). The correlative triad among aging, dopamine, and cognition: Current status and future prospects. Neuroscience and Biobehavioral Reviews, 30, 791–807. Balota, D. A., & Faust, M. E. (2001). Attention in dementia of the Alzheimer’s type. In F. Bolla and S. Cappa (Eds.), Handbook of neuropsychology: Vol. 6. Aging and dementia (2nd ed., pp. 51–80). New York, NY: Elsevier Science. Balota, D. A., Cortese, M. J., Duchek, J. M., Adams, D. R., Roediger, H. L., III, McDermott, K. B., et al. (1999). Veridical and false memories in healthy older adults and in dementia of the Alzheimer’s type. Cognitive Neuropsychology, 16, 361–384. Balota, D. A., Dolan, P. O., & Duchek, J. M. (2000). Memory changes in healthy young and older adults. In E. Tulving and F. I. M. Craik (Eds.), The Oxford handbook of memory (pp. 395–410). Oxford University Press. Baltes, P. B., & Baltes, M. M. (1990). Psychological perspectives on successful aging: The model of selective optimization with compensation. In P. B. Baltes and M. M. Baltes (Eds.), Successful aging: Perspectives from the behavioral sciences (pp. 1–34). New York, NY: Cambridge University Press. Baltes, P. B., Staudinger, U. M., & Lindenberger, U. (1999). Lifespan psychology: Theory and application to intellectual functioning. Annual Review of Psychology, 50, 471–507. Bastin, C., & Van der Linden, M. (2006). The eVects of aging on the recognition of diVerent types of associations. Experimental Aging Research, 32, 61–77. Benjamin, A. S. (2007). Memory is more than just remembering: Strategic control of encoding, accessing memory, and making decisions. In A. S. Benjamin and B. H. Ross (Eds.), The psychology of learning and motivation (Vol. 48, pp. 175–223). San Diego, CA: Academic Press. Benjamin, A. S., & Craik, F. I. M. (2001). Parallel eVects of aging and time pressure on memory for source: Evidence from the spacing eVect. Memory & Cognition, 29, 691–697. Blanchard‐Fields, F. (2007). Everyday problem solving and emotion: An adult developmental perspective. Current Directions in Psychological Science, 16, 26–31.
264
Alan D. Castel
Blanchard‐Fields, F., & Camp, C. J. (1990). AVect, individual diVerences, and real world problem solving across the adult life span. In T. Hess (Ed.), Aging and cognition: Knowledge organization and utilization (pp. 461–497). Oxford, England: North Holland. Bosman, A. A. (1993). Age and skill diVerences in typing related and unrelated reaction time tasks. Aging and Cognition, 1, 310–322. Brainerd, C. J., & Reyna, V. F. (2001). Fuzzy‐trace theory: Dual‐processes in reasoning, memory, and cognitive neuroscience. Advances in Child Development and Behavior, 28, 49–100. Braver, T. S., Barch, D. M., Keys, B. A., Carter, C. S., Kaye, J. A., Janowsky, J. S., et al. (2001). Context processing in older adults: Evidence for a theory relating cognitive control to neurobiology in healthy aging. Journal of Experimental Psychology: General, 130, 746–763. Burke, D. M., & Light, L. L. (1981). Memory and aging: The role of the retrieval processes. Psychological Bulletin, 90, 513–546. Cabeza, R. E., Anderson, N. D., Locantore, J. K., & McIntosh, A. R. (2002). Aging gracefully: Compensatory brain activity in high‐performing older adults. Neuroimage, 17, 1394–1402. Caplan, L. J., & Schooler, C. (2001). Age eVects on analogy‐based memory for text. Experimental Aging Research, 2, 151–165. Carstensen, L. L. (1992). Social and emotional patterns in adulthood: Support for socioemotional selectivity theory. Psychology and Aging, 7, 331–338. Carstensen, L. L. (2006). The influence of a sense of time on human development. Science, 312, 1913–1915. Carstensen, L. L., & Charles, S. T. (2002). Human aging: Why is even good news taken as bad? In L. Aspinwall and U. Staudinger (Eds.), A psychology of human strengths: Perspectives on an emerging field. Washington, DC: American Psychological Association. Carstensen, L. L., Isaacowitz, D. M., & Charles, S. T. (1999). Taking time seriously: A theory of social selectivity. American Psychologist, 54, 165–181. Castel, A. D. (2005). Memory for grocery prices in younger and older adults: The role of schematic support. Psychology and Aging, 20, 718–721. Castel, A. D. (2007). Aging and memory for numerical information: The role of specificity and expertise in associative memory. Journal of Gerontology: Psychological Sciences, 62, 194–196. Castel, A. D., Benjamin, A. S., Craik, F. I. M., & Watkins, M. J. (2002). The eVects of aging on selectivity and control in short‐term recall. Memory & Cognition, 30, 1078–1085. Castel, A. D., & Craik, F. I. M. (2003). The eVects of aging and divided attention on memory for item and associative information. Psychology and Aging, 18, 873–885. Castel, A. D., Balota, D. A., Hutchison, K. A., Logan, J. M., & Yap, M. J. (2007). Spatial attention and response control in healthy younger and older adults and individuals with Alzheimer’s disease: Evidence for disproportionate selection breakdowns in the Simon task. Neuropsychology, 21, 170–182. Castel, A. D., Farb, N., & Craik, F. I. M. (2007). Memory for general and specific value information in younger and older adults: Measuring the limits of strategic control. Memory & Cognition. (in press). Castel, A. D., McCabe, D. P., Roediger, H. L. III, & Heitman, J. L. (2007). The dark side of expertise: Domain specific memory errors. Psychological Science, 18(1), 3–5. Chalfonte, B. L., & Johnson, M. K. (1996). Feature memory and binding in young and older adults. Memory & Cognition, 24, 403–416. Charness, N. (1981). Visual short‐term memory and aging in chess players. Journal of Gerontology, 36, 615–619.
Value‐Directed Remembering and Aging
265
Craik, F. I. M. (2002). Human memory and aging. In L. Ba¨ckman and C. von Hofsten (Eds.), Psychology at the turn of the millennium (pp. 261–280). Hove, UK: Psychology Press. Craik, F. I. M., & Bosman, B. A. (1992). Age‐related changes in memory and learning. In H. Bouma and J. A. M. Graafmans (Eds.), Gerontechnology (pp. 79–92). Amsterdam: IOS Press. Craik, F. I. M., & Byrd, M. (1982). Aging and cognitive deficits: The role of attentional resources. In F. I. M. Craik and S. E. Trehub (Eds.), Aging and cognitive processes. New York, NY: Plenum. Craik, F. I. M., & Salthouse, T. A. (2000). The handbook of aging and cognition. Hillsdale, NJ: Lawrence Erlbaum. Deese, J. (1959). On the prediction of occurrence of particular verbal intrusions in immediate recall. Journal of Experimental Psychology, 58, 305–312. Diehl, M., Coyle, N., & Labouvie‐Vief, G. (1996). Age and sex diVerences in strategies of coping and defense across the life span. Psychology and Aging, 11, 127–139. Duncan, J. (2001). An adaptive coding model of neural function in prefrontal cortex. Nature Reviews Neuroscience, 2, 820–829. Dunlosky, J., & Connor, L. (1997). Age‐related diVerences in the allocation of study time account for age‐related diVerences in memory performance. Memory & Cognition, 25, 691–700. Dunlosky, J., & Hertzog, C. (1998). Training program to improve learning in later adulthood: Helping older adults educate themselves. In D. J. Hacker, J. Dunlosky, and A. C. Graesser (Eds.), Metacognition in educational theory and practice (pp. 249–275). Mahwah, NJ: Lawrence Erlbaum. Dunlosky, J., Kubat‐Silman, A. K., & Hertzog, C. (2003). Training monitoring skills improves older adults’ self‐paced associative learning. Psychology and Aging, 18, 340–345. Einstein, G. O., McDaniel, M. A., Richardson, S. L., Guynn, M. J., & Cunfer, A. R. (1995). Aging and prospective memory: Examining the influence of self‐initiated retrieval processes. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 996–1007. Freund, A. M., & Baltes, P. B. (2002). Life‐management strategies of selection, optimization, and compensation: Measurement by self‐report and construct validity. Journal of Personality and Social Psychology, 82, 642–662. Fung, H. H., & Carstensen, L. L. (2003). Sending memorable messages to the old: Age diVerences in preferences and memory for advertisements. Journal of Personality and Social Psychology, 85, 163–178. Goldsmith, M., & Koriat, A. (2007). The strategic regulation of memory accuracy and informativeness. In A. S. Benjamin and B. H. Ross (Eds.), The psychology of learning and motivation (Vol. 48, pp. 1–60). San Diego, CA: Academic Press. Goldsmith, M., Koriat, A., & Weinberg‐Eliezer, A. (2002). Strategic regulation of grain size in memory reporting. Journal of Experimental Psychology: General, 131, 73–95. Gru¨hn, D., Smith, J., & Baltes, P. B. (2005). No aging bias favoring memory for positive material: Evidence from a heterogeneity‐homogeneity list paradigm using emotionally toned words. Psychology and Aging, 20, 579–588. Hadley, C. B., & MacKay, D. G. (2006). Does emotion help or hinder immediate memory? Arousal versus priority‐binding mechanisms Journal of Experimental Psychology: Learning, Memory, and Cognition, 32, 79–88. Hasher, L., & Zacks, R. T. (1988). Working memory, comprehension and aging: A review and a new view. In G. Bower (Ed.), The psychology of learning and motivation (Vol. 22, pp. 193–225). New York, NY: Academic Press. Heckhausen, J. (1999). Developmental regulation in adulthood: Age normative and sociostructural constraints as adaptive challenges. New York, NY: Cambridge University Press.
266
Alan D. Castel
Heckhausen, J., & Schulz, R. (1995). A life span theory of control. Psychological Review, 102, 284–304. Hertzog, C., & Dunlosky, J. (2005). Aging, metacognition, and cognitive control. In B. H. Ross (Ed.), The psychology of learning and motivation (Vol. 45, pp. 215–251). San Diego, CA: Academic Press. Hertzog, C., Kidder, D. P., Powell‐Moman, A., & Dunlosky, J. (2002). Aging and monitoring associative learning: Is monitoring accuracy spared or impaired? Psychology and Aging, 17, 209–225. Hess, T. M. (2005). Memory and aging in context. Psychological Bulletin, 131, 383–406. Hess, T. M., & Slaughter, S. J. (1990). Schematic knowledge influences on memory scene information in young and older adults. Developmental Psychology, 26, 855–865. Hess, T. M., Rosenberg, D. C., & Waters, S. J. (2001). Motivation and representational processes in adulthood: The eVects of social accountability and information relevance. Psychology and Aging, 16, 629–642. Higgins, E. T. (2005). Value from regulatory fit. Current Directions in Psychological Science, 14, 209–213. Jacoby, L. L. (1999). Ironic eVects of repetition: Measuring age‐related diVerence in memory. Journal of Experimental Psychology: Leaning, Memory, & Cognition, 25, 3–22. Jacoby, L. L., & Hay, J. F. (1998). Age‐related deficits in memory: Theory and application. In M. A. Conway, S. E. Gathercole, and C. Cornoldi (Eds.), Theories of memory II (pp. 111–134). Hove, UK: Psychology Press. Jacoby, L. L., & Rhodes, M. G. (2006). False remembering in the aged. Current Directions in Psychological Science, 15, 49–53. James, L. (2006). Specific eVects of aging on proper name retrieval: Now you see them, now you don’t. Journal of Gerontology, Psychological Science, 61, P180–P183. James, W. (1890). The principles of psychology: Volume 1. New York, NY: Henry Holt. Jenkins, J. J. (1979). Four points to remember: A tetrahedral model of memory experiments. In L. S. Cermak and F. I. M. Craik (Eds.), Levels of processing in human memory (pp. 429–446). Hillsdale, NJ: Lawrence Erlbaum. Jennings, J. M., & Jacoby, L. L. (2003). Improving memory in older adults: Training recollection. Neuropsychological Rehabilitation, 13, 417–440. Johnson, M. K., Hashtroudi, S., & Lindsay, D. S. (1993). Source monitoring. Psychological Bulletin, 114, 3–28. Johnson, M. K., Raye, C. L., Mitchell, K. J., Greene, E. J., Cunningham, W. A., & Sanislow, C. A. (2005). Using fMRI to investigate a component process of reflection: Prefrontal correlates of refreshing a just‐activated representation. Cognitive AVective & Behavioral Neuroscience, 5, 339–361. Johnson, M. K., Reeder, J. A., Raye, C. L., & Mitchell, K. J. (2002). Second thoughts versus second looks: An age‐related deficit in reflectively refreshing just‐activated information. Psychological Science, 13, 64–67. Kelley, C. M., & Sahakyan, L. (2003). Memory, monitoring, and control in the attainment of memory accuracy. Journal of Memory and Language, 48, 704–721. Kensinger, E. A., & Schacter, D. L. (1999). When true memories suppress false memories: EVects of aging. Cognitive Neuropsychology, 16, 399–415. Kensinger, E. A., Piguet, O., Krendl, A. C., & Corkin, S. (2005). Memory for contextual details: EVects of emotion and aging. Psychology and Aging, 20, 241–250. Koriat, A., & Goldsmith, M. (1996). Monitoring and control processes in the strategic regulation of memory accuracy. Psychological Review, 103, 490–517. Koutstaal, W. (2003). Older adults encode–but do not always use–perceptual details: Intentional versus unintentional eVects of detail on memory judgments. Psychological Science, 14, 189–193.
Value‐Directed Remembering and Aging
267
Koutstaal, W. (2006). Flexible remembering. Psychonomic Bulletin & Review, 13, 84–91. Koutstaal, W., & Schacter, D. L. (1997). Gist‐based false recognition of pictures in older and younger adults. Journal of Memory & Language, 37, 555–583. Kramer, A. F., & Willis, S. L. (2003). Cognitive plasticity and aging. In B. H. Ross (Ed.), The psychology of learning and motivation (pp. 267–302). San Diego, CA: Academic Press. Krampe, R. T., & Charness, N. (2006). Aging and expertise. In K. A. Ericsson, N. Charness, P. Feltovich, and R. HoVman (Eds.), Cambridge handbook of expertise and expert performance (pp. 723–742). Cambridge, UK: Cambridge University Press. Labouvie‐Vief, G. (1990). Wisdom as integrated thought: Historical and developmental perspectives. In R. J. Sternberg (Ed.), Wisdom: Its nature, origin, and development (pp. 52–86). New York, NY: Cambridge University Press. Lachman, M. E. (2006). Perceived control over aging‐related declines: Adaptive beliefs and behaviors. Current Directions in Psychological Science, 15, 282–286. Leippe, M. R., Wells, G. L., & Ostrom, T. M. (1978). Crime seriousness as a determinant of accuracy in eyewitness identification. Journal of Applied Psychology, 63, 345–351. Light, L. L., Prull, M. W., La Voie, D. J., & Healy, M. R. (2000). Dual‐process theories of memory in old age. In T. J. Perfect and E. A. Maylor (Eds.), Models of cognitive aging (pp. 238–300). Oxford: Oxford University Press. Logan, J. M., Sanders, A. L., Snyder, A. Z., Morris, J. C., & Buckner, R. L. (2002). Under‐ recruitment and non‐selective recruitment: Dissociable neural mechanisms associated with aging. Neuron, 33, 827–840. MacKay, D. G., & Ahmetzanov, M. V. (2005). Emotion, memory, and attention in the taboo Stroop paradigm: An experimental analogue of flashbulb memories. Psychological Science, 16, 25–32. MacKay, D. G., Hadley, C. B., & Schwartz, J. H. (2005). Relations between emotion, illusory word perception, and orthographic repetition blindness: Tests of binding theory. Quarterly Journal of Experimental Psychology, 58, 1514–1533. Mather, M. (2006). A review of decision making processes: Weighing the risks and benefits of aging. In L. L. Carstensen and C. R. Hartel (Eds.), When I’m 64 (pp. 145–173). Washington, DC: National Academies Press. Mather, M. (2007). Emotional arousal and memory binding: An object‐based framework. Perspectives on Psychological Science, 2, 33–52. Mather, M., & Carstensen, L. L. (2003). Aging and attentional biases for emotional faces. Psychological Science, 14, 409–415. Mather, M., & Carstensen, L. L. (2005). Aging and motivated cognition: The positivity eVect in attention and memory. Trends in Cognitive Science, 9, 497–502. Mather, M., & Johnson, M. K. (2000). Choice‐supportive source monitoring: Do our decisions seem better to us as we age? Psychology and Aging, 15, 596–606. Mather, M., & Knight, M. (2005). Goal‐directed memory: The role of cognitive control in older adults’ emotional memory. Psychology and Aging, 20, 554–570. Mather, M., Knight, M., & McCaVrey, M. (2005). The allure of the alignable: Younger and older adults’ false memories of choice features. Journal of Experimental Psychology: General, 134, 38–51. May, C. P., Hasher, L., & Stoltzfus, E. R. (1993). Optimal time of day and the magnitude of age diVerence in memory. Psychological Science, 4, 326–330. May, C. P., Rahhal, T., Berry, E. M., & Leighton, E. A. (2005). Aging, source memory, and emotion. Psychology and Aging, 20, 571–578. McCabe, D. P., & Smith, A. D. (2002). The eVect of warnings on false memories in young and older adults. Memory & Cognition, 30, 1065–1077. McDaniel, M. A., Einstein, G. O., Stout, A. C., & Morgan, Z. (2003). Aging and maintaining intentions over delays: Do it or lose it. Psychology and Aging, 8, 823–835.
268
Alan D. Castel
Meinz, E. J., & Salthouse, T. A. (1998). The eVects of age and experience on memory for visually presented music. Journal of Gerontology: Psychological Science, 53B, P60–P69. Mikels, J. A., Larkin, G. R., Reuter‐Lorenz, P. A., & Carstensen, L. L. (2005). Divergent trajectories in the aging mind: Changes in working memory for aVective versus visual information with age. Psychology and Aging, 20, 542–553. Miller, L. M. S. (2003). The eVects of age and domain knowledge on text processing. Journals of Gerontology: Psychological Sciences, 58, P217–P222. Morrow, D. G., Leirer, V. O., Altieri, P. A., & Fitzsimmons, C. (1994). When expertise reduces age diVerences in performance. Psychology and Aging, 9, 134–148. Moscovitch, M., & Winocur, G. (1992). The neuropsychology of memory and aging. In F. I. M. Craik and T. A. Salthouse (Eds.), The handbook of aging and cognition (pp. 315–372). Hillsdale, NJ: Lawrence Erlbaum. Multhaup, K. S. (1995). Aging, source, and decision criteria: When false fame errors do and do not occur. Psychology and Aging, 10, 492–497. Nairne, J. S. (2005). The functionalist agenda in memory research. In A. F. Healy (Ed.), Experimental cognitive psychology and its applications: Festschrift in honor of Lyle Bourne, Walter Kintsch, and Thomas Landauer. Washington, DC: American Psychological Association. Naveh‐Benjamin, M. (2000). Adult age diVerences in memory performance: Tests of an associative deficit hypothesis. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 1170–1187. Naveh‐Benjamin, M., Guez, J., Kilb, A., & Reedy, S. (2004). The associative deficit of older adults: Further support using face‐name associations. Psychology and Aging, 19, 541–546. Naveh‐Benjamin, M., Keshet, T., & Levi, D. (2007). The associative memory deficit of older adults: The role of efficient strategy utilization. Psychology and Aging, 22, 202–208. Neisser, U. (1982). Memory: What are the important questions? In U. Neisser (Ed.), Memory observed (pp. 3–19). San Francisco, CA: Freeman. Norman, K. A., & Schacter, D. L. (1997). False recognition in younger and older adults: exploring the characteristics of illusory memories. Memory & Cognition, 25, 838–848. Nyberg, L., & Ba¨ckman, L. (2006). Influences of biological and self‐initiated factors on brain and cognition in adulthood and aging. In P. B. Baltes, P. A. Reuter‐Lorenz, and F. Ro¨sler (Eds.), Lifespan development and the brain: The perspective of biocultural co‐constructivism (pp. 239–254). Cambridge: Cambridge University Press. Park, D. C. (2002). Judging meaning improves function in the aging brain. Trends in Cognitive Sciences, 6, 227–229. Park, D. C., Smith, A. D., Lautenschlager, G., Earles, J. L., Frieske, D., Zwahr, M., et al. (1996). Mediators of long‐term memory performance across the life span. Psychology and Aging, 11, 621–637. Park, D. C., Lautenschlager, G., Hedden, T., Davidson, N. S., Smith, A. D., & Smith, P. K. (2002). Models of visuospatial and verbal memory across the adult life span. Psychology and Aging, 17, 299–320. Park, D. C., & Schwarz, N. (Eds.). (2000). Cognitive aging: A primer. Philadelphia, PA: Psychology Press. Paxton, J. L., Barch, D. M., Storandt, M., & Braver, T. S. (2006). EVects of environmental support and strategy training on older adults’ use of context. Psychology and Aging, 21, 499–509. Rahhal, T. A., May, C. P., & Hasher, L. (2002). Truth and character: Sources that older adults can remember. Psychological Science, 13, 101–105.
Value‐Directed Remembering and Aging
269
Reder, L. M., Wible, C., & Martin, J. (1986). DiVerential memory changes with age: Exact retrieval versus plausible inference. Journal of Experimental Psychology: Learning, Memory, and Cognition, 12, 72–81. Rendell, P. G., & Craik, F. I. M. (2002). Virtual week and actual week: Age related diVerences in prospective memory. Applied Cognitive Psychology, 14, 43–62. Rendell, P. G., Castel, A. D., & Craik, F. I. M. (2005). Memory for proper names in old age: A disproportionate impairment? Quarterly Journal of Experimental Psychology, 58A, 54–71. Rhodes, M. G., & Kelley, C. M. (2005). Executive processes, memory accuracy, and memory monitoring: An aging and individual diVerences analysis. Journal of Memory and Language, 52, 578–594. Rhodes, M. G., Castel, A. D., & Jacoby, L. L. (2006). Memory for face pairs: An associative memory impairment? Poster presented at the 47th annual meeting of the Psychonomics society, Houston, Texas. Rhodes, M. G., Jacoby, L. L., Daniels, K. A., & Rogers, C. S. (2007). Training memory‐ confidence calibration in the elderly. Manuscript submitted for publication. Riediger, M., & Freund, A. M. (2006). Focusing and restricting: Two aspects of motivational selectivity in adulthood. Psychology and Aging, 21, 173–185. Riediger, M., Li, S.‐C., & Lindenberger, U. (2006). Selection, optimization, and compensation as developmental mechanisms of adaptive resource allocation: Review and preview. In J. E. Birren and K. W. Schaie (Eds.), Handbook of the psychology of aging (6th ed., pp. 289–313). Amsterdam: Elsevier. Roediger, H. L., & Geraci, L. (2007). Aging and the misinformation eVect: A neuropsychological analysis. Journal of Experimental Psychology: Learning, Memory and Cognition, 34, 1569–1577. Roediger, H. L., & McDermott, K. B. (1995). Creating false memories: remembering words not presented in lists. Journal of Experimental Psychology: Learning, Memory, & Cognition, 20, 1379–1390. Salthouse, T. A. (1984). EVects of age and skill in typing. Journal of Experimental Psychology: General, 113, 345–361. Salthouse, T. A. (1996). The processing‐speed theory of adult age diVerence in cognition. Psychological Review, 103, 403–428. Schacter, D. L. (1999). The seven sins of memory: Insights from psychology and cognitive neuroscience. American Psychologist, 54, 182–203. Schacter, D. L., Kaszniak, A. W., Kihlstrom, J. F., & Valdiserri, M. (1991). The relation between source memory and aging. Psychology and Aging, 6, 559–568. Schacter, D. L., Koutstaal, W., Johnson, M. K., Gross, M. S., & Angell, K. A. (1997). False recollection induced by photographs: a comparison of older and younger adults. Psychology and Aging, 12, 203–215. Smith, A. (1776/1994). The wealth of nations. New York, NY: Random House (Original work published in 1776). Spencer, W. D., & Raz, N. (1995). DiVerential age eVects on memory for content and context: A meta‐analysis. Psychology and Aging, 10, 527–539. Spieler, D. H., Mayr, U., & LaGrone, S. (2006). Outsourcing cognitive control to the environment: Adult age diVerences in the use of task cues. Psychonomic Bulletin & Review, 13, 787–793. Stine, E. L., Wingfield, A., & Poon, L. W. (1986). How much and how fast: Rapid processing of spoken language in later adulthood. Psychology and Aging, 1, 303–311. Tentori, K., Osherson, D., Hasher, L., & May, C. (2001). Wisdom and aging: Irrational preferences in college students but not older adults. Cognition, 81, B87–B96.
270
Alan D. Castel
Titcomb, A. L., & Reyna, V. F. (1995). Memory interference and misinformation effects. In F. N. Dempster and C. J. Brainerd (Eds.), Interference and inhibition in cognition (pp. 263–294). San Diego, CA: Academic Press. Touron, D. R. (2006). Are item‐level strategy shifts abrupt of collective? Age diVerences in cognitive skill acquisition Psychonomic Bulletin & Review, 13, 781–786. Tun, P. A., Wingfield, A., Rosen, M. J., & Blanchard, L. (1998). Response latencies for false memories: gist‐based processes in normal aging. Psychology and Aging, 13, 230–241. Tversky, A. (1969). Intransitivity of preferences. Psychological Review, 76, 31–48. Tversky, A. (1972). Elimination by aspects. A theory of choice. Psychological Review, 79, 281–299. Verhaeghen, P., Marceon, A., & Goossens, L. (1992). Improving memory performance in the aged through mnemonic training: A meta‐analytic study. Psychology and Aging, 7, 242–251. Watkins, M. J., & Bloom, L. C. (1999). Selectivity in memory: An exploration of willful control over the remembering process. Unpublished manuscript. Weber, E. U., & Johnson, E. J. (2006). Constructing preferences from memories. In S. Lichtenstein and P. Slovic (Eds.), The construction of preferences (pp. 397–410). New York, NY: Cambridge University Press. West, R. L. (1996). Compensatory strategies for age‐associated memory impairment. In A. D. Baddeley, B. A. Wilson, and F. N. Watts (Eds.), Handbook of memory disorders (pp. 481–500). London: Wiley. Wood, S., Busemeyer, J., Koling, A., Cox, C. R., & Davis, H. (2005). Older adults as adaptive decision makers: Evidence from the Iowa Gambling Task. Psychology and Aging, 20, 220–225. Yonelinas, A. P. (2002). The nature of recollection and familiarity: A review of 30 years of research. Journal of Memory and Language, 46, 441–517. Zacks, R. T., & Hasher, L. (2006). Aging and long‐term memory: Deficits are not inevitable. In E. Bialystok and F. I. M. Craik (Eds.), Lifespan cognition: Mechanisms of change (pp. 162–177). New York, NY: Oxford University Press. Zacks, R. T., Radvansky, G., & Hasher, L. (1996). Studies of directed forgetting in older adults. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 143–156.
EXPERIENCE IS A DOUBLE‐EDGED SWORD: A COMPUTATIONAL MODEL OF THE ENCODING/RETRIEVAL TRADE‐OFF WITH FAMILIARITY Lynne M. Reder, Christopher Paynter, Rachel A. Diana, Jiquan Ngiam, and Daniel Dickison
I.
Introduction
Although many aspects of memory are not well understood, there are other aspects on which there is little debate. For example, one of the most basic laws of memory is that practice benefits retention. Indeed, the conventional wisdom that ‘‘practice makes perfect’’ is applicable whether the practice involves learning a skill (e.g., how to drive a car) or learning a fact (e.g., the name of the first American president). One need not to be a memory researcher to appreciate that the more experience one has with something, the easier it is to process. On the other hand, it is less appreciated that this same experience comes with costs. That is, familiarity with an item sometimes benefits and sometimes hurts performance, depending on the nature of the task. One area in which this familiarity trade‐oV is increasingly evident is the domain of memory retrieval. Two decades ago, in this same Psychology of Learning and Motivation series, Reder (1988) wrote a chapter about the ‘‘strategic control of retrieval strategies’’ arguing against the (then) conventional wisdom that we always try to search our memory for an answer before THE PSYCHOLOGY OF LEARNING AND MOTIVATION VOL. 48 DOI: 10.1016/S0079-7421(07)48007-0
271
Copyright 2007, Elsevier Inc. All rights reserved. 0079-7421/08 $35.00
272
Lynne M. Reder et al.
attempting to reason the answer by using other strategies. That chapter highlighted the various factors that can make one strategy more useful than another, and also proposed that people unconsciously adapt their strategy use to optimize their performance (see also Cary & Reder, 2002; Koriat, 2000; Reder, Weber, Shang, & Vanyukov, 2003; Sun, 2000). A decade later, Schunn and Reder (1998) also wrote a chapter for this series, proposing that there are individual diVerences in the ability to rapidly adapt strategies to optimize performance. Both chapters dealt with the notion that people do not behave in a monolithic fashion, but rather alter their strategies adaptively based on the contingencies of the environment, their own cognitive capacities, and the contents of their memory. It is now generally understood and accepted that people use diVerent strategies in diVerent situations (Anderson & Betz, 2001; Reder, 1987; Shrager & Siegler, 1998) and that people vary in how quickly they adapt to how well a strategy is working (Schunn, Lovett, & Reder, 2001). In this chapter, we want to examine the variables that aVect performance from the bottom up, rather than the top down. That is, we will examine what aspects of the cognitive architecture make the same information an advantage or a liability depending on the task. Our focus is on the trade‐oVs that are inherent with experience and why these trade‐oVs occur from a mechanistic standpoint. The first section of this chapter reviews the evidence that experience can be a liability when retrieving information and also explains the conditions when experience does not hurt performance at retrieval. In the second part of the chapter, we focus on how experience generally facilitates encoding, although we point out trade‐oVs here as well, such that familiarity can sometimes be a liability at encoding. As a part of these explanations, we describe a model that we have developed that can explain retrieval deficits with experience. The SAC model, which stands for source of activation confusion, has had success predicting many results, including some that were not intuitive. However, some additions to the model seem warranted in order to make it more complete and allow it to account for an even wider range of the data. We introduce a revised but more psychologically accurate model1 that can explain how experience positively aVects encoding.
1 We will still call it SAC and like most computational models it undergoes additions and modifications to its assumptions. It is conventionally more parsimonious to keep the same name rather than to introduce a new name every time a change is made to a model. If the changes were fundamental to the axiomatic assumptions of the model, then it would make sense to reject it and start over with something totally diVerent. That is not the case here.
Experience Is a Double‐Edged Sword
II.
273
When and Why Experience Adversely Affects Memory Retrieval
If a person on the street were asked, ‘‘Do you think it is easier to answer a question about something if you know a lot about it?’’ the answer would almost certainly be, ‘‘Of course.’’ Yet if the question was phrased, ‘‘If you were searching for a particular key would it be more diYcult if there were many keys on the key ring or if there were only a few keys?’’ the answer would clearly be that discriminating a single key from many keys would be more diYcult. This common intuition about physical search is just as applicable for memory search, that it is more diYcult to find a specific fact if there are many contenders available. Below we review some of the evidence for the assertion that knowing more about a concept can hurt subsequent retrieval of any particular fact about the concept. We explain why that occurs from a mechanistic standpoint and why it does not always adversely aVect performance. A.
THE FAN EFFECT
Anderson and Bower (1973) demonstrated that when more statements had been previously studied that shared concepts with a given test probe, subjects were slower and less accurate to recognize that the test probe had been seen before. For instance, subjects were slower and less accurate to verify a studied sentence such as ‘‘The hippie touched the debutante’’ if more sentences had also been studied that shared the same terms (e.g., hippie, touch, or debutante). They dubbed this phenomenon the ‘‘fan eVect’’ because they assumed a representation in which concepts were represented as nodes and associations connected the concepts such that the more concepts that ‘‘fanned’’ out of a node, the less activation could spread to any other associated node. Speed and accuracy are related to the amount of activation that reaches another node to make it available. These types of eVects have been demonstrated in many paradigms with many types of stimuli (Anderson & Paulson, 1978; Lewis & Anderson, 1976; Reder, Donavos, & Erickson, 2002; ZbrodoV, 1995), although there are some who have questioned the generality of these eVects (Radvansky, 1999; Smith, Adams, & Schorr, 1978). The fan eVect shows that having more information about a topic does not necessarily decrease memory retrieval time for probes of that topic and might increase it. Nevertheless, one might question whether fan eVects observed in the laboratory are relevant to attempts to retrieve information in the real world. 1.
The Paradox of the Expert
Smith et al. (1978) noted that a logical conclusion of the claim that fan eVects are ubiquitous is that experts should be too slow to answer any questions posed to them and should always be lost in thought. Although anecdotal
274
Lynne M. Reder et al.
evidence seems to suggest that experts often cannot give a ‘‘straight answer,’’ the authors’ point is well taken, as it certainly does not seem experts are unable to give responses. Smith et al. demonstrated that when the facts used in a fan experiment belonged to a theme such as a ship christening (e.g., Marty broke the bottle), knowing more facts about an item (Marty) that were all consistent with the theme did not produce a fan eVect. They suggested that thematically related information is organized into schemas that are represented in a qualitatively diVerent way than a semantic network such as the one proposed by Anderson and Bower (1973). Moreover, they suggested that only when the materials were unrelated and unintegrated (and presumably, unnatural), the fan eVect would occur. This seems to suggest that increasing experience may not decrease memory performance in most cases.
2.
Strategy Variability and Strategy Selection
An alternative explanation that we ultimately put forward is that whether the fan eVect hurts an expert (or anyone else) depends on the nature of the task requirements. Specifically, in some situations (e.g., memory tasks), people are obliged to use a ‘‘direct retrieval’’ strategy that is adversely aVected by fan. In other situations, question answering can occur without using direct retrieval. A few decades ago, the conventional wisdom concerning strategy use in question answering was that people first used a direct retrieval strategy wherein they searched for the answer to a question and only used an inference strategy if that initial direct retrieval attempt failed (Anderson, 1976; Kintsch, 1974; Norman, Rumelhart, & the LNR research group, 1975). Reder (1979, 1982) discovered that this conventional wisdom was erroneous. That is, people do not necessarily search for the answer to a question (direct retrieval) before adopting an inference strategy (plausible reasoning) to answer a question even when they are expressly told to search for a specific fact. Conceivably, the subjects in the Smith et al. (1978) paradigm were frequently opting to use a type of plausible reasoning or consistency strategy to answer the questions in their experiment, and the foils being used in their experiment did not preclude this behavior.2 The hypothesis that Reder and Anderson (1980) tested was that depending on the type of foil, diVerent strategies for question‐answering would be selected. 2
Smith et al. tested Reder’s explanation (provided in a personal communication) by inserting a novel lexical item into the test probes, for example, ‘‘Marty broke the champagne bottle,’’ and did not find that the fan eVect reappeared. Reder discounted Smith et al.’s finding because the low-frequency novel lexical item provided an additional means of rejecting the probe as unstudied. Reder felt that it was important that the experiment control the familiarity of foils which motivated the study by Reder and Anderson (1980).
Experience Is a Double‐Edged Sword
275
In that study, subjects produced fan eVects, but only in certain trial blocks, depending on the nature of the foils in that block. In blocks in which the foils were not thematically related to study items, subjects could use a consistency or plausibility strategy (Reder, 1982, 1987; Reder, Wible, & Martin, 1986), and Reder and Anderson (1980) obtained the same null fan eVect observed by Smith et al. (1978). However, in blocks in which a consistency strategy would not work because foils were thematically related, the fan eVect reemerged, suggesting that a direct retrieval strategy was used. The notion that subjects can adapt their strategy choice from one block to another has subsequently been demonstrated many times (Cary & Reder, 2002; Lemaire & Reder, 1999; Lovett & Schunn, 1999; Reder, 1982, 1987; Reder & Ross, 1983; Reder et al., 1986; Schunn & Reder, 1998). Reder and Ross (1983) went on to show that the flat or null fan eVect that emerged when subjects could get away with a consistency strategy actually resulted from a mixture of two processes: on some trials, subjects actually searched for the specific fact using the eVortful retrieval process, while on other trials a subject would adopt the faster consistency judgment strategy (the fact retrieved is consistent with the probe statement). In the former case, the more related facts studied, the slower the verification; however, Reder and Ross also demonstrated that when subjects used the consistency strategy, the more relevant facts studied, the faster subjects were to verify the statement. They added a third type of test block in which subjects were specifically told to make their decision based on consistency. In the blocks that forced specific search because the foils were thematically related, the fan eVect was found. In recognition blocks in which the foils were not thematically related and subjects could get away with using plausibility, the fan eVect was flat or null. Importantly, in those blocks in which subjects were specifically instructed to base their judgments on the consistency of the probe to the studied statements regardless of whether that specific statement had been studied, verification was faster when more relevant facts had been studied. In other words, Reder and Ross (1983) found a negative fan eVect when the appropriate strategy was plausibility or consistency rather than retrieving a specific statement from memory. The paradox of the expert was solved. 3.
Fan EVects with Real‐World Knowledge
Although the paradox of the expert was ‘‘solved’’ in that experts did not really search for an exact fact in memory, one could still wonder whether these manipulations only had eVects on material learned in the laboratory. That is, the original demonstrations of the fan eVect involved contrived laboratory statements that no undergraduate would ever believe was true, motivating the research by Smith et al. (1978) discussed above. Conceivably
276
Lynne M. Reder et al.
real semantic facts stored in memory would not be aVected by this fan manipulation. That question motivated several laboratory investigations of whether real‐ world knowledge could be aVected by laboratory fan manipulations (Lewis & Anderson, 1976; Peterson & Potts, 1982). In those experiments, subjects learned fantasy facts (Lewis & Anderson) or esoteric (unknown) but true facts (Peterson & Potts) about famous individuals (e.g., George Washington, Napoleon Bonaparte) and later had to verify which newly learned statements had been studied about the famous character. The number of novel facts learned about a famous person was randomly determined for each subject. The time to verify a specific new fact increased monotonically with the number of studied facts, replicating the typical fan eVect. The more interesting result was the eVect that fan manipulation had on the time to verify previously known facts about a famous person. These real‐world facts were also adversely aVected by the number of new facts that had been learned about an individual. In other words, both episodic and semantic (real‐world knowledge) memory were shown to be vulnerable to the fan eVect. 4.
A Mechanistic Account of Retrieval EVects
The original fan eVects of Anderson and Bower (1973) were modeled with mathematical equations that produced excellent fits to the data. The response times were derived from the estimated time to activate the memory structure due to activation spread from the content words (source nodes) in the test probe to the connected representation in memory. The amount of activation spread3 depended on the number of competitors sharing the activation of each of the probes. Reder and Ross (1983) suggested that consistency judgments were based on the amount of activation that accrues at a given theme (e.g., lawyer) due to its relationship with a particular character (e.g., Marty). This activation accrual is aVected by the number of themes associated with the character. The more themes associated with a person, the slower the response times for consistency judgments; however, the more facts associated with a given thematic node, the faster to make a consistency judgment. Reder and Ross (1983) presented a verbal description that is consistent with recent modeling implementations. Specifically, they suggested that the theme node and the link between it and the character node would become stronger with each additional thematic fact studied. 3 When first proposed, the description involved time for activation to spread. In revisions of the theory, the assumptions changed to the amount of activation available to spread. Latency is an inverse function of activation.
Experience Is a Double‐Edged Sword
277
Neither of these mathematical models was implemented as a computational model. However, Anderson in recent decades has developed a sophisticated cognitive architecture, ACT‐R (Anderson & Lebiere, 1998) that can easily account for these types of fan eVects (Anderson & Reder, 1999). Reder developed a related, but simpler model of memory called SAC that does not address skill learning, but that has been used to account for a wide variety of memory phenomena (some not easily accommodated by ACT‐R). These include feeling of knowing eVects (Reder & Schunn, 1996; Schunn, Reder, Nhouyvanisvong, Richards, & StroVolino, 1997), word frequency mirror eVects (Reder et al., 2000), perceptual match eVects (Diana, Peterson, & Reder, 2004, Reder et al., 2002), paired associate learning and cued recall (Reder, Park, & Kieffaber, 2007a), and aging eVects on memory (Buchler & Reder, 2007). The ACT‐R mechanism for spread of activation was included in SAC assumptions, so the explanation for the fan eVect is the same. Although many of the assumptions of SAC were imported from ACT‐R, other assumptions of SAC are not part of the ACT‐R architecture. For example, SAC allows phenomenological judgments to be made based on activation values of nodes (chunks) while ACT‐R does not allow activation levels to be ‘‘read’’4 in this way. It is worth emphasizing that the fan eVect, which plays an important role in both SAC and ACT‐R, is concerned only with retrieval, not encoding. At this time, ACT‐R does not make any assumptions about diVerential probability of encoding. In the second half of this chapter, we will describe modifications to SAC that posit diVerential probability of encoding information. These modifications allow the model to account for various eVects demonstrating both the advantages and disadvantages of familiarity in memory. B.
THE SAC MEMORY MODEL: THE ROLE OF EXPERIENCE IN RECOGNITION MEMORY
The SAC model was initially developed to account for a series of feeling of knowing experiments (Reder & Ritter, 1992; Reder & Schunn, 1996; Schunn et al., 1997).5 However, SAC also makes very strong predictions concerning the role of experience on memory performance, and these basic assumptions 4
It seems likely that ACT-R could be modified to make the same predictions as SAC. In our view, some of the SAC assumptions provide a better account of certain phenomena; however, it is probably not practical for ACT-R to import those assumptions now. Since all theories are only approximations to the truth, hopefully the better assumptions of theories will be adopted by other theories and ultimately become one and the same. 5 The motivation for those experiments was to test the assumption that people could quickly evaluate whether to search for an answer or use a reasoning strategy (Reder, 1987; Reder et al., 1986).
278
Lynne M. Reder et al.
and necessary predictions seemed inconsistent with findings in the literature. Specifically, others had claimed manipulating word frequency in a recognition memory task produced a dissociation such that recollection judgments are aVected by word frequency but familiarity judgments are not (Gardiner & Java, 1990). It is a central assumption of SAC that high‐ and low‐frequency words should diVer in their inherent familiarity because they diVer in how often they have been previously experienced. This apparent contradiction of a basic axiom of the model motivated further exploration of this claimed dissociation. Further research made it clear that the conventional wisdom was incorrect. Before recounting those experiments, a description of the assumptions of SAC is in order. These are the original assumptions of the simpler version of the model. The recent elaborations to SAC that incorporate assumptions about working memory (WM) and how experience aVects encoding will be introduced later in the chapter. SAC is an experience/history sensitive model that represents information as a set of interconnected concepts (we refer to them as nodes). Concept nodes are linked to semantically related nodes as well as nodes representing the constituent features of the concept (e.g., phonemic and lexical features, semantic features).6 There also exist episode nodes that are linked to the concept nodes and which provide information about having seen a concept in a particular context. Any idiosyncratic features of the experience will be individually bound to the episode node, which is connected through memory linkages to both conceptual and perceptual aspects of the experience. There is also a node for the general experimental context in the model that has features of the experiment bound to it and which is also linked to the episode nodes. An illustration of these representational assumptions is shown in Fig. 1. A central assumption is that all aspects of a memory experience follow the same principles, regardless of whether the information is conceptual or perceptual. In other words, all nodes in the network strengthen and decay according to the same rules. Although this model uses a localist, rather than a distributed representation such as the PDP framework of McClelland and Rumelhart (1985), each concept is associated with a wide variety of features, a subset of which can activate the episode node. It is the detailed specification
6 The representation is necessarily schematic and not all features of the experience are represented such as the language that the word is presented in; however, we believe that the perceptual and lexical features are often part of the representation, depending on the attention given to various aspects of the experience. For simplicity, we do not represent features that are probably part of the mental representation and do not aVect our account of the phenomena.
Experience Is a Double‐Edged Sword
279
Fig. 1. A schematic representation of the structure of the SAC model.
of how representations change with experience and how activation values are interpreted in particular situations that allows SAC to make specific, quantifiable predictions for many types of tasks.
1.
Node Strength
The strength of a concept (node in our theory) represents the history of exposure to that concept, with more exposure producing greater strengthening. Strength can also be thought of as the baseline or resting level of activation of a node. Increases and decreases in this baseline strength change according to a power function: B¼c
X
td i
ð1Þ
where B is the base‐level activation, c and d are constants, and ti is the time since the ith presentation. This function captures both power law decay of memories with time and power law learning of memories with practice. Very strong regularities have been found wherever these issues have been studied (Anderson & Schooler, 1991). The central feature of power law decay is that initially memories decay quickly and then much more slowly at increasing delays. Similarly, the central feature of power law learning is that first exposures to an item contribute more than subsequent exposures. That is, the incremental contribution of each new exposure decreases with increasing numbers of exposures.
Lynne M. Reder et al.
280
2.
Link Strength
Links connect nodes that have been associated together by being thought of or experienced at the same time. The strength of these links will vary as a function of how many times the concepts had been associated together and the time delay between exposures. Specifically, we assume a power function given by: Ss;r ¼
X
L td i
ð2Þ
where Ss,r is the strength of the link from the node s to node r, ti is the time since the ith coexposure, and dL is the decay constant for links.
3.
Spread of Activation
The current activation level of a node can increase by receiving environmental stimulation directly or by receiving activation that has ‘‘spilled over’’ from another node in the network to which it is linked. The increase in activation of some node r, which is receiving activation from other nodes, is computed by summing the activation it is receiving from all (source) nodes. However, the amount of activation each source node sends depends on (a) that source node’s strength and (b) how much competition the connection from the source to node r has from other links associated with that source. The change in activation of some node r is computed by summing the spread of activation from all source nodes s connected to node r according to the equation: DAr ¼
XAs Ss;r SSs;i
ð3Þ
where Ar is the change in activation of the receiving node r, As is the activation of each source node s, Ss,r is strength of the link between nodes s and r, and Ss,i is the sum of the strengths of all links emanating from node s. The eVect of the ratio Ss,r / Ss,i is to limit the total spread from a node s to all connected nodes such that it is equal to the node’s current activation As. This feature gives the model the ability to simulate the fan eVects (Anderson, 1974; Reder & Ross, 1983) we have discussed. For example, if a node had three connections emanating from it with link strengths of 1, 2, and 3, then the activation spread along those links would be, respectively, 1/6, 1/3 (i.e., 2/6), and 1/2 (i.e., 3/6) of the node’s current activation level.
Experience Is a Double‐Edged Sword
4.
281
Current Activation of a Node
The base or resting level of activation of a node should be distinguished from the current activation value of a node. The current level of a node will be higher than its baseline whenever it receives stimulation from the environment, that is, when the concept is mentioned or perceived, or when the concept receives activation from other nodes. While baseline strength decays according to a power function (i.e., first quickly and then slowly), current activation decays rapidly and exponentially toward its base level. Let A represent the current level of activation and B represent the base level of activation. Then, the decrease in current activation will be: DA ¼ rðA BÞ
ð4Þ
such that, after each unit of time, the current activation will decrease for every node by the proportion multiplied by that node’s current distance from its base‐level activation. C.
THE SAC MODEL OF WORD RECOGNITION AND THE WORD FREQUENCY MIRROR EFFECT
Researchers have found that diVerential experience with words has profound eVects both in ease of reading (making lexical decisions, naming times) and in memory for the words. One of the conundrums of memory research is the problem of the word frequency mirror eVect in recognition memory (Glanzer & Adams, 1985; Glanzer & Bowles, 1976; Gorman, 1961; Greene & Thapar, 1994; Hintzman, Caulton, & Curran, 1994; Hockley, 1994). Normative word frequency attempts to measure the extent of previous every day experience with each word (although the estimates are usually derived from books). The word frequency mirror eVect is given its name because the pattern of hit rates is a mirror image of the pattern of false alarm rates: Low frequency words produce more hits and fewer false alarms than high‐frequency words. In other words, people are more likely both to recognize a previously seen low‐frequency word compared with a high‐frequency word and to correctly reject a low‐frequency foil compared to a high‐ frequency foil. This eVect has been seen as counterintuitive because it provides a case in which familiarity with a concept produces poorer memory performance. The SAC architecture posits a dual‐process account of recognition, and the word frequency mirror eVect follows naturally from the original SAC assumptions (Reder et al., 2000; Reder, Angstadt, Cary, Erickson, & Ayers, 2002).
282
Lynne M. Reder et al.
The SAC representation of words studied in an experiment is shown in Fig. 1. By dual‐process, we mean that when a subject is asked whether a test probe had been studied as part of a list of words presented earlier, the subject has two routes through which he/she may recognize the probe word. Recognition can occur because (a) the subject recollects having studied the word on the list, which means retrieving specific episodic details of the appropriate previous encounter, or (b) the test probe seems so familiar that the inference is drawn that the familiarity must be the result of a recent previous exposure. The dual‐process theory of recognition is becoming increasingly accepted among memory researchers (Jacoby, 1991; Joordens & Hockley, 2000; Mandler, 1980; Reder et al., 2000; Yonelinas, 1994), but what sets the SAC dual‐process theory apart from the others is that it is computationally implemented (see Diana, Reder, Arndt, & Park, 2006 for a review).7 The Remember/Know paradigm is often used as an assessment of recollection and familiarity‐based processes (Tulving, 1985). In this paradigm, participants are asked to make a Remember response when they recognize an item and can recall some detail about the context in which they studied the item. Know responses are made when the participant feels the item is familiar, but is unable to recall any details about the context in which he/she studied the item. Remember responses index the recollection process and Know responses index the familiarity process. We have used the terms know and familiar interchangeably for the same judgment. Figure 2 illustrates how the role of normative word frequency aVects recognition memory, especially Remember versus Know judgments. Using the assumptions described above, SAC can predict the percentage of recollection‐based and familiarity‐based responses that will be produced under the various conditions of a recognition task. These predicted response percentages are based on the current activation values of memory traces within the model. The percentage of recollection and familiarity responses can be combined to predict old/new responses. When real words are used in an experiment, SAC assumes that the concept nodes already exist in memory and their base‐level activation is determined by their history of previous exposure (frequency and recency of exposure). In order to approximate a given word’s base‐level activation value, we use its word frequency value in standard norms (Kucera & Francis, 1967).8 7 An important part of the debate between single- and dual-process models is the value and diagnosticity of the phenomenological judgments of recollection. In our view, the cumulative evidence is too compelling to reject the dual-process account (see Diana et al., 2006 for a further discussion of this point). 8 We raised that word frequency value to the power 0.7 for base level activation and 0.4 for the amount of preexperimental fan. We have used those values in all experiments in which we modeled eVects of normative word frequency.
Experience Is a Double‐Edged Sword
283
Experimental context
Episode node Episode node
Low-frequency concept node Specific context
High-frequency concept node
Specific context
Links to other episode nodes
Fig. 2. SAC’s representation of high‐ and low‐frequency words studied in an experiment.
At study, we assume that the to‐be‐remembered word is activated and linked to the context in which it occurred. This context can include those characteristics of the environment that the subject experiences during the experiment, such as the lighting, equipment in the room, and the participant’s mood during the task. Features that are general to the entire experiment are bound together as a general experimental context node. A specific context node also may be created during a study trial to capture a novel element of context that diVers from the general experimental context. This might include the presentation of a word in a unique font, a sound occurring outside the room, or the participant’s response to the stimulus. These three types of information: the concept node, specific context node, and experimental context node, are bound together by an episode node, which represents the experience of studying the word in the experiment. When a probe word is presented at test, its concept node is activated along with the experimental context node. The contextual features of the test probe will also be activated. If the word is presented in the same specific context that was linked to the episode node during study, the specific context will be a relevant source of activation that can spread to the episode node. The activation from the concept and context nodes may intersect at the same episode node (depending on whether the probe is a target item or a foil and whether the specific context is similar). Recollection responses are based on the activation of the episode node, where activation accrues due to spread from associated concept nodes, specific context nodes, and experimental context nodes. Familiarity responses are
284
Lynne M. Reder et al.
based on the activation of the concept node and sometimes spuriously from the specific context node. Activation spreads from each node in the structure that is activated by the environment (including concept nodes, specific context nodes, and experimental context nodes) according to the number and relative strength of the links connected to the node. The more links there are emanating from a node, the less activation spreads along any one of the node’s individual links. See Eq. (3) above or consult Reder et al. (2000) for more details. The probability of a Remember response depends on the current activation of the episode node and the subject’s individual threshold for giving a ‘‘Remember’’ response. We assume the same parameters for strengthening, decay, spread of activation, and so on, but we assume that each individual has his or her own threshold for giving a Remember and a Know response. The probability of a Know response is the probability of not responding Remember multiplied by the probability of the concept or specific context node’s activation being above threshold.9 It is important to note that the Remember and Know judgments are not assumed to be independent. The proportion of Remember responses aVects Know responses, but not the converse because participants are instructed to respond Remember if any recollected information is available, even when the item is familiar. We assume that when the node binding the episodic details to the conceptual information is not suYciently strong to pass threshold, the subject will rely on the less accurate process of familiarity. The familiarity‐based (Know) response is based on the activation of the concept node. Given that the entire history of experience influences the node’s strength or activation value, this judgment is less accurate for episodic tasks that require context‐specific judgments of familiarity. SAC got its name, Source of Activation Confusion, because of the assumption that people are unable to distinguish between activation due to recent exposure and activation due to a buildup of prior exposures. This principle is central to the SAC explanation of the word frequency mirror eVect. The strength of the word concept node is aVected by whether the word has been recently seen and how often it has been seen previously. High‐frequency words have higher concept node strength due to prior exposure, and thus high‐frequency lures would be more likely to produce familiarity‐based false alarms than low‐frequency lures.
9 The eVect of the activation from the specific context node on the probability of making a Know response is important when the specific context can be varied between study and test (see Diana et al., 2004 for more details).
Experience Is a Double‐Edged Sword
285
As described earlier, another principle of SAC is that activation spreads along links between nodes according to the number and relative strength of the links. Therefore, less activation spreads along any one link from a node that has a greater number of links. A high‐frequency word has more preexperimental contextual associations than a low‐frequency word and thus can be expected to have more links emanating from its word concept node.9 This makes it less likely that a suYcient amount of activation will spread from a high‐frequency word concept node to its episode node than that suYcient activation will spread from a low‐frequency word concept node to its episode node. Recollection‐based responses are made when the activation of an episode node surpasses threshold. Therefore, SAC predicts more hits to low‐frequency words than high‐frequency words, but also predicts that this diVerence should be seen in the Remember responses (Fig. 2). According to SAC, the familiarity of a word is aVected by whether or not the word has recently been seen and how frequently it has been seen overall such that both normative word frequency and recent exposure aVect a word’s familiarity. Because familiarity can arise from multiple causes, an accurate recognition judgment is based on the retrieval of the study event node (i.e., a true recollection), while responses based on the word node (i.e., familiarity‐based responses) are error prone. There are more false alarms for high‐frequency words than low‐frequency words because high‐ frequency words are more familiar (have a higher base‐level activation), and hence are more likely to seem old when a response is made based on the word node. The SAC model of the word frequency mirror eVect was formally implemented in Reder et al. (2000). It was shown to successfully fit the empirical data. However, the predictions and data obtained by Reder et al. were inconsistent with the findings obtained by Gardiner and Java (1990). Similar to the Reder et al. (2000) finding, Gardiner and Java (1990) found that for the hit portion of the mirror eVect, there were more Remember responses to low‐ frequency targets than high‐frequency targets. This led the authors to conclude that retrieval is responsible for the mirror eVect. SAC also predicts that there will be more Know responses to high frequency than low‐frequency words, but Gardiner and Java found no evidence of this. In order to confirm their finding of a diVerence in Know responses, Reder et al. (2000) analyzed the results of five previous papers testing the word frequency mirror eVect with Remember/Know judgments. They found that high‐frequency words produced a significantly higher proportion of Know responses compared with low‐frequency words, confirming the SAC prediction. Figure 3 shows the model fits to the empirical data for Remember and Know judgments as a function of the experimental and preexperimental frequency of the stimuli.
Lynne M. Reder et al.
286
Low frequency
Proportion response
1.1 1 .9 .8 .7 .6 .5 .4 .3 .2 .1 0 –.1
High frequency 1.1 1 .9 .8 .7 .6 .5 .4 .3 .2 .1 0 –.1
Remember
Know
1
2
3
4 5 6 7 8 Presentation #
9 10
Remember
Know
1
2
3
4 5 6 7 8 Presentation #
9 10
Fig. 3. The proportion of Remember and Know responses for words as a function of word frequency. Triangles represent Remember responses, circles represent Know responses. Closed symbols with solid lines represent the actual data. Open symbols represent the model predictions. The error bars represent 95% confidence intervals. From Reder et al., 2000, p. 310. Copyright 2000 by the American Psychological Association. Reprinted with permission of the author.
D.
CONVERGING EVIDENCE FOR SAC EXPLANATION USING OTHER TYPES OF STIMULI
Recently we have tested our explanation of the eVect of prior experience on retrieval in studies that manipulated exposure to perceptual (as opposed to conceptual) information. This involved presenting words in unusual fonts during study and then measuring word recognition as a function of whether the font at test matched the encoding font and as a function of the number of other words studied in that unusual font (Diana et al., 2004; Reder, Donavos et al., 2002). We represent the unusual font as an idiosyncratic contextual cue associated with the episode node for the studied word. If the word is tested in the same font used during encoding, then there is an extra source of activation that can spread to the episode node, and there should be a greater chance for a recollection (Fig. 4). However, if the font was used with many other words, then the fan of the font node will diminish the amount of activation that will get to any one of the associated episode nodes. As predicted, there were more hits and more Remember responses when the font matched and, most importantly, the advantage of the font matching was modulated by the fan of the font, such that the greater the font fan, the smaller the advantage of matching font.
Experience Is a Double‐Edged Sword
Fig. 4.
287
SAC’s representation of high fan and low fan fonts reinstated at test.
Further evidence for this explanation comes from a study by Park, Arndt, and Reder (2006). In order to test our hypothesis that these eVects were driven by the fan of the contextual cue reinstated at test, subjects were asked to study a series of words presented individually on a screen in one of a number of unusual fonts while simultaneously hearing the word pronounced through a pair of headphones in one of a set of unfamiliar voices. A given word was presented in either a high fan font (seen with many words) or a low fan font (seen with only a few words). If the font was high fan, the voice would be low fan and vice versa. Assignment of voices and fonts to fan condition and to words was randomly determined for each subject. At test, when a probe was presented it was only presented in one modality, either font or voice (for both new and studied words). The context provided always matched the encoding features. As predicted, recognition was more accurate when the feature that was reinstated was low fan. Not only do these findings provide additional evidence that the fan eVects found for word frequency apply to perceptual information, but they also imply that these eVects occur at retrieval rather than encoding. Subjects studied all words for the same amount of time, regardless of fan condition, and it was the fan of the reinstated feature that mattered at test.
288
Lynne M. Reder et al.
Note that this explanation for more Remember responses with a low‐ frequency font is analogous to the explanation for more Remember hits for low‐frequency words. Also as predicted, there were more false alarms to foils that were tested in high‐frequency fonts than low‐frequency fonts. In other words, we obtained a mirror eVect for font frequency, just as one sees for word frequency. Since the assignment of fonts to be either high or low frequency (seen with one or many words) was randomly determined for each subject, the font frequency mirror eVect does not suVer the interpretation problems of a quasi‐experimental design that typically plague studies of the word frequency mirror eVect. Indeed, Maddox and Estes (1997) proposed that word frequency, per se, was not the real cause of the mirror eVect. They manipulated exposure to artificial words (pseudowords) and found a concordant pattern of hits and false alarms such that high‐frequency pseudowords produced more hits and false alarms. However, we suspected that their frequency manipulation was too weak, and that they were replicating a finding that rare words produce fewer hits (Schulman, 1976). Reder, Angstadt et al. (2002) exposed subjects to these pseudowords for an entire semester. Early in the training, they replicated the results of Maddox and Estes. However, by the end of training, they produced the standard mirror eVect, including more Remember responses for low‐frequency pseudowords. More recently, Nelson and ShiVrin (2006) have replicated our result of a mirror eVect for diVerentially experienced stimuli, in this case Chinese characters. In summary, given that diVerential exposure to fonts, pseudowords, or Chinese characters all produce the mirror eVect and that the assignment of stimuli to frequency category was randomly determined for each subject, this eVect must be due to the previous exposure to the stimuli and not something inherent in the stimuli, per se. This finding supports the claim that familiarity alone can be the source of a reduction in memory performance. 1.
Converging Evidence Using Synthetic Amnesia
Although word frequency manipulations in tests of recognition memory almost always produce a mirror eVect, there are situations where this regularity does not occur, such as in studies with amnesiacs or participants under the influence of midazolam. It is often proposed that patients with Alzheimer’s disease and other forms of anterograde amnesia have damage to the recollection capability in memory, but that their familiarity capabilities remain largely intact (Balota & Ferraro, 1996). Hirshman, Fisher, Henthorn, Arndt, and Passannante (2002) induced temporary anterograde
Experience Is a Double‐Edged Sword
289
amnesia using the drug midazolam and showed that when participants were under the influence of midazolam, the hit rate portion of the mirror eVect did not occur. A concordant pattern emerged such that there were more hits and false alarms to high‐frequency words than low‐frequency words. However, participants in the control condition, who received an injection of saline, did show the typical word frequency mirror eVect. It is thought that midazolam aVects people’s ability to recollect information from study, but that it does not impair familiarity processes (Hirshman et al., 2002). Dual‐process models like SAC can explain these data: the hit rate portion of the mirror eVect is due to a recollection process which is disturbed by the drug (or organic amnesia), but the false‐alarm portion results from a familiarity process that is not aVected by the drug. According to SAC, high‐ frequency words have a higher base‐level familiarity that results in more hits (and false alarms) when retrieval of contextual associations cannot be used. 2.
Source Memory Studies Provide Further Support
Evidence from the source memory literature further supports the SAC account. Low‐frequency words are more likely to be associated with correct source judgments than high‐frequency words (Guttentag & Carroll, 1997; Rugg, Cox, Doyle, & Wells, 1995). Source judgments ask participants to report a contextual detail from the study phase that was varied systematically when they recognize a test word. This type of task is thought to use recollection‐based processing (Quamme, Frederick, Kroll, Yonelinas, & Dobbins, 2002). The research found that low‐frequency words were more likely to be correctly judged old and to be assigned to the correct study context than were high‐frequency words. This indicates that participants could more easily recollect the specific context for low‐frequency items and thus were more able to use recollection processing for low‐frequency words. This is consistent with a dual‐process account claiming that the increased hit rate for low‐frequency words is based on better recollection. These findings provide supporting evidence for our model that the hit portion of the mirror eVect is driven by recollection‐based responses while the false‐alarm portion is driven by familiarity‐based responses. 3.
The Costs of Lifelong Experience on Retrieval
An interesting implication of the theory we have presented to explain the word frequency mirror eVect and other phenomena is that the base‐level activation and contextual fan of words should continue to increase over a person’s lifetime because the words continue to be experienced. We propose that some of the memory deficits associated with advancing age can
290
Lynne M. Reder et al.
be explained with these same assumptions (Buchler & Reder, 2007). Although there has been a great deal of research done on the biological and physiological bases of age‐related memory problems, there has been surprisingly little attention devoted to the potential eVects of experience itself. SAC predicts that familiarity processes should be relatively unaVected in that familiarity is enhanced with continued experience (base‐level activation goes up). However, the fan out of each word also accumulates with age making the recollection process more diYcult. Many studies support our position that age‐ related deficits are found in the recollection‐based component rather than the familiarity component (Balota, Burgess, Cortese, & Adams, 2002; Burke & Light, 1981; Castel & Craik, 2003; Chalfonte & Johnson, 1996; Kliegl & Lindenberger, 1993; Light, Healy, Patterson, & Chung, 2005; Naveh‐ Benjamin, 2000; Simons, Dodson, Bell, & Schacter, 2004; Spencer & Raz, 1995). Buchler and Reder (2007) used a two‐parameter model of aging to successfully account for a number of previous results that compared young and old memory performance. The older adults were assumed to diVer from the younger on only two parameters, one representing the extra increase in baseline activation and another representing the increased fan. The fit to the published data was quite good (generally with an r2 of .98 or better using only these two parameters, and sometimes only one, to fit the data). Despite the excellent fits to five diVerent published data sets, we recognize that other factors besides these two parameters aVect diVerences in performance between young and older adults. We will discuss those in the second part of this chapter. For one thing, there is evidence that older adults use diVerent cognitive strategies, presumably to try to compensate for whatever detrimental eVects do arise from aging. Reder et al. (1986) explored whether the tendency to use ‘‘direct retrieval’’ as opposed to a plausibility strategy diVered with age. Some subjects of both age groups (young versus old) were explicitly asked to judge whether a sentence was consistent with what they had read before while the two other groups were explicitly asked to determine whether a specific sentence had been read earlier (direct retrieval). Although older subjects were slower to respond in all cases, they were actually better than their younger counterparts at the plausibility task in terms of accuracy. However, as predicted, they were much worse when direct retrieval was required. E.
SUMMARY OF HOW EXPERIENCE HURTS RETRIEVAL
In this section, we have reviewed a number of experiments that report that knowing more about a concept hurts one’s ability to retrieve specific information associated with that concept. We have used the explanation of the
Experience Is a Double‐Edged Sword
291
‘‘fan eVect’’ to account for various aspects of the word frequency mirror eVect, as well as reviewing the larger literature on the fan eVect that shows accuracy and latency are adversely aVected by knowing more about a concept. We showed that this eVect is not limited to experimental material generated in the laboratory, but applies to prior knowledge about famous individuals. We also showed that our computational model could account for eVects of fan on perceptual information such as font during encoding and showed that it is the fan of the contextual features reinstated at test that matters, rather than the fan of the features used during encoding. We also explained how it is that people avoid the ‘‘paradox of the expert’’ by using strategies other than ‘‘direct retrieval.’’ Not only can individuals be manipulated to use direct retrieval or plausibility as the preferred strategy by manipulating prior history of success, or cues in the question (Reder, 1987, 1988; Reder & Ritter, 1992; Reder & Schunn, 1996), people’s appreciation of their general ability to use retrieval, as a function of age, also influences tendencies to use one question‐answering strategy or another (Reder et al., 1986). Despite all the evidence showing how detrimental prior experience can be to the retrieval process, there is also evidence that prior experience can be a benefit during encoding. The rest of this chapter is devoted to presenting the evidence for this point of view and the additions to SAC to explain these eVects. III.
When and Why Experience Facilitates Memory Encoding
It is generally accepted that novel stimuli attract attention (Johnston, Hawley, Plewe, Elliott, & DeWitt, 1990; Sokolov, 1963) even for infants (Fagan, 1970). That observation has been used by some theorists to explain the word frequency mirror eVect (Glanzer & Adams, 1990). Rao and Proctor (1984) demonstrated that when encoding is self‐paced, participants study low‐frequency words longer than high‐frequency words. Conceivably, the longer study times for low‐frequency words arises as a result of people preferring novel stimuli and therefore allocating more attention to them. This leads to better recollection for low‐frequency words. In the previous section, we oVered a diVerent explanation for the word frequency mirror eVect; we think that the longer study time for low‐frequency words results from the fact that less familiar stimuli are actually more diYcult to encode and, as a result, require more attention in order to be processed. The arguments put forward in the first half of this chapter concerned the adverse eVects of experience, when attempting to retrieve associations to frequently experienced concepts. Now we want to examine the other side of
292
Lynne M. Reder et al.
the coin and argue that frequently experienced concepts are actually easier to encode. This encoding advantage occurs despite the novelty bias in attention, which we speculate may occur in part as a compensation for the encoding disadvantage. In this section, we will review some of the evidence that has led us to this conclusion and describe our modifications to SAC in order to account for the encoding advantage. We also provide model fits to a number of the phenomena that we intend to explain with the revised model. Some aspects of an apparent encoding advantage, such as faster naming times and faster reading times for high‐frequency words are consistent with the SAC assumption that high‐frequency words have a higher base level of activation and are therefore more accessible. What was missing from SAC was the assumption that there is a finite pool of WM resources and that the ability to encode a stimulus depends on both the familiarity of the stimulus and the amount of WM resources available. Before providing the details of the change in the SAC architecture, we will review some of the findings that motivated the modifications to the model. In an unpublished paper, Spehn and Reder (2000) (available on the web at http://www.memory.psy.cmu.edu/unpublished/SpehnLMR.pdf ) found that subjects were better at learning novel first names to famous names such as Einstein or Travolta than to unfamiliar last names such as Kounkel. When tested on their memory for just the last names of the studied first–last name pairs, famous last names were recognized best, rare names intermediate, and common names such as Smith were worst. In contrast, when the recognition test required judging whether the first name was studied with the last name, common last names did exceptionally well. In our view, this result is analogous to the finding that although high‐ frequency words are not well recognized, they do better in word‐pair recognition than low‐frequency word pairs (Clark, 1992). Like high‐frequency words, common names have greater fan (many first names already associated with them), so it is harder to retrieve the pairing if only given the last name as the test probe. That is one reason why common last names were recognized worst when tested in isolation. The other reason is that basing the recognition judgment on familiarity (when retrieval of the first name failed) will be error prone just as it is for high‐frequency words. On the other hand, if the task is name‐pair recognition the first name is provided at test as well. In that case, there are two sources of activation to send to the episode node that binds the names together. With two sources of activation, the eVects of fan should be reduced, enabling the encoding advantage of common names to be observed. In other words, we believe that it is easier to link an arbitrary first name in memory to a common name than to a rare name like Kounkel or Nhouyvanisvong because those names
Experience Is a Double‐Edged Sword
293
are quite unfamiliar and take up considerable resources just to encode those names.10 Another study conducted by Diana and Reder (2006) supports the role of familiarity at encoding. Subjects were presented with high‐ or low‐frequency words that were superimposed on pictures of common objects and instructed to try to remember both the pictures and the words. Assignment of words to pictures was randomized for each subject. For example, a picture of a basketball might have a high‐frequency word (e.g., tree) or a low‐frequency word (e.g., aspirin) superimposed on it. At test, pictures were presented without any words and subjects were asked to recognize the studied pictures. Recognition memory for the pictures was better when the superimposed word at study was high frequency rather than low frequency. Not only was recognition accuracy better when the picture was studied with a high‐frequency word, but the proportion of ‘‘Remember’’ judgments was greater when the encoding word was of high frequency. This latter point is important because the binding operation that we believe requires WM is manifest in Remember responses. ‘‘Familiar’’ (or ‘‘Know’’) responses do not depend on this binding process because they reflect only the activation of the concept node. Although picture memory was better when high‐frequency words were superimposed, recognition memory for the words themselves (tested separately from the pictures) showed the typical pattern whereby low‐frequency words were recognized better than high‐frequency words. In our view, recognition is better for low‐frequency words despite their encoding disadvantage because the retrieval advantage masks the encoding disadvantage unless there are increased WM demands at encoding. Another study by Diana and Reder (2006) found that when two words are presented for study simultaneously, both high‐ and low‐frequency words are more easily recollected later if the word it was paired with was of high frequency. That is, pairing a word with a low‐frequency word at study makes recollection more diYcult. An alternative explanation for the picture encoding advantage with high‐ frequency words is that there is a tacit trade‐oV in attention between the word and the picture such that low‐frequency words grab more of the attention than high‐frequency words and the total amount of attention is limited. That is, more novel words attract more attention leaving less for the pictures. High‐frequency words are less unusual and therefore more attention is allocated to the picture, increasing its chances of being recognized later. This alternative account cannot explain the findings with lists of pure high frequency or pure low‐frequency words in paired associate recognition or recall. In those cases, high‐frequency words are at an advantage (Clark, 1992; 10
Ngiam also modeled some of these data successfully. For reasons of space and time considerations (he did not have time to model all of the results), we are not reporting those eVorts.
294
Lynne M. Reder et al.
Deese, 1960). We will describe these patterns in more detail when we fit SAC to the empirical results. Participants may tacitly appreciate this trade‐oV between encoding and retrieval for word frequency. When queried before the experiment begins ‘‘how diYcult will each item be to recognize’’, they predict that high‐frequency words will be easier to recognize. However, when asked the same question during the test phase, participants make the correct judgment, noticing that low‐frequency words are easier to recognize (Benjamin, 2003). This suggests that participants may experience high‐ and low‐frequency words diVerently during encoding as well as supporting the idea that low‐frequency words are more likely to produce a recollection‐based response, which would lead participants to feel that such words are particularly memorable. Recognition memory tests also show list composition eVects whereby the low‐frequency word advantage is augmented in mixed lists of predominantly high‐frequency words (Dewhurst, Hitch, & Barry, 1998; Malmberg & Murnane, 2002). Also, rare words (e.g., ‘‘iatrogenic’’) do not show the normal hit rate advantage in standard recognition memory experiments that low‐frequency words enjoy (Schulman, 1976). This may be because the rare words are so diYcult to parse or comprehend that it becomes diYcult to form any associative link to them whatsoever. Thus, the postulation of a low‐ frequency encoding disadvantage can explain a range of phenomena in the literature on memory for words. High‐frequency words also show an advantage in associative recognition tasks. Associative recognition requires the formation of associations between items. In these tasks, participants study pairs of words and at test are asked to discriminate between words that were presented as pairs at study (that should be judged as old) and those that are recombinations of studied items from diVerent pairs (that should be judged as new). Unlike item recognition, associative recognition shows a mirror eVect for high‐frequency words: previously seen high‐frequency word pairs produce more hits while high‐ frequency recombined pairs produce fewer false alarms than low‐frequency pairs (Clark, 1992). These findings from associative recognition and recall provide evidence that the formation of associative links between items in memory, such as between arbitrary word pairs presented in associative tasks or from word to word in serial recall tasks, may be easier for high‐frequency words than low‐frequency words. A.
AUGMENTATION OF SAC: HOW WM AND PRIOR EXPERIENCE INTERACT TO AFFECT EASE OF ENCODING
We have previously implemented SAC models that vary the probability of encoding an event to explain aging eVects (Reder et al., 2007a) and to simulate the eVects of midazolam (Reder et al., 2007b). We accomplished
Experience Is a Double‐Edged Sword
295
these eVects by merely positing diVerent probabilities of forming a link. Although those modifications worked well, they were ad hoc. The addition of a WM component to the SAC architecture enables the probability of encoding to vary in a more principled fashion (i.e., without merely fitting a parameter that varies the success of the binding). We assume that there is a finite amount of WM resources that can be used to encode stimuli, build associations, perform tasks, and so on and that this pool returns to its full capacity over the time. Resources are drawn from this pool of WM to activate a stimulus so that it can be encoded in a way that enables the construction of a link between two elements. For example, this could be the binding of a word to an experimental context or forming an association between two words. Importantly, how much activation must be drawn from the pool of WM resources depends on the resting level of activation of the concept such that the weaker the base‐level activation of the concept, the more activation that is required to build a new association. As such, familiar concepts (e.g., words with higher normative frequency) make fewer demands on the WM pool when attempting to bind an item to context or to another concept. This implies that the more elements that need to be encoded and processed, the greater the demand on this pool of WM (Anderson, Reder, & Lebiere, 1996). The amount of WM expended in encoding one concept is: WMencode ¼ t B
ð5Þ
where t is the threshold and B is the node’s base‐level activation [Eq. (1)]. The WM pool replenishes at a linear rate, r, such that the pool at time t is given by: WMt ¼ minðWMmax ; WMt1 þ rÞ
ð6Þ
Thus, the WM extensions to SAC involve 2 new parameters: the maximum WM pool quantity, WMmax, and the WM recovery rate, r. We also assume that if there is suYcient WM to get a concept over threshold, the amount of activation that is sent from a source node is unaVected by the base‐level activation, although it remains proportional to the relative link strength. Familiarity judgments are now a function of the amount of WM resources required to get the word up to threshold (much like ‘‘perceptual fluency,’’ see Whittlesea, Jacoby, & Girard, 1990) such that the fewer WM resources needed to reach threshold, the more perceptually fluent and the more familiar the concept appears.11 11
These diVerent assumptions do not change the behavioral predictions of the model for the datasets already fit. Familiarity judgment calculations are isomorphic. The spread of activation values are almost the same as well.
296
Lynne M. Reder et al.
These assumptions mean that a person is less likely to be able to bind a concept to a context if (a) the concept is unfamiliar, (b) there are many other stimuli to encode at the same time, (c) the stimulus is perceptually degraded, or (d) the WM pool is small, either because it has not finished being replenished or because the person has a smaller pool to begin with. We assume that the amount of WM varies among individuals (Daily, Lovett, & Reder, 2001; Lovett, Daily, & Reder, 2000; Lovett, Reder, & Lebiere, 1997), as well for a particular individual as a function of fatigue, and so on.12 It is important to note that these assumptions concerning encoding also apply at test when the probe(s) need to be encoded. When there are more stimuli as part of the test probe that need to be encoded (word pair vs a single item) or when the stimuli are less familiar (low‐frequency words, words presented in unusual fonts), more WM resources are depleted in the eVort to get each concept of the test probe up to threshold. If there are suYcient resources to get a concept up to threshold, then activation can spread to its associated nodes. 1.
A Limit on Concept Strengthening
We have also added the assumption that a node is not strengthened when its current activation is above a specific level. This assumption could be viewed as a proxy for habituation such that when the same information is experienced over and over it no longer attracts as much attention and does not gain strength indefinitely; however, we are not claiming that the links are not formed or strengthened when the item is repeated at threshold; therefore, it should not be taken as a complete analogue to habituation. 2.
Partial Match and Spurious Recollection
In order to model false alarms that are reported as ‘‘recollections’’, a spurious recollection mechanism has been introduced to SAC. Previously, SAC only accounted for false alarms as familiarity‐based ‘‘Know’’ false alarms and did not allow any ‘‘Remember’’ false alarms by spuriously activating the wrong episode node. That simplifying assumption seems odd in hindsight because the original SAC model of feeling of knowing (Reder & Schunn, 1996; Schunn et al., 1997) accounted for spurious feelings of knowing that were generated from partial matching. Specifically, we modeled that a spurious 12
Reder’s previous work on individual diVerences in working memory capacity used the ACT-R framework. In ACT-R, working memory diVerences are assumed to only aVect retrieval, not encoding. There are currently no assumptions about diVerential probability of encoding or binding in ACT-R.
Experience Is a Double‐Edged Sword
297
feeling of knowing would occur if suYcient activation accumulated at the problem node even if an element of the problem (such as the operator) did not match. We now appreciate that the same assumptions should have remained in SAC when we modeled recognition. We now allow for an analogous mechanism in recognition to occur by letting the model attempt to retrieve the episode node with the highest activation regardless of whether or not that episode node corresponds to the concept in the probe. If a spurious episode node is retrieved, the participant may still be able to recall the original concept that the episode had been linked to and reject it on that basis (recall to reject). For more information on spurious recollection, see Cook, Reder, Buchler, Hashemi, and Dickison (in preparation). B.
ILLUSTRATIONS OF MODEL FITS WITH THE NEW ENCODING ASSUMPTIONS
Earlier in this chapter we described a study by Diana and Reder (2006) in which words were superimposed on pictures and subjects were responsible for remembering both aspects of the stimulus. In this model, pictures and words are represented by concept nodes, with an attempt to link each concept node to an episode node at study. The concept nodes for the pictures were given base activation levels approximated from medium‐frequency words. During each study trial, consisting of a superimposed word and a picture, two links needed to be formed from the picture concept node and the word concept node to their respective episode nodes. This link is only formed when suYcient resources exist in the WM pool. Therefore, when a low‐ frequency word is presented with a picture, fewer resources remain to allow encoding of the picture than when a high‐frequency word is presented with a picture. Model fits to this experiment, comparing the SAC predictions to the actual data, were quite good, with Pearson’s r2 ¼ .95. These fits are shown in Fig. 5. As described earlier, Diana and Reder (2006) found that low‐frequency words were better recognized if they were encoded with a high‐frequency word while high‐frequency words were recognized worse if encoded with a low‐frequency word even though participants were instructed to remember each word separately and were not tested on their memory for which words were paired together. The results and model fit are shown in Fig. 6. Here too the fit was quite good, r2 ¼ .96. As in other models, words are represented by concept nodes, with each concept node linked to its own episode node. Because participants were instructed to remember each word separately, we did not include a link between the episode nodes for words studied at the same time.
Lynne M. Reder et al.
298 1.000 .900 .800
Empirical data SAC model fits
.700 .600 .500 .400 .300 .200 .100 .000
Remember Familiar Remember Familiar Remember Familiar Low frequency High frequency Lures
Fig. 5. SAC model fits for Remember‐Familiar responses during the word test in the picture–word interference experiment.
In each study trial consisting of two words (to be encoded separately), recollection requires that the word concept node be bound to the experimental context node by creating an episode node that links them. The formation of each episode node requires resources to be drawn from the pool of WM resources. Because high‐frequency words have a greater base‐level activation, fewer WM resources are required to create an episode node linking the concept and context nodes, while low‐frequency words require relatively more WM resources in order to form an episode node. In the event of a link formation failure, the concept node will not be linked to the episode node at all. This reflects a failure in binding and the item cannot be retrieved using recollection. In the case of a link formation failure, no resources are subtracted from the WM pool. Word frequency manipulations produce diVerent eVects depending on the composition of the study lists. When items are encoded on lists of either purely high‐frequency or purely low‐frequency words, high‐frequency items produce better performance on cued recall and associative recognition tests (Clark & Burchett, 1994). On lists with both high‐ and low‐frequency words, the high‐ frequency advantage in cued recall does not occur. Simple recall also shows a
Experience Is a Double‐Edged Sword
Remember responses
299 Empirical data SAC model fits
.700 .600 .500 .400 .300 .200 .100 .000
.700
Pure Mixed High frequency
Mixed Pure Low frequency
Familiar responses
Empirical data SAC model fits
.600 .500 .400 .300 .200 .100 .000
Pure Mixed High frequency
Pure Mixed Low frequency
Fig. 6. SAC model fits for Remember‐Familiar responses during the word–word interference experiment.
high‐frequency advantage only for pure lists (MacLeod & Kampe, 1996; Watkins, LeCompte, & Kim, 2000). Even in recognition, the ubiquitous low‐frequency advantage is aVected by list composition. There is some evidence that high‐frequency words show an advantage when items are presented on pure lists (Dewhurst et al., 1998). Also, when the proportion of high‐frequency words on a list is increased, the low‐frequency advantage increases (Malmberg & Murnane, 2002). If low‐frequency words in fact use more WM capacity during encoding, the presence of more low‐frequency words on a list may reduce the processing resources that are available to encode all words on this list. This is because
Lynne M. Reder et al.
Remember hit rate
300
1 .9 .8 .7 .6 .5 .4 .3 .2 .1 0
HF words LF words
0
20
40 60 % LF words
80
100
Fig. 7. SAC simulation of Remember hit rate to high‐ and low‐frequency words across diVerent proportions of low‐frequency words included in the study list.
the low‐frequency words may recruit WM capacity from high‐ or low‐ frequency words presented on subsequent trials. That is, encoding of a previous low‐frequency word may still be occurring during later study trials. In this case, we would expect better encoding of low‐frequency words on a randomized list that contained fewer low‐frequency words and better encoding of high‐frequency words on a randomized list that contained only high‐ frequency words. To test whether our explanation of this pattern could actually be simulated, we developed a SAC simulation of learning a study list that varied in the proportion of low‐ and high‐frequency words and tested its ability to retrieve the episode node. Figure 7 shows the results of that simulation. Note that this pattern is consistent with the findings of Malmberg and Murnane: As the proportion of low‐frequency words on the list increases, there is a reduction in the proportion of low‐frequency Remember hits while high‐frequency word Remember hits were largely unaVected by this manipulation. C.
THE CONSEQUENCES OF MINIMAL ‘‘LIFELONG’’ EXPERIENCE ON ENCODING
In the previous section, we discussed how and why experience hurts the elderly when it comes to using prior knowledge in a fact retrieval situation. The other side of this coin is the demonstration that young children are less able to encode information because of their limited experience with the stimuli. Whitehouse, Maybery, and Durkin (2006) found that the picture superiority eVect (over words) in free‐recall tests increases from middle childhood to adolescence.
Experience Is a Double‐Edged Sword
301
Given that word reading does not decline with age, and pictures should be more important for younger children, the explanation cannot be due to simple identification of the stimuli. Indeed word recall did not improve from grades 2/3 to 10/11, while picture recall improved substantially. The interpretation of Whitehouse et al. is that the picture superiority eVect is ‘‘contingent on the encoding of pictorial information through two diVerent routes.’’ While many would have predicted that the picture superiority eVect would decrease from elementary school to secondary school, Whitehouse et al. speculate that the converse finding results from the development of inner speech with age, and that inner speech allows for the dual‐code advantage postulated by Paivio (1971). Our interpretation is similar but is based on lower familiarity of concepts for young children. Concepts that have a lower level of activation are more diYcult to bind to an episode, making recall more diYcult. The picture task uses more WM resources because the picture has to be translated into a word to get the second code. Within SAC, the recovery rate of the WM pool takes time and is aVected by the amount of depletion. We would argue that in grades 2/3, fewer of the pictures benefit from the secondary code, but as each of the concepts gets stronger, the number of concepts that can be bound to the episode node increases. In other words, it is the tacit secondary task of converting pictures to words that creates the dual codes but also taxes WM, meaning that more concepts fail to be bound. D.
EXTREMELY LOW‐FREQUENCY STIMULI: EXPERIENCE ENABLES UNITIZATION (CHUNKING)
Although low‐frequency words typically show an advantage in tests of recognition memory over high‐frequency words, this eVect is reversed when rare words (e.g., ‘‘iatrogenic’’) are used (Schulman, 1976). We believe this is because the rare words are so unusual that they are not chunks. Stimuli that are not chunks have a weak node binding the components together and WM resources are used to bind together the constituents of the rare stimulus rather than binding it to the experimental context. Another study from our lab (Reder et al., 2006a) provided additional support for the notion that unfamiliar stimuli are diYcult to encode and therefore bind to context, despite their unusual status. Subjects studied words, photographs, and abstract pictures for a subsequent recognition test on the same day. Each subject participated in two sessions with two separate lists of stimuli. In one session they received an injection of the drug midazolam, a benzodiazepine that creates temporary anterograde amnesia, before studying the list of items that they would then have to recognize. In the other session they studied diVerent items from the same stimulus classes, but after an injection of saline. Neither the
Lynne M. Reder et al.
302
3
2.5
2
1.5
1 Midazolam Saline
.5
0
Abstracts
Photographs Stimulus type
Words
Fig. 8. Recognition memory measured in d 0 as a function of stimulus type (words, photographs, and abstract pictures) and drug condition (saline vs midazolam). From Reder et al., 2006a, p. 565. Copyright 2006 by the Association for Psychological Science. Reprinted with permission of the author.
participant, nor the nurse, nor the experimenter knew which day a particular subject was given saline or midazolam, (i.e., testing conditions were double blind). The striking result was that midazolam aVected recognition memory for words most and aVected memory for abstract pictures least (Fig. 8). Our explanation for this result is that (a) midazolam only aVects the ability to create new bindings (Park, Quinlan, Thornton, & Reder, 2004; Reder et al., 2006b) and (b) only a unitized chunk can be bound to an experimental context. The abstract pictures could not be bound to the experimental context even in the saline condition, and therefore the eVect of the drug was minimized for that stimulus class. Another finding by Dobbins and Kroll (2005) can be interpreted as supporting our hypothesis. They found that recognition memory was superior for scenes and faces that were known, but that the advantage for those stimulus types was eliminated when subjects were forced to respond quickly or when testing was delayed for one week. Our interpretation is that binding concepts to experimental context is much more likely for known faces and scenes; however, if responding must be rapid, judgments are based on familiarity and so there is no advantage to having formed an episode node. With a one week delay the episode node and link will have decayed substantially making reliance on familiarity the dominant process.
Experience Is a Double‐Edged Sword
303
The notion that unitization requires prior experience is not a new idea. Hayes‐Roth (1977) and Servan‐Schreiber (1991) have hypothesized something similar; however, no one has thus far suggested that the strength of a chunk predicts the probability of encoding it and binding it to other chunks. Our explanation is that an item with no prior representation must be encoded in terms of the component features that are strongly activated. With repeated exposure, the node that binds the constituents together becomes a chunk in its own right, forming a new, higher‐level chunk involving the grouping of these features. At that point the higher‐level chunk is suYciently strong (i.e., has strong enough base‐level activation) to be bound with other co‐occuring stimuli or bound to the experimental context to make an episodic event. The abstract pictures had not been experienced before and recognition could only be based on the familiarity of the elements that were primed from exposure.13 Further support for the notion that chunks are constructed as their constituent elements become more familiar comes from studies with chess masters (Chase & Simon, 1973; Simon & Gilmartin, 1973) who have acquired thousands of hours of experience with various chess patterns. Although chess masters are much better than novices at reproducing a chessboard configuration when it was displayed tachistoscopically (very briefly), they are not better than a novice if the configuration of chess pieces on the board is random (de Groot, 1965). In addition, the latency between chess pieces that were put down on the board to reproduce the flashed display mirrored the chunks that one would expect. That is, subjects had shorter pauses when putting down pieces within a chunk (e.g., a Sicilian defense), but longer pauses when switching to recall of another chunk. IV.
General Discussion
Sometimes psychologists will say with a wry smile, ‘‘Psychology is the science penetrating the obvious.’’ Whether or not that adage is valid, it seems obvious (with hindsight) that experience should facilitate encoding. However, it has also been demonstrated that novel stimuli attract far more attention, and it has often been claimed that the disadvantage of high‐frequency words in recognition results from poorer encoding. In this chapter, we have argued that high‐frequency words are encoded more easily than low‐ frequency words, but that their deficit in recognition occurs despite their encoding advantage. 13 There is also the possibility of recollection from a subset of the features, that is binding some of the features to context. The danger with that strategy is that the features that are strong enough to bind to context could also be shared with foil pictures.
304
Lynne M. Reder et al.
The important contribution of this chapter is not articulating what some might consider the obvious (at least in hindsight), but rather articulating a mechanistic account of when and why familiarity helps encoding. That is, the familiarity advantage at encoding matters more when there is a demand on WM resources. We also oVered an explanation of how and why familiarity enables the binding of context to concepts. Finally, we reviewed the evidence that knowing more about a concept means that retrieving any one fact about it is slower or less accurate. This seems obvious when reframed as ‘‘it is harder to find a specific strand of hay in a haystack than on a clean floor.’’ If the details of the retrieved information are unimportant, then the eVect of fan goes away or even reverses. This chapter went beyond verbal explanations to account for classes of phenomena. We oVered a computationally implemented model that accounts for both the costs of experience at retrieval and the benefits of experience at encoding within the same framework. We went beyond demonstrations of qualitative fits to the empirical data and provided excellent quantitative fits that involved estimating few new free parameters (i.e., most parameter values have remained the same across all SAC models). We did not attempt to fit all the data we reported that provides converging evidence for our point of view, but we are confident that these phenomena could also be modeled within our framework. We have also fit some phenomena that we did not describe such as diVerential eVects of word frequency as a function of the presentation rate. A.
EXPLAINING RELATED PHENOMENA WITH OUR MODEL
All of the phenomena that we have modeled have either involved simple numerical problems or words or word pairs and perceptual contexts (e.g., font or voice). These domains have the property that individual diVerences in semantic memory are not too relevant to performance (unless one gets into free‐recall tasks) and we do not need to model language parsing. In order to model phenomena that involve the semantics of the stimuli we would need to speculate on the semantic content of people’s memories, a complex task that we do not feel equipped to undertake. Nonetheless, a number of ideas described here apply to other phenomena that have not been modeled in SAC but seem consistent with the architectural principles. For example, we reviewed the findings that new information about famous people can produce fan eVects (interference) with real‐world knowledge about them when the task requires retrieval of specific facts rather than consistency judgments about these people. The explanation that chess masters have acquired higher‐level chunks from the experience of building up constituent (smaller) chunks with experience is something that is predicted by the model. A prediction of our model is that if chess masters were presented
Experience Is a Double‐Edged Sword
305
with chess configurations at the same time as orally presented words, then recognition for the chess patterns presented would show a ‘‘mirror eVect’’ such that the very common patterns would have fewer hits and more false alarms than the somewhat less common chess configurations; however, we would also predict that the words presented with the common chess patterns would produce more ‘‘Remember’’ responses than those words studied with the less common (lower frequency) chess patterns, analogous to what we have seen with words and pictures (except that here chess patterns are mapped to the words in terms of our predictions). Our explanation of why high‐frequency words are easier to encode involves the assumption that they have a higher resting level of activation, which we have also used to explain the misattribution of activation that creates spurious familiarity judgments. This assumption follows from the architectural principles of strengthening chunks with repeated exposures and also explains a number of other phenomena associated with words of diVerent frequency. For example, word naming tasks, used primarily in the study of semantic memory, show a high‐frequency advantage such that high‐frequency words produce faster responses than low‐frequency words (Frost & Katz, 1989). Also consistent with our framework, when a secondary task is added to the word naming task, eVects of secondary task diYculty are larger for low‐frequency words than high‐frequency words (Becker, 1976; Goldinger, Azuma, Abramson, & Jain, 1997). When longer delays between word presentation and response are used, the high‐frequency advantage disappears (Becker, 1976; Connine, Mullenix, ShernoV, & Yelen, 1990). Seidenberg (1985) argued that higher frequency words are more visually familiar and this visual familiarity allows lexical access without generation of phonology. With regard to the current question of the eVects of frequency at encoding, this idea could be simplified to the view that access to memory representations of high‐frequency words is faster than access to representations of low‐frequency words. V.
Summary and Conclusions
In this chapter, we have proposed that experience can facilitate cognition, but that it also carries costs. We have provided both empirical evidence to support these claims and a computational mechanism to show how these processes interact with other aspects of the mind. Our cognitive architecture also has neurophysiological support for its assumptions. For example, there is evidence that repetition priming produces a reduction in the BOLD response (see Henson, 2003 for a review), consistent with the idea that a node with a stronger base‐level activation (from recent boosts in activation) requires less processing to get to threshold. Likewise, there is evidence that
306
Lynne M. Reder et al.
high‐frequency words produce a reduced signal both in fMRI (de Zubicaray, McMahon, Eastburn, Finnigan, & Humphreys, 2005) and EEG (Hauk & Pulvermuller, 2004) compared with low‐frequency words, which is also consistent with our assumptions. Likewise, there is neuroimaging evidence that increased fan creates a greater BOLD response, which supports the view that it is more diYcult to retrieve something for which there are more associations (D’Arcy, Ryner, Richter, Service, & Connolly, 2004). The first half of this chapter reviewed the evidence for the important role of experience at retrieval. We argued that greater experience makes retrieval of specific facts more diYcult, but that it facilitates judgments based on inference (familiarity based, consistency based, and so on). As we age, we have more wisdom, and more knowledge and more experience, so it is natural that we rely more on this experience and make more inferential judgments. The second half of this chapter extended our implemented mechanistic account of implicit and explicit memory eVects that can account for the mirror eVect of word frequency among many other phenomena. In the augmentation of SAC, we provided insights as to how familiarity can provide an advantage in cognitive processing by facilitating encoding. The value of a computational model such as SAC is that it can be integrated to explain many phenomena with the same set of assumptions. As Herb Simon said, ‘‘If the goal of psychology is to prove a theory wrong, we can all go home now because all theories are wrong’’ (personal communication, 2000). Yet Herb Simon was one of the strongest advocates for developing computational models and frameworks or architectures. The goal is to move toward closer and closer approximations to the truth by building models that can account for more and more phenomena. ACKNOWLEDGMENTS Preparation of this chapter and the work described herein was supported by National Institute of Mental Health grants 5R01MH052808 and 5T32MH019983. The latter training grant on ‘‘Combined Computational and Behavioral Approaches to the Study of Cognition’’ supported R.A.D. and C.P. The authors wish to thank Joyce Oates, Paul KieVaber, and Alexandra Tsarenko for their comments on an earlier version of the chapter.
REFERENCES Anderson, J. R. (1974). Retrieval of propositional information from long‐term memory. Cognitive Psychology, 5, 451–474. Anderson, J. R. (1976). Language, memory, and thought. Hillsdale, NJ: Erlbaum.
Experience Is a Double‐Edged Sword
307
Anderson, J. R., & Betz, J. (2001). A hybrid model of categorization. Psychonomic Bulletin & Review, 8(4), 629–647. Anderson, J. R., & Bower, G. H. (1973). Human associative memory. Washington: Winston and Sons. Anderson, J. R., & Lebiere, C. (1998). Atomic components of thought. Mahwah, NJ: Erlbaum. Anderson, J. R., & Paulson, R. (1978). Interference in memory for pictorial information. Cognitive Psychology, 30, 221–256. Anderson, J. R., & Reder, L. M. (1999). The fan eVect: New results and new theories. Journal of Experimental Psychology: General, 128, 186–197. Anderson, J. R., Reder, L. M., & Lebiere, C. (1996). Working memory: Activation limitations on retrieval. Cognitive Psychology, 30, 221–256. Anderson, J. R., & Schooler, L. J. (1991). Reflections of the environment in memory. Psychological Science, 2, 396–408. Balota, D. A., Burgess, G. C., Cortese, M. J., & Adams, D. R. (2002). The word‐frequency mirror eVect in young, old, and early‐stage Alzheimer’s disease: Evidence for two processes in episodic recognition performance. Journal of Memory and Language, 46, 199–226. Balota, D. A., & Ferraro, F. R. (1996). Lexical, sublexical, and implicit memory processes in healthy young and healthy older adults and in individuals with dementia of the Alzheimer type. Neuropsychology, 10(1), 82–95. Becker, C. A. (1976). Allocation of attention during visual word recognition. Journal of Experimental Psychology: Human Perception & Performance, 2(4), 556–566. Benjamin, A. S. (2003). Predicting and postdicting the eVects of word frequency on memory. Memory & Cognition, 31(2), 297–305. Buchler, N. E. G., & Reder, L. M. (2007). Modeling age‐related memory deficits: A two‐ parameter solution. Psychology & Aging, 22(1), 104–121. Burke, D. M., & Light, L. L. (1981). Memory and aging: The role of retrieval processes. Psychological Bulletin, 90, 513–546. Cary, M., & Reder, L. M. (2002). Metacognition in strategy selection: Giving consciousness too much credit. In P. Chambres, M. Izaute, and P. J. Marescaux (Eds.), Metacognition: Process, function, and use. (pp. 63–78). New York, NY: Kluwer. Castel, A. D., & Craik, F. I. M. (2003). The eVects of aging and divided attention on memory for item and associative information. Psychology and Aging, 18, 873–885. Chalfonte, B. L., & Johnson, M. K. (1996). Feature memory and binding in young and older adults. Memory & Cognition, 24, 403–416. Chase, W. G., & Simon, H. A. (1973). The mind’s eye in chess. In W. G. Chase (Ed.), Visual information processing. (pp. 215–281). New York, NY: Academic Press. Clark, S. E. (1992). Word frequency eVects in associative and item recognition. Memory & Cognition, 20(3), 231–243. Clark, S. E., & Burchett, R. E. R. (1994). Word frequency and list composition eVects in associative recognition and recall. Memory & Cognition, 22(1), 55–62. Connine, C. M., Mullenix, J., ShernoV, E., & Yelen, J. (1990). Word familiarity and frequency in visual and auditory word recognition. Journal of Experimental Psychology: Learning, Memory, & Cognition, 16(6), 1084–1096. Cook, S., Reder, L. M., Buchler, N., Hashemi, S., & Dickison, D. A mechanistic account of spurious recollection. Manuscript in preparation. Daily, L. Z., Lovett, M. C., & Reder, L. M. (2001). Modeling individual diVerences in working memory performance: A source activation account. Cognitive Science, 25, 315–353.
308
Lynne M. Reder et al.
D’Arcy, R. C. N., Ryner, L., Richter, W., Service, E., & Connolly, J. F. (2004). The fan eVect in fMRI: Left hemisphere specialization in verbal working memory. Neuroreport: For Rapid Communication of Neuroscience Research, 15(12), 1851–1855. Deese, J. (1960). Frequency of usage and number of words in free recall: The role of association. Psychological Reports, 7, 337–344. de Groot, A. D. (1965). Thought and choice in chess. The Hague, The Netherlands: Mouton. de Zubicaray, G. I., McMahon, K. L., Eastburn, M. M., Finnigan, S., & Humphreys, M. (2005). fMRI evidence of word frequency and strength eVects during episodic memory encoding. Cognitive Brain Research, 22, 439–450. Dewhurst, S. A., Hitch, G. J., & Barry, C. (1998). Separate eVects of word frequency and age of acquisition in recognition and recall. Journal of Experimental Psychology: Learning, Memory, & Cognition, 24(2), 284–298. Diana, R. A., Peterson, M. J., & Reder, L. M. (2004). The role of spurious feature familiarity in recognition memory. Psychonomic Bulletin & Review, 11(1), 150–156. Diana, R. A., & Reder, L. M. (2006). The low‐frequency encoding disadvantage: Word frequency aVects processing demands. Journal of Experimental Psychology: Learning, Memory, & Cognition, 34(4), 805–815. Diana, R. A., Reder, L. M., Arndt, J., & Park, H. (2006). Models of recognition: A review of arguments in favor of a dual‐process account. Psychonomic Bulletin & Review, 13, 1–21. Dobbins, I. G., & Kroll, N. E. A. (2005). Distinctiveness and the recognition mirror eVect: Evidence for an item‐based criterion placement heuristic. Journal of Experimental Psychology: Learning, Memory, & Cognition, 31(6), 1186–1198. Fagan, J. F., III (1970). Memory in the infant. Journal of Experimental Child Psychology, 9, 217–226. Frost, R., & Katz, L. (1989). Orthographic depth and the interaction of visual and auditory processing in word recognition. Memory & Cognition, 17(3), 302–310. Gardiner, J. M., & Java, R. I. (1990). Recollective experience in word and non‐word recognition. Memory & Cognition, 18, 23–30. Glanzer, M., & Adams, J. K. (1985). The mirror eVect in recognition memory. Memory & Cognition, 13(1), 8–20. Glanzer, M., & Adams, J. K. (1990). The mirror eVect in recognition memory. Data and theory. Journal of Experimental Psychology: Learning, Memory, & Cognition, 16, 5–16. Glanzer, M., & Bowles, N. (1976). Analysis of the word‐frequency eVect in recognition memory. Journal of Experimental Psychology: Human Learning and Memory, 2, 21–31. Goldinger, S. D., Azuma, T., Abramson, M., & Jain, P. (1997). Open wide and say ‘‘Blah!’’ Attentional dynamics of delayed naming. Journal of Memory and Language, 37, 190–216. Gorman, A. M. (1961). Recognition memory for nouns as a function of abstractness and frequency. Journal of Experimental Psychology, 61(1), 23–29. Greene, R. L., & Thapar, A. (1994). Mirror eVect in frequency discrimination. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20(4), 946–952. Guttentag, R. E., & Carroll, D. (1997). Recency judgments as a function of word frequency: A framing eVect and frequency misattributions. Psychonomic Bulletin & Review, 4(3), 411–415. Hauk, O., & Pulvermuller, F. (2004). EVects of word length and frequency on the human event‐ related potential. Clinical Neurophysiology, 115(5), 1090–1103. Hayes‐Roth, B. (1977). Evolution of cognitive structures and processes. Psychological Review, 84(3), 260–278. Henson, R. N. A. (2003). Neuroimaging studies of priming. Progress in Neurobiology, 70, 53–81.
Experience Is a Double‐Edged Sword
309
Hintzman, D. L., Caulton, D. A., & Curran, T. (1994). Retrieval constraints and the mirror eVect. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20(2), 275–289. Hirshman, E., Fisher, J., Henthorn, T., Arndt, J., & Passannante, A. (2002). Midazolam amnesia and dual‐process models of the word‐frequency mirror eVect. Journal of Memory and Language, 47(4), 499–516. Hockley, W. E. (1994). Reflections of the mirror eVect for item and associative recognition. Memory & Cognition, 22, 713–722. Jacoby, L. L. (1991). A process‐dissociation framework: Separating automatic from intentional uses of memory. Journal of Memory and Language, 30, 513–541. Johnston, W. A., Hawley, K. J., Plewe, S. H., Elliott, J. M., & DeWitt, M. J. (1990). Attention capture by novel stimuli. Journal of Experimental Psychology: General, 119(4), 397–411. Joordens, S., & Hockley, W. E. (2000). Recollection and familiarity through the looking glass: When old does not mirror new. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26(6), 1534–1555. Kintsch, W. (1974). The representation of meaning in memory. Hillsdale, NJ: Lawrence Erlbaum Associates. Kliegl, R., & Lindenberger, U. (1993). Modeling intrusions and correct recall in episodic memory: Adult age diVerences in encoding of list context. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19, 617–637. Koriat, A. (2000). The feeling of knowing: Some metatheoretical implications for consciousness and control. Consciousness and Cognition, 9, 149–171. Kucera, H., & Francis, N. W. (1967). Computational analysis of present‐day American English. Providence, RI: Brown University Press. Lemaire, P., & Reder, L. M. (1999). What aVects strategy selection in arithmetic? An examination of parity and five eVects on product verification. Memory & Cognition, 27(2), 364–382. Lewis, C. H., & Anderson, J. R. (1976). Interference with real world knowledge. Cognitive Psychology, 7, 311–335. Light, L. L., Healy, M. R., Patterson, M. M., & Chung, C. (2005). Dual‐process models of memory in young and older adults: Evidence from associative recognition. In L. Taconnat, D. Clarys, S. Vanneste, and M. Isingrini (Eds.), Manifestations cognitives du vieillissement psychologique: Actes des VIIemes journees du vieillessement cognitif. (pp. 95–116). Paris, France: Publibook. Lovett, M. C., Daily, L. Z., & Reder, L. M. (2000). A source activation theory of working memory: Cross‐task prediction of performance in ACT‐R. Journal of Cognitive Systems Research, 1, 99–118. Lovett, M. C., Reder, L. M., & Lebiere, C. (1997). Modeling individual diVerences in a digit working memory task. Proceedings of the Nineteenth Annual Cognitive Science Conference, 460–465. Mahwah, NJ: Erlbaum. Lovett, M. C., & Schunn, C. D. (1999). Task representations, strategy variability, and base‐rate neglect. Journal of Experimental Psychology: General, 128(2), 107–130. MacLeod, C. M., & Kampe, K. E. (1996). Word frequency eVects on recall, recognition, and word fragment completion tests. Journal of Experimental Psychology: Learning, Memory, & Cognition, 22(1), 132–142. Maddox, W. T., & Estes, W. K. (1997). Direct and indirect stimulus‐frequency eVects in recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23, 539–559. Malmberg, K. J., & Murnane, K. (2002). List composition and the word‐frequency eVect for recognition memory. Journal of Experimental Psychology: Learning, Memory, & Cognition, 28(4), 616–630.
310
Lynne M. Reder et al.
Mandler, G. (1980). Recognizing: The judgment of previous occurrence. Psychological Review, 87(3), 252–271. McClelland, J. L., & Rumelhart, D. E. (1985). Distributed memory and the representation of general and specific information. Journal of Experimental Psychology: General, 114(2), 159–188. Naveh‐Benjamin, M. (2000). Adult age diVerences in memory performance: Tests of an associative deficit hypothesis. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 1170–1187. Nelson, A. B., & ShiVrin, R. M. (2006). Modeling the eVect of diVerential experience on perception and memory [abstract]. Proceedings of the Psychonomic Society, USA Vol. 11, p. 17. Norman, D. A., Rumelhart, D. E., & the LNR Research Group. (1975). Explorations in cognition. San Francisco, CA: Freeman. Paivio, A. (1971). Imagery and verbal processes. New York, NY: Holt, Rinehart, Winston. Park, H., Arndt, J. D., & Reder, L. M. (2006). A contextual interference account of distinctiveness eVects in recognition. Memory & Cognition, 34(4), 743–751. Park, H., Quinlan, J. J., Thornton, E. R., & Reder, L. M. (2004). The eVect of midazolam on visual search: Implications for understanding amnesia. Proceedings of the National Academy of Sciences, 101(51), 17879–17883. Peterson, S. B., & Potts, G. R. (1982). Global and specific components of information integration. Journal of Verbal Learning & Verbal Behavior, 21(4), 403–420. Quamme, J. R., Frederick, C., Kroll, N. E., Yonelinas, A. P., & Dobbins, I. G. (2002). Recognition memory for source and occurrence: The importance of recollection. Memory & Cognition, 30(6), 893–907. Radvansky, G. A. (1999). The fan eVect: A tale of two theories. Journal of Experimental Psychology: Genera, 128(2), 198–206. Rao, K. V., & Proctor, R. W. (1984). Study‐phase processing and the word frequency eVect in recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 10(3), 386–394. Reder, L. M. (1979). The role of elaborations in memory for prose. Cognitive Psychology, 11, 221–234. Reder, L. M. (1982). Plausibility judgments vs. fact retrieval: Alternative strategies for sentence verification. Psychological Review, 89, 250–280. Reder, L. M. (1987). Strategy selection in question answering. Cognitive Psychology, 19(1), 90–138. Reder, L. M. (1988). Strategic control of retrieval strategies. In G. Bower (Ed.), The psychology of learning and motivation (Vol. 22, pp. 227–259). New York, NY: Academic Press. Reder, L. M., & Anderson, J. R. (1980). A partial resolution of the paradox of interference: The role of integrating knowledge. Cognitive Psychology, 12, 447–472. Reder, L. M., Angstadt, P., Cary, M., Erickson, M. A., & Ayers, M. A. (2002). A reexamination of stimulus‐frequency eVects in recognition: Two mirrors for low‐ and high‐frequency pseudowords. Journal of Experimental Psychology: Learning, Memory and Cognition, 28, 138–152. Reder, L. M., Donavos, D. K., & Erickson, M. A. (2002). Perceptual match eVects in direct tests of memory: The role of contextual fan. Memory & Cognition, 30(2), 312–323. Reder, L. M., Park, H., & Kieffaber, P. D. (2007a). Memory systems do not divide on consciousness. Under review. Reder, L. M., Oates, J. M., Dickison, D., Anderson, J. R., Gyulai, F., Quinlan, J. J., et al. (2007b). Retrograde facilitation under midazolam: The role of general and specific interference. Psychonomic Bulletin & Review, 14(2), 261–269.
Experience Is a Double‐Edged Sword
311
Reder, L. M., Nhouyvansivong, A., Schunn, C. D., Ayers, M. S., Angstadt, P., & Hiraki, K. (2000). A mechanistic account of the mirror eVect for word frequency: A computational model of remember/know judgments in a continuous recognition paradigm. Journal of Experimental Psychology: Learning, Memory, & Cognition, 26(2), 294–320. Reder, L. M., Oates, J. M., Thornton, E. R., Quinlan, J. J., Kaufer, A., & Sauer, J. (2006a). Drug‐induced amnesia hurts recognition, but only for memories that can be unitized. Psychological Science, 17(7), 562–567. Reder, L. M., Proctor, I., Anderson, J. R., Gyulai, F., Quinlan, J. J., & Oates, J. M. (2006b). Midazolam does not inhibit association formation, just its storage and strengthening. Psychopharmacology, 188(4), 462–471. Reder, L. M., & Ritter, F. (1992). What determines initial feeling of knowing? Familiarity with question terms, not with the answer. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 435–451. Reder, L. M., & Ross, B. H. (1983). Integrated knowledge in diVerent tasks: The role of retrieval strategy on fan eVects. Journal of Experimental Psychology: Learning, Memory and Cognition, 9, 55–72. Reder, L. M., & Schunn, C. D. (1996). Metacognition does not imply awareness: Strategy choice is governed by implicit learning and memory. In L. M. Reder (Ed.), Implicit memory and metacognition (pp. 45–77). Mahwah, NJ: L. Erlbaum. Reder, L. M., Weber, K., Shang, Y., & Vanyukov, P. (2003). The adaptive character of the attentional system: Statistical sensitivity in a target localization task. Journal of Experimental Psychology: Human Perception and Performance, 29(3), 631–649. Reder, L. M., Wible, C., & Martin, J. (1986). DiVerential memory changes with age: Exact retrieval versus plausible inference. Journal of Experimental Psychology: Learning, Memory, and Cognition, 12(1), 72–81. Rugg, M. D., Cox, C., Doyle, M. C., & Wells, T. (1995). Event‐related potentials and the recollection of low and high frequency words. Neuropsychologia, 33(4), 471–484. Schulman, A. I. (1976). Memory for rare words previously rated for familiarity. Journal of Experimental Psychology: Human Learning & Memory, 2(3), 301–307. Schunn, C. D., Lovett, M. C., & Reder, L. M. (2001). Awareness and working memory in strategy adaptivity. Memory & Cognition, 29(2), 254–266. Schunn, C. D., & Reder, L. M. (1998). Strategy adaptivity and individual diVerences. In D. L. Medin (Ed.), The psychology of learning and motivation (pp. 115–154). Academic Press, York: PA. Schunn, C. D., Reder, L. M., Nhouyvanisvong, A., Richards, D. R., & StroVolino, P. J. (1997). To calculate or not calculate: A source activation confusion (SAC) model of problem‐ familiarity’s role in strategy selection. Journal of Experimental Psychology: Learning, Memory, & Cognition, 23, 1–27. Seidenberg, M. S. (1985). The time course of phonological code activation in two writing systems. Cognition, 19, 1–30. Servan‐Schreiber, E. (1991). The competitive chunking theory: Models of perception, learning, and memory. Dissertation Abstracts International, 52(6‐B), 3279. Shrager, J., & Siegler, R. S. (1998). SCADS: A model of children’s strategy choices and strategy discoveries. Psychological Science, 9(5), 405–410. Simon, H. A., & Gilmartin, K. (1973). A simulation of memory for chess positions. Cognitive Psychology, 5, 29–46. Simons, J. S., Dodson, C. S., Bell, D., & Schacter, D. L. (2004). Specific‐ and partial‐ source memory: EVects of aging. Psychology and Aging, 19, 689–694. Smith, E. E., Adams, N., & Schorr, D. (1978). Fact retrieval and the paradox of interference. Cognitive Psychology, 10, 438–464.
312
Lynne M. Reder et al.
Sokolov, E. N. (1963). Higher nervous functions: The orienting reflex. Annual Review of Physiology, 25, 545–580. Spehn, M. K., & Reder, L. M. (2000). The Role of Familiarity and Associative Competition in Building Novel Structures. Unpublished manuscript. Spencer, W. D., & Raz, N. (1995). DiVerential eVects of aging on memory for content and context: A meta‐analysis. Psychology and Aging, 10, 527–539. Sun, R. (2000). Explicit and implicit processes of metacognition. (Tech. Rep. No. TR‐00–001) Columbia, MO: University of Missouri, CECS Department. Tulving, E. (1985). Memory and consciousness. Canadian Psychology, 26, 1–12. Watkins, M. J., LeCompte, D. C., & Kim, K. (2000). Role of study strategy in recall of mixed lists of common and rare words. Journal of Experimental Psychology: Learning, Memory, & Cognition, 26(1), 239–245. Whitehouse, A. J. O., Maybery, M. T., & Durkin, K. (2006). The development of the picture‐ superiority eVect. British Journal of Developmental Psychology, 24(4), 767–773. Whittlesea, B. W., Jacoby, L. L., & Girard, K. (1990). Illusions of immediate memory: Evidence of an attributional basis for feelings of familiarity and perceptual quality. Journal of Memory and Language, 29(6), 716–732. Yonelinas, A. P. (1994). Receiver‐operating characteristics in recognition memory: Evidence for a dual‐process model. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20(6), 1341–1354. ZbrodoV, N. J. (1995). Why is 9 þ 7 harder than 2 þ 3? Strength and interference as explanations of the problem‐size eVect. Memory & Cognition, 23, 689–700.
TOWARD AN UNDERSTANDING OF INDIVIDUAL DIFFERENCES IN EPISODIC MEMORY: MODELING THE DYNAMICS OF RECOGNITION MEMORY Kenneth J. Malmberg
I.
A Possible Relationship Between the Speed‐Accuracy Trade‐Off and Individual Differences in Associative Recognition
Viewing the use of memory as skilled cognition emphasizes the interaction between the nature of the task, the nature of memory, and prior knowledge. The major implication is that some people can be better rememberers than others based on their skill as rememberers or other strategic factors; memory performance can change based on diVerences in one’s goals and in one’s meta‐level knowledge of how to achieve one’s mnemonic goals, even when memory is intact and the nature of the memory task is static. While individual diVerences are usually treated as random factors and conveniently washed away in the averaging process, diVerences in the strategic use of memory almost certainly characterize diVerent populations and discriminate among individuals of a given population. In some cases, the systematic diVerences in goals and prior knowledge are likely to be the major source of variance in memory performance. Understanding individual diVerences in memory is one of the most important goals for future memory research (cf. Lewandowsky & Heit, 2006). Over the past 125 or so years, memory research has focused on explaining the THE PSYCHOLOGY OF LEARNING AND MOTIVATION VOL. 48 DOI: 10.1016/S0079-7421(07)48008-2
313
Copyright 2007, Elsevier Inc. All rights reserved. 0079-7421/08 $35.00
314
Kenneth J. Malmberg
central tendencies in our observations, largely ignoring the variability in our observations. This approach to memory research served Ebbinghaus (he served as his only subject) and others well, but it puts severe limitations on the scope and usefulness of our understanding of memory. A long‐term goal should be to account for diVerent sources of variance in our observations, including individual diVerences owing to structural and strategic factors. The short‐term goal of this chapter is to identify diVerent potential sources of variability in recognition memory performance and to show how they diVerentially impact performance. Are individual diVerences due to random error or systematic factors that influence the manner in which associative recognition performed? While most would accept the proposition that both random and systematic factors influence performance, little is known about the extent to which each is a contributor or how to discriminate between these sources of variability in memory performance. There are also likely to be diVerent types of systematic influences on variability, and it is particularly important to consider that individuals might use diVerent strategies to perform the same task in the same nominal situations. This could occur because there are diVerences in one’s goals, in the state of memory, and/or because subjects are more or less able to select an adaptive strategy for the task or situation. If some individual diVerences are due to systematic factors, are they due to diVerences in the structural aspects of memory or the strategies or goals adopted? Associative recognition, for instance, requires the discrimination of pairs that were studied together (i.e., targets or intact pairs) from pairs that were not studied together ( foils or rearranged pairs), and younger adults perform associative recognition better on average than older adults (Light, Patterson, Chung, & Healy, 2004). Even within a given population, say a younger‐adult or older‐adult population, some individuals are better at associative recognition than others. Should the diVerences in performance found in younger versus old populations be attributed to the same factors that characterize the diVerences in performance within either of these populations? The answer to this question might aVect how we view the memory deficits of those within a given population, and thus discriminating the impact of diVerent sources of variability should be a clinically important goal of future memory research. (The performances of other memory tasks are also likely to be systematically aVected by individual diVerences, but the focus of this chapter will be associative recognition.) One way to assess the impact of structural changes in memory versus the eVectiveness of diVerent memory strategies is to implement changes in memory and diVerent strategies in a formal model and observe its behavior. Indeed, I suppose that it would be diYcult to discriminate among diVerent sources of variability outside the framework of a formal model. For instance,
Toward an Understanding of Individual Differences in Episodic Memory
315
the patterns of performance observed in the models can be compared to the performance of diVerent individuals or groups. To the extent that diVerent models (or sets of parameters) are required to describe the performance of the individuals, one might be able to attribute the diVerences in performance to diVerent strategies that were adopted versus diVerences in the structural aspects of memory. In subsequent sections, I will attempt to use a formal model to tease apart these sources of variance or individual diVerences. Because associative recognition can be performed in multiple ways and diVerent populations show deficits in performance, it is an ideal task to explore as a means for understanding the various sources of individual diVerences in episodic memory performance. Such an endeavor, of course, requires a framework that acknowledges that memory performance can be aVected by strategic factors. While most traditional models of discrimination can describe the eVects of a variety factors that influence memory, the flexibility in how a given task is performed is limited by their scope. Since these models often do not make assumptions about how memories are encoded, represented, or retrieved, their flexibility is limited to adopting diVerent decision criteria or thresholds in response to the statistical properties of the environment (Green & Swets, 1966; Macmillan & Creelman, 1990; Rotello & MacMillan, this volume; Rotello, Macmillan, & Reeder, 2004). Some have argued that such impoverished models can explain recognition performance in all its varieties (Dunn, 2004; Rotello et al., 2004). Growing evidence suggests, however, that recognition does not necessarily involve only a simple discrimination between events that occurred and events that did not occur (Van Zandt & Maldonado‐Molina, 2004; Malmberg & Xu, 2007). Rather, the manner in which recognition tasks are performed also depends on the similarity between the targets and the foils (Malmberg, Holden, & ShiVrin, 2004; Malmberg, Zeelenberg, & ShiVrin, 2004; Malmberg & Xu, 2007), task demands (Jacoby, 1991; Malmberg & Xu, 2006), available retrieval cues (Criss & ShiVrin, 2005), the composition of the study lists (Hockley & Newandomski, 2007), and the expertise one has in processing diVerent stimuli (Xu & Malmberg, 2007). Other findings suggest that the basis for yes‐no recognition is diVerent from that used to make a discrimination based on confidence ratings (Van Zandt & Maldonado‐Molina, 2004; also see Baranski & Petrusic, 1998; Malmberg & Xu, 2007), that yes‐no recognition is based on diVerent information than judgments of frequency (Cleary, Curran, & Greene, 2001; Hintzman, Curran, & Oppy, 1992; Malmberg, Holden et al., 2004), and that confidence ratings and judgments of frequency are based on diVerent information (Hintzman, 1986). Accounting for the aforementioned dissociations is diYcult without some degree of model flexibility (cf. Malmberg, Holden et al., 2004), and several
316
Kenneth J. Malmberg
newer models have been devised as a result (Criss & ShiVrin, 2005; Kelley & Wixted, 2001; Malmberg, Holden et al., 2004; Reder et al., 2000). One important advance being made is that these newer models are more comprehensive in their description of memory, and additional strategies emerge from them as the representations and processes involved in recognition memory are fleshed out. Nevertheless, a potential problem arises when additional assumptions are made in order to account for the performance of diVerent tasks and situations. The danger of proposing diVerent models for diVerent tasks and situations is that the theory might become rather post hoc in its organization of the data. To mitigate this danger, we have proposed that there are diVerent ways to perform recognition tasks, and one selects (or at least should select) a recognition strategy that will achieve a subjectively determined level of accuracy in the shortest amount of time (Malmberg & Xu, 2007). Such a strategy is said to be eYcient with respect to the subject’s goals, and the construct of eYciency organizes the diVerent models under a single theoretical framework. The eYciency hypothesis assumes that subjects have meta‐level and/or perhaps implicit knowledge of how diVerent tasks can be performed and how the processes involved in performing these tasks are constrained by time pressure and other situational factors. The ability of a subject to eYciently perform a recognition task depends on the ability to adapt to the current situation by adopting an appropriate strategy. This will depend on the quality of the subject’s meta‐level knowledge of his memory, the task, and the situation. In some cases, the subjects may perform at a suboptimal level of accuracy because speed is deemed to be more important or vice versa (Van Zandt & Maldonado‐Molina, 2004; Malmberg & Xu, 2007). In these situations, performance is suboptimal with respect to the level of accuracy that can potentially be achieved, but performance would nevertheless be considered eYcient with respect to the subjective goal. The trade‐oV between speed and accuracy is well documented in the literature of cognitive psychology. However, the extent to which this relationship is used to aVect how learning and memory tasks are performed is only now beginning to be understood. Nelson and Narens (1990; also Nelson & Leonesio, 1990) proposed that subjects adopt a standard to which they seek to learn new material, and they allocate various amounts of time to learning the diVerent elements of the material in order to achieve their standard. For instance, the amount of self‐paced study time allocated to a given item is correlated with a metacognitive ease‐of‐learning judgment, but the fact that subjects tend to underestimate the amount of learning needed to achieve a desired level of performance is referred to as the ‘‘labor‐in‐vain’’ eVect. What is less understood is the extent to which the speed‐accuracy trade‐oV is taken into account when performing various memory tasks. Recent
Toward an Understanding of Individual Differences in Episodic Memory
317
associative recognition memory experiments suggest that subjects select a strategy that they determine allows them to achieve a desired level of accuracy in shortest time possible. For instance, Malmberg and Xu (2007) varied the composition of the lists used to test memory, such that in some cases, pairs formed from unstudied items were also tested (i.e., XY pairs). Since the average similarity between the targets and foils is lower, the discrimination of XY pairs and rearranged pairs from intact pairs is relatively easy compared to the discrimination of only rearranged from intact pairs. In these experiments, the overall accuracy was similar across conditions (Malmberg & Xu, 2007). That is, the probability of endorsing rearranged pairs (hit rate) was similar, and of course the probability of incorrectly endorsing foils ( false‐ alarm rate) was lower for XY pairs than for rearranged pairs. Critically, the false‐alarm rate for rearranged pairs was greater when XY pairs were tested than when XY pairs were not tested. The false‐alarm rates were statistically similar across testing conditions, even though the overall time it took to perform the task was much less on average in the condition when XY pairs were tested. Thus, the ability to discriminate intact from rearranged pairs was dependent on testing conditions that could not be attributable to factors such as bias, interference, attentional load, stress, or delay. Such findings suggested us that perhaps metacognitive judgments are made prior to and/or during the retrieval of information from memory in order to maximize a subjective speed‐accuracy trade‐oV (Malmberg & Xu, 2007). Understanding the impact on recognition performance of implementing diVerent strategies can help us to understand the eVect of diVerent sources of variability in recognition performance and diVerentiate diVerences in strategy selection from diVerences in structural changes in memory. It is quite possible that the appropriate selection of a strategy that meets one’s goals vis‐a`‐vis the speed‐accuracy trade‐oV is a major source of individual diVerences. In order to achieve the level of specificity necessary to diVerentiate strategic versus structural sources, it is necessary to understand how these factors interact with the dynamics of recognition memory. Most memory models have focused on accounting for the accuracy of recognition and not the latency of recognition, and as we will see the latency with which one performs a task places constraints on the models. Other models are used to investigate speed‐accuracy trade‐oVs without reference to any explicit assumptions about the nature of memory (Dosher, 1984; Gronlund & RatcliV, 1989). More rarely, models have focused on the latency of recognition memory under conditions where accuracy was nearly perfect (Atkinson & Juola, 1974). Models that have addressed both the speed and accuracy of recognition memory have been subsequently disconfirmed (Diller, Nobel, & ShiVrin, 2001; Xu & Malmberg, 2007).
Kenneth J. Malmberg
318
In this chapter, I will describe several new models of the accuracy and latency of associative recognition performance. Although I will discuss some important findings that constrain model development, quantitative model selection is not the primary goal. Nor is the immediate goal to identify individuals who have diVerent tendencies to emphasize speed versus accuracy. Rather, the present purpose is to develop a better understanding of the potential sources of individual diVerences by formally describing the speed‐ accuracy trade‐oV within a recognition memory framework and how the models’ behavior are influenced by various structural versus strategic factors. The models that I will describe are illustrative of many that have been verbally described in the past or implemented without reference to the nature of memory (Yonelinas, 1997), which is usually described as how information is encoded, represented, and retrieved. There are, however, a number of issues that need to be addressed in order to implement them, and their complexity does not readily lend to understanding based on intuition or casual reasoning. When working with relatively complex models, it is rarely the case that the initial set of assumptions is that which is ultimately adopted and usually only the final outcome of a modeling endeavor is reported. However, there is much to be gained by observing the modeling process, as it is perhaps more important to understand what implementations do not work than to identify an implementation that is suYcient. For instance, we might expect based on intuition that a particular shift in criteria will produce a speed‐accuracy trade‐oV only to find out that after it is implemented our intuitions were incorrect. Thus, the purpose of this chapter is to identify issues that are important for understanding the dynamics of associative recognition and how structural versus strategic influences aVect performance, and hence how individual diVerences might map onto these sources of variability. Before doing so, I will briefly describe traditional recognition memory procedures and several classical models of associative recognition in the context of findings that constrain them. II.
Traditional Testing Procedures
In single‐item recognition experiments, subjects discriminate items that were studied from items that were not studied. Usually, items are randomly selected from a large corpus and assigned to either target or foil conditions for each subject. Thus, the targets and foils are only randomly similar to each other. In comparison, pairs of items are studied in most associative recognition experiments (A‐B, C‐D, E‐F, and so on). Traditionally, the pairs consist of
Toward an Understanding of Individual Differences in Episodic Memory
319
words, but there is growing and increasingly important literature of the associative recognition of nonverbal and novel stimuli (Criss & ShiVrin, 2005; Cleary et al., 2001; Greene, 1996; Hockley, 1994; Xu & Malmberg, 2007). At test, associative recognition requires the subject to discriminate between intact (e.g., A‐B, C‐D) and rearranged pairs (e.g., A‐D, C‐B). Thus, the discrimination of intact from rearranged pairs involves discriminating between pairs that are similar to each other. There are two traditional procedures for assessing the relationship between accuracy and latency. The signal‐to‐respond procedure forces subjects to report their recognition judgment within a narrow window subsequent to the presentation of the test stimulus (Diller et al., 2001; Dosher, 1984; Gronlund & RatcliV, 1989; Light et al., 2004). The free‐response procedure allows subjects to respond when they choose. The advantage of the signal‐to‐ respond procedure is that it allows the researcher to assess the relationship of speed and accuracy at diVerent points during retrieval. The disadvantage of the signal‐to‐respond procedure is that the severe constraints placed on the report might influence how subjects make recognition judgments. III.
Classical Models of Associative Recognition
Models are often characterized by the information used to make a recognition decision. Some models assume that recognition is based on a continuous random variable often conceptualized as familiarity. The compound‐cue model (Gronlund & RatcliV, 1989; also see Gillund & ShiVrin, 1984; Hintzman, 1986; Murdock, 1982) attempts to account for the time course of recognition accuracy derived from the signal‐to‐respond procedure by positing a simple random walk or diVusion process (RatcliV, 1978). Accordingly, associative recognition involves a comparison of diVerent types of cues with the contents of memory: concurrent cues and compound cues. Concurrent cues represent the individual items of a test stimulus, and they provide a measure of the familiarities of the individual test items when used as memory probes. A compound cue is composed of the two test items, and when used as a probe it provides a measure of how familiar the test pair is. Item familiarity tends to provide positive evidence for both intact and rearranged pairs, whereas associative evidence tends to provide positive evidence when an intact pair is tested but not when a rearranged pair is tested. According to the continuous‐time version of the compound‐cue model (i.e., a diVusion model), positive and negative discrete evidence (i.e., familiarity) is accumulated subsequent to the memory probe starting at point Z at a mean rate of u with a variance of s2. Positive endorsements are made when the accumulated familiarity associated with the cue is greater than a
320
Kenneth J. Malmberg
subjective ‘‘old’’ criterion and negative endorsements are made when it is less than a subjective ‘‘new’’ criterion. One can think of the random walk model as dynamic version of signal detection models (Green & Swets, 1966; RatcliV, 1978), and they account for the latencies of a wide variety of binary decisions (RatcliV & Smith, 2004). For instance, the random walk model provides a natural explanation for the speed‐accuracy trade‐oV. Speeded decisions are made by decreasing the diVerence between the ‘‘yes’’ and ‘‘no’’ criteria, which leads to faster but more error‐prone performance. A compound‐cue random walk model can also predict a nonmonotonic relationship between false‐alarm rates and the delay of the signal‐to‐respond judgment. Responses made at relatively short delays are influenced by the item familiarities to a greater degree than the associative familiarity, but responses made at relatively long delays are influenced by the associative familiarity to a greater degree than the item familiarities. Thus, false‐alarm rates initially increase with response delay and then decrease (Dosher, 1984; Gronlund & RatcliV, 1989). In addition to the nonmonotonic signal‐to‐respond false‐alarm rate function, the eVect of repetitions on false‐alarm rates from the free‐response procedure is a critical finding to be explained. Most random walk models do not describe how memories are represented or retrieved and therefore they make very few if any useful predictions concerning the accuracy of recognition memory unless they are implemented in a model of memory. Because they assume that the evidence on which a decision is based is a continuous random variable, it makes the most sense to implement a compound‐cue model of recognition in a global‐matching model of familiarity (ShiVrin & Steyvers, 1997). Simple versions of these models predict that increasing the number of times that pairs are studied should increase false‐alarm rates (ShiVrin & Steyvers, 1997; Xu & Malmberg, 2007; see Criss & McClelland, 2006, for an analysis of the eVect of list composition). In contrast, there is often little or no eVect of repetitions on false‐alarm rates and sometimes false‐alarms decrease (Cleary, Curran, & Greene, 2001; Kelley & Wixted, 2001; Xu & Malmberg, 2007). A related class of models assumes that associative recognition involves a computation of associative familiarity that is independent of the familiarity of the items comprising the test pair, and hence they are often referred to as independent‐cue models (Criss & ShiVrin, 2005; Kelley & Wixted, 2001; Murdock, 1997). Independent‐cue models can predict little or no aVect of target presentations on false‐alarm rates because the associative cue is only randomly similar to the contents of memory. That is, the familiarities of rearranged pairs are indistinguishable from the familiarities of the XY pairs, which consist of items that were not studied (see above). Independent‐cue models that assume that associative information is the basis of all responses
Toward an Understanding of Individual Differences in Episodic Memory
321
(Murdock, 1982) are disconfirmed by findings that show the false‐alarm rates for rearranged pairs are greater than those for XY pairs. Other independent‐cue models assume that associative recognition is based on a combination of item and associative information (Criss & ShiVrin, 2005; Kelley & Wixted, 2001). These models can predict that the false‐alarm rates are greater for rearranged pairs than for XY pairs, but they cannot explain a null eVect of repetitions on false‐alarm rates. Moreover they predict that the eVect of repetitions on the latencies of the hit rates and correct rejections of rearranged pairs should be similar (Criss & ShiVrin, 2005; Kelley & Wixted, 2001), and numerous experiments in my laboratory have found that hits tend to be faster than correct rejections, especially when pairs are relatively well encoded. Thus, compound‐ and independent‐cue models are challenged to explain the accuracy and the latency of associative recognition. While the compound‐cue models come up short in some important respects, they nevertheless have had a large impact on theory. The assumption that more than one type of information contributes to recognition performance increases the complexity of the models and introduces the possibility of diVerent retrieval strategies that are characterized by the type of information that is used. Dual‐process models (Yonelinas, 1997; Xu & Malmberg, 2007) are related to independent‐cue models insofar as both assume that associative recognition involves more than one source of information. It is often emphasized that the primary diVerence between these models is that the dual‐process model assumes that associative information has a discrete form whereas the independent‐cue models assume that associative information has a continuous form. Many models assume that the retrieval processes involved in producing the diVerent forms of information are themselves distinct. However, these assumptions do not necessarily have to be the case (Rotello et al., 2004), and these points tend to obscure the more important functional roles that associative information plays in memory performance. Perhaps a more functionally important distinction between independent‐cue and dual‐process models is the independence assumption. Independent‐ cue models assume that associative information provides neutral evidence concerning the status of rearranged pairs, and hence there is no relationship between the familiarity of the items comprising rearranged pairs and the ability to reject rearranged pairs based on associative familiarity. Dual‐process models, on the other hand, assume that associative information is in the form of episodic details recalled from memory. The episodic details provide negative information concerning the status of rearranged pairs. Because there is a positive relationship between the familiarity of the items and the ability to reject otherwise familiar rearranged pairs, the more familiar the rearranged
322
Kenneth J. Malmberg
pair, the more likely it will be rejected based on recollection. Thus, dual‐process models can account for the relationship between pair repetitions and false‐ alarm rates if it is assumed that some responses are based on familiarity and some responses are based on the episodic details retrieved from memory. Dual‐process models can also account for the nonmonotonic relationship between false‐alarm rates and response delay that is observed using the signal‐to‐respond procedure (Gronlund & RatcliV, 1989). Accordingly, familiarity begins to accumulate shortly after the presentation of the stimulus. For both intact and rearranged pairs, this tends to be positive evidence. Most models assume that the ability to recall episodic details requires an additional processing stage, and hence takes longer than the production of familiarity. Once the details are recalled, however, they may be used to reject rearranged pairs, producing lower false‐alarm rates at longer delays. Individual diVerences might arise based on the extent to which the outcome of familiarity process versus the recall process is emphasized. Thus, dual‐ process models provide a means for describing many facets of associative recognition performance, and eVects of strategic factors naturally arise from their framework. There are, however, diVerent ways to implement a dual‐process model, and not all of the implementations can account for both speed and accuracy of recognition performance. For instance, a retrieving effectively from memory (REM) dual‐process model that was devised by Malmberg and Xu accounts for a variety of recognition accuracy findings (Malmberg & Xu, 2007; Xu & Malmberg, 2007; also see Malmberg, Holden et al., 2004). When applied to the free‐response procedure it assumes that subjects initially evaluate the familiarity of the test pair, and if it does not exceed a subjective criterion the pair is called ‘‘new.’’ However, familiarity is not assumed to be a suYcient basis for ‘‘old’’ status because the subject is most likely aware that both intact and rearranged pairs yield reasonably high levels of familiarity. In order to call a pair ‘‘old,’’ we assume that the subject first attempts to recall episodic details, which may be used to accept an intact pair or correctly reject a rearranged pair. If the attempt to recall episodic details fails, a guess is made. While this dual‐process model does a good job accounting for the accuracy of associative recognition, it has problems accounting for the latency of associative recognition. Notice that before responding ‘‘old,’’ an attempt to recall is made, but ‘‘new’’ responses can be made without an attempt to recall. On this assumption, the latencies of hits should be slower than the latencies of correct rejections. Figure 1 shows the data from Experiment 2 of Malmberg and Xu (2007). These data are representative of a large number of similar experiments conducted in my laboratory, where we have shown that the dual‐process model provides a reasonable account of the accuracy of associative recognition. However, the latencies of the hits are shorter than the
Toward an Understanding of Individual Differences in Episodic Memory
1.0
323
1.8
1.7 .8 1.6
p (old)
Latency (s)
.6
.4
1.5
1.4
1.3 .2 1.2 HR FAR 0
HR CR 1.1
0 2 4 6 8 10 12 Target presentations
0 2 4 6 8 10 12 Target presentations
Fig. 1. Data from Malmberg and Xu (2007) illustrating the relationship between the speed and accuracy associative recognition in a free‐response paradigm. HR ¼ hit rate, FAR ¼ false‐ alarm rate, CR ¼ correct rejection.
latencies of correct rejections for rearranged pairs, and hence these findings indicate that the decision model needs to be modified in order to account for both the accuracy and the latency of free‐response associative recognition.
IV.
Modeling the Accuracy and Latency of Associative Recognition
We have assumed that recognition can be performed in diVerent ways, and that the strategy adopted depends on the subject’s goals, the task, and subject’s ability to implement a strategy that allows for a desired level of accuracy to be achieved in the shortest time possible. DiVerences in the goals and abilities of subjects are likely to be major sources of individual diVerences in memory performance. Therefore, in order to understand the sources of individual diVerences, it is critical to understand the dynamics of associative recognition memory. In this section, I will describe several new models of the accuracy and latency of associative recognition that lead to a better understanding of the time course of associative recognition.
324
A.
Kenneth J. Malmberg
ENCODING OF ASSOCIATIVE INFORMATION
As in other REM models (ShiVrin & Steyvers, 1997), this dynamic version of the Xu and Malmberg model of associative recognition assumes that words are represented by lexical/semantic traces consisting of w geometrically distributed features, where the probability of each feature value j is: P( j) ¼ g(1 g) j1. These traces are assumed to be complete and accurate representations of the general knowledge about the words that has been built up over a lifetime. When a list of word pairs is studied, an episodic trace is stored for each pair. Episodic encoding is a concatenation of incomplete and error‐prone copies of the lexical/semantic traces that represent the words. For each attempt, t, made to store a lexical/semantic feature in the episodic trace, there is a probability u* that a nonzero feature value will be stored. If a nonzero feature value is stored, it is copied correctly with probability c. If encoding of a feature does not occur, a zero is stored. In all the simulations, the encoding parameters are set to: w ¼ 10, g ¼ .4, t ¼ 8, u* ¼ .04, and c ¼ .7. The episodic association between the two words is the concatenation of two episodic vectors. These episodic traces also contain context features (cf. Criss & ShiVrin, 2005), but for the sake of simplicity allow me to omit this consideration for the present. Repetition of studied items has been modeled in REM in diVerent ways (Malmberg & ShiVrin, 2005; Malmberg, Holden et al., 2004; ShiVrin & Steyvers, 1997; Xu & Malmberg, 2007). For present purposes, I will assume that feature storage accumulates in a single‐trace for a given pair of words because the more complex assumptions will not diVerentially aVect the qualitative predictions of the models that we will consider. B.
FAMILIARITY‐BASED RETRIEVAL
The familiarity‐based process operates in a fashion similar to other REM models (Malmberg, Holden et al., 2004; ShiVrin & Steyvers, 1998; Xu & Malmberg, 2007). The critical diVerence is that the retrieval cue is compared to only a subset of the features that comprise the traces in the activated memory set, and that retrieval occurs overtime in successive cycles on this subset of activated memory. For each unit of retrieval time, T ¼ 1. . .m, a set of features, FT, is sampled from an activated memory set with a probability of s. In the simplest simulations, context does not play an important role in associative recognition, and hence the activated set consists of the traces stored during study. Modeling how the features (or evidence) are sampled from memory is usually ignored in random walk models. However, the issue is both theoretically interesting and pragmatically important, and therefore I will consider two approaches. In the simplest model, sampling occurs with replacement.
Toward an Understanding of Individual Differences in Episodic Memory
325
That is, each feature in the activated memory set is sampled with probability s for each T, and the probability of sampling a given feature for FT is independent of sampling that feature for FT þ 1. For each T, global matching is performed over FT producing a familiarity value T. The logarithm of T is computed in order to produce either positive or negative evidence that the test pair was studied. This is the evidence that will be used to drive the random walk, and this evidence accumulates over T such that when T ¼ m the accumulated value of odds [(m)] is m X
m X
1X log FT ¼ log lj fðmÞ ¼ N i¼1 T¼1 T¼1 j
!
where lj ¼ ð1 cÞ
njq
" #nijm 1 Y c þ ð1 cÞgð1 gÞi1 i¼1
gð1 gÞi1
Thus, (m) represents the evidence accumulated to time T ¼ m in response to a probe with a compound cue consisting of the lexical/semantic traces representing the words comprising the test stimulus. An intuitively unappealing characteristic of the sampling‐with‐replacement model is that it assumes that there is potentially an infinite amount of information to be gained from any probe of memory. Given that the amount information stored about an event is assumed to be finite, how can the information retrieved from memory be potentially infinite? The answer to this question is unclear, even though sampling with replacement is the standard assumption in the literature. This assumption is, however, what guarantees that a probe will result in the evidence crossing one of the decision criteria. As an alternative to the sampling‐with‐replacement model, the sampling‐ without‐replacement model assumes that once a feature has been sampled as the result of a probe of memory, it remains a part of FT. Thus, features accumulate in FT and the matching process is computed over a more and more complete version of the activated memory set, as FT will become identical to activated memory set at suYciently long values of T: ! j 1X lj ; fðmÞ ¼ log FT ¼ log N i¼1
Kenneth J. Malmberg
326
where njq
lj ¼ ð1 cÞ
" #nijm 1 Y c þ ð1 cÞgð1 gÞi1 i¼1
gð1 gÞi1
:
In Section C , I will compare the performance of these sampling models. C.
DECISION
We already discussed that the nature of the decision process has important implications for the dynamics of recognition memory. For instance, the assumption that a recall process must be invoked before an ‘‘old’’ response can be made suggested that ‘‘old’’ responses should be slower on average than ‘‘new’’ responses. For the signal‐to‐respond procedure, I assume that a response is based on the evidence accumulated at any given T. If (m) 0, the response is ‘‘old,’’ otherwise it is ‘‘new.’’ That is, the subject is not free to determine the amount of evidence required to make a response. The nature of the free‐response decision process depends on the sampling model. First, consider the sampling‐with‐replacement model. For the free‐ response procedure, I assume that there is an old criterion, KO, and a new criterion, KN. If (m) > KO, then the response is ‘‘old,’’ and if (m) < KN, the response is ‘‘new.’’ If KN < (m) < KO, then a new sample of features is drawn from the activated memory set, matched against the retrieval cue, and added to (m). (Another way of viewing the signal‐to‐respond model is to assume that KN ¼ KO ¼ 0.) Because (m) is a summed accumulation of evidence based on a series of independent samples from the activated memory set, this model guarantees that at some T the evidence will exceed one of the decision criteria, unless of course the log odds of the match between the compound cue and the activated memory set is 0, which is highly unlikely.1 A diVerent assumption is that (m) is based on sampling‐without‐replacement at each T. That is, the sample drawn from the activated memory system becomes more complete with respect to T, and the evidence is evaluated anew after each sample. Because T is bounded by the amount of information stored during study, it is therefore possible (even likely) under sampling without replacement that neither decision criterion will ever be exceeded. 1
One might be tempted to assume that information accumulates based on a match of the cue against the complete activated memory set for each T because this would allow for an optimal basis for a decision, but it would obviate the need for an accumulation of evidence since there would be no need to sample more than once.
Toward an Understanding of Individual Differences in Episodic Memory
327
Thus, a sampling‐without‐replacement model is inherently more complex than a sampling‐with‐replacement model because there must be a stopping rule that is utilized when neither decision criterion has been met and there is little or no evidence left to sample or the subject is unwilling to search any longer. For instance, I assume that for free‐response recognition KMax is the maximum amount of time the subject is willing to search memory (cf. Raaijmakers & ShiVrin, 1980). If the KN < (m) < KO and T ¼ KMax, then a guess is made. This stopping rule is similar to the guessing assumption made in earlier models of recognition accuracy (Malmberg, Holden et al., 2004; Malmberg & Xu, 2007; Xu & Malmberg, 2007). Accordingly, I usually assume that KMax ¼ 10 and that guessing is biased to respond ‘‘old,’’ which I assume is equal to .90 in the present simulations. The guessing bias might seem unreasonably high, but its value is consistent with prior applications of the accuracy model where it was determined based on attempts to quantitatively fit data and justified on the assumption that even when recall fails that items seem relatively familiar (i.e., there was no evidence to indicate that the item or pair was new; Malmberg, Holden et al., 2004; Malmberg & Xu, 2007; Xu & Malmberg, 2007). D.
FAMILIARITY‐BASED PERFORMANCE
Before moving to a discussion of the recall process, it is worth exploring the assumptions that concern the accumulation of evidence based on the global‐ matching process. Other REM models have focused on only the accuracy of recognition memory. Those models used a single decision criterion with a default value of log() ¼ 0, which can be considered Bayesian optimal because the odds are equivalent to the probability that test stimulus is old divided by the probability that the test stimulus is new given the matching and mismatching features (ShiVrin & Steyvers, 1997). In the dynamic model, however, only a sample from the activated memory set is compared to the retrieval cue at time T, and thus the evidence accumulated at any given T is not likely to provide an optimal basis for a recognition decision. The sample‐ without‐replacement model comes closer to optimality in the sense that it is possible that the entire set of activated features would be involved in the global‐matching process at a suYciently long T. Figure 2 shows the behavior of the sampling‐with‐replacement model (left panel) and the sampling‐without‐replacement (right panel) model derived from the signal‐to‐respond assumptions. It shows how hit rates and false‐ alarm rates are aVected over time by the number of times that targets are presented. Note that the models make identical predictions when the signal to respond comes at the earliest delay because replacement occurs or does not
Kenneth J. Malmberg
1.0
1.0
.8
.8 p (old)
p (old)
328
.6
.4
.6
.4
.2
.2 0
2
4
6
8
10
Delay Note: Hit rates and false-alarm rates were generated at various units of time following the presentation of the test pair. Delay 1 is the earliest response deadline and Delay 9 is longest response deadline. The left graph shows the performance of sampling-withreplacement model and right graph shows the performance of sampling-without-replacement model. The parameter values were: w = 10, t = 8, c = .7, u* = .04, g = .4, KO = KN = 0, s = .25.
0
2
4
6
8
10
Delay HR - 12 presentations HR - 6 presentations HR - 3 presentations HR - 2 presentations HR - 1 presentation FAR - 1 presentation FAR - 2 presentations FAR - 3 presentations FAR - 6 presentations FAR - 12 presentations
Fig. 2. Hit rates and false‐alarm rates as a function of response deadline and pair presentations for the familiarity‐based signal‐to‐respond model under the assumptions of sampling‐ with‐replacement versus sampling‐without‐replacement.
occur only after the first sample is evaluated. Hit rates and false‐alarm rates then tend to increase with repetitions and delay in both models. Like other implementations of the compound‐cue model in a global‐ matching framework, performance based on familiarity cannot alone account for the nonmonotonic relationship between the response delay and false‐alarm rates (Gronlund & RatcliV, 1989). Nevertheless, this analysis tells us something useful about how to eYciently model the relationship between speed and accuracy: The sampling‐without‐replacement model produces a higher level of accuracy than the sampling‐with‐replacement at any T > 1. Therefore, sampling without replacement is inherently more eYcient than sampling with replacement, and if we want a model of familiarity within a Bayesian system of memory, this is an important factor to consider. One structural factor that influences the rate at which information is accumulated is the sample size, s. It plays a role in the present model that is
Toward an Understanding of Individual Differences in Episodic Memory
329
similar to the role drift rate plays in a random walk model. Variability in the drift rate has been shown to be critical in order to achieve realistic levels of accuracy (RatcliV, 1978). In the present models, variability in the drift rate is the result of the variability in the familiarity of the pairs. That is, the variability occurs in T for each sample, which is determined by how well pairs are encoded and s, and therefore it is likely to diVer between individuals and populations. Figure 3 shows how the sample size aVects free‐response recognition in the model of familiarity, where s ¼ .10, .25, and .40 and pairs were studied 1, 2, 3, 6, or 12 times. Figure 3 also illustrates the point concerning the optimality of recognition memory when the decision rule is based on partial information. The top panels of Fig. 3 show how the sample size aVects the accuracy and latency of the sample‐with‐replacement model and bottom panels show the same for the sample‐without‐replacement model. In both models, the latencies decrease as the sample size increases and that the latencies of the correct rejections are much longer than the latencies of the hits regardless of the value of s. Moreover, increases in the sample size improve accuracy and decrease latencies. Because optimal decisions are based on all the information in memory, but the samples are incomplete representations of what is stored, there is less noise in the accumulation of evidence as the sample size increases, especially in the sample‐without‐replacement model. The sampling‐without‐replacement model achieves greater accuracy in far less time than the sampling‐with‐replacement model, despite the fact that guessing is a necessary component of the sampling‐without‐replacement model. Thus, as a structural component of memory, a component whose operations are outside the control of the subject, the sampling without replacement is to be preferred in a free‐response model as well as in a signal‐to‐respond model. In the context of this chapter on memory as skilled cognition, it is also important to explore how a strategically initiated speed‐accuracy trade‐oV might be achieved based on global matching. This will also help us to understand the behavior of the more complex dual‐process model. Under most conditions, the sampling rate is not likely to be influenced by the goals or strategies of the subject, at least after encoding is complete. Thus, let us assume that diVerences in sampling rate reflect diVerences in a structural aspect of memory. On the other hand, the decision criteria are usually assumed to be under the control of the subject (Egan, 1958; Green & Swets, 1966). In simple versions of a random walk model, for instance, increasing the diVerence between KO and KN should improve accuracy and increase latencies, but Figs. 4 and 5 show that this is not necessarily the case for the present model. Let us consider that changes in the location of the decision criteria reflect strategies with diVerent emphases on speed versus accuracy.
Kenneth J. Malmberg
330 Accuracy
Latency 12 10
.8 8 Latency
Sampling with replacement p (old)
1.0
.6
4 .4 2 .2
0 0 2 4 6 8 101214
0 2 4 6 8 10 12 14 7
1.0
6 .8 5 Latency
Sampling without replacement p (old)
6
.6
4 3
.4 2 .2
0 2 4 6 8 10 12 14
1
0 2 4 6 8 10 12 14
Pair presentations Note: This figure shows the relationship between associative recognition performance and the rate at which evidence accumulates for free-response recognition when performance is based on familiarity only. Hit rates and false-alarm rates as a function of the number of target presentations are plotted in the left panel, and response latencies are plotted in the right panel. The top row shows performance based on a samplingwith-replacement model and the bottom row shows performance with a sampling-without-replacement model. Values of s are .10, .25, and .40. The parameter values are: w = 10, t = 8, c = .7, g = .4, KO = 1.0, KN = −1.0.
Targets: s = .40 Targets: s = .25 Targets: s = .10 Foils: s = .10 Foils: s = .25 Foils: s = .40
Fig. 3. A comparison of the eVect of diVerent sampling rates on the speed and accuracy of familiarity‐based free‐response recognition under the assumptions of sampling‐with‐replacement versus sampling‐without‐replacement.
Let us first consider the eVect of implementing a symmetrical change in the decision criteria. Three sets of criterion locations were chosen for the simulation whose results are shown in Fig. 4: KO ¼ 1.0, 2.0, and 3.0, where KN ¼ –KO. The right panels of Fig. 4 show that latencies increase with
Toward an Understanding of Individual Differences in Episodic Memory Accuracy 1.0
18
331
Latency
14
.8
12 Latency
Sampling with replacement p (old)
16
.6
10 8 6
.4
4 2 0
.2
0 2 4 6 8 10 12 14
0 2 4 6 8 10 12 14 10
8
.8
.6
.4
.2
Latency
Sampling without replacement p (old)
1.0
6
4
2
0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 Pair presentations
Note: This figure shows the relationship between associative recognition performance and the rate at which evidence accumulates for response recognition when performance is based on familiarity only. Hit rates and false-alarm rates as a function of the number of target presentations are plotted in the left panel, and response latencies are plotted in the right panel. The parameter values were: w = 10, t = 8, c = .7, g = .4, s = .25.
Targets: KO = 3, KN = −3 Targets: KO = 2, KN = −2 Targets: KO = 1, KN = −1 Foils: KO = 1, KN = −1 Foils: KO = 2, KN = −2 Foils: KO = 3, KN = −3
Fig. 4. A comparison of the eVect of a symmetrical shift in the decision criteria on the speed and accuracy of familiarity‐based free‐response recognition under the assumptions of sampling‐ with‐replacement versus sampling‐without‐replacement.
increases in the spread of decision criteria for both sampling models, and that accuracy actually decreases. Increasing the spread has a positive eVect on both hit rates and false‐alarm rates. However, the increase is smaller for hit rates than for false‐alarm rates. This is because the rearranged pairs are
Kenneth J. Malmberg
332
Latency
Accuracy 8 7
.8
6 .6
Latency
Sampling with replacement p (old)
1.0
.4
4 3
.2 2 0
1
0 2 4 6 8 10 12 14
0 2 4 6 8 10 12 14 6
1.0
5
.8 Latency
Sampling without replacement p (old)
5
.6
4
3
.4 2 .2 0 2 4 6 8 101214
1
0 2 4 6 8 10 12 14 Pair presentations
Note: This figure shows the relationship between associative recognition performance and the spread between the old and new criteria. The spread is implemented in an asymmetrical manner. Hit rates and false-alarm rates as a function of the number of target presentations are plotted in the left panel, and response latencies are plotted in the right panel. The parameter values were: w = 10, t = 8, c = .7, g = .4, KO = 1.0, KN = −1.0, s = .25.
Targets: KO = 3, KN = −1 Targets: KO = 2, KN = −1 Targets: KO = 1, KN = −1 Foils: KO = 1, KN = −1 Foils: KO = 2, KN = −1 Foils: KO = 3, KN = −1
Fig. 5. A comparison of the eVect of an asymmetrical shift in the decision criteria on the speed and accuracy of familiarity‐based free‐response recognition under the assumptions of sampling‐with‐replacement versus sampling‐without‐replacement.
relatively familiar due to their similarity to the studied pairs, and therefore lowering KN makes it less likely that enough negative evidence will accumulate to allow for a negative response. Hence, a symmetrical change in the decision criteria does not produce a speed‐accuracy trade‐oV.
Toward an Understanding of Individual Differences in Episodic Memory
333
On further analysis, a symmetrical change in the decision criteria does not make much sense as a means of slowing performance in order to increase accuracy given the nature of the associative recognition task. Both the targets and the foils are relatively familiar, and it is in the interest of the subject to maximize the likelihood of a correct rejection. To achieve a goal of greater accuracy, the subject could instead asymmetrically alter the decision criteria by increasing KO relative to KN. The result of a simulation of this model is shown in Fig. 5, where KO ¼ 1.0, 2.0, and 3.0 and KN ¼ –1.0. Here, the latencies increase with increases in the spread of the decision criteria and accuracy increases. Once again, the spread of criteria has little eVect on hit rates, but false‐alarm rates decrease as the spread increases. Thus, an asymmetrical change in the location of the criteria with respect to (m) ¼ 0 is one way to achieve a speed‐accuracy trade‐oV for associative recognition based on familiarity. So far the models that we have considered cannot alone account for the accuracy and latency of associative recognition. However, we have made some critical observations that will allow us to construct a model that can better account for associative recognition performance. Based on these analyses, one way to achieve a speed‐accuracy trade‐oV for free‐response associative recognition in a global‐matching model is to assume that there is an asymmetrical spread in the decision criteria. In addition, there is little qualitative diVerence between the sampling‐with‐replacement model and the sampling‐without‐replacement model. Therefore, because the sampling‐without‐replacement model is the more eYcient, less noisy model and because it seems unprincipled to assume that an infinite amount of evidence can be obtained from a finite amount of information stored in memory, I will assume that sampling occurs without replacement from now on. E.
RECOLLECTION
Recall in REM involves the same mechanisms of sampling and recovering traces as in the search of associative memory (SAM) model (Malmberg & ShiVrin, 2005; Raaijmakers & ShiVrin, 1980; ShiVrin & Steyvers, 1998). The sampling process follows a Luce choice rule, whereby the probability of sampling trace, i, from the set of sampled features, FT, given retrieval cue, Q, is positively related to the similarity of Q and trace i, and negatively related to the similarity of Q and other traces in memory: PðijQÞ ¼
li n P lj
j¼1
Note that the sampling involved in recollection involves retrieving traces from an activated set of memory traces in addition to first sampling a set of features
334
Kenneth J. Malmberg
from memory, whereas the sampling model considered earlier only involves the latter. Recovery is assumed to be a special case of a threshold process. Recovery of trace i is successful if and only if there are at least KR nonzero features in trace i, sampled from activated memory set. If the number of nonzero sampled features does not exceed KR, then recollection fails. If KR is exceeded, then the individual features are used to make a decision by comparing them to the test stimulus. Note that it is possible for KR to be exceeded by varying amounts of evidence recalled from memory, and hence, it is possible that this information might provide more than an all‐or‐none source of information. While this assumption will not play an important role in the present modeling, it might help to address some of the issues involving the nature of information recalled from memory and how it influences confidence judgments (Rotello et al., 2004). The decision based on recalled information assesses the matching and mismatching features of the trace sampled from FT and the test stimulus. I assume that the target trace must be sampled from memory in order to recall that an intact pair was studied, and I assume that one of two traces that correspond to rearranged pairs must be sampled in order to recall that a rearranged pair was not studied. Therefore, the recovery process only leads to veridical responses. That is, if an intact pair is tested, recollection can only lead to a positive response if it is successful (and vice versa for rearranged pairs). In a Bayesian system, the comparison of the sampled features to the test stimulus would ideally be informed by the extent and accuracy of encoding. However, this might not always be known. Moreover, recovery most likely involves an interaction between the features sampled from memory and general knowledge which is in the form of lexical/semantic traces in REM. The additional complexity involved in implementing these models probably would not aVect the behavior of the model vis‐a`‐vis our goal of understanding the dynamics of recognition memory, however. In the present simulations, I have therefore allowed for one mismatching feature between the sampled features of a target trace and an intact stimulus in order to take into account the errors in encoding. For rearranged pairs, I have required that there be at least two mismatching features in order for recollection to succeed in rejecting the otherwise familiar foils. These assumptions provide a more complete account of recovery than that has been provided previously (Xu & Malmberg, 2007). Nevertheless, the recovery model is still overly simplistic and therefore the selection of diVerent parameter values is necessarily somewhat arbitrary. For now, however, these assumptions capture much of what needs to be explained in a formal manner. Perhaps the most important aspect of this model is the assumption that only one mismatching feature is allowed in order to judge that a test stimulus was studied based on recollection, whereas at least two mismatching features
Toward an Understanding of Individual Differences in Episodic Memory
335
must be observed in order to judge that a test stimulus was not studied. This makes recalling‐to‐reject less demanding than recalling‐to‐accept given that there are w features possible to sample from FT. In essence, this means that given the same levels of encoding that responses to targets are more likely to be based on familiarity than responses to foils and that responses to foils are more likely to based on recollection than responses to target (all else being equal). Hence, these assumptions will contribute to the model’s predictions concerning the relationship between the latencies of hits and correct rejections. The recollection process is assumed to be slower than the familiarity‐based process (cf. Dosher, 1984; Gronlund & RatcliV, 1989). For simplicity, it is arbitrarily assumed that the result of the recall attempt begins to be available at T ¼ 4. Presumably, this accounts for the amount of time it takes to construct a cue, probe, sample, recover, and evaluate the contents of a trace. Some of these operations must also be completed prior to completing the first global‐matching comparison. Therefore, I assume that no responses are made until T ¼ 2 in the free‐response model. Once recollection succeeds no further attempts to recall information is made. This is true even in the case when the recollection succeeds prior to T ¼ 4. When recollection fails, a new attempt is made on the subsequent probe of memory. At times, a trace will be sampled, but its contents do not exceed KR. Because of the sampling‐without‐replacement model, these contents will still be available in FT when the trace is sampled on a subsequent attempt to recall. In this sense, the sampling‐without‐replacement model is simpler than the sampling‐with‐replacement model that I have rejected. F.
DUAL‐PROCESS DECISION STRATEGIES
First, let us consider the signal‐to‐respond model. Assume that the subject responds at the moment the signal is made (regardless of T ). If the signal occurs before recall is complete, then a response is made based on familiarity in the fashion described earlier. If the signal occurs, after recall is complete and it is successful, then the response is ‘‘old’’ if the test stimulus is a target (i.e., the recovered features tend to match the intact pair) and ‘‘new’’ if the test stimulus is a foil (i.e., the recovered features tend to mismatch the rearranged pair). If recall fails, then a decision is based on familiarity in the usual fashion: If (m) 0, the response is ‘‘old,’’ otherwise it is ‘‘new.’’ Figure 6 shows the performance of the dual‐process signal‐to‐respond model. It is instructive to compare this performance with the performance of the familiarity‐based sample‐without‐recovery model in Fig. 2. Both models predict an increase in hit rates with increases in delay and presentations. Unlike the familiarity‐based model, however, the dual‐process model
Kenneth J. Malmberg
336
1.0
.8
HR - 12 presentations HR - 6 presentations HR - 3 presentations HR - 2 presentations HR - 1 presentation FAR - 1 presentation FAR - 2 presentations FAR - 3 presentations FAR - 6 presentations FAR - 12 presentations
p (old)
.6
.4
.2
0 0
2
4
6
8
10
Delay Note: Hit rates and false-alarm rates were generated at various units of time following the presentation of the test pair. Delay 1 is the earliest response deadline and Delay 9 is longest response deadline. This model assumes at the shortest response deadlines (Delays 1–3) responses are based solely on the amount of familiarity accumulated to that point. At the longer response deadlines, hit rates and false-alarm rates are based on a mixture of responses based on the amounts of familiarity and recollected details to that point in time. The parameter values were: w = 10, t = 8, c = .7, u* = .04, g = .4, KO = KN = 0, KR = 5, s = .25.
Fig. 6. Hit rates and false‐alarm rates as a function of response deadline and pair presentations for the dual‐process signal‐to‐respond model.
predicts the observed nonmonotonic relationship between response delay and false‐alarm rates if the initial encoding of the trace is relatively strong (i.e., the target pairs were presented more than one time). The contribution of recollection to signal‐to‐respond performance is greater for targets than for foils when pairs are presented infrequently, but as the number of presentations increases the contribution is greater for foils than for targets. In addition, the contribution of recollection increases for both targets and foils as the number of presentations and delay increases. Before recovery and trace comparisons are completed, performance is based on familiarity only, and hence false‐alarm rates initially increase with delay if the targets are suYciently well encoded. After recollective processing is assumed to be completed (i.e., T ¼ 4), the evidence obtained tends to indicate that rearranged pairs are familiar and that the items comprising the test pair
Toward an Understanding of Individual Differences in Episodic Memory
337
were not studied together. Both sources of evidence tend to increase in probability as the delay increases, and thus false‐alarm rates decrease with delay thereafter. The performance of the signal‐to‐respond procedure is assumed to be relatively immune to strategic factors, as the response is demanded at a specific point in time. In free‐response recognition, however, the choice of how long one waits before making a response is up to the subject. Therefore, it is in the free‐response procedure where we would expect to find greater influences on strategic diVerences in performance. One way to aVect the length of time before a response is initiated is by increasing the spread between the familiarity‐based decision criteria. Figure 5 shows that asymmetrically increasing the spread of the old and new criteria by increasing the old criterion relative to new criterion produces an increase in accuracy and a decrease in speed in the familiarity‐based model. Figure 7 shows how the same manipulations of the decision criteria aVect the dual‐process model free‐response performance. The left panel of Fig. 7 shows that accuracy increases as the spread of the criteria increases, primarily by reducing false‐alarm rates. In addition, when the spread is relatively small, false‐alarms increase with increases in the number of target presentations. In this case, performance is very similar to what was observed in Experiment 2 of Malmberg and Xu (2007, Fig. 1). When there is a relatively moderate spread in the decision criteria, there is little or no eVect of pair strength on false‐alarm rates and contribution of recollection to performance increases. In several experiments, Kelley and Wixted (2001) observed a similar pattern of false‐alarm rates. It is important to note, however, that in their experiments Kelley and Wixted only varied pair strength at two levels. Thus, the function relating their two observations is unknown and it is possible that a nonlinear relationship exists between pair strength and false‐alarm rates (e.g., Experiments 3 and 4 in Malmberg & Xu, 2007). We will return to this matter shortly. As the spread of the criteria increases, fewer responses to rearranged pairs are based on familiarity, and hence the false‐alarm rates can even decrease as the number of target presentations increases. This can explain the variability in the observed patterns of false‐alarm rates between several experiments in the literature if one assumes that diVerent groups of subjects adopt sets of decision criteria that lead to more or less responses based on recollection (Cleary et al., 2001; Kelley & Wixted, 2001; Malmberg & Xu, 2007; Xu & Malmberg, 2007). This could be due to factors such as motivation, instructions, fatigue, and so on. The middle panel of Fig. 7 shows that asymmetrically increasing the spread of decision criteria increases the latencies of the responses. When the spread of the criteria is relatively small the latencies of the correct
Kenneth J. Malmberg
Accuracy
5.5
1.0
Latency
5.0 .8
p (old)
Latency
4.5 .6
.4
4.0 3.5 3.0
.2
2.5 2.0
0 0 2 4 6 8 10 12 14
Recollection p (correct response based on recollection)
338
0 2 4 6 8 10 12 14 Pair presentations
Note: This figure shows the relationship between associative recognition performance and spread of the decision criteria in the dual-process model. Hit rates and false-alarm rates as a function of the number of target presentations are plotted in the right panel. The middle panel plots the latency of the correct responses, and the right panel plots the probability that recollection was the basis of a response. The parameter values were: w = 10, t = 8, c = .7, g = .4, KN = −.5, KR = 10, s =.25.
.4
.3
.2
.1
0 0 2 4 6 8 10 12 14
Targets: KO = 1 Targets: KO = 3 Targets: KO = 5 Foils: KO = 5 Foils: KO = 3 Foils: KO = 1
Fig. 7. The eVect of an asymmetrical shift in the decision criteria on the accuracy and latency of the dual‐process model and the contribution of recollection to its performance.
rejections are slower than the latencies of the hits, as was observed in Experiment 2 of Malmberg and Xu (2007) and corresponding to the rise in false‐alarm rates with increases in target presentations. As the spread increases, hits tend to become slower than correct rejections because it becomes less and less likely that familiarity will lead to an ‘‘old’’ response. This is illustrated in the right panel of Fig. 7 by increases in the amount recollective‐based responding as the spread of the decision criteria increases. In addition, the contribution of recollection to performance tends to decrease for targets and increase for foils as the number of times targets are presented increases. Thus, repetitions have the opposite eVects on the contribution of recollection to hit rates for free‐response and signal‐to‐respond performance. The increase in latencies associated with the increase in the spread of the decision criteria leads to an increase in accuracy for foils but not for targets,
Toward an Understanding of Individual Differences in Episodic Memory
339
which is to say that overall accuracy increases. The increase in latencies is much greater for targets than for foils, such that latencies of hits is greater than the latencies of correct rejections when KO ¼ 5. This model is not necessarily disconfirmed by the data shown in Fig. 1, since those findings are that false‐alarm rates increase as the number of target presentations increase, which is what we observe in Fig. 7 when KO ¼ 1. More data are therefore required to in order to evaluate this prediction of the model. Specifically, it would be of interest to devise a manipulation that aVects the subject’s tendency to respond based on recollection and determine if the latencies of hits and correct rejections are diVerentially aVected. Distinguishing between strategic and structural sources of systematic variability in recognition performance is an important component to understanding individual diVerences. The prior analysis shows diVerences in the contribution of recollection to performance can be achieved by controlling the spread between the familiarity‐based decision criteria. How would a change in some structural aspect of memory aVect associative recognition? Figure 8 shows the eVect of an increase in the sample size, s, on dual‐process free‐response performance. As the sample size increases, latencies tend to decrease, and there is little eVect of sample size on accuracy: Both hit rates and the false‐alarm rates increase with sample size. This is because increasing the sample size decreases the latencies of the familiarity‐based responses. For rearranged pairs this generally leads to false‐alarms, and these tend to occur prior to when recollection is assumed to be complete in this model. For targets, the larger sample sizes are more likely to make their familiarity exceed KO and they tend to exceed KO more quickly, prior to the completion of recollection. This is illustrated in the right panel of Fig. 8 where it shows that the increase in sampling rate has little eVect on the contribution of recollection to correct rejections, but the contribution of recollection to hits decreases as the sampling rate increases. This is because more responses are based on familiarity prior to when recollection is assumed to be complete. Thus, there appear to be patterns of accuracy and latency data that can distinguish between strategic and structural diVerences in the manner in which associative recognition is performed. Changes in the location of ‘‘old’’ decision criterion can possibly account for diVerences in the patterns of false‐alarm rates observed in the literature, whereas diVerence in sampling rate probably cannot. So far the dual‐process model performs fairly well with the possible exception of the longer response latencies for hit than for correct rejections when the spread of the decision criteria is relatively large. While it is unknown if this is a truly problematic prediction, it is still worth exploring a diVerent way to delay responding as a means of increasing accuracy. One way this might be accomplished is by leaving the old and new criteria fixed, but
Kenneth J. Malmberg
340
Latency 8
.8
7
p (old) .4
Latency
6 .6
5 4
.2
0
3 2 0 2 4 6 8 10 12 14
Recollection p (correct response based on recollection)
Accuracy 1.0
0 2 4 6 8 10 12 14 Pair presentations
Note: This figure shows the relationship between associative recognition performance and the sampling rate for the dual-process model. Hit rates and falsealarm rates as a function of the number of target presentations are plotted in the right panel. The middle panel plots the latency of the correct responses, and the right panel plots the probability that recollection was the basis of a response. The parameter values were: w = 10, t = 8, c = .7, u* = .04, g = .4, KO = 3, KN = −.5, KR = 5.
.35 .30 .25 .20 .15 .10 .05 0 0 2 4 6 8 10 12 14
Targets: s = .40 Targets: s = .25 Targets: s = .10 Foils: s = .10 Foils: s = .25 Foils: s = .40
Fig. 8. The eVect of sampling rate on the accuracy and latency of the dual‐process model and the contribution of recollection to its performance.
delaying the responses until a subjective metacognitive estimate of when recollection should be completed is met. That is, subjects may have meta‐ level beliefs about the amount of time recollection takes to succeed, and responses could be delayed until that time even though the familiarity‐based evidence is already suYcient for making a response. This could, of course, be aVected by the nature of the task or diYculty of the task. In this case, I will assume that subjects delay their responding until T ¼ 4. Figure 9 reports a simulation in which responses are made as soon as either familiarity‐based or recollective‐based evidence is suYcient for a response versus the situation where all responses are delayed until T ¼ 4. The latter simulation is similar to Experiment 3 in Malmberg and Xu (2007). In all respects that experiment was the same as the Experiment 2 from
Toward an Understanding of Individual Differences in Episodic Memory
Data 1.0
.8
.8
Latency
5.0 4.5
p (old)
Latency
.6
.6
.4
.4
4.0 3.5 3.0
.2
.2 2.5
0
0 2 4 6 8 1012
.5
.4
.3
.2
.1
0
2.0
0 0 2 4 6 8 1012 14
Recollection .6
5.5
p (correct response based on recollection)
Accuracy 1.0
341
0 2 4 6 8 101214
0 2 4 6 8 101214
Pair presentations Note: This figure shows the relationship between associative recognition performance and the delay in a free response. Hit rates and false-alarm rates as a function of the number of target presentations are plotted in the left panels. The left-most panel shows the performance of the model and the next panel shows the data from Malmberg and Xu (2007). The middle-right panel plots the latency of the correct responses, and the right-most panel plots the probability that recollection was the basis of a response. The parameter values were: w = 10, t = 8, c = .7, g = .4, KO = 1.0, KN = −.5, KR = 10, s = .25.
Targets: Fast Targets: Slow Foils: Slow Foils: Fast
Fig. 9. The eVect of delaying responses on the accuracy and latency of the dual‐process model and the contribution of recollection to its performance.
Malmberg and Xu (2007) which produced the data in Fig. 1. The only diVerence is that responses were delayed by 2 s. That is, the subjects were to respond after 2 s had elapsed since the presentation of the test stimuli. The accuracy data from both experiments are shown in middle‐left panel of Fig. 9. The primary eVect of delaying the yes‐no response was to induce a statistically reliable increase then a decrease in false‐alarm rates as the number of target presentations increased. The left panel of Fig. 9 shows a similar pattern of data derived from the computer simulation. In addition, the relationship between the latencies of the hits and the correct rejections is maintained, even as accuracy increases as the result of delaying the response.
Kenneth J. Malmberg
342
How does a change in a structural aspect of memory aVect the performance of this model? Figure 10 shows the eVect of sampling rate on associative recognition performance based on the free‐response model that assumes that all responses are delayed until some T ¼ m. Unlike the free‐response model that assumes that responses will be made as soon as either the familiarity‐based evidence or the recollective‐based evidence provides a suYcient reason to respond (Fig. 8), increasing the sampling rate produces a robust increase in accuracy and a decrease in latencies. Comparing Figs. 8 and 10 shows that increasing the sampling rate has large eVects on the contribution of recollection to correct rejections when responses are delayed
Latency
Accuracy
p (correct response based on recollection)
9 .8
p (old)
Latency
8 .6
Recollection
10
1.0
7
.4 6 .2
5
0
4 0 2 4 6 8 10 12 14
0 2 4 6 8 10 12 14
.8
.6
.4
.2
0 0 2 4 6 8 10 12 14
Pair presentations Note: This figure shows the relationship between associative recognition performance and the sampling rate for the dual-process model when responses are assumed to be delayed until the subjective estimate of when recollection should be completed. Hit rates and false-alarm rates as a function of the number of target presentations are plotted in the right panel. The middle panel plots the latency of the correct responses, and the right panel plots the probability that recollection was the basis of a response. The parameter values were: w = 10, t = 8, c = .7, u* = .04, g = .4, KO = 3, KN = −.5, KR = 10.
Targets: s = .10 Targets: s = .25 Targets: s = .40 Foils: s = .40 Foils: s = .25 Foils: s = .10
Fig. 10. The eVect of sampling rate on the accuracy and latency of the dual‐process model and the contribution of recollection to its performance when responses are delayed.
Toward an Understanding of Individual Differences in Episodic Memory
343
but not when the subject is free to respond based on familiarity prior to recollection being made available. This is because the present model responds based on recollective evidence when it is available. Thus, the eVect of a structural change in memory can have diVerent eVects on recognition memory performance that depend on the nature of the decision invoked at the time of retrieval. It is important to note that these predictions and all of the predictions derived for the purposes of these analyses were generated without formally obtaining a ‘‘best fit’’ of the models to the data. It is therefore quite probable that more accurate representations of the data can be generated by the models. The point being that the qualitative predictions of diVerent models can be used to distinguish between them, which is an encouraging sign, given the goal of understanding the sources of individual diVerences in recognition memory performance. Thus, there are at least two models for achieving a speed‐accuracy trade‐oV for associative recognition in a dual‐process framework. One model assumes that subjects vary their familiarity‐based decision criteria. The greater the distance between them the less likely responses will be based on familiarity and therefore accuracy increases. The other model assumes that subjects simply delay responding until a metacognitively determined estimate of the amount of time one should wait in order to achieve their goals. The longer one waits, the more evidence there is on which to base a decision and the more accurate performance becomes. Interestingly, there appears to be a way to empirically distinguish between these models because the patterns of accuracy and latency data are diVerent. According to the model that assumes that subjects vary their decision criteria (Fig. 7), increases in the number of times targets are presented should correspond to small decreases in the contribution of recollection to hit rates, and the latencies of the hits should increase relative to the latencies of correct rejections. According to the response‐delay model, by way of comparison, when responses are relatively slow (Fig. 9), increases in the number of times targets are presented should increase the amount that recollection contributes to hits, but the latencies of hits should be less than latencies of correct rejections. Thus, it is possible to distinguish between two models of the speed‐accuracy trade‐oV by observing the patterns of accuracy and latency data. At present, there are no data that can help us make an empirically informed decision about which model of speed‐accuracy trade‐oVs is correct. Doing so could have important implications for how fMRI and EEG data are interpreted. That is, the important implication is how we interpret the eVect of an increase in accuracy and latency. According to one model, increases in pair strength produces an increase in the contribution of recollection to hits and correct
Kenneth J. Malmberg
344
Slow (KO = 5.0) 1.0
Number of presentations 1.0 1 2 3 6 12
Hits p (latency)
.8
.8
.6
.6
.4
.4
.2
.2
0
0
0
Correct rejections p (latency)
Fast (KO = 1.0)
2
4
6
8
10
1.0
1.0
.8
.8
.6
.6
.4
.4
.2
.2
0
0
0
2
4 6 Latency
8
10
0
2
0
2
4
6
8
10
4
6
8
10
Latency
Fig. 11. The distribution of the latencies of hits and correct rejections for diVerent levels of bias to respond ‘‘old’’ based on familiarity in the dual‐process delayed‐response model.
rejections, and according to the other model increases in pair strength only increases the contribution of recollection to correct rejections. Thus, such modeling endeavors when used in combination with neuroscientific methods can produce conclusions based on a combination of our understanding of diVerent brain structures and our understanding of the dynamics of memory systems.
Toward an Understanding of Individual Differences in Episodic Memory
345
In recent years, an important means for evaluating models of response latencies is to observe not only the central tendency of responses but also the distribution of response latencies. A critical finding in many areas of research is that latency distributions are skewed such that the leading edge is relatively steep compared to the tail of distribution, as opposed to being normally distributed. As a final test of whether the current model is viable, I conducted a simulation of the free‐response model from which the latency distributions for hits and correct rejects could be obtained. Figure 11 shows the hit and correct rejection latency distributions of the dual‐process model for two levels KO. In all cases, the distributions are skewed. The correct rejection latency distribution is clearly bimodal, owing to an early mode corresponding to an initial contribution of familiarity to performance and a later mode corresponding to a later contribution of recollection. The later mode diminishes as the spread between the old and new criteria decreases, as this reduces the contribution of recollection to performance. V.
Conclusions
Over the past few decades, we have learned a lot about the nature of human memory. Perhaps the best examples of the advances in our understanding that have been made are the formal models that have been developed. A shortcoming of these models, however, is that their scope has traditionally been limited to providing accounts of how the average individual remembers. The reason for this has not only to do with the incremental nature of the scientific method, but also the empirical limitations of conducting memory research and the relegation of individual diVerences to the error term in our description of the data. If, however, formal modeling of memory is to have an impact on everyday lives of people, we must apply what we know about how the average individual remembers to understanding why some people or populations are better rememberers than other. Other areas of memory research have promoted methodologies that are more amenable to identifying individual diVerences. The recent upsurge in interest in cognitive neuroscience is due in large part to the saliency of the data that provide a glimpse into the brain of the rememberer. Such glimpses not only can tell us about where the brain supports human memory, but they can also tell us about how individual brains diVer when they perform a memory task. In my experience, subjects and laypeople have little interest in how memory works. Their primary interest is whether they are as bad at remembering as they think they are. Indeed, the greatest prize one often receives when visiting a cognitive neuroscience laboratory is a photograph of their brain. As such, the cognitive neuroscience approach to memory is often more appealing to the layperson,
346
Kenneth J. Malmberg
than the intrinsically dense mathematical formulations that increasingly characterize behavioral research. A limitation of current cognitive neuroscience approach is that extant theories have more to say about where or when memory occurs and has relatively little to say about how or why memory occurs. Thus, we can point to an area of the brain that discriminates between good and not‐so‐good remembers and not have a very good understanding of why one person remembers better than another. For instance, are diVerences in the patterns of brain activation due to structural abnormalities of the brain or due to diVerences in the utilization of diVerent memory strategies? Viewing memory as skilled cognition provides an opportunity to take advantage of the explanatory power of the modeling approach to memory and the specificity of cognitive neuroscience approach to understanding individual diVerences. In this chapter, I have described a theory that assumes that there is a variety of ways to perform a given memory task, and the strategy adopted is assumed to be eYcient with respect to the subjects goal to achieve a given level of accuracy in the shortest time possible. The extent to which one is a ‘‘good rememberer’’ depends (a) on the operations of the structural components of memory and (b) on the quality of one’s meta‐level understanding of the nature of the task and nature of memory and decision. Discriminating between structural and strategic sources of variance depends on having a model of how they aVect performance, and doing so might allow us to discriminate between those with serious structural memory impairments and those who are not eVectively choosing eYcient strategies for remembering. Such a theory could be further extended to include assumptions about the eVects of implementing diVerent memory strategies on brain activity, and these assumptions could be tested using traditional behavioral and cognitive neuroscience methodologies. To the extent that these joint eVorts are successful, we may in the future use this knowledge to support diagnostic tools for evaluating human memory performance in order to more eVectively characterize and treat memory impairments. ACKNOWLEDGMENTS Many thanks to Harry Bahrick and Aaron Benjamin for their comments on prior versions of this chapter and to Roger RatcliV for a very generous and enlightening conversation concerning the random walk model. Please address correspondence to:
[email protected].
REFERENCES Atkinson, R. C., & Juola, J. F. (1974). Search and decision processes in recognition memory. In D. H. Krantz, R. C. Atkinson, R. D. Luce, and P. Suppes (Eds.), Contemporary
Toward an Understanding of Individual Differences in Episodic Memory
347
developments in mathematical psychology, Learning, memory, and thinking (Vol. 1, pp. 243–293). San Francisco, CA: Freeman. Baranski, J. V., & Petrusic, W. M. (1998). Probing the locus of confidence judgments: Experiments on the time to determine confidence. Journal of Experimental Psychology: Human Perception and Performance, 24, 929–945. Cleary, A. M., Curran, T., & Greene, R. L. (2001). Memory for detail in item versus associative recognition. Memory & Cognition, 29, 413–423. Criss, A. H., & McClelland, J. L. (2006). DiVerentiating the diVerentiation models: A comparison of the retrieving eVectively from memory model (REM) and the subjective likelihood model (SLiM). Journal of Memory & Language: Special Issue on Computational Models of Memory, 55, 447–460. Criss, A. H., & ShiVrin, R. M. (2005). List discrimination and representation in associative recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31(6), 1199–1212. Diller, D. E., Nobel, P. A., & ShiVrin, R. M. (2001). An ARC‐REM model for accuracy and response time in recognition and recall. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 414–435. Dosher, B. A. (1984). Discriminating preexperimental (semantic) from learned (episodic) associations: A speed‐accuracy study. Cognitive Psychology, 16, 519–555. Dunn, J. A. (2004). Remember‐Know: A matter of confidence. Psychological Review, 111, 524–542. Egan, J. P. (1958). Recognition memory and the operating characteristic. Indiana University Hearing and Communications Laboratory, AFCRC‐TN‐58–51. Gillund, G., & Shiffrin, R. M. (1984). A retrieval model for both recognition and recall. Psychological Review, 91, 1–67. Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York, NY: Wiley. Greene, R. L. (1996). Mirror eVect in order and associative information: Role of response strategies. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22(3), 687–695. Gronlund, S. E., & RatcliV, R. (1989). Time course of item and associative information: Implications of global memory models. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 846–858. Hintzman, D. L. (1986). Schema abstraction in a multiple trace memory model. Psychological Review, 93, 411–428. Hintzman, D. L., Curran, T., & Oppy, B. (1992). EVects of similarity and repetition on memory: Registration without learning? Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 667–680. Hockley, W. E. (1994). Reflections on the mirror eVect for item and associative recognition. Memory & Cognition, 22, 713–722. Hockley, W. E., & Niewiadomski, M. W. (2007). Strength‐based mirror eVects in item and associative recognition: Evidence for within‐list criterion changes. Memory & Cognition, 35, 679–688. Jacoby, L. L. (1991). A process dissociation framework: Separating automatic from intentional uses of memory. Journal of Memory and Language, 30, 513–541. Kelley, R., & Wixted, J. T. (2001). On the nature of associative information in recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 701–722.
348
Kenneth J. Malmberg
Lewandowsky, S., & Hiet, E. (2006). Some targets for memory models. Journal of Memory and Language, 55, 441–446. Light, L. L., Patterson, M. M., Chung, C., & Healy, M. R. (2004). EVects of repetition and response deadline on associative recognition in young and older adults. Memory & Cognition, 32(7), 1182–1193. Macmillan, N. A., & Creelman, C. D. (1990). Response bias: Characteristics of detection theory, threshold theory, and nonparametric indexes. Psychological Bulletin, 107, 401–413. Malmberg, K. J., Holden, J. E., & ShiVrin, R. M. (2004). Modeling the eVects of repetitions, similarity, and normative word frequency on judgments of frequency and recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 319–331. Malmberg, K. J., & ShiVrin, R. M. (2005). The ‘‘one‐shot’’ hypothesis for context storage. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 322–336. Malmberg, K. J., & Xu, J. (2006). The influence of averaging and noisy decision strategies on the recognition memory ROC. Psychonomic Bulletin & Review, 13(1), 99–105. Malmberg, K. J., & Xu, J. (2007). On the flexibility and on the fallibility of associative memory. Memory & Cognition, 35(3), 545–556. Malmberg, K. J., Zeelenberg, R., & ShiVrin, R. M. (2004). Turning up the noise or turning down the volume? On the nature of the impairment of episodic recognition memory by midazolam, Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 540–549. Murdock, B. B. (1982). A theory for the storage and retrieval of item and associative information. Psychological Review, 89, 609–626. Murdock, B. B. (1997). Context and mediators in a theory of distributed associative memory (TODAM2). Psychological Review, 104, 839–862. Nelson, T. O., & Leonesio, R. J. (1988). Allocation of self‐paced study time and the ‘‘labor‐in‐ vain effect’’. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 676–686. Nelson, T. O., & Narens, L. (1990). Metamemory: A Theoretical framework and new findings. In G. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory. New York, NY: Academic Press. Raaijmakers, J. G. W., & ShiVrin, R. M. (1980). SAM: A theory of probabilistic search of associative memory. In G. H. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory (Vol. 14, pp. 207–262). New York, NY: Academic Press. RatcliV, R. (1978). A theory of memory retrieval. Psychological Review, 85, 59–108. RatcliV, R., & Smith, P. L. (2004). A comparison of sequential sampling models for two‐choice reaction time. Psychological Review, 111, 333. Reder, L. M., Nhouyvanisvong, A., Schunn, C. D., Ayers, M. S., Angstadt, P., & Hiraki, K. (2000). A mechanistic account of the mirror eVect for word frequency: A computational model of remember‐know judgments in a continuous recognition paradigm. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 294–320. Rotello, C. R., Macmillan, N. A., & Reeder, J. A. (2004). Sum‐diVerence theory of remembering and knowing: A two‐dimensional signal‐detection model. Psychological Review, 111(3), 588–616. ShiVrin, R. M., & Steyvers, M. (1997). A model for recognition memory: REM—retrieving eVectively from memory. Psychonomic Bulletin & Review, 4, 145–166.
Toward an Understanding of Individual Differences in Episodic Memory
349
ShiVrin, R. M., & Steyvers, M. (1998). The eVectiveness of retrieval from memory. In M. Oaksford and N. Chater (Eds.), Rational models of cognition (pp. 73–95). Oxford, England: Oxford University Press. Van Zandt, T., & Moldonado‐Molina, M. M. (2004). Response reversals in recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 1147–1166. Xu, J., & Malmberg, K. J. (2007). Modeling the eVects of verbal‐ and nonverbal‐pair strength on associative recognition. Memory & Cognition, 35(3), 526–544. Yonelinas, A. P. (1997). Recognition memory ROCs for item and associative information: The contribution of recollection and familiarity. Memory & Cognition, 25, 1397–1410.
This page intentionally left blank
MEMORY AS A FULLY INTEGRATED ASPECT OF SKILLED AND EXPERT PERFORMANCE K. Anders Ericsson and Roy W. Roring
I.
Introduction
Psychology has long searched for general theories based on invariant memory systems. In this chapter, we show that skilled and expert performers acquire complex skills and neurological/physiological adaptations that transform the mediating cognitive mechanisms and associated brain regions to make them qualitatively diVerent from the processes and representations traditionally examined in the laboratory studies of memory. We will argue that the development of expert performance drives the acquisition of mechanisms that allow experts to meet the demands of representative task performance, where memory is only one aspect of a highly integrated system that mediates superior task performance and its acquisition. For the last several decades, researchers have increasingly focused on measuring the reproducibly superior performance of experts and describing its structure and acquisition (Ericsson, Charness, Feltovich, & HoVman, 2006). This body of research has uncovered many important insights related to the learning and acquisition of skills in diVerent domains within the framework of the expert‐performance approach (Ericsson, 2006a,b; Ericsson & Smith, 1991); however, the list of explored domains regularly expands from initial domains, such as chess and music, to the recent additions of domains as diverse as simultaneous language translation, wine tasting, THE PSYCHOLOGY OF LEARNING AND MOTIVATION VOL. 48 DOI: 10.1016/S0079-7421(07)48009-4
351
Copyright 2007, Elsevier Inc. All rights reserved. 0079-7421/08 $35.00
352
Ericsson and Roring
and volleyball. This general approach of extending the analysis of phenomena in new domains might appear excessively inductive and descriptive to a classically trained scientist. Traditional psychological and behavioral science is committed to studying the most elementary cognitive processes of perception, memory, and learning first, and only later to apply the discovered general laws to complex skills and everyday phenomena. According to this general deductive approach, the basic mechanisms and laws should provide the most parsimonious accounts of complex findings from domains of expertise and even create novel predictions for hypothetical situations created in the laboratory. Such an approach was first established in the physical sciences, where physicists were able to explain mechanical phenomena by invariant general laws. Later the same approach allowed chemists to explain chemical reactions in terms of a small set of fundamental atomic elements. Today mainstream psychological and cognitive scientists are continuing to follow this general approach by studying many interesting phenomena in the laboratory in search of basic invariant elements. Ever since the emergence of applied psychology, investigators have attempted to apply these basic findings to domains of activity in everyday life. However, in the last few decades, numerous influential researchers in psychology have started to criticize this approach, proposing a return to the study of everyday cognitive phenomena (cf. Gibson & Pick, 2000; Neisser, 1976). Similarly, the expert‐performance approach (Ericsson & Smith, 1991) restricts scientific investigation to superior performance in everyday life and argues that this type of expert performance can be captured with representative tasks and thus reproduced in the laboratory for further analyses of the mediating‐cognitive processes. Process‐tracing studies of skilled and expert performance have shown that it is diYcult to distinguish diVerent types of cognitive functions such as memory, problem solving, and decision making. These functions are tightly integrated during very high levels of performance on representative tasks in the associated domain of expertise. This integration has to be achieved virtually continuously as the structure of the acquired mechanisms is changed and modified as individuals develop from struggling beginners to experts with a vastly superior performance (Ericsson, 1996, 2006a). In this chapter, we will examine how this theoretical framework will permit us to discuss research on memory in skilled and expert performers. Rather than studying memory for domain‐ specific information in separate memory experiments, we will advocate that researchers focus on those representative activities that capture the essence of expertise in specific domains of activity. To understand the organization of the mechanisms mediating expert performance requires that investigators examine representative tasks that elicit the encoding, retrieval, reasoning, decision making, and thinking that permit the expert to reproduce his or her superior performance.
Memory as an Aspect of Skilled and Expert Performance
II.
353
Outline of the Chapter
This chapter begins by sketching the rationale for the original mainstream approach in experimental psychology that searched for basic nonmodifiable human capacities and processes. We will start by briefly reviewing Ebbinghaus’s pioneering work on basic memory processes and some of the criticisms raised by his contemporaries. We will then discuss the emergence of information‐processing models of cognitive processes that could account for performance in a wide range of laboratory tasks that were designed to study problem solving, decision making, concept formation, and reasoning. We will then briefly review subsequent models of skill acquisition and expertise and how modern formulations of skilled and expert performance account for findings from a variety of domains that integrate concepts from all areas of cognition. We will discuss recent empirical evidence on the eVects of practice and adaptation, and demonstrate that under certain external conditions, biological systems—including those of human children and adults—are capable of dramatic change, even at the cellular level. The remarkable modifiability of human performance raises issues for research committed to the traditional search for basic invariant phenomena. Our approach, that is the Expert‐Performance Approach (Ericsson, 2006a; Ericsson & Smith, 1991), searches for reliably superior performance by experts and then attempts to capture that performance with representative tasks in the laboratory. The structure of the intact performance is examined using standard process‐tracing techniques, such as protocol analysis of verbal reports of thinking, analysis of latencies, and analysis of eye‐movement sequences. Given that most forms of skilled performance involve the generation and/or selection of superior courses of action under time‐ constrained conditions, we can develop representative situations with immediate demands for action that allow this type of performance to be reproduced repeatedly in the laboratory. These tasks allow the experimenter to use experimental techniques to determine how the mechanisms responsible for superior performance rely on temporary and more permanent storage of presented information and the intermediate products of task processing. In this chapter, we will discuss how integrated structures and mechanisms are acquired through deliberate practice, and why a general scientific description of the mediating mechanisms underlying skilled performance requires studying a variety of expert domains.
354
A.
Ericsson and Roring
HISTORICAL BACKGROUND TO THE TRADITIONAL APPROACH SEARCHING FOR BASIC ABILITIES
Psychologists in the nineteenth century were inspired by the successful approach of the natural sciences during the seventeenth through nineteenth centuries. Motivated by such examples as Newton’s three laws in mechanics, these psychologists searched for general laws and the simplest observable mental phenomena that could be elicited in controlled laboratory environments. The pioneering laboratory study of basic memory processes was conducted by Ebbinghaus (1885/1964). A critical goal of Ebbinghaus was to uncover scientific laws underlying the acquisition and retention of elementary associations in memory independent of acquired knowledge and experience. To minimize the eVects of knowledge, Ebbinghaus’s experiments used lists of nonsense syllables. Ebbinghaus sought general laws governing healthy adult memory, and his belief in the generality of those laws was so strong that he studied only a single participant: himself. Over several years of testing, he recorded his performance while memorizing over 2000 lists of nonsense syllables (Dukes, 1965). His faith in the value of his single‐participant studies was apparently well founded, as even 100 years after publication of his dissertation, scientists noted the validity and reproducibility of his original discoveries and even questioned how much significant progress had been achieved in the study of human memory in the century that followed (Slamecka, 1985a,b). However, memory for nonsense syllables is much worse than memory for typical information encountered in everyday life—in fact, Ebbinghaus needed 10 times more time to memorize lists of nonsense syllables compared to poems with the same number of syllables. Other psychologists of the time, most notably Alfred Binet, raised doubts about whether memory for nonsense syllables was mediated by the same types of processes as memory in everyday life and the exceptional memory performance of mental calculators and chess players. Binet’s report on chess players’ ‘‘mnemonic virtuosity’’ was arguably the first published study of memory and expertise (Binet, 1894/ 1966, p. 127). In this paper Binet reports an interview of chess players and their ability to play chess without seeing a chessboard or ‘‘blindfolded.’’ Based on anecdotes and the answers to his interview, Binet argued for the role of serious study in chess playing as any amateur who has just learned the rules cannot play blindfold chess ‘‘[n]o matter how good his memory’’ (p. 145). Hence, it is the ability to discover the meaning that is the basis for the superior memory. However, Binet found that the verbal descriptions on the visual images used by chess players diVered enormously between diVerent individuals. Some claimed to see the board perfectly with all the details and even
Memory as an Aspect of Skilled and Expert Performance
355
shadows. Other chess players claimed merely to rely on abstract characteristics of the chess position. Unfortunately, there was no independent evidence to support, question, or falsify the validity of these diverse introspective reports. Despite this, Binet’s classic report is valuable and sets the stage for subsequent psychometric tests of performance and experimental laboratory studies (Binet, 1893/1966). Whereas initial proficiency in some domains may be attained within weeks or months, development to very high levels of achievement appears to require many years or even decades of experience. Similar transformative phases of acquisition have been shown to account for development in professional domains such as telegraphy (Bryan & Harter, 1897, 1899) and typing (Book, 1925a,b). In fact, Bryan and Harter claimed as early as 1899 that over 10 years are necessary for becoming an expert. The next major contribution to the study of memory and expertise involved testing the basic abilities of world‐class chess players and comparing their abilities to regular adults. Djakow, Petrowski, and Rudik (1927) measured many diVerent psychometric abilities of 12 international‐level chess players and compared their performance to the average of a large sample of non‐chess players. Contrary to the assumed importance of natural gifts, the international chess players were only superior on a single test—a test involving memory for chess positions. A later fundamental advance was achieved when theories of normal skill acquisition were integrated with the acquisition of expertise (Fitts & Posner, 1967). When anyone is introduced to a skilled activity such as driving a car, typing on a computer, or playing golf, their initial goal is to reach a level of proficiency that will allow them to perform these everyday tasks at a functional level. During the first phase of skill acquisition (Fitts & Posner, 1967), beginners try to understand the requirements of the activity and focus on generating actions while avoiding gross mistakes. In the second phase, when people have had more experience, noticeable mistakes become increasingly rare, performance appears smoother, and learners no longer need to focus as intensely on their performance. As individuals adapt to a domain during the third phase of learning, their performance skills become automated, and they are able to execute these skills with minimal eVort. As a consequence of automatization, performers lose control over the execution of those skills, making intentional modifications and adjustments diYcult. In their influential formal theory of expertise, Simon and Chase (1973) reconciled this discrepancy between the rudimentary performance on unfamiliar tasks in the laboratory and the skilled, eVortless performance in everyday life and described how expert performance could be developed without violating invariant memory constraints, such as short‐term memory (STM) capacity, and the fixed speed of basic processing. Simon and Chase
356
Ericsson and Roring
argued that with experience, individuals could build associations in memory based on patterns extracted from previously encountered situations [i.e., chunks of information in long‐term memory (LTM)], as well as the appropriate actions that should be taken in these situations. During performance and competitions, these experienced individuals would simply use the situational cues matching specific LTM chunks to retrieve the associated action from memory and thus bypass the complex generation and decision‐ making processes, often heavily involving STM, that were found for unpracticed laboratory tasks. Simon and Chase (1973) argued that the acquisition of expertise was closely linked to the gradual accumulation of such patterns and knowledge gained from much extended experience in the domain. For example, Simon and Chase found that chess experts typically spent at least 10 years playing before attaining international levels of performance. Based on Newell and Simon’s (1972) theory for human information processing, Chase and Simon (1973) argued that the overall number of chunks held by chess players with brief exposures to chess positions is limited to seven plus minus two (Miller, 1956). Chase and Simon (1973) argued that the memory recall advantage of stronger chess players was based on the larger number and the greater complexity of stored patterns of chess pieces (chunks). In support of this, they found that the recall advantage essentially disappears when using chess positions where pieces are randomly assigned to squares on the board. Given that chess players accumulate the chunks in LTM from experience with real chess positions in games, these random positions were unlikely to contain meaningful chunks. Later research demonstrated this interaction of skill by stimulus meaningfulness on memory recall in a large number of other domains, including bridge (Charness, 1979; Engle & Bukstel, 1978), Go (Reitman, 1976), medicine (Norman, Brooks, & Allen, 1989), music (Sloboda, 1976), and several diVerent sports (Allard & Starkes, 1991). These studies supported the notion that expert performance is largely a function of acquiring a database of LTM chunks that are accessed from STM during task performance. However, empirical research has uncovered critical problems with the assumptions of chunking theory and its assumptions of memory stores with invariant storage capacity and with fixed durations for the associated processes of encoding, storage, and retrieval. In particular, the assumption that the recalled information after a brief exposure was stored temporarily only in a limited capacity STM was found untenable. Charness (1976) showed how recall of a briefly presented chess position was essentially unaVected for experts even when chess experts had to perform a highly attention‐demanding task interpolated between the end of presentation and the beginning of recall. If the processing of the attention‐demanding task had replaced the original contents of the limited‐capacity STM, then chess
Memory as an Aspect of Skilled and Expert Performance
357
experts must be able to store information rapidly in a more persistent storage system that would allow subsequent retrieval. More generally, Ericsson and Kintsch (1995) showed in a review that expert performers in chess and other domains are able to store and encode information in LTM with the flexible and rapid access to information that is typical characteristic of STM and they referred to this type of memory system as long‐term working memory (LTWM). Their reviewed findings showed that during performance of representative tasks, the primary challenge for expert performers is to anticipate future retrieval demands for the generated and encountered information so it can be encoded appropriately in LTM. These findings are inconsistent with the idea of general elementary processes that perform encoding, storage, and retrieval for unitary structures, such as chunks, and the associated human information processing theory of Newell and Simon (1972). Most research in the laboratory, especially on memory, rarely confronts these issues as participants are tested on relatively unfamiliar stimuli and the duration of the experiments is limited to around one hour. In order to bridge expert performance and the one‐hour laboratory experiment, we would need to conduct training studies in the laboratory that would allow us to trace the development of more complex processes with ten, hundreds, or thousands of hours of practice, which is generally impractical for many reasons. Simon and Chase (1973) were aware of the enormous investment of preparation to reach the highest levels of skill and they elaborated the 10‐year rule (Bryan & Harter, 1899) and proposed that it is not possible in modern times to win consistently at the international level with less than 10 years of experience. B.
THE POWERFUL EFFECTS OF SOME TYPES OF PRACTICE
Over 25 years ago, Bill Chase and K. Anders Ericsson (Ericsson, Chase, & Faloon, 1980) were interested in whether it was possible to increase what was at that time commonly believed to be the primary constraint on information processing, namely the limited capacity of STM. The standard test of STM involves immediate recall of a series of digits (digit span). The average performance of college students for such an activity is about seven digits, the equivalent of a local phone number. This study began by establishing the digit memory of several college students prior to training, and we verified that recall performance was normal (i.e., limited to around seven digits), implying typical capacity levels. After 50 h of practice, all of the trained students increased their memory performance to over 20 digits. After 200–400 h, two of the students improved their recall to over 80 digits, an increase of more than 1000%, and later studies replicated these dramatic improvements. It was discovered that the observed 1000% improvement in digit recall relied on mechanisms of LTM,
358
Ericsson and Roring
which is quite similar to the superior memory of expert performers in chess mentioned earlier (Chase & Ericsson, 1982). In fact, adults attaining skilled performance in reading and text comprehension, for example, acquire similar mechanisms for expanding their working memory through storage in LTWM (Ericsson & Kintsch, 1995). If extended training on a task can result in dramatically superior performance due to mechanisms that are independent of basic capacity constraints, such as STM capacity, then such capacity measures may be unhelpful in predicting this superior achievement, even though they may predict performance on unpracticed laboratory tasks. When we limit our review to only objective measures of performance in domains of skill, we find that traditional capacity measures and intelligence tests are not related to individual diVerences among skilled and expert performers. In general, our reviews show that although IQ is frequently correlated with initial performance on an unfamiliar task, after extended periods of skill acquisition the relation is no longer statistically reliable. For example, a recent longitudinal study of children’s improvements in chess found that the predictive power of IQ diminishes as skill improves and IQ did not predict rate of improvement after accounting for practice activity (Bilalicˇ, 2006). Research on expert performance has also found that full‐scale IQ tests and heavily g‐loaded tests, such as Raven’s matrices, are not reliably correlated with expert performance in many types of intellectual domains, such as chess (Doll & Mayr, 1987), Go (Masunaga & Horn, 2001), and several others (Ericsson & Lehmann, 1996). One initial objection to these findings is that the entire samples of skilled chess players have an average IQ that is higher than the mean of the general population (Doll & Mayr, 1987). More recently collected samples that vary widely in skill and IQ (Grabner, Neubauer, & Stern, 2006; Unterrainer, Kaller, Halsband, & Rahm, 2006) have replicated the lack of relation between chess performance and IQ even when restriction of range cannot be the explanation due to having an appropriate range in the studied sample. There are also numerous documented cases of individuals achieving very high levels of achievement with an IQ below 100. For instance, some of the grandmaster chess players in Doll and Mayr’s (1987) study had IQ scores below the normative mean and even in the verbal board game Scrabble, some top players have below average verbal ability (TuYash, Roring, & Ericsson, in press). That IQ and related cognitive‐ ability tasks fail to predict reliably objective measures of domain achievement questions whether the underlying mechanisms overlap. Indeed, research on processes mediating reproducibly superior expert performance shows that the associated mechanisms diVer fundamentally from those used by novices (Ericsson, 2006a)—during years of practice and training, experts acquire elaborate mechanisms for encoding and maintaining flexible access to critical task information that bypass basic capacities,
Memory as an Aspect of Skilled and Expert Performance
359
such as STM capacity (Ericsson & Kintsch, 1995). Moreover, evidence suggests that with increasing level of skill there are changes in the patterns of neural activation (Hill & Schneider, 2006), and some evidence even suggest that intense training can change functional and structural aspects of the brain (Ericsson, 2006b). For instance, early and extended training in music has been shown to change the cortical mapping of the brain area controlling fingers of string players (Elbert, Pantev, Wienbruch, Rockstoh, & Taub, 1995) and the flexibility of fingers (Ericsson & Lehmann, 1996), and that intense music practice influences the development of myelin around nerves in critical brain regions (Bengtsson et al., 2005). Notably, several studies have also found that regions of brain activity may dramatically change as chess skill increases (Amidzic, Riehle, Fehr, Wienbruch, & Elbert, 2001; Grabner et al., 2006; Volke, Dettmar, Richter, Rudolf, & Buhss, 2002), and this may hold true in many other intellectual domains as well, including memory (Maguire, Valentine, Wilding, & Kapur, 2003), mental calculation (Pesenti et al., 2001), and even in taxi drivers (Hartley, Maguire, Spiers, & Burgess, 2003). In general, achieving expert proficiency in a domain requires thousands of hours of eVortful deliberate practice involving problem solving and intense concentration (Ericsson, 2006b; Ericsson, Krampe, & Tesch‐Ro¨mer, 1993), and critical changes in neural substrates will likely accumulate over this period of time, in a manner that cannot be elicited during a brief training period with the tasks. Given that the structure and the mechanisms mediating performance change qualitatively during the skill acquisition process, any theory claiming the invariant involvement of stable mechanisms, such as innate capacities, must not only provide supporting evidence, but must also account for the accumulated evidence failing to support the involvement of any general and stable mechanisms or structure. C.
TOWARD A SCIENTIFIC STUDY OF HIGH LEVELS OF SKILL AND EXPERT PERFORMANCE
The expert‐performance approach (Ericsson, 2006a; Ericsson & Smith, 1991) focuses on those observable activities that define the essence of expert performance in a domain. For example, the focus should be on the diagnosis and treatment of patients in a superior manner, on the consistent selection of the best moves for chess positions, and on superior performance in music and sport competitions. The first step for a science of expert performance requires that scientists capture, with standardized tasks, the reproducibly superior domain performance. The second step involves analyzing performance on these tasks with cognitive methods and experimental techniques to delineate the underlying mediating mechanisms. Once cognitive mechanisms have been identified, the third and final step involves research on what types of practice
360
Ericsson and Roring
activities that have led the acquisition of these structures.The findings from these steps will be discussed in the next sections. We will also discuss how these mediating mechanisms could explain the various findings on superior expert memory recall. D.
CAPTURING REPRODUCIBLY SUPERIOR PERFORMANCE UNDER STANDARDIZED CONDITIONS
In everyday life, experts encounter unique challenges under diVerent conditions, which is problematic for comparing levels of performance among diVerent experts. For example, one doctor may treat two clients with complex and diYcult treatment problems, whereas another may treat six clients with relatively routine problems. Unless skilled individuals encounter the same or comparable situations it will be very diYcult to measure individual diVerences in performance in a meaningful manner. In his pioneering study De Groot (1978) was able to design representative tasks that captured the superior performance of world‐class chess players. The ultimate criterion for success in chess is success at chess tournaments, where players match their chess skill in matches that last for several hours. Using an innovative approach, De Groot (1978) found a way to elicit the critical processes distinguishing chess experts that avoided an analysis of extended chess playing behavior. He identified particular chess positions, taken from real games between chess masters, where one move was much better than other legal moves. He then presented the selected positions to chess players during an individual testing session and asked them to ‘‘think aloud’’ while they generated the best possible next move for each position. De Groot (1978) demonstrated that world‐class players reliably found the best moves for these positions, whereas skilled club players only found the best chess moves for some of them. Subsequent research with large groups of chess players diVering widely in skill has shown that the selection of the best move for selected chess positions is correlated with the tournament ratings (Ericsson, Patel, & Kintsch, 2000; van der Maas & Wagenmakers, 2004). When performance on 20 selection tasks is aggregated, the resulting score is highly correlated with chess ratings, allowing researchers to measure a chess player’s skill after less than 15 min of testing. Of particular interest, De Groot (1978) was able to identify how the thought processes of world‐class players diVered from highly skilled club players by analyzing their think aloud protocols from the move selection tasks. Ericsson and Smith (1991) proposed a new approach to the study of expertise based on a generalization of De Groot’s paradigm, which was later elaborated as the expert‐performance approach. According to this framework, the experts’ performance in a given domain of expertise, such as music, sports, or medicine, is examined to identify naturally occurring situations in a given domain of expertise that require immediate action and
Memory as an Aspect of Skilled and Expert Performance
361
that captures the essence of expertise in the associated domain of expertise. For example, a doctor or a nurse will frequently encounter situations where they must assess the symptoms of a patient for immediate diagnosis and treatment, and this can be captured by a task that requires identification of symptoms and subsequent diagnosis and treatment of disease. Similarly, it is possible to film soccer game situations where a given player is required to elicit an action without any delay (Ward & Williams, 2005)—once these situations have been selected and appropriate courses of action identified, it is then possible to reproduce them with appropriate context and an immediate demand for action under standardized conditions for all tested participants. In a controlled environment it is possible to present these representative situations to participants diVering in skill, and then use experimental techniques and process‐tracing methods (e.g., eye tracking) to explain the mediating mechanisms of superior performance. E.
APPLYING THE EXPERT‐PERFORMANCE APPROACH TO IDENTIFY THE MEDIATING MECHANISMS
The expert‐performance approach (Ericsson, 2006a) proposes that the reproduction of the essential performance under controlled experimental conditions in the laboratory provides opportunities to analyze the mediating processes with traditional methods in cognitive psychology and cognitive science, such as protocol analysis, eye‐movement recordings, and experimental manipulations. From a series of such analyses, it is possible to identify the mechanisms that are responsible for the superior performance and then turn to the final step, namely explaining how the complex mechanisms mediating the superior performance were acquired by deliberate practice (Ericsson, 2006b). Our aim for the remainder of this chapter is twofold. First, we will describe the assessment of the mechanisms mediating expert performance in diVerent domains and describe evidence on the critical practice activities. Second, we will examine if our understanding of these acquired task‐related mechanisms can predict the performance of experts on transfer tasks involving various memory tests, such as Chase and Simon’s 5s presentation and recall of chess positions. In particular, we will review how studies focusing only on these intentional memory tasks often report a diVerent pattern of findings than those observed for incidental memory after performance on regular (non‐memory) tasks that are directly representative of expertise in the associated domains. Such independence suggests that the processes used for memory tasks may be distinct from the processes used by experts to generate reproducible superior performance in their domain. We will discuss research on specific domains and highlight how unique information has been discovered by studying novel areas of expert performance. First, we will
Ericsson and Roring
362
discuss acquired memory performance followed by chess, the drosophila of skill domains (Simon & Chase, 1973), and follow this with domains having both predictable and less predictable occurring situations. 1.
Memory Expertise
Perhaps the domain most extensively studied with the expert‐performance approach is exceptional memory (Ericsson, 2003; Ericsson & Lehmann, 1996). Following Binet’s (1894/1966) pioneering work studying individuals with exceptional memory, several subsequent studies interviewed exceptional memorizers, such as Luria’s (1968) subject S and Hunt and Love’s (1972) VP (see Wilding & Valentine, 1997, for a review). The capturing of expert memory performance is unusually straightforward because the superior performance is defined by the memory task as it is performed in public demonstrations. The unfamiliar material, such as lists or matrices of digits or letters, is presented to the performer followed by a recall. The first study to trace the development of the acquired mechanisms mediating exceptional memory was conducted by Chase and Ericsson (1981, 1982) and Ericsson et al. (1980). They studied how college students with initially an average memory performance acquired the best memory performance in the world for some memory tasks. In this case, the memory task involved the memorization of orally presented random sequences of digits at a rate of 1 digit per second. In the first study, Chase and Ericsson (1981, 1982) and Ericsson et al. (1980) collected retrospective reports after most memory trials to assess changes in the thought processes as a college student (SF) improved his performance on the digit‐span task. The analysis of the protocols revealed that as SF increased his memory performance, he reported segmenting the presented lists into 3‐digit groups and whenever possible encoded them as running times given that SF was an avid cross‐country runner. To address the validity of the verbal reports, Chase and Ericsson designed an experiment to test the eVects of mnemonic encodings and presented SF with special types of lists of constrained digits, such as lists that were constructed to contain only 3‐digit groups that could not be encoded as running times, and his performance reliably decreased. Based on a combination of experiments and process‐tracing studies, Chase and Ericsson attained a deep understanding of the acquired skill. This study is relatively unique in that all of SF’s training was monitored in the laboratory with retrospective reports. SF’s gradual changes could be observed and related to the problem solving and exploration of alternative encoding methods and the opportunity to practice tasks at the limits of his ability. The training procedure involved presenting SF with a list of digits with a length that was kept at his performance limit—the length was increased with one digit if the preceding trial was successful and reduced by one digit if he had failed, which
Memory as an Aspect of Skilled and Expert Performance
363
led to perfect recall for 50% of the memory trials. He was given immediate feedback on any errors in recall so he could identify specific problems and develop better encoding methods. This type of practice is designed to present tasks that challenge the current level of performance and provide immediate detailed feedback and opportunities for repetition, and was subsequently viewed as an example of deliberate practice by Ericsson et al. (1993). Deliberate practice has been documented in a wide range of diVerent domains of expertise (Ericsson, 2006a; Ericsson et al., 1993). The knowledge of the mechanisms mediating the exceptional performance and the deliberate practice that led to the gradual acquisition of the performance permitted us to make predictions for performance on near and far transfer tasks of memory. When SF was tested on visually presented matrices he was able to adapt his encoding methods and show nearly perfect transfer, but when he was tested on series of consonants his performance showed no benefit of the training. In general, this can be extended to any individual with exceptional memory performance. Initially, the exceptional individuals verbally report their thoughts while performing or directly after performing representative memory tasks. These reports allow investigators to identify the mediating encoding and retrieval mechanisms of each exceptional individual. The training history of the individual is assessed by interviews to determine the nature of the practice activities in relation to the acquired mechanisms. These models of the exceptional performance are then evaluated experimentally by presenting each individual with experimentally designed transfer tasks, where performance and decrements in performance are predicted in advance (Ericsson, 1985, 1988; Wilding & Valentine, 1997). With the same methodology of the expert‐performance approach, verbally reported mechanisms of superior performance and the associated practice activities have been validated with designed experiments in a wide range of domains, such as a waiter having superior memory for dinner orders (Ericsson & Polson, 1988a,b) and other individuals with exceptional memory performance (Ericsson, 2003; Ericsson, Delaney, Weaver, & Mahadevan, 2004). 2.
Chess
The task that captures skilled performance in chess is the selection of the best moves for unfamiliar positions, as we discussed earlier. De Groot (1978) applied the process‐tracing methodology to assess the processes mediating this performance. He collected ‘‘think aloud’’ protocols while the chess players generated their moves and he collected incidental memory for the positions after the move selection had been completed. The verbal protocols of both world‐class and skilled club‐level players showed that both types of players first familiarized themselves with the position and verbally reported
364
Ericsson and Roring
salient and distinctive aspects of the position along with potential lines of attack or defense. The players then explored the consequences of longer move exchanges by planning alternatives and evaluating the resulting positions. During these searches, the players would identify moves with the best prospects in order to select the single best move. De Groot’s analysis of the protocols identified two important diVerences in cognitive processes that explained the ability of world‐class players to select superior moves compared to club players (De Groot, 1946/1978). De Groot noticed that the less skilled players rarely verbalized the best move during move selection, implying that they did not, in fact, consider it. Thus, their initial inferior representation of the position must not have revealed the value of lines of play starting with that move. In contrast, the world‐class players reported many strong first moves even during their initial familiarization with the chess position. For example, they would notice weaknesses in the opponent’s defense that suggested various lines of attack and then examine and systematically compare the consequences of various sequences of moves. During this second detailed phase of analysis, these world‐class players would often discover new moves which were superior to all the previously generated ones. Subsequent research has shown that after selecting the best move for positions, chess players are able to recall substantial amount of the presented chess positions, even if the request for recall is unexpected. Higher levels of skill are associated with higher levels of recall. Studies have found that the structure of the search for alternative moves becomes more advanced with increased levels of chess skill (Campitelli & Gobet, 2004; Charness, 1981). Evidence that the additional time to plan and explore alternative moves has been shown to lead to increased quality of moves (Chabris & Hearst, 2003). During the planning phase, chess players are able to refine their representation of the chess position with new discovered consequences of considered moves. While an exact theory for how chess players form and integrate such patterns and relationships during problem solving has not yet been proposed, there has been significant advances in understanding the type of practice activities that are associated with increased chess skill. Ericsson et al. (1993) proposed that the most eVective type of practice involved selecting moves for chess positions, where it was possible to get immediate feedback on the quality of one’s proposed move. They suggested that trying to select moves for published master games, one move at a time, and then check on whether the selected move matched the masters would provide such an opportunity. Subsequent research by Charness and colleagues (Charness, Krampe, & Mayr, 1996; Charness, TuYash, Krampe, Reingold, & Vasyukoya, 2005) has supported these predictions. They have found that the accumulated amount of solitary study of chess games is the best single predictor of chess ratings and that the
Memory as an Aspect of Skilled and Expert Performance
365
amount of chess books and journals owned (the source of games between chess masters) is also a very good predictor of chess playing. In contrast, merely playing chess games does not seem to lead to improved chess performance when other more relevant activities are controlled statistically. This type of training in selecting the best moves and correcting errors in move selection requires extensive reasoning about future potential chess positions and thus the refinement of representations and skill to encode all relevant information in LTWM. Some evidence from neuroscience even suggest that as chess players improve, they rely increasingly more on the temporal lobe structures and less on parietal lobe structures, indicating the use of episodic LTWM during task performance (Amidzic et al., 2001). This analysis of the mechanisms mediating chess proficiency is consistent with research on memory tests. The most remarkable feat originally documented by Binet (1893/1966) is the ability of skilled players to play chess without seeing the chessboard, referred to as blindfold chess. Blindfold chess is similar in many respects to the planning operations carried out by chess players in normal games, particularly in complex positions where players must calculate many moves deep beyond the current position. Ericsson and colleagues found that a chess master without any prior experience of blindfold chess was able to play at a skilled level and could play out games in his mind from merely being given a series of the moves (Ericsson & Oliver, 1988; Ericsson & Staszewski, 1989). These findings have been replicated and extended by Saariluoma (1989), who found evidence that chess players could encode and recall chess pieces associated with specific squares of the chessboard even when the pieces were presented in random orders. In fact, Chabris and Hearst (2003) have shown that quality of chess playing under blindfold conditions by very strong players is almost as good as under normal conditions. Similarly, when chess players are performing the Chase and Simon (1973) type memory task with a brief presentation, they should be able to use the same encoding mechanisms developed during deliberate practice; however, they may also use meaningful mnemonic encodings during the memory task that are unrelated to choosing good chess moves but more related to common positional structures or patterns. Consistent with this latter idea, Ericsson and Harris (1990) found that a beginner at chess could be trained to produce master‐level memory performance for chess positions in about 50 h of practice, which is dramatically less time than is required to reach master levels of chess skill. In fact, highly skilled chess players have superior memory even for pseudorandom positions after engaging in the move‐ selection task (Schultetus & Charness, 1999). This and related findings emphasize how memory must be studied in the context of representative domain situations, rather than in distinct memory tasks that may diVer with respect to the representations used by skilled individuals. If researchers want to study
366
Ericsson and Roring
the essence of domain performance, they must avoid contrived memory tasks and focus on the performance in question. As we will see, in many domains, the study of memory performance in isolation leads to findings incongruent with data from representative tasks. F.
THE REPRESENTATIVE REHEARSED PERFORMANCE AND ITS RELATION TO MEMORY PERFORMANCE
The type of performance captured by representative domain tasks diVers markedly across domains of skill. In the arts, expert performers are typically judged by an audience. The score of famous music pieces and the lines of famous plays need to be successfully reproduced, and thus require that the performers memorize and practice it until they can express their interpretation of the piece of art. In other domains, the expert performer cannot make detailed preparations in advance. For example, soccer players and medical doctors cannot know what specific situations they will encounter and must thus develop skills that allow them to generate appropriate actions in a whole range of possible situations. 1.
Displaying a Prepared Performance: Acting
The essence of expert acting is to present a rendition of given role that is authentic and engages the audience. This is an end product of a very long process of internalizing the lines of the portrayed character. Actors generally approach learning a script in two stages (Noice & Noice, 2002b). First, they derive the character’s intentions from parsed units of the script, requiring extensive analysis where they attempt to understand the character’s motivation for specific aspects of every line. Second, they rehearse and perform these scripts through active experience. Empirical evidence suggests that the intentions (and possibly information from motor movements and situational cues: Noice & Noice, 1999) can later serve as retrieval cues (Noice & Noice, 2006). Important for our purposes, some studies have even found that actors can recall scripts almost perfectly several years after the final performance despite having learned other scripts in the interim (Noice & Noice, 2002a; Schmidt, Boshuizen, & van Breukelen, 2002). This means that the actors accumulate knowledge about scripts and lines and that they expand their repertoire by adding new scripts while letting older scripts accessible and associated to relevant retrieval cues. This type of accumulation of associations of new material with a limited set of retrieval cues is likely to lead to interference problems during the acquisition of new scripts rather than facilitation of memorization of new material. Consistent with hypothesis that actors become skilled at interpreting the intentions and thoughts of the characters that portray, actors have minimal advantage over untrained individuals when first learning a script
Memory as an Aspect of Skilled and Expert Performance
367
(Noice, 1993). The amount of time required by actors to memorize text is not reliably diVerent from random samples of college students (Intons‐Peterson & Smyth, 1987). More detailed studies of the memory structure of lines have been conducted with actors performing classical playwrights, such as Shakespeare (who cannot be paraphrased ad lib). In fact, memorizing the lines of Shakespearean characters typically takes remarkable amounts of time, even in the range of 300–600 h depending on the size of the part. Oliver and Ericsson (1986) found that actors could reliably recall individual words from a script when given a verbal probe (one to four words with no additional contextual information), and that retrieval was slower when crossing sentence boundaries. This suggests a hierarchical organization for the retrieval structure that allows actors to remember and perform lines in a script, where specific phrases can be directly retrieved as chunks without the need to first recall previous phrases. The case of actors illustrates an example of LTWM mechanisms having an integrated memory representation that mediates the exceptional memory performance, but may not lead to reproducible superior performance in encoding new information. Although a structure with retrieval cues may be generated during learning and may mediate performance, actors should not have an advantage over college students at encoding new additional textual information in LTM. This domain illustrates an important point: although skilled performance often transfers to memory tasks, it need not always show superior memory performance. 2.
Rehearsed Music Performance
Similar to actors, expert pianists must memorize large amounts of information, namely scores of music, for their public performances. Also like actors, pianists must go beyond mere recall of the information, and must produce a pleasing musical experience. ChaYn and Imreh (2002) argue that the structure of a music piece possesses a natural hierarchy of movements, sections, subsections, and bars that could serve to organize a set of performance (retrieval) cues in a retrieval structure. Examples of performance cues could include dynamics, tempo, use of pedal, and emotions to be conveyed during performance (the latter being the most eVective in ChaYn and Imreh’s analysis). ChaYn and Imreh (2002) found evidence that a concert pianist’s practice was organized around this structure and that the structure was apparent in an incidental recall test two years later. Moreover, Williamon and Valentine (2002) found evidence from systematic observations of practice and interviews that pianists’ segmentation of a to‐be‐learned piece is hierarchically organized, such as the pianists’ identification of bars denoted ‘‘diYcult’’ by their specific section of the piece. Like ChaYn and Imreh (1997), these authors found that practice was stopped most frequently at structural boundaries in the musical score. Hence, evidence suggests that skilled pianists
368
Ericsson and Roring
form an integrated mental representation of a musical piece organized by the structure of the piece itself, so that individual sequences of notes, possibly encoded as patterns in many cases, are associated with structural sections of the piece, which serve as retrieval cues organized into a hierarchy that mirrors the large‐scale organization of the piece. Research on deliberate practice in music (Ericsson et al., 1993) shows that highly skilled music performance requires thousands of hours of continued attempts for mastery and requires that the performer always tries to correct some specific weakness while preserving other successful aspects of function. With increased skill in monitoring, skilled performers in music master increasing challenges by goal‐ directed deliberate practice involving problem‐solving and specialized training techniques (ChaYn & Imreh, 1997; Ericsson, 2002; Gruson, 1988; Nielsen, 1999). In parallel with the findings observed for actors we would not expect the expert musicians to show superior memory for auditorily presented stimuli. Sloboda and Parker (1985) did not observe a memory advantage for music experts for presented melodies. It might be expected that musicians who can read music would demonstrate a memory advantage for visually presented notes; however, the superiority of skilled musicians’ recall for written sequences of notes is rather small (in absolute terms) and even extends to a lesser degree to random music notation (Halpern & Bower, 1982; KauVman & Carlsen, 1989; Sloboda, 1976). Vastly superior memory for music information is observed only when domain‐related tasks require such an ability, leading to an explicit motivation to acquire it through deliberate practice. For example, autistic savants who are blind or cannot read music are forced to acquire new music pieces by listening to them on the radio. They must therefore develop the skill of rapidly committing these pieces to memory after only a small number of exposures. These individuals can display superior music memory and can reproduce a piece of traditionally structured (tonal) music, but not atonal music, after having heard it once or twice (Charness et al., 1988; Miller, 1989; Sloboda, Hermelin, & O’Connor, 1985). G.
EXPERT PERFORMANCE IN DOMAINS WITH LESS PREDICTABLE SITUATIONS AND ITS RELATION TO MEMORY PERFORMANCE
Many domains of expertise do not involve the delivery of a preplanned and rehearsed performance. In any competitive domain, such as chess and other sports, it is possible to make some preparations and adjustments for competing with specific opponents, but many novel situations arise. Similarly, a medical doctor in a clinic will not know the patients’ problems, but will most of the time have to react to the current situation. Expert performance in these domains appears to be related to the performer’s ability to create a mental representation that allows them to reason about various courses of action.
Memory as an Aspect of Skilled and Expert Performance
1.
369
Soccer
The essence of expert performance in soccer involves having the knowledge of the best action in a given situation and the control to consistently execute that action. Studies of soccer experts have shown that when videos of real match situations are presented to soccer players at the local, national, and international levels, reliable diVerences are found in quality of the selected action and in some cases even in the speed of the execution of the action (Helsen & Starkes, 1999; Ward & Williams, 2005). In playing soccer, expertise requires accurate anticipation of plays based on perceptual cues (such as locations of players on the field and body movement information from the players possessing the soccer ball) to prepare movements and countermeasures. Evidence suggests that better soccer players have greater selective access to possible alternative plays at points in a game (Williams, Davids, Burwitz, & Williams, 1993). There is compelling evidence that increased duration of deliberate practice in soccer is associated with advancement to the highest levels of competition (Ward, Hodges, Williams, & Starkes, 2004). Detailed information about the microstructure of deliberate practice that causes measurable increases in soccer performance has not yet been well developed. Studies of memory for videos of dynamically changing situations in team sports are superior for players at the higher level of competition, not just in soccer (Williams et al., 1993) but also in volleyball (Borgeaud & Abernethy, 1987). Increased memory for snapshots of game situations in basketball (Allard, Graham, & Paarsalu, 1980) is also associated with higher levels of skill. These results are similar to those observed in chess; however, the game situations are not suYciently predictable to plan several consecutive actions deep nor is there enough time for extensive planning. 2.
Medicine
The task of medical diagnosis can be captured in the laboratory by presenting medical doctors and students with information about specific patients, then comparing their diagnoses with the correct diagnoses—as determined from additional information and tests of the associated patient. Typically, diagnoses require medical experts to review and integrate a large body of evidence, such as the patients’ descriptions of their symptoms and results from the physical examination, as well as data from laboratory tests and other ancillary procedures, before making final diagnoses. Accuracy of medical diagnosis for frequent and representative medical problems typically reaches a stable level after the completion of residency; however, the most experienced experts demonstrate superior diagnosis of more diYcult and infrequent cases (Norman, Coblentz, Brooks, & Babcook, 1992;
370
Ericsson and Roring
Schmidt, Norman, & Boshuizen, 1990; see also Patel, Arocha, & Kaufmann, 1994; Patel & Groen, 1991 for evidence of reliably superior medical diagnostic performance). Consistent with a LTWM explanation for what might mediate medical experts’ superior performance, studies asking medical experts, students, and novices to recall unexpectedly information about a patient after making a diagnosis found that information recall was a monotonic function of expertise, and this was found regardless of whether the information was in order or scrambled (Norman et al., 1989). It is likely that medical expertise is similar to other forms of skill, such as chess skill, in that experts encode significant aspects of patient symptoms and test results in LTWM (Ericsson & Kintsch, 1995; Ericsson et al., 2000) in a way that can be flexibly accessed from LTM (cf. Patel et al., 1994) to support reasoning about diagnostic alternatives. Currently, there is very limited knowledge about what types of practice activities are associated with increased diagnostic accuracy other than increased training and specialization. If skill in diagnostic ability leads to better encoding of the diagnostically relevant inferences, it would not necessarily imply increased memory for literal details. In fact, unlike many other domains, when participants are asked to explicitly memorize patient information for later recall, Schmidt and Boshuizen (1993) found an inverted‐U function with expertise. Participants with an intermediate level of skill recalled more information than both novices and more experienced medical experts. However, when Groen and Patel (1988) analyzed the recalled information and only coded the diagnostically relevant higher‐level information for described patients then they saw the expected increased recall as a function of expertise. These findings from medicine again emphasize the importance of studying representative tasks, rather than memory tasks. Domain skill must be studied in the context of reproducibly superior performance, and although performance often transfers to contrived memory tasks, these tasks need not capture the essence of expertise. In medical skill, the inverted‐U function is one example illustrating how counterintuitive findings can arise for these contrived tasks, which may lead investigators away from the actual performance central to their research interests. H.
EXPERT PERFORMANCE INVOLVING CALCULATION AND ITS RELATION TO MEMORY PERFORMANCE
There are several domains of expertise that involve mental calculation. The large number of diVerent problems makes it very diYcult to store and later retrieve solutions to problems like 345 6789 and billions of all the other problems involving other number combinations. There appears to be a great consistency in the manner that expert calculators acquire their skills (Smith, 1983).
Memory as an Aspect of Skilled and Expert Performance
1.
371
Mental Calculation
High skill in mental calculation is captured by presenting performers with problems involving typically the multiplication of two multidigit numbers. Mental calculators produce their answer with impressive speed in spite of the fact that the solution of these problems require many intermediate products and steps that would have to be maintained in immediate memory (Binet, 1894/1966). For instance, the famous mental calculator Inaudi could calculate the product of two 3‐digit numbers in less than 10 s, despite that most individuals cannot solve such problems quickly even with pencil and paper. Studies of mental calculators have revealed how through study, these individuals have learned a vast number of numerical facts and procedures. Evidence from concurrent verbal protocols and analyses of errors suggest that these individuals develop techniques for expanding their functional working memory for the intermediate products during calculation by storing the results accessibly in LTM. For instance, Chase and Ericsson (1982) found evidence that a mental calculator would use mnemonics to store some of the intermediate results of his algorithms. Dansereau (1969) had skilled college students think aloud while they solved problems and discovered how some mental calculators used patterns and relations to encode problems and intermediate products. After test sessions with many multiplication problems, Staszewski (1988) found that trained college students recognized and recalled numbers in the original problems and numbers generated as intermediate products. More recently, neuroimaging evidence revealed how a calculation expert recruits areas of the brain associated with episodic LTM. For retrieving the simple products of two numbers (e.g., 3 5 ¼ 15), control participants tend to use a left parieto‐precentral region plus naming regions in the left anterior insula and right cerebellum (Zago et al., 2001). More complex calculations from unskilled individuals also recruit visual short‐term working memory areas and areas related to mental imagery, such as the left‐parieto‐superior frontal region and bilateral inferior temporal gyri (Butterworth, 2006). However, an expert mental calculator was found to additionally use brain areas implicated in episodic LTM storage, such as right medial frontal and parahippocampal gyri (Pesenti et al., 2001). Hence, converging evidence suggests that STM is rarely used by expert mental calculators, who likely retrieve intermediate products from LTWM, although precise systems of retrieval cues used by these experts are not yet fully understood. Memory tests of mental calculators show that their memory span for digits is the 12–15 digit range (Ericsson, 1985). This superior digit span can be accounted for in terms of the mental calculators’ knowledge of numbers. Most of them know each of the numbers between 0 and 999 individually and
Ericsson and Roring
372
know whether they are primes or can be represented as factorials of primes, such as 210 ¼ 2 3 5 7. Their memory span can thus be described as only three or four diVerent 3‐digit numbers (Ericsson & Kintsch, 1995). The superior knowledge about 3‐digit numbers and numerical relations of numbers allow mental calculators to memorize lists or matrices of numbers at a faster rate than average college students (Ericsson, 1985). 2.
Mental Abacus Calculation
Adding numbers rapidly with an abacus is a common skill in Japan. Some individuals develop the skill of adding numbers without a perceptually available abacus. After a session of several addition problems solved with a mental abacus memory for the presented problems and intermediate sums is poor (Hatano & Osawa, 1983). There is evidence that it takes considerable time to develop a mental abacus—roughly a year of practice is necessary for expanding the mental abacus with one additional location to allow the calculator to represent larger numbers, such as 5‐digit number to 6‐digit numbers. Memory studies have shown that mental abacus calculation experts have superior digit span than less skilled abacus calculators, which is consistent with the development of a mental abacus representation. If mental abacus experts are presented with lists of digits for consecutive recall, their recall is only accurate for the most recent list (Hatano & Osawa, 1983). Time to recall digits appears independent of their serial position (Hatano, Amaiwa, & Shimizu, 1987), which supports the notion of a retrieval structure flexibly allowing information recall, rather than a chaining strategy. The experimental technique of forcing calculators to engage in a dual task involving articulatory suppression while calculating showed that the suppression did not aVect performance of abacus experts, only controls (Hatano & Osawa, 1983; Hatano et al., 1987). It is likely that abacus experts use a retrieval structure similar to the physical abacus, where digits are stored at locations on the mental representation of an abacus. Recent evidence from neuroscience supports the role of LTWM in mental abacus calculation, particularly involvement of temporal areas for imagined abacus (Chen et al., 2006). III.
Summary
The essence of expert performance, reflected by the superior performance on representative tasks, can be classified according to the cognitive demands required by the associated tasks. Some domains require skilled performers to acquire large quantities of organized information into LTWM that allows flexible and eYcient access to the information during the public performance. Examples of such domains include acting and music performance.
Memory as an Aspect of Skilled and Expert Performance
373
Other domains impose some real‐time constraints in less predictable environments, where skilled performers acquire LTWM skills to rapidly encode information to elicit appropriate actions during their performance. For example, during a chess match the opposing chess player will make moves that could not be predicted in advance. The chess players must encode the new chess position into an integrated LTWM structure that allows flexible access to relevant details, facilitating planning operations, and supports rapid decision making. Similarly, in medical diagnosis the experts will more eVectively encode the current situation and the presented information. For example, a medical expert will integrate information about patients and use these details to reason about possible diagnoses consistent with all the relevant information. We also examined domains with more severe real‐ time constraints with calculation of many intermediate products such as mental calculation. Our central hypothesis is that the demands of the target skill will influence the structure of the acquired skills, which will in turn determine the performance on memory tasks administered to experts and novices. Many domains appear to share similar processing demands and thus require similar mediating mechanisms, and we have suggested some general principles for categorization that can guide us toward more global and detailed theories of performance and consequences for memory tests. The diversity of structures mediating skilled performance would never emerge from studying unskilled individuals nor from examining skilled performance on a single cognitive task, but requires continued exploration of new domains where the acquisition and structure of the expert performance have not yet been examined in detail. IV.
Conclusion
Our chapter has shown that general theories of memory based on basic memory capacities and elementary memory processes do not explain the structure of superior performance of experts in diVerent domains of expertise. In each studied domain, we find that the superior representative performance is mediated by complex acquired mechanisms that have emerged through engagement in deliberate practice activities. Sometimes it is possible to define a memory task that matches, at least partially, the processes and the task demands of the representative task for a given domain, and in those cases the experts exhibit superior memory performance. However, experts may also use alternative processing strategies, such as meaningful mnemonics based on patterns from the respective skill domains, that then lead to superior memory performance yet need not directly mediate their superior skill. Only by examining those tasks capturing the essence of domain
374
Ericsson and Roring
expertise researchers can hope to reveal the underlying mediating mechanisms. Importantly, some domains do not benefit from rapid encoding of surface aspects of the encountered situation and task, and in such domains superior memory may not even emerge at all. General theories of expertise, such as that of Simon and Chase (1973), cannot explain these qualitative diVerences in expert performance in diVerent domains as the slow accumulation of patterns and chunks. Their proposal for expert performance and the performance on associated memory tasks has been shown to be inadequate to explain the diversity of empirical findings (Ericsson & Kintsch, 1995; Gobet & Simon, 1996; Vicente & Wang, 1998). In this chapter, we have adopted a very diVerent and more inductive framework. The general argument is that through training and deliberate practice it is possible to acquire complex mechanisms that mediate performance for representative tasks (Ericsson, 2006b; Ericsson et al., 1993). To account for the diversity of expert performance in their respective task domains and their associated need for working memory and more permanent storage of information, a more versatile and complex framework such as LTWM (Ericsson & Kintsch, 1995; Ericsson et al., 2000) is required. From this perspective, new findings from neuroscience provide intriguing evidence for the qualitative changes in processing associated with skill development. Laboratory research is not limited to studies of unfamiliar and simple tasks. In fact, the expert‐performance approach oVers a proven methodology for capturing phenomena of high‐level skill so that they can be reproduced in the laboratory and in an MRI scanner for analysis and laboratory experiments. In Section I, we discussed how many cognitive scientists point to physics as a model science in their justification for a reductionistic approach to the study of behavior and cognition. In contrast, we propose that a more appropriate scientific model for the study of skilled and expert performance is biology, where scientists study specific organisms for a better understanding of cellular and developmental processes in general, which results in successful adaptation to the ecological environment. The organisms of skilled research are the expert performers in domains of real‐world expertise, who have successfully acquired the mechanisms necessary for successful adaptation to the environment as evidence by their superior performance on the representative tasks. In line with the expert‐performance approach, we believe that the first few steps of the new science of skilled and expert performance will entail capturing diVerent types of performance and then dissecting their structure to identify the integrated mechanisms that have developed in response to months, years, and decades of practice and performance. This chapter has attempted to sketch the diversity of these types of mechanisms and how some of the mechanisms can transfer to superior performance on other domain‐related tasks.
Memory as an Aspect of Skilled and Expert Performance
375
ACKNOWLEDGMENT This chapter was prepared in part with support from the FSCW/Conradi Endowment Fund of Florida State University Foundation to K.A.E.
REFERENCES Allard, F., Graham, S., & Paarsalu, M. E. (1980). Perception in sport: Basketball. Journal of Sport Psychology, 2, 14–21. Allard, F., & Starkes, J. L. (1991). Motor‐skill experts in sports, dance, and other domains. In K. A. Ericsson and J. Smith (Eds.), Toward a general theory of expertise: Prospects and limits (pp. 126–152). New York, NY: Cambridge University Press. Amidzic, O., Riehle, H. J., Fehr, T., Wienbruch, C., & Elbert, T. (2001). Pattern of focal gamma bursts in chess players. Nature, 412, 603. Bengtsson, S. L., Nagy, Z., Skare, S., Forsman, L., Forsberg, H., & Ulle´n, F. (2005). Extensive piano practicing has regionally specific eVects on white matter development. Nature Neuroscience, 8, 1148–1150. Bilalicˇ, M. (2006). Acquisition of chess skill. Unpublished doctoral dissertation, Oxford, UK: Oxford University. Binet, A. (1894/1966). Mnemonic virtuosity: A study of chess players. [Trans. M. L. Simmel & S. B. Barron.]. Genetic Psychology Monographs, 74, 127–162. Book, W. F. (1925a). Learning to typewrite. New York, NY: The Gregg Publishing Co. Book, W. F. (1925b). The psychology of skill. New York, NY: The Gregg Publishing Co. Borgeaud, P., & Abernethy, B. (1987). Skilled perception in volleyball defense. Journal of Sport Psychology, 9, 400–406. Bryan, W. L., & Harter, N. (1897). Studies in the physiology and psychology of the telegraphic language. Psychological Review, 4, 27–53. Bryan, W. L., & Harter, N. (1899). Studies on the telegraphic language: The acquisition of a hierarchy of habits. Psychological Review, 6, 345–375. Butterworth, B. (2006). Mathematical expertise. In K. A. Ericsson, N. Charness, P. J. Feltovich, and R. R. HoVman (Eds.), The Cambridge handbook of expertise and expert performance (pp. 553–568). New York, NY: Cambridge University Press. Campitelli, G., & Gobet, F. (2004). Adaptive expert decision making: Skilled chess players search more and deeper. ICGA Journal, 27, 209–216. Chabris, C. F., & Hearst, E. S. (2003). Visualization, pattern recognition, and forward search: EVects of playing speed and sight of the position on grandmaster chess errors. Cognitive Psychology, 27, 637–648. ChaYn, R., & Imreh, G. (1997). ‘‘Pulling teeth and torture’’: Musical memory and problem solving. Thinking and Reasoning. Special Issue: Expert thinking, 3, 315–336. ChaYn, R., & Imreh, G. (2002). Practicing perfection: Piano performance as expert memory. Psychological Science, 13, 342–349. Charness, N. (1976). Memory for chess positions: Resistance to interference. Journal of Experimental Psychology: Human Learning & Memory, 2, 641–653. Charness, N. (1979). Components of skill in bridge. Canadian Journal of Psychology, 33, 1–16. Charness, N. (1981). Aging and skilled problem solving. Journal of Experimental Psychology: General, 110, 21–38. Charness, N. (1988). The role of theories of cognitive aging: Comment on Salthouse. Psychology and Aging, 3, 17–21.
376
Ericsson and Roring
Charness, N., Krampe, R. Th., & Mayr, U. (1996). The role of practice and coaching in entrepreneurial skill domains: An international comparison of life‐span chess skill acquisition. In K. A. Ericsson (Ed.), The road to excellence: The acquisition of expert performance in the arts and sciences, sports, and games (pp. 51–80). Mahwah, NJ: Erlbaum. Charness, N., TuYash, M., Krampe, R., Reingold, E., & Vasyukoya, E. (2005). The role of deliberate practice in chess expertise. Applied Cognitive Psychology, 19, 151–165. Chase, W. G., & Ericsson, K. A. (1981). Skilled memory. In J. R. Anderson (Ed.), Cognitive skills and their acquisition (pp. 141–189). Hillsdale, NJ: Lawrence Erlbaum Associates. Chase, W. G., & Ericsson, K. A. (1982). Skill and working memory. In G. H. Bower (Ed.), The psychology of learning and motivation (pp. 1–58). New York, NY: Academic Press. Chase, W. G., & Simon, H. A. (1973). Perception in chess. Cognitive Psychology, 4, 55–81. Chen, F., Hu, Z., Zhao, X., Wang, R., Yang, Z., Wang, X., et al. (2006). Neural correlates of serial abacus mental calculation in children: A functional MRI study. Neuroscience Letters, 403, 46–51. Dansereau, D. F. (1969). An information processing model of mental multiplication. Dissertation Abstracts International, 30, 1916. De Groot, A. (1978). Thought and choice in chess. The Hague, The Netherlands: Mouton. Djakow, J. N., Petrowski, N. W., & Rudik, P. A. (1927). Psychologie des Schachspiels [The psychology of chess]. Berlin, Germany: Walter de Gruyter. Doll, M., & Mayr, U. (1987). Intelligence and success in chess playing: An examination of chess experts/ Intelligenz und Schachleistung—eine Untersuchung an Schachexperten. Psychologische Beitrage, 29, 270–289. Dukes, W. F. (1965). N ¼ 1. Psychological Bulletin, 64, 74–79. Ebbinghaus, H. (1885/1964). Memory: A contribution to experimental psychology. Oxford, UK: Dover. Elbert, T., Pantev, C., Wienbruch, C., Rockstroh, B., & Taub, E. (1995). Increased cortical representation of the fingers of the left hand in string players. Science, 270, 305–307. Engle, R. W., & Bukstel, L. H. (1978). Memory processes among bridge players of diVering expertise. American Journal of Psychology, 91, 673–689. Ericsson, K. A. (1985). Memory skill. Canadian Journal of Psychology, 39, 188–231. Ericsson, K. A. (1988). Analysis of memory performance in terms of memory skill. In R. J. Sternberg (Ed.), Advances in the psychology of human intelligence (Vol. 4, pp. 137–179). Hillsdale, NJ: Erlbaum. Ericsson, K. A. (1996). The acquisition of expert performance: An introduction to some of the issues. In K. A. Ericsson (Ed.), The road to excellence: The acquisition of expert performance in the arts and sciences, sports, and games (pp. 1–50). Mahwah, NJ: Erlbaum. Ericsson, K. A. (2002). Attaining excellence through deliberate practice: Insights from the study of expert performance. In M. Ferrari (Ed.), The pursuit of excellence in education (pp. 21–55). Hillsdale, NJ: Erlbaum. Ericsson, K. A. (2003). Exceptional memorizers: Made, not born. Trends in Cognitive Sciences, 7, 233–235. Ericsson, K. A. (2006a). Protocol analysis and expert thought: Concurrent verbalizations of thinking during experts’ performance on representative task. In K. A. Ericsson, N. Charness, P. Feltovich, and R. R. HoVman (Eds.), Cambridge handbook of expertise and expert performance (pp. 223–242). Cambridge, UK: Cambridge University Press. Ericsson, K. A. (2006b). The influence of experience and deliberate practice on the development of superior expert performance. In K. A. Ericsson, N. Charness, P. Feltovich, and R. R. HoVman (Eds.), Cambridge handbook of expertise and expert performance (pp. 685–706). Cambridge, UK: Cambridge University Press. Ericsson, K. A., Charness, N., Feltovich, P., & HoVman, R. R. (2006). The Cambridge handbook of expertise and expert performance. New York, NY: Cambridge University Press.
Memory as an Aspect of Skilled and Expert Performance
377
Ericsson, K. A., Chase, W. G., & Faloon, S. (1980). Acquisition of a memory skill. Science, 208, 1181–1182. Ericsson, K. A., Delaney, P., Weaver, G., & Mahadevan, S. (2004). Uncovering the structure of a memorist’s superior ‘‘basic’’ memory capacity. Cognitive Psychology, 49, 191–237. Ericsson, K. A., & Harris, M. S. (1990). Expert chess memory without chess knowledge: A training study. Poster presented at the 31st Annual Meeting of the Psychonomic Society, New Orleans, LA. Ericsson, K. A., & Kintsch, W. (1995). Long‐term working memory. Psychological Review, 102, 211–245. Ericsson, K. A., Krampe, R. Th., & Tesch‐Ro¨mer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100, 363–406. Ericsson, K. A., & Lehmann, A. C. (1996). Expert and exceptional performance: Evidence of maximal adaptations to task constraints. Annual Review of Psychology, 47, 273–305. Ericsson, K. A., & Oliver, W. L. (1988). Methodology for laboratory research on thinking : Task selection, collection of observation and data analysis. In R. J. Sternberg and E. E. Smith (Eds.), The psychology of human thought (pp. 392–428). Cambridge, UK: Cambridge University Press. Ericsson, K. A., Patel, V., & Kintsch, W. (2000). How experts’ adaptations to representative task demands account for the expertise eVect in memory recall: Comment on Vicente and Wang (1998). Psychological Review, 107, 578–592. Ericsson, K. A., & Polson, P. G. (1988a). Memory for restaurant orders. In M. Chi, R. Glaser, and M. Farr (Eds.), The nature of expertise (pp. 23–70). Hillsdale, NJ: Erlbaum. Ericsson, K. A., & Polson, P. G. (1988b). An experimental analysis of a memory skill for dinner orders. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 305–316. Ericsson, K. A., & Smith, J. (1991). Prospects and limits in the empirical study of expertise: An introduction. In K. A. Ericsson and J. Smith (Eds.), Toward a general theory of expertise: Prospects and limits (pp. 1–38). Cambridge, UK: Cambridge University Press. Ericsson, K. A., & Staszewski, J. (1989). Skilled memory and expertise: Mechanisms of exceptional performance. In D. Klahr and K. Kotovsky (Eds.), Complex information processing: The impact of Herbert A. Simon (pp. 235–267). Hillsdale, NJ, UK: Lawrence Erlbaum Associates, Inc. Fitts, P., & Posner, M. I. (1967). Human performance. Belmont, CA: Brooks/Cole. Gibson, E. J., & Pick, A. D. (2000). An ecological approach to perceptual learning and development. New York, NY: Oxford University Press. Gobet, F., & Simon, H. A. (1996). Templates in chess memory: A mechanism for recalling several boards. Cognitive Psychology, 31, 1–40. Grabner, R. H., Neubauer, A. C., & Stern, E. (2006). Superior impact and neural eYciency: The impact of intelligence and expertise. Brain Research Bulletin, 69, 422–439. Groen, G. J., & Patel, V. L. (1988). The relationship between comprehension and reasoning in medical expertise. In M. T. H. Chi, R. Glaser, and M. J. Farr (Eds.), The nature of expertise (pp. 287–310). Hillsdale, NJ: Erlbaum. Gruson, L. M. (1988). Rehearsal skill and musical competence: Does practice make perfect? In J. A. Sloboda (Ed.), Generative processes in music (pp. 91–112). Oxford, UK: Clarenden Press. Halpern, A. R., & Bower, G. H. (1982). Musical expertise and melodic structure in memory for musical notation. American Journal of Psychology, 95, 31–50. Hartley, T., Maguire, E. A., Spiers, H. J., & Burgess, N. (2003). The well‐worn route and the path less traveled: Distinct neural bases of route following and wayfinding in humans. Neuron, 37, 877–888.
378
Ericsson and Roring
Hatano, G., Amaiwa, S., & Shimizu, K. (1987). Formation of a mental abacus for computation and its use as a memory device for digits: A developmental study. Developmental Psychology, 23, 832–838. Hatano, G., & Osawa, K. (1983). Digit memory of grand experts in abacus‐derived mental calculation. Cognition, 15, 95–110. Hill, N. M., & Schneider, W. (2006). Brain changes in the development of expertise: Neuroanatomical and neurophysiological evidence about skill‐based adaptations. In K. A. Ericsson, N. Charness, P. Feltovich, and R. R. HoVman (Eds.), Cambridge handbook of expertise and expert performance (pp. 223–242). Cambridge, UK: Cambridge University Press. Hunt, E., & Love, T. (1972). How good can memory be? In A. W. Melton and E. Martin (Eds.), Coding processes in human memory (pp. 237–260). New York, NY: Holt. Intons‐Peterson, M. J., & Smyth, M. M. (1987). The anatomy of repertory memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 13, 490–500. KauVman, W. H., & Carlsen, J. C. (1989). Memory for intact music works: The importance of music expertise and retention interval. Psychomusicology, 8, 3–19. Luria, A. R. (1968). The mind of a mnemonist. New York, NY: Avon. Maguire, E. A., Valentine, E. R., Wilding, J. M., & Kapur, N. (2003). Routes to remembering: The brains behind superior memory. Nature Neuroscience, 6, 90–95. Masunaga, H., & Horn, J. (2001). Expertise and age‐related changes in components of intelligence. Psychology and Aging, 16, 293–311. Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits of our capacity for processing information. Psychological Review, 63, 81–97. Miller, L. K. (1989). Musical savants: Exceptional skill in the mentally retarded. Hillsdale, NJ: Erlbaum. Neisser, U. (1976). Cognition and reality: Principles and implications of cognitive psychology. New York, NY: W H Freeman/Times Books/ Henry Holt & Co. Newell, A., & Simon, H. A. (1972). Human problem solving. Oxford, UK: Prentice‐Hall. Nielsen, S. (1999). Regulation of learning strategies during practice: A case study of a single church organ student preparing a particular work for a concert performance. Psychology of Music, 27, 218–229. Noice, H. (1993). EVects of rote versus gist strategy on the verbatim retention of theatrical scripts. Applied Cognitive Psychology, 7, 75–84. Noice, H., & Noice, T. (1999). Long‐term retention of theatrical roles. Memory, 7, 357–382. Noice, H., & Noice, T. (2006). Artistic performance: Acting, ballet, and contemporary dance. In K. A. Ericsson, N. Charness, P. J. Feltovich, and R. R. HoVman (Eds.), The Cambridge handbook of expertise and expert performance (pp. 553–568). New York, NY: Cambridge University Press. Noice, T., & Noice, H. (2002a). The expertise of professional actors: A review of recent research. High Ability Studies, 13, 7–20. Noice, T., & Noice, H. (2002b). Very long‐term recall and recognition of well‐learned material. Applied Cognitive Psychology, 16, 259–272. Norman, D. A., Coblentz, C. L., Brooks, L. R., & Babcook, C. J. (1992). Expertise in visual diagnosis: A review of the literature. Academic Medicine (Rime Supplement), 67, 78–83. Norman, G. R., Brooks, L. R., & Allen, S. W. (1989). Recall by expert medical practitioners and novices as a record of processing attention. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 1166–1174. Oliver, W., & Ericsson, K. A. (1986). Repertory actor’s memory for their parts. Proceedings of the Eighth Annual Conference of the Cognitive Science Society (pp. 399–406). Hillsdale, NJ: Erlbaum.
Memory as an Aspect of Skilled and Expert Performance
379
Patel, V. L., Arocha, J. F., & Kaufmann, D. R. (1994). Diagnostic reasoning and medical expertise. In D. Medin (Ed.), The psychology of learning and motivation (Vol. 30, pp. 187–251). New York, NY: Academic Press. Patel, V. L., & Groen, G. J. (1991). The general and specific nature of medical expertise: A critical look. In K. A. Ericsson and J. Smith (Eds.), Toward a general theory of expertise (pp. 93–125). Cambridge, MA: Cambridge University Press. Pesenti, M., Zago, L., Crivello, F., Mellet, E., Samson, D., Duroux, B., et al. (2001). Mental calculation in a prodigy is sustained by right prefrontal and medial temporal areas. Nature Neuroscience, 4, 103–107. Reitman, J. S. (1976). Skilled perception in Go: Deducing memory structures from inter‐ response times. Cognitive Psychology, 8, 336–356. Saariluoma, P. (1989). Chess players’ recall of auditorily presented chess positions. European Journal of Cognitive Psychology, 1, 309–320. Schmidt, H. G., & Boshuizen, H. P. A. (1993). On the origin of intermediate eVects in clinical case recall. Memory & Cognition, 21, 338–351. Schmidt, H. G., Boshuizen, H. P. A., & van Breukelen, G. J. (2002). Long‐term retention of a theatrical script by repertory actors: The role of context. Memory, 10, 21–28. Schmidt, H. G., Norman, G. R., & Boshuizen, H. P. A. (1990). A cognitive perspective on medical expertise: Theory and implications. Academic Medicine, 65, 611–621. Schultetus, S., & Charness, N. (1999). Recall vs. position evaluation revisited: The importance of position‐specific memory in chess skill. American Journal of Psychology, 112(4), 555–569. Simon, H. A., & Chase, W. G. (1973). Skill in chess. American Scientist, 61, 394–403. Slamecka, N. J. (1985a). Ebbinghaus: Some associations. Journal of Experimental Psychology: Learning, Memory, & Cognition, 11, 414–435. Slamecka, N. J. (1985b). Ebbinghaus: Some rejoinders. Journal of Experimental Psychology: Learning, Memory, & Cognition, 11, 496–500. Sloboda, J. (1976). Visual perception of musical notation: Registering pitch symbols in memory. Quarterly Journal of Experimental Psychology, 28, 1–16. Sloboda, J. A., Hermelin, B., & O’Connor, N. (1985). An exceptional musical memory. Music Perception, 3, 155–169. Sloboda, J. A., & Parker, H. H. (1985). Immediate recall of melodies. In I. Cross, P. Howell, and R. West (Eds.), Musical structure and cognition (pp. 143–167). New York: Academic Press. Smith, S. B. (1983). The great mental calculators: The psychology, methods, and lives of calculating prodigies past and present. New York, NY: Columbia University Press. Staszewski, J. J. (1988). Skilled memory and expert mental calculation. In M. T. H. Chi, R. Glaser, and M. J. Farr (Eds.), The nature of expertise (pp. 71–128). Hillsdale, NJ: Erlbaum. TuYash, M., Roring, R. W., & Ericsson, K. A. (in press). Expert world play: Capturing and explaining superior reproducible task performance. Unterrainer, J. M., Kaller, C. M., Halsband, U., & Rahm, B. (2006). Planning abilities and chess: A comparison of chess and non‐chess players on the Tower of London task. British Journal of Psychology, 97, 299–311. van der Maas, H. L. J., & Wagenmakers, E.‐J. (2004). A psychometric analysis of chess expertise. American Journal of Psychology, 118, 29–60. Vicente, K. J., & Wang, J. H. (1998). An ecological theory of expertise eVects in memory recall. Psychological Review, 105, 33–57. Volke, H. J., Dettmar, P., Richter, P., Rudolf, M., & Buhss, U. (2002). On‐coupling and oV‐ coupling of neocortical areas in chess experts and novices as revealed by evoked EEG coherence measures and factor‐based topological analysis—a pilot study. Journal of Psychophysiology, 16, 23–36.
380
Ericsson and Roring
Ward, P., Hodges, N. J., Williams, A. M., & Starkes, J. L. (2004). Deliberate practice and expert performance. In A. M. Williams and N. J. Hodges (Eds.), Skill acquisition in sport (pp. 231–258). London, UK: Routhledge. Ward, P., & Williams, A. M. (2005). Perceptual and cognitive skill development in soccer: The multidimensional nature of expert performance. Journal of Sport & Exercise Psychology, 25, 93–111. Wilding, J. M., & Valentine, E. R. (1997). Superior memory. Essays in cognitive psychology. Hove, UK: Psychology Press/Erlbaum. Williamon, A., & Valentine, E. R. (2002). The role of retrieval structures in memorizing music. Cognitive Psychology, 44, 1–32. Williams, A. M., Davids, K., Burwitz, L., & Williams, J. G. (1993). Cognitive knowledge and soccer performance. Perceptual and Motor Skills, 76, 579–593. Zago, L., Pesenti, M., Mellet, E., Crivello, F., Mazoyer, B., & Tzourio‐Mazoyer, N. (2001). Neural correlates of simple and complex mental calculation. NeuroImage, 13, 314–327.
INDEX A Accuracy and latency models of associative recognition, 323–345 decision process, 326–327 dual-process decision strategies, 335–345 encoding of associative information, 324 familiarity-based performance, 327–333 familiarity-based retrieval process, 324–326 recollection, 333–335 Acting rehearsed performance, relation to memory performance, 366–367 Actual memories, 46, 189 Adaptive control of thought (ACT) model, 147–148, 277 Adult life span development and memory. See Memory and adult life span development Aging and memory adaptive control and skilled cognition, 256–257 brain mechanisms, 254–255 control and use of value assignment, 257–259 implications for training, 259–260 individual differences, 257–259, 313–318, 345 two-parameter model of, 290 value directed remembering, 239, 259–260 value implications, 232, 254–260 Alzheimer’s disease, 288 Anterograde amnesia, 288, 301 Associative memory, 149 Associative recognition accuracy and latency models, 323–345 decision process, 326–327
dual-process decision strategies, 335–345 encoding of associative information, 324 familiarity-based performance, 327–333 familiarity-based retrieval process, 324–326, 335 recollection, 333–335 sampling-without-replacement model, 325–333 sampling-with-replacement model, 325–333, 335 classical models, 319–323 compound-cue model, 319– 321, 328 compound-cue random walk model, 320 diffusion model, 319 dual-process models, 321–323, 329 independent-cue models, 320–321 random walk model, 320 REM dual-process model, 322 response-delay model, 343–344 signal detection models, 320 cognitive neuroscience approach, 346 free-respond procedure, 319, 322, 327, 329, 332, 345 individual differences in, 313–318, 345 signal-to-respond procedure, 319–320, 322, 327–329, 335, 337–338 speed-accuracy trade-off in, 313–318 tasks, 294 testing procedures, 318–319 Attentional process, in prospective memory, 149–154, 166 Automatic associative memory system, 153
381
382
B Basic memory processes in prospective memory, 149–154 traditional approach searching for, 354–357, 373 Benzodiazepine, 301 Blindfold chess, 354, 365 Brain, intense training influence on, 359 Brown-Peterson paradigm, 121–122, 163
C Chess players Binet’s classic report, 354–355 mechanisms mediating chess proficiency of, 363–366 memory recall advantage of, 356 mnemonic virtuosity, 354 performance and IQ, 358 psychometric abilities of, 355 skilled performance, 363–366 superior performance of, 360–361 Children’s eyewitness testimony, 31–33 Chunking theory assumptions, 356 Classical models, of associative recognition, 319–323 compound-cue model, 319–321 compound-cue random walk model, 320 dual-process models, 321–323 independent-cue models, 320–321 random walk model, 320 REM dual-process model, 322 signal detection models, 320 COBRA (COmpetition Between Reward and Accuracy), 69–70 COBRM (COmpetition Between Reward and probability Matching), 69–70 Cognitive aging, lifespan theories of, 226–229 Cognitive processes, information-processing models of, 353 Compound-cue model, of associative recognition, 319–321 Confidence and accuracy relationship item-based memory decisions, 113–126 dissociations, 114, 118–125 1D-SDT model of 2AFC, 115–118 inversion paradigm, 115–118 scene manipulations, 113–115, 121 Confidence ratings and response bias, 84–88
Index Controlled recognition process, in prospective memory, 149–151 Cue-recall test, 187–188
D Decision making in associative recognition, 326–327 cognitive processes of, 352 Deese/Roediger/McDermott (DRM) paradigm, 72, 74, 249–250, 252 De Groot’s paradigm, 360 Diamond–water paradox, 231, 261 Digital memory, 189 Directed recognition process, in prospective memory, 149–151 Direct retrieval strategy, 274 1D-SDT model, 102, 121–131, 138–140 of 2AFC, 115–118 item-based memory decisions, 96–100 optimal decision criterion in, 126 and source memory in item memory judgements, 132 Dual-process decision strategies in associative recognition, 335–345 models of associative recognition, 321–323 theories of recognition decisions, 98
E Emotional information, processing of, 244–245 Emotional valence and subjective memorability, 77–78 Encoding of associative information, in associative recognition, 324 Encoding strategic decisions about memory, 178–189, 209–210 encode material of greater interest and value, 179–181 encoding to control memory, 188–189 learning about, 185–188 association and categorization strategies, 187–188 encoding strategies, 186–187 stimulus characteristics, 185–186 material selection for encoding, 179–181, 210 means of encoding, 181–185
Index controlling processing, 181–183 self-scheduling of study events, 183–185 Episodic memory, individual differences in, 313–345 Evaluative processing concept of, 232–233 memory performance in older adults, 225–226, 231 metacognition and, 242–245 model and value directed remembering, 242–245 as skilled cognition in older adults, 239–241 slowing cognitive process and, 242–245 Experimental manipulations, 361 Expertise, influential formal theory of, 355–356 Expert-performance approach, 351–353, 360 in domains with less predictable situation, 368–370 in medicine, 369–370, 373 in soccer, 369 to identify mediating mechanism, 361–366 in chess, 363–366 memory expertise, 362–363 to individuals, 363 involving calculation, 370–372 mental abacus calculation, 372 mental calculation, 371–372 process-tracing studies of, 352, 362–363 rehearsed performance acting, 366–367 displaying prepared performance, 366–367 music performance, 367–368, 372 relation to performance memory, 368–370 scientific study of, 359–360, 373–374 External memory, 189, 211 Eye-movement sequences analysis, 353, 361 Eyewitness testimony, 3–4
F False-alarm rates (FARs), 63, 69, 75, 79–80, 101–103, 105, 108–111, 127–128, 190–192, 206–208, 294, 296, 317, 320–322, 328, 331, 333, 336–339, 341 Familiarity-based performance, in associative recognition, 327–333 Familiarity-based retrieval process, in associative recognition, 324–326
383
Fan effect experts paradox, 273–274, 291 mechanistic account of retrieval effects, 276–277 on memory retrieval, 273–281, 291, 304, 306 with real-world knowledge, 275–276 variability and selection strategy, 274–275 Feedback contingencies, in criterion placement in item-based memory decisions, 102–104 Fill-in-the-blank cued-recall test, 186 Focusing, concept of, 233 Forced-choice recognition, 6–7 Forced-report tests, 6 Forgetting curves, showing actual memory accuracy performance, 46 Free-recall test, 31, 187–188 Free-recognition test, 31 Free-report conditions, 6–7 Free-report memory performance, 7, 9 Functional neuroimaging of source memory, and item memory, 132–137 Fuzzy-trace theory, 249–250
G Gist-based processing, 250, 262 Global-matching proces, 327–328 Grain size, concept of, 232–233
H Heuristics and Biases approach, to judgment and decision making, 99, 139 High-frequency words, memory representations, 281, 285, 289, 291–294, 297, 299–301, 303, 305–306 Hit/false alarm correlation, item-based memory decisions, 127–130 Hit rate (HR), 206, 208 Human cognitive system, 147 Human information processing, theory for, 356 Human memory characteristics of, 190 generalization gradients, 190 matching process, 190–192 system, 189 Hypermnesia phenomenon, 201
384
Index
I
L
Independent-cue models, of associative recognition, 320–321 Individual differences, in associative recognition, 313–318, 345 Information processing powerful effects, 357–359 theory for, 356 Informativeness-accuracy tradeoff function, 204 Inversion paradigm, of confidence and accuracy relationship, 115–118 Involuntary memory, 152–154 Iowa Gambling Task, 230 Item-based memory decisions characteristics and neural substrates, 100–138 confidence and accuracy relationship, 113–126 context vs. item memory judgements and PFC, 132–138 recognition criterion by individual skill, 126–132 recognition criterion lability during testing, 100–113 definition of, 95–96 1D-SDT model, 96–100 model of, 95–140 simple decision model, 96–100 Item-based memory discrimination, 96–97 Item memory judgements and PFC context memory and, 96, 99–100, 132–138 1D-SDT framework and source memory, 132 functional neuroimaging of source memory vs. item memory, 132–137 item-based memory decisions, 132–138
Labor-in-vain effect, 316 Latencies, analysis of, 353 Latency models of associative recognition. See Accuracy and latency models of associative recognition Learning first, cognitive processes of, 352 Life logging/lifeloggers, prospect of, 178, 189, 201, 209–211 Lifelong experience consequences on memory encoding, 300–301 costs on memory retrieval, 289–290, 304 Lifespan theories, of cognitive aging, 226–229 Likelihood ratio rule, 86 Lockstep model, 86 Long-term memory (LTM), 356–357, 367, 370–371 Long-term working memory (LTWM), 357–358, 365, 367, 370, 372–374 Lose-shift strategy, 201 Low-frequency words, memory representations, 278, 283, 285, 288–289, 291, 293–294, 296–301, 305–306
J Judgment and decision making, Heuristics and Biases approach to, 99 Judgments of learning (JOL), and encoding manipulations, 182, 186, 205
K Knowing effects feeling, 277
M Medicine, expert-performance approach in, 369–370, 373 Memorability heuristics, in item-based memory decisions, 101–102 Memory. See also Encoding strategic decisions access strategic decisions, 189–202, 210–212 access to traces of, 178, 190–192 accuracy, 2–3 accuracy-informativeness trade-off, 40–41 and aging, 254–260 approaches to, 2–5 autobiographical, 3–4 characterization and control, 177–179 chunking and clustering in, 183 clinical impairement, 35–36 cognitive processes of, 352 contextualistic views of, 124, 168 control of, 189 cueing, 36–37 digital, 189 distortions and fabrications, 3 encoding strategic decisions, 178–189, 209–210
Index episodic, 227 episodic reporting over time, 45–48 errors in, 189 evidence evaluation, 101 expert performance, 362–363 external, 189, 211 false, 3–4 flashbulb, 4 generation effect, 182, 186 grain size control, 39–48 accuracy-informativeness trade-off,40–41 accuracy-quantity trade-off, 41 empirical evidence, 42–45 in episodic memory reporting over time, 45–48 integrated model and report option, 48–52 regulation in old age, 48 relative expected-utility maximizing model, 42 and report option, 48–52 satisficing and utility maxmizing model, 41–42, 44 impairment in old age, 33–35 impairments in older adults, 225–227 implicit and explicit effects, 306 individual differences in, 313–318 integrated model of grain size and report option, 48–52 coarse-grained answer, 50–51 need for informativeness criterion, 51–52 satisficing model, 48–50 interaction with, 177–179 laws of, 271 long-term storage of, 176 metacognitive monitoring and control processes role in performance, 47 and metamemory illusions, 3 for nonsense syllables, 354 performance, 181, 185, 188–189, 196, 200, 225–226, 313 age influences on, 226–229 in domains with less predictable situation relation to, 368–370 evaluative processing by older adults in, 213, 225–226 experience effect on, 273–274 expert-performance involving calculation relation to, 370–372 rehearsed performance relation to, 366–368
385
sources of variability in, 314 value influences on, 225–227, 231–239 postaccess decision processes, 202–209 psychometric testing, 37–39 rational analysis of, 199 regulation of accuracy and quantity performance model, 8, 10 rememberer role, 3–4 reporting, 4–6 (see also Memory reporting) representations of high-frequency words, 281, 285, 289, 291–294, 297, 299–301, 303, 305–306 low-frequency words, 278, 283, 285, 288–289, 291, 293–294, 296–301, 305–306 response bias in recognition, 61–90 schema-based errors and, 3 self-spacing, 183–184 semantic, 305 sensitivity, 61–62 separable processes, 175 as skilled cognition perspective, 176 spatial, 3 strategic regulation of, 1–2 strengths for targets and lures, 61–63, 70–71, 78 structural changes impact on, 314–315, 346 subjective experience, 4–5 systems, 175 value and grain size at retrieval, 252–253 Memory access as cognitive skill, 201–202 decisions about, 192–199 continuing and discontinuing search of memory, 197–199 effects of incentives, 192–193, 199 output order, 194–195 probes for, 195–198 retrieval access, 193 retrieval and plausible inference, 193–194 retrieval plans, 194–195 search order, 194–195 learning about, 199–201 retrieval plan formulation, 200–201 self-testing strategies, 200 matching and retrieval, 190–192 means of, 190–192 processes, 178–179 response times, 196–197
386 Memory access (continued ) strategic decisions about, 189–202, 210–212 time pressure influence on, 190–192 to traces of, 178, 190–192 Memory and adult life span development 226–231 lifespan theories of cognitive aging, 226–229 Jenkins’s tetrahedral model of memory experiments, 228 SOC framework, 227 SST framework, 227–228 motivated cognition and goals of older adults, 229–231 Memory decisions. See Item-based memory decisions Memory encoding, experience affects on, 272, 291–304 chunking, 301–303 enabling utilization, 301–303 low-frequency stimuli, 301–303 minimal lifelong experience consequences, 300–301 model with new encoding assumptions, 297–300 SAC augmentation, 294–297, 304, 307 partial match and spurious recollection, 296–297 strengthening concept limit, 296 WM and prior experience interection affect, 294–297 Memory reporting accuracy, 6 dependability, 6 forced-report memory performance, 23–24, 27–28, 32 free-report memory performance, 7, 9, 12, 15–17, 20–21, 24, 27–29, 32, 48 informativeness in, 5–6 input-bound and output-bound measures, 6 metacognitive framework, 6–29 applications of, 29–39 children’s eyewitness testimony, 31–33 clinical memory impairement, 35–36 encoding specificity and memory cueing, 36–37 impairment in old age, 33–35 psychometric testing, 37–39 recall-recognition paradox, 30–31 model to control report option, 8–12 monitoring
Index accuracy and quantity performance, 10–12, 15–16 and control mechanism, 9 effectiveness role, 12, 16–20, 24 and retention, 17, 19–20 quantity-accuracy profile (QAP) methodology, 12–14, 20–30 quantity-accuracy trade-off, 9–10, 12, 15, 23, 32 simulation analyses, 10 strategic control, 6–29 empirical evidence, 12, 15–21 model, 8–12 QAP methodology, 12–14, 20–30 of report option, 8–12 Type-2 SDT, 27–29 Memory retrieval, experience affects on, 272–277, 304, 306 costs of lifelong experience on, 289–290, 304 direct retrieval strategy for, 274–275, 290 evidence for SAC explanation, 286–290, 304 costs of lifelong experience on retrieval, 289–290, 304 source memory studies, 289 using synthetic amnesia, 288–289 experience hurts retrieval, 290–291 fan effect, 273–277, 291, 304, 306 experts paradox, 273–274, 291 mechanistic account of retrieval effects, 276–277 with real-world knowledge, 275–276, 304 variability and selection strategy, 274–275 inference strategy for, 274–275 recognition memory and, 277–281 SAC model, 277–281 of word recognition, 281–286 word frequency mirror effect, 281–286 Mental abacus calculation, 372 Mental calculation, 371–372 Metacognition, 8, 10, 186 Metamemory and prospective memory, 155–165, 170 delayed execution of retrieved intentions, 162–164 monitoring cost, 164–165 sensitivity to ongoing task encouraging focalprocessingoftargetcue,155–160
Index Midazolam, 288–289, 294, 301–302 Mirror effect, 206–207, 281–286, 288–289, 294, 305–306 Monitoring effectiveness indices, 11 Motivational selectivity, concept of, 233 Multiprocess theory, in prospective memory, 154–155 Music performance, relation to memory performance, 367–368, 372
N Negative fan effect, 275 Neyman–Pearson decision process, 208
O Old age, memory impairment in, 33–35 Older adults adaptive decision making by, 230 associative memory impairments, 246–247, 250–251, 261–262 evaluative processing in memory performance, 225–226, 231 metacognition and, 242–245 model and value directed remembering, 242–245 as skilled cognition in, 239–241 slowing cognitive process and, 242–245 false memory, 249–250 flexible remembering, 249–250 implications for training, 259–260 individual differences, 257–259 memory, value and grain size at retrieval, 252–253 memory impairments, 225–226 and value, 246–247, 250–251, 261–262 motivated cognition and goals of, 229–231 proper names as low value information, 250–251 realistic reliance on memory and reasoning, 230 recollection, familiarity and value, 247–248 selectivity in use of memory by, 234–239 value directed remembering, 239, 259–260 memory and grain size at retrieval, 252–253
387
and memory impairments, 246–247, 250–251, 261–262 as memory modifiers for, 231–239, 241–242 motivation and emotional priority for, 241–242 recollection and familiarity, 247–248 and use of memory by, 234–239 One dimensional signal detection theory. See 1D-SDT model
P Paired associate learning and cued recall, 277 Perception, cognitive process of, 352 Perceptual match effects, 277 Perceptual source memory, 133 Positioning decision criteria, in item-based memory decisions, 105–112 Positivity effect, 229 Postaccess decision processes, in memory, 202–209 criterion placement and adjustment in recogniton, 204–207 criterion shifts in, 208–209 means of control over memory, 209 memory decision making, 207–209 output grain, 204 stimulus memorability and mirror effect, 206–207 suppression of output, 202–204 Practice benefits retention, laws of memory, 271 Prefrontal cortex (PFC), context vs. item memory judgements and, 96, 99–100, 132–138 Preparatory attentional processes and memory (PAM) theory, 150, 154 Priority-binding theory, 251 Problem solving, cognitive process of, 352 Prospective and Retrospective Memory Questionnaire (PRMQ), 167 Prospective memory basic memory and attentional processes in, 149–154, 166 behavioral and cognitive neuroscience studies, 151 characteristic of, 146–147, 170 controlled recognition process, 149–151 deficits in, 145 directed recognition process, 149–151
388 Prospective memory (continued ) ERP study, 152 focal nature of, 159–160, 168 future directions, 165–170 importance, 146, 155, 160–162 mechanism for, 147–148, 165–166 metamemory and, 155–165, 170 delayed execution of retrieved intentions, 162–164 monitoring cost, 158, 164–165 sensitivity to ongoing task encouraging focalprocessingoftargetcue,155–160 multiprocess theory, 154–155, 166 retrospective memory, 149 spontaneous recognition process, 151–152 spontaneous reflexive associative memory process, 152–154 system, 147–148 task importance, 155, 158, 160–162, 164, 166, 168–169 Protocol analysis, 353, 361
Q Quantity-accuracy profile (QAP) methodology curves, 22–25 for strategic control of memory reporting, 12–14, 20–30, 44
R Random walk model, of associative recognition, 320 Rational choice theory, 99, 139 Raven’s matrices, 358 Real memorability, 78 Real-world knowledge, fan effect with, 275–276 Recall-recognition paradox, 30–31 Receiver-operating characteristic (ROC) data, 62, 85, 90 Recognition criterion by individual skill item-based memory decisions, 126–132 hit/false alarm correlation, 127–130 individual variation, 127–130 optimal decision criterion in 1D-SDT, 126 Recognition criterion lability during testing in item-based memory decisions, 100–113 biased feedback contingencies, 102–104 criterion rigidity, 105–107
Index memorability heuristics, 101–102 positioning decision criteria, 105–112 simulation outcomes, 108–110 Recognition memory, 149 activation spread, 280 current activation of node, 281 dynamics of, 313–345 link strength, 280 and memory retrieval, 277–281 node strength, 279 normative word frequency affects, 282–283 response bias in, 61–90 structure of SAC model, 278–279 testing procedures, 318–319 Rehearsed performance, memory performance and acting, 366–367 displaying prepared performance, 366–367 music performance, 367–368 REM dual-process model, of associative recognition, 322 Remembering/Rememberers, 3–4, 194, 198–199, 203. See also Memory Remember/Know judgments, 285 Remember-know paradigm, in response bias, 82–84, 89, 282–286 Response bias between group criterion differences, 72 list criterion differences, 72–74 between-test criterion shifts, 74–78 processing and stimulus manipulations, 76–78 processing time and revelation, 76–77 against shifts, 74–76 for shifts, 76–78 strength manipulations, 74–76 definition, 63 designs with multiple responses, 81–88 confidence ratings, 84–88 remember-know paradigm, 82–84 distribution shifts masquerading as criterion shifts, 79–81 context effects, 80–81 identifying change, 81 study-test delays, 80 emotional valence and subjective memorability, 77–78 experimental strategy, 67–71
Index invariants within data, 67–68 by subject, 68–71 explicit model application to data, 83 feedback use, 69–70, 89 forced-choice design, 89 measurement in, 63–67 equal sensitivity conditions, 65–66 single experimental condition, 63–64 unequal sensitivity conditions, 66–67 processing effects, 76–77 emotional valence and subjective memorability, 77–78 and stimulus manipulations, 76–78 time and revelation, 76–77 in recognition memory experiments, 61–90 right measure selection, 88 right sensitivity measure selection, 88 stimulus effects, 77–78 use ratings and plot ROCs, 62, 85, 90 Response-signal paradigm, 76, 319–320, 322, 327–329, 335, 337–338 Response times (RTs), 196–197 Retention processing variables on, 182 ways to enhance, 183–184 Retrospective memory, 146, 155–157, 170 Revelation effect, 65–66, 77
S SAC model, 277–281, 286–290, 294–297, 304–305, 307 activation spread, 280, 283–284 assumptions, 278–281, 292 augmentation for memory encoding, 294–297, 304, 307 partial match and spurious recollection, 296–297 strengthening concept limit, 296 current activation of node, 281 dual-process account of recognition, 281–282, 289 evidence for, 286–290, 304 explaining related phenomena with, 304–305 high and low-frequency words representation in, 283
389
high fan and low fan fonts representation, 286–288 link strength, 280 of memory retrieval, 277–281 node strength, 279 picture–word interference experiment, 298 for remember-know responses, 297–299 simulation results, 300 structure of, 278–279 of word recognition, 281–286 word–word interference experiment, 299 Satisficing and unsatisficing knowledge, 52 Scene manipulations, item-based memory decisions, 113–115, 121 Selection, optimization and compensation (SOC) framework, 227, 253, 257 Selectivity concept of, 232–233 in use of memory by older adults, 234–239 Self-cued prospective memory, 167 Self-guided learning, 181 Semantic-memory grain-size study, 42, 49–50 Sensitivity conditions, response bias measurement in, 65–67 Short-term memory (STM) capacity, 355–359, 371 scanning, 196 Sicilian defense, 303 Signal detection models, of associative recognition, 320 Signal-detection theory (SDT), 7–9, 19, 22, 61–63, 65, 81–82, 84–85 Single-participant studies, 354 Skill and expert performance scientific study of, 359–360 under standardized conditions, 360–361 Soccer, expert-performance approach in, 369 Socioemotional selectivity theory (SST) framework, in lifespan theories of cognitive aging, 227 Source memory studies, evidence for SAC explanation, 289 Source of activation confusion model. See SAC model Speed-accuracy trade-off in associative recognition, 313–318 in free-response paradigm, 322–323
390 Spontaneous recognition process, in prospective memory, 151–152 Spontaneous reflexive associative memory process, in prospective memory, 152–154 Stimulus memorability, 206–207 Strategic control, concept of, 232 Strength-based decision model, 87 Students’ memory, 39 Subjective memorability and emotional valence, 77–78 Synthetic amnesia, and memory retrieval, 288–289
T Telegraphy, 355 Test-expectancy manipulation, 187 Theory of signal detection (TSD), 205–206 Two-alternative forced-choice (2AFC) format, 89, 113 1D-SDT model, 115–118 Type-2 SDT, 27–30
Index
V Value, concept of, 232–233 Value-directed remembering, and evaluative processing model, 239, 242–245 Verbal reports of thinking, protocol analysis of, 353
W Win-stay strategy, 201 Word frequency mirror effect, in memory retrieval, 277, 281–286, 291 Word-rating task, 148 Working memory (WM) capacity, 124 and prior experience affect on memory encoding, 278, 294–297, 304
Y Younger adults, as expert memorizers, 233
CONTENTS OF RECENT VOLUMES Volume 30
The Child’s Representation of Human Groups Lawrence A. Hirschfeld Diagnostic Reasoning and Medical Expertise Vimla L. Patel, Jose´ F. Arocha, and David R. Kaufman Object Shape, Object Name, and Object Kind: Representation and Development Barbara Landau The Ontogeny of Part Representation in Object Concepts Philippe G. Schyns and Gregory L. Murphy Index
Perceptual Learning Felice Bedford A Rational-Constructivist Account of Early Learning about Numbers and Objects Rochel Gelman Remembering, Knowing, and Reconstructing the Past Henry L. Roediger III, Mark A. Wheeler, and Suparna Rajaram The Long-Term Retention of Knowledge and Skills Alice F. Healy, Deborah M. Clawson, Danielle S. McNamara, William R. Marmie, Vivian I. Schneider, Timothy C. Rickard, Robert J. Crutcher, Cheri L. King, K. Anders Ericsson, and Lyle E. Bourne, Jr. A Comprehension-Based Approach to Learning and Understanding Walter Kintsch, Bruce K. Britton, Charles R. Fletcher, Eileen Kintsch, Suzanne M. Mannes, and Mitchell J. Nathan Separating Causal Laws from Causal Facts: Pressing the Limits of Statistical Relevance Patricia W. Cheng Categories, Hierarchies, and Induction Elizabeth F. Shipley Index
Volume 32 Cognitive Approaches to Judgment and Decision Making Reid Hastie and Nancy Pennington And Let Us Not Forget Memory: The Role of Memory Processes and Techniques in the Study of Judgment and Choice Elke U. Weber, Wiliam M. Goldstein, and Sema Barlas Content and Discontent: Indications and Implications of Domain Specificity in Preferential Decision Making William M. Goldstein and Elke U. Weber An Information Processing Perspective on Choice John W. Payne, James R. Bettman, Eric J. Johnson, and Mary Frances Luce Algebra and Process in the Modeling of Risky Choice Lola L. Lopes Utility Invariance Despite Labile Preferences Barbara A. Mellers, Elke U. Weber, Lisa D. Ordo´n˜ez, and Alan D. J. Cooke
Volume 31 Associative Representations of Instrumental Contingencies Ruth M. Colwill A Behavioral Analysis of Concepts: Its Application to Pigeons and Children Edward A. Wasserman and Suzette L. Astley
391
392
Contents of Recent Volumes
Compatibility in Cognition and Decision Eldar Shafir Processing Linguistic Probabilities: General Principles and Empirical Evidence David V. Budescu and Thomas S. Wallsten Compositional Anomalies in the Semantics of Evidence John M. Miyamoto, Richard Gonzalez, and Shihfen Tu Varieties of Confirmation Bias Joshua Klayman Index
Volume 33 Landmark-Based Spatial Memory in the Pigeon Ken Cheng The Acquisition and Structure of Emotional Response Categories Paula M. Niedenthal and Jamin B. Halberstadt Early Symbol Understanding and Use Judy S. DeLoache Mechanisms of Transition: Learning with a Helping Hand Susan Goldin-Meadow and Martha Wagner Alibali The Universal Word Identification Reflex Charles A. Perfetti and Sulan Zhang Prospective Memory: Progress and Processes Mark A. McDaniel Looking for Transfer and Interference Nancy Pennington and Bob Rehder Index
Volume 34 Associative and Normative Models of Causal Induction: Reacting to versus Understanding Cause A. G. Baker, Robin A. Murphy, and Fre´de´ric Valle´e-Tourangeau Knowledge-Based Causal Induction Michael R. Waldmann A Comparative Analysis of Negative Contingency Learning in Humans and Nonhumans Douglas A. Williams Animal Analogues of Causal Judgment Ralph R. Miller and Helena Matute Conditionalizing Causality Barbara A. Spellman Causation and Association Edward A. Wasserman, Shu-Fang Kao, Linda J. Van Hamme, Masayoshi Katagiri, and Michael E. Young
Distinguishing Associative and Probabilistic Contrast Theories of Human Contingency Judgment David R. Shanks, Francisco J. Lopez, Richard J. Darby, and Anthony Dickinson A Causal-Power Theory of Focal Sets Patricia W. Cheng, Jooyong Park, Aaron S. Yarlas, and Keith J. Holyoak The Use of Intervening Variables in Causal Learning Jerome R. Busemeyer, Mark A. McDaniel, and Eunhee Byun Structural and Probabilistic Causality Judea Pearl Index
Volume 35 Distance and Location Processes in Memory for the Times of Past Events William J. Friedman Verbal and Spatial Working Memory in Humans John Jonides, Patricia A. Reuter-Lorenz, Edward E. Smith, Edward Awh, Lisa L. Barnes, Maxwell Drain, Jennifer Glass, Erick J. Lauber, Andrea L. Patalano, and Eric H. Schumacher Memory for Asymmetric Events John T. Wixted and Deirdra H. Dougherty The Maintenance of a Complex Knowledge Base After Seventeen Years Marigold Linton Category Learning As Problem Solving Brian H. Ross Building a Coherent Conception of HIV Transmission: A New Approach to Aids Educations Terry Kit-fong Au and Laura F. Romo Spatial Effects in the Partial Report Paradigm: A Challenge for Theories of Visual Spatial Attention Gordon D. Logan and Claus Bundesen Structural Biases in Concept Learning: Influences from Multiple Functions Dorrit Billman Index
Volume 36 Learning to Bridge Between Perception and Cognition Robert L. Goldstone, Philippe G. Schyns, and Douglas L. Medin
Contents of Recent Volumes The Affordances of Perceptual Inquiry: Pictures Are Learned From the World, and What That Fact Might Mean About Perception Quite Generally Julian Hochberg Perceptual Learning of Alphanumeric-Like Characters Richard M. Shiffrin and Nancy Lightfoot Expertise in Object and Face Recognition James Tanaka and Isabel Gauthier Infant Speech Perception: Processing Characteristics, Representational Units, and the Learning of Words Peter D. Eimas Constraints on the Learning of Spatial Terms: A Computational Investigation Terry Regier Learning to Talk About the Properties of Objects: A Network Model of the Development of Dimensions Linda B. Smith, Michael Gasser, and Catherine M. Sandhofer Self-Organization, Plasticity, and Low-Level Visual Phenomena in a Laterally Connected Map Model of the Primary Visual Cortex Risto Mikkulainen, James A. Bednar, Yoonsuck Choe, and Joseph Sirosh Perceptual Learning From Cross-Modal Feedback Virginia R. de Sa and Dana H. Ballard Learning As Extraction of Low-Dimensional Representations Shimon Edelman and Nathan Intrator Index
Volume 37 Object-Based Reasoning Miriam Bassok Encoding Spatial Representation Through Nonvisually Guided Locomotion: Tests of Human Path Integration Roberta L. Klatzky, Jack M. Loomis, and Reginald G. Golledge Production, Evaluation, and Preservation of Experiences: Constructive Processing in Remembering and Performance Tasks Bruce W. A. Whittlesea Goals, Representations, and Strategies in a Concept Attainment Task: The EPAM Model Fernand Gobet, Howard Richman, Jim Staszewski, and Herbert A. Simon Attenuating Interference During Comprehension: The Role of Suppression Morton Ann Gernsbacher
393
Cognitive Processes in Counterfactual Thinking About What Might Have Been Ruth M. J. Byrne Episodic Enhancement of Processing Fluency Michael E. J. Masson and Colin M. MacLeod At a Loss From Words: Verbal Overshadowing of Perceptual Memories Jonathan W. Schooler, Stephen M. Fiore, and Maria A. Brandimonte Index
Volume 38 Transfer-Inappropriate Processing: Negative Priming and Related Phenomena W. Trammell Neil and Katherine M. Mathis Cue Competition in the Absence of Compound Training: Its Relation to Paradigms of Interference Between Outcomes Helena Matute and Oskar Pinen˜o Sooner or Later: The Psychology of Intertemporal Choice Gretchen B. Chapman Strategy Adaptivity and Individual Differences Christian D. Schunn and Lynne M. Reder Going Wild in the Laboratory: Learning About Species Typical Cues Michael Domjan Emotional Memory: The Effects of Stress on ‘‘Cool’’ and ‘‘Hot’’ Memory Systems Janet Metcalfe and W. Jake Jacobs Metacomprehension of Text: Influence of Absolute Confidence Level on Bias and Accuracy Ruth H. Maki Linking Object Categorization and Naming: Early Expectations and the Shaping Role of Language Sandra R. Waxman Index
Volume 39 Infant Memory: Cues, Contexts, Categories, and Lists Carolyn Rovee-Collier and Michelle Gulya The Cognitive-Initiative Account of DepressionRelated Impairments in Memory Paula T. Hertel Relational Timing: A Theromorphic Perspective J. Gregor Fetterman The Influence of Goals on Value and Choice Arthur B. Markham and C. Miguel Brendl The Copying Machine Metaphor Edward J. Wisniewski Knowledge Selection in Category Learning Evan Heit and Lewis Bott Index
394
Contents of Recent Volumes
Volume 40 Different Organization of Concepts and Meaning Systems in the Two Cerebral Hemispheres Dahlia W. Zaidel The Causal Status Effect in Categorization: An Overview Woo-kyoung Ahn and Nancy S. Kim Remembering as a Social Process Mary Susan Weldon Neurocognitive Foundations of Human Memory Ken A. Paller Structural Influences on Implicit and Explicit Sequence Learning Tim Curran, Michael D. Smith, Joseph M. DiFranco, and Aaron T. Daggy Recall Processes in Recognition Memory Caren M. Rotello Reward Learning: Reinforcement, Incentives, and Expectations Kent C. Berridge Spatial Diagrams: Key Instruments in the Toolbox for Thought Laura R. Novick Reinforcement and Punishment in the Prisoner’s Dilemma Game Howard Rachlin, Jay Brown, and Forest Baker Index
Volume 41 Categorization and Reasoning in Relation to Culture and Expertise Douglas L. Medin, Norbert Ross, Scott Atran, Russell C. Burnett, and Sergey V. Blok On the Computational basis of Learning and Cognition: Arguments from LSA Thomas K. Landauer Multimedia Learning Richard E. Mayer Memory Systems and Perceptual Categorization Thomas J. Palmeri and Marci A. Flanery Conscious Intentions in the Control of Skilled Mental Activity Richard A. Carlson Brain Imaging Autobiographical Memory Martin A. Conway, Christopher W. Pleydell-Pearce, Sharon Whitecross, and Helen Sharpe The Continued Influence of Misinformation in Memory: What Makes Corrections Effective? Colleen M. Seifert
Making Sense and Nonsense of Experience: Attributions in Memory and Judgment Colleen M. Kelley and Matthew G. Rhodes Real-World Estimation: Estimation Modes and Seeding Effects Norman R. Brown Index
Volume 42 Memory and Learning in Figure–Ground Perception Mary A. Peterson and Emily Skow-Grant Spatial and Visual Working Memory: A Mental Workspace Robert H. Logie Scene Perception and Memory Marvin M. Chun Spatial Representations and Spatial Updating Ranxiano Frances Wang Selective Visual Attention and Visual Search: Behavioral and Neural Mechanisms Joy J. Geng and Marlene Behrmann Categorizing and Perceiving Objects: Exploring a Continuum of Information Use Philippe G. Schyns From Vision to Action and Action to Vision: A Convergent Route Approach to Vision, Action, and Attention Glyn W. Humphreys and M. Jane Riddoch Eye Movements and Visual Cognitive Suppression David E. Irwin What Makes Change Blindness Interesting? Daniel J. Simons and Daniel T. Levin Index
Volume 43 Ecological Validity and the Study of Concepts Gregory L. Murphy Social Embodiment Lawrence W. Barsalou, Paula M. Niedinthal, Aron K. Barbey, and Jennifer A. Ruppert The Body’s Contribution to Language Arthur M. Glenberg and Michael P. Kaschak Using Spatial Language Laura A. Carlson In Opposition to Inhibition Colin M. MacLeod, Michael D. Dodd, Erin D. Sheard, Daryl E. Wilson, and Uri Bibi Evolution of Human Cognitive Architecture John Sweller Cognitive Plasticity and Aging Arthur F. Kramer and Sherry L. Willis Index
Contents of Recent Volumes
Volume 44 Goal-Based Accessibility of Entities within Situation Models Mike Rinck and Gordon H. Bower The Immersed Experiencer: Toward an Embodied Theory of Language Comprehension Rolf A. Zwaan Speech Errors and Language Production: Neuropsychological and Connectionist Perspectives Gary S. Dell and Jason M. Sullivan Psycholinguistically Speaking: Some Matters of Meaning, Marking, and Morphing Kathryn Bock Executive Attention, Working Memory Capacity, and a Two-Factor Theory of Cognitive Control Randall W. Engle and Michael J. Kane Relational Perception and Cognition: Implications for Cognitive Architecture and the Perceptual-Cognitive Interface Collin Green and John E. Hummel An Exemplar Model for Perceptual Categorization of Events Koen Lamberts On the Perception of Consistency Yaakov Kareev Causal Invariance in Reasoning and Learning Steven Sloman and David A. Lagnado Index
Volume 45 Exemplar Models in the Study of Natural Language Concepts Gert Storms Semantic Memory: Some Insights From Feature-Based Connectionist Attractor Networks Ken McRae On the Continuity of Mind: Toward a Dynamical Account of Cognition Michael J. Spivey and Rick Dale Action and Memory Peter Dixon and Scott Glover Self-Generation and Memory Neil W. Mulligan and Jeffrey P. Lozito Aging, Metacognition, and Cognitive Control Christopher Hertzog and John Dunlosky
395
The Psychopharmacology of Memory and Cognition: Promises, Pitfalls, and a Methodological Framework Elliot Hirshman Index
Volume 46 The Role of the Basal Ganglia in Category Learning F. Gregory Ashby and John M. Ennis Knowledge, Development, and Category Learning Brett K. Hayes Concepts as Prototypes James A. Hampton An Analysis of Prospective Memory Richard L. Marsh, Gabriel I. Cook, and Jason L. Hicks Accessing Recent Events Brian McElree SIMPLE: Further Applications of a Local Distinctiveness Model of Memory Ian Neath and Gordon D. A. Brown What is Musical Prosody? Caroline Palmer and Sean Hutchins Index
Volume 47 Relations and Categories Viviana A. Zelizer and Charles Tilly Learning Linguistic Patterns Adele E. Goldberg Understanding the Art of Design: Tools for the Next Edisonian Innovators Kristin L. Wood and Julie S. Linsey Categorizing the Social World: Affect, Motivation, and Self-Regulation Galen V. Bodenhausen, Andrew R. Todd, and Andrew P. Becker Reconsidering the Role of Structure in Vision Elan Barenholtz and Michael J. Tarr Conversation as a Site of Category Learning and Category Use Dale J. Barr and Edmundo Kronmu¨ller Using Classification to Understand the Motivation-Learning Interface W. Todd Maddox, Arthur B. Markman, and Grant C. Baldwin Index