THE PSYCHOLOGY OF LEARNING AND MOTIVATION A d v a n c e s in Research and T h e o r y
EDITED BY D O U G L A S L. M E D I N DEPARTMENT OF PSYCHOLOGY NORTHWESTERN UNIVERSITY EVANSTON, ILLINOIS
Volume 39
San Diego
A C A D E M I C PRESS London Boston New York
Sydney Tokyo
Toronto
This book is printed on acid-frt.'C paper.
IS
Copyright (:12000 by ACADEMIC PRESS All Rights Re5CI'Ved. No part of this publication may be reproduced or transmItted in any form or by any means. dectronic or mechanical. including photocopy, recording. or any information storage and retrieval system. without pmniSSlon in writing from the Publisher. The appe.vance of the code at the bottom of the first page o f a chapter in this book indicates the Publisher'5 con...ent thai copies of the chapter may be made for personal or internal u� ofSJll.!Cific clients. Th;5 consent IS given on the condition. however, that the copier "ay the stated per copy fee through the Copyright CIe.vanCt: Center, [ne. (222 Rosewood Drive. DanvCffl. Massachusetts O[ 92]). for copying beyond that pmnitted by Sec1ions 107 or 108 of the U,S. Copyright Law. This consent does not extend to other kind!; of copying. such as copying for general dislribution. for advertising or promotional pUrpose!!. for creatmg new collet:tive works. or for resale. Copy fees for pr�2000 ehaptcn arc all shown on the title pages. If no fee code appears on the title page, the copy feo! i5 the same as for current chaJllcn. 0019-1421100 $30.00 Explicit permission from Academic Pre:s5;5 not r�-qu;red 10 reproduce a maximum of two figures or Ulbles from an Academic PrC!i$ article in another scientific or re�earch publication provided that the material ha.� not been credited to another source und that full credit to the Academic Pres.� article is gi"'�"rI.
Academic Press A Harcourt &lflnce and Technology Company S2S B SUed. Suite 1900. San Diego, California 921014495.lJSA hltp:llwww.apntl.oom
Academic Press 24-28 Oval Road. London NWI 10K UK hup:llwww.hbuk.oo.uklapl [nternational Standard Book Number: 0-12-54]3]9-5 PRINTED IN THE UNITED STATES OF AMERICA 99 00 01 02 0] 04 BB 9 8 7 6 S
4
]
2
1
CONTENTS
COlltributors
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . .
.
. . . . . . . .
.
. . . . .
. .. . . .
. .
.
. . . . . . .
.
. .
IX
INFANT MEMORY: CUES, CONTEXTS, CATEGORIES, AND LISTS
Carolyn Rovee-Collier and Michelle Gu/ya I. General Procedures III. Cues and ConlclI:ls IV. Categories
.. . . . . . . . . . . . ,' ........................"..... "....................... . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
II. Ontogeny of Memory
. . . . . . . . . . . . . . . . . . . . . . .
.......
. .
.
. . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. .
.
. . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. .
.
.
. . . . . . .
. .
. . . . . . .
V. Lists VI. Infantile Amnesia . ................................... ... ..... ... ........ .......... VII. Summary References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
....
.....
. . . . . . . . . . . . . . . . . . . . . .
2 4 13 22 32 40 41 41
THE COGNITIVE-INITIATIVE ACCOUNT OF DEPRESSION·RELATED IMPAIRMENTS IN MEMORY
Paula T. Hertel 1. Introduction
. . . . . . . . . . . .
. . . .
. . . . .
. . . .
..........
II. The Framework and the Findings
. . . .. .. . . . .
. . . . . . . . . .
...
.
. . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
.
III. The Role of Motivation in Memory Impairments
. . . . .
.
. ..
. . . . . . . . . . . .
. . . . . . . . .
.
. . . . . .
. . . .
.
. . . . .
IV. Comparisons and Conclusions . . . . . References ............................................. ..... ... ........... .
. . . . . . . . .
. . .
. .
. . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
48 61 65 69
Conlents
,i
RELATIONAL TIMING: A THEROMORPHIC PERSPECTIVE
J. Gregor Fellerman I. General Method
..
.
..............................
II. Ordinal Comparisons of Duration III. Ralio Comparisons of Duration
.................................
75
...........................
77
. . . . . . . . . . . . . . .
...
.
. . . . . . . . . . . . . . .. . . . . . . . . . . . . .
. . . .
. . . .
IV. The Role of Instructions and Extracxperimental Experience
.
....
........
�_� References
. . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
82
86 m 94
THE INFLUENCE OF GOALS ON VALUE AND CHOICE
Arthur B. Markman and C. Miguel Brendl I. Goals, Value. and Choice
...
. ....................................... ......... . .
II. Goals and Their Relationship to Objects . III. Goals and the Processing of Choice
. . ..
IV. Goals and the Determination of Value V. Conclusions and Further Directions References
................................
.
. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . .
.......
.....
.
.
. . . . . .......................
. . . . . . . . . . . . . . . .
.
. . . . . . . . . .
. .
.....
.
.
.
......................
........................
. . . . . . . . ..............
.
.......
97 98 103 106 124 125
THE COPYING MACHINE METAPHOR
Edward 1. Wisniewski I. The Copying Machine Metaphor in Cognitive Psychology ........... II. Why Is the Copying Machine Metaphor so Prevalent? III. The Case for Integrative/Constructive Processing IV. How Is Knowledge Integrated and Constructed?
. . . . . . . . . . . . . . . . . . . . . . .
..
. . ..
V. Knowledge Construction across Cognitive Domains VI. Conclusions and Concerns References .
..
.
. . . .
.
.
.
............
.
. . . . . . . .
.
. . . .
.
................
. .
.
.
.
. . . . . . . . .
.
.......
. . . . . . . . . . . . .
..
..
. . . . .. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
130 132 135 143 150 157
159
KNOWLEDGE SELECTION IN CATEGORY LEARNING
Evan Heit and Lewis Bott I. Introduction
. . . .
.
. . . . . . . .
II. Experiment I . . . .
III. Experiment
2
. .
. .. .
.
. . .
. . . . . . . .. . . . . . . . . . . . . . . . . . . . . .
. . . . .. . . . . . . . . . .. . . . . . . . . . . . . . .
.
. . .
. . . . . . . . . . . . .
. . . . . . . .
.
. . . .
. .
163
.
. . . . .. . . . . .
................................................................
IV. Discussion of Experiments
V. PUlling Knowledge into Neural Network Models VI. The Baywatch Model
. . . .
. . .
.
.....
. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . .
.........
.
.......
.
...............
. . . ......................
. . . . .
. .
. .
175 179 180 183 186
Conlenls
VII. Evaluation of the Baywatch Model ......................................... VIII. Conclusions ......................................................................... References
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VII
191 196 197
THE ROLE OF LANGUAGE IN THE CONSTRUCTION OF KINDS
Susan A. Gelman, Michelle Hollander, Jon Star, and Gail D. Heyman I. Introduction ....... ................ ................................................
201
II. Categories. Kinds. and Essences ........ .....................................
204
Ill. Four Linguistic Devices .........................................................
208
IV. The Word "Kind" ................................................................
212
V. Lexicalization ......................................................................
217
VI. Logical Quantifiers ...............................................................
225
VII. Generic Noun Phrases ..........................................................
228
VIII. Summary and Conclusions .....................................................
251
References ..........................................................................
256
Index .................. ................. ................................................ COllfefllS of Recent Voillme.\· ........................................................
273
. . . . . . .
. . . .
265
CONTRIBUTORS
N u m b e r s in parentheses indicate the pages on which the authors' contributions begin.
Lewis Bott (163), Department of Psychology, University of Warwick, Coventry, CV4 4AL, United Kingdom G. Miguel Brendl (97), Universitfit Heidelberg, Psychologisches Insititut, Heidelberg D-69117, Germany J. Gregor Fetterman (73), Department of Psychology, Indiana University/Purdue University-Indianapolis, Indianapolis, Indiana 46202 Susan A. Gelman (201), Department of Psychology, University of
Michigan, Ann Arbor, Michigan 48109 Michelle Gulya (1), Department of Psychology, Rutgers University, Piscataway, New Jersey 08854 Evan Heit (163), Department of Psychology, University of Warwick,
Coventry CV4 4AL, United Kingdom Paula T. Hertel (47), Department of Psychology, Trinity University, San Antonio, Texas 78212 Gail D. Heyman (201), Department of Psychology, University of California-San Diego, San Diego, California 92093 Miehelle Hollander (201), Department of Psychology, University of Michigan, Ann Arbor, Michigan 48109 Arthur B. Markman (97), Department of Psychology, University of
Texas, Austin, Texas 78712 ix
x
Contributors
CarolynRovee-Collier (1), Department of Psychology, Rutgers University, Piscataway, New Jersey 08854 Jon Star (201), Department of Psychology, University of Michigan, Ann Arbor, Michigan 48109 Edward J. Wisniewski (129), Department of Psychology, University of North Carolina at Greensboro, Greensboro, North Carolina 27412
INFANT MEMORY: Cues, Contexts, Categories, and Lists Carolyn Rovee-Collier Michelle Gulya
At the end of the nineteenth century, Ebbinghaus introduced the study of memory into psychology. Since then, research on m e m o r y has undergone a veritable explosion, and our current understanding of memory would amaze even Ebbinghaus. In the face of this burgeoning knowledge, however, infant m e m o r y has remained an enigma. Perhaps because of its inscrutability, most cognitive scientists have dismissed memory processing by preverbal infants as being of little or no consequence for m e m o r y processing by verbally competent children and adults. Recently, however, infants have been interrogated about their memories in new and different ways. Their answers suggest that the basic processes that mediate m e m o r y processing by humans free of brain damage are invariant over ontogeny. Although memory processing clearly changes quantitatively with age, there is no evidence that it changes qualitatively. Moreover, because infants' m e m o r y processing is slow and hence accessible to experimental observation over a relatively long period, some of the most basic phenomena of human m e m o r y can only be studied in the very young. This chapter reviews how infant m e m o r y changes over the first yearand-a-half of life. We next consider how different aspects of an event that are encoded in the same m e m o r y - - t h e focal cue and the context in which it is e n c o u n t e r e d - - c o n t r i b u t e to its retention. Third, we consider how the structure of an event affects what infants learn and remember. H e r e we focus on infants' ability to r e m e m b e r categories and lists. Finally, we consider implications of our findings for the p h e n o m e n o n of infantile amnesia. THE PSYCHOLOGY OF LEARNING AND MOTIVATION, VOL. 39
1
Copyright © 2000 by Academic Press. All rights of reproduction in any form reserved. 0079-7421/00 $30.00
2
Carolyn Rovee-Collier and Micheile Gulya I.
General Procedures
Because very young infants do not have a verbal response to tell us what they recognize, we initially teach them a motoric o n e - - a n operant footkick--to use instead. During the retention test, we show them a display that is either the same as or different from the training display, and the infants "tell" us whether or not they recognize it by whether or not they produce the motoric response. If they recognize it, then they kick above their pretraining baseline rate, saying "yes"; if they do not recognize it, then they do not kick above baseline rate, saying "no." We teach infants to kick by stringing a ribbon from one ankle to the overhead hook that suspends a crib mobile (see Fig. 1). Infants learn rapidly that kicking moves the mobile and usually double or triple their rate of kicking within a few minutes; thereafter, kick rates remain high and stable, both within sessions and over successive days. Moreover, their increase in kick rate is due solely to the contingency and not to behavioral arousal (Rovee & Rovee, 1969). Both before and after training, we attach the ribbon to a second, " e m p t y " hook so that infants can see the mobile, but their kicks cannot move it. We measure their baseline kick rate before training and their final level of learning after training under these conditions. Because we also measure long-term retention under these conditions, infants' memory performance reflects only what they bring into the test session, and not new learning or savings at the time of testing. After training is over, we test infants in one of two different memory paradigms (see Fig. 2). In the delayed recognition paradigm, after a given amount of time has passed, we simply hang the stationary mobile over infants' heads and ask if they recognize it. In the reactivation paradigm, the procedure is identical, except that we expose infants to a memory prime in advance of the long-term retention test (see Fig. 3). In most studies, we wait for infants to forget (i.e., responding during a delayed recognition test has returned to baseline) before we present the prime. The memory prime is an isolated component of the original event, such as the original mobile (in motion) or the original training context with no mobile present at all. The prime presumably activates the latent or dormant memory, increasing its accessibility. Later, we assess the effectiveness of the prime in a standard delayed recognition test. Figure 4 illustrates the memory performance that infants typically exhibit in the delayed recognition and reactivation paradigms. The first curve in Fig. 4 is the forgetting function which was obtained during delayed recognition tests with independent groups of 3-month olds at different delays after training was over at time 0. Thirteen days after training, when the original training memory was forgotten, infants were briefly exposed to a memory
Infant Memory
3
Fig. 1. A 3-month-oldin the mobile task during acquisition. The infant's footkicks move the mobile by means of the ribbon that is connected from the ankle to the mobile hook. During baseline and all retention tests, the ribbon and the mobile are connected to different hooks, so that kicks cannot move the mobile. prime. The second curve in Fig. 4 is the forgetting function of the reactivated m e m o r y beginning 1 day after priming (14 days after training was over), also obtained during standard recognition tests with independent groups at different delays after priming on day 13. As shown, the magnitude of retention 1 day after priming is the same as the magnitude of retention only 1 day after original training, and the reactivated m e m o r y is forgotten at almost the same rate as the original m e m o r y (Rovee-Collier, Sullivan, Enright, Lucas, & Fagen, 1980).
4
Carolyn Rovee-Collier and Michelle Gulya
A. Delayed Recognition Paradigm
Baseline
Acquisition
Immediate Retention Test
B. Reactivation Paradigm
Test
I Ill I'-1
TIME PASSAGE
I
Long-Term Retention Test I
TIME PASSAGE
[
Reactivation Test
I
Reacquisition
't Re-extinction
Ne-exuncuon
I
Reinforcement Phase
Treatment
Nonreinforcement Phase
Fig. 2. (A) The delayed-recognition task, showing training and the long-term retention test. (B) The reactivation task, showing training and the brief reactivation (priming) treatment prior to the long-term retention test. The test cue in (A) is the memory prime in (B).
II. Ontogeny of Memory A.
AGE CHANGES IN THE DURATION OF RETENTION
In the past, the major impediment to the systematic study of infant memory development was the lack of a task that is interesting to infants across a broad age range but is also relatively easy in terms of its motoric demands for all ages. A quick glance at the infants in Fig. 5 reveals the magnitude of this problem. From left to right, these infants are 2, 3, 6, 9, 12, 15, and 18 months of age. The physical and behavioral differences between the youngest and oldest infant are obvious. Because the mobile task is unsuitable for infants older than 6 months, we developed a second operant task that could be used as an "upward extension" of it. In this task, infants sit in front of a large box that houses a miniature train amidst a complex of toys. By pressing a lever, they can move the train around a circular track. Six-month olds, trained in both tasks with the same set of parameters, produce identical learning and retention data and respond identically to both cue and context changes (Hartshorn & Rovee-Collier, 1997), confirming that the train and mobile tasks are functionally equivalent. In all subsequent studies, therefore, we have used the mobile task with infants from 2 to 6 months and the train task with infants from 6 to 18 months of age. Standardized training parameters, calibrated for age, are always used unless otherwise specified.
Infant Memory
5
O O
Fig. 3. A reactivation treatment with a 2-month-old. The far end of the ribbon is held by the experimenter, standing at the side of the crib, who uses it to move the mobile for 3 minutes at the same rate that the infant had kicked to move it in the final 3 minutes of acquisition. T a k i n g this a p p r o a c h , w e h a v e f o u n d t h a t h u m a n infants, like infants o f s p e c i e s r a n g i n g f r o m frogs ( M i l l e r & B e r k , 1977) to m o n k e y s ( G r e e n , 1962), exhibit equivalent retention after short delays but remember progressively l o n g e r as t h e y g e t o l d e r ( H a r t s h o r n , R o v e e - C o l l i e r , G e r h a r d s t e i n , B h a t t ,
6
Carolyn Rovee-Collier and Michelle Gulya
o r~ ¢--
o r-
ID
1.3 1.2 1.1
~
~
/ /
Memory reactivated at 27 days
~
1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3
•, ",,
riginal memory
Memory reactivated at 13 days "l f~'
"'6 )Training) (Priming) (Priming) ........... /, ........... ,x, 2 4 6 8 10 12 14 16 18 20 22 24 26 28
Retention Interval (days) Fig. 4. The forgetting and reforgetting functions of the original memory and the reactivated memory, respectively, as a function of the number of days after training or priming when retention testing occurred. The memory prime was presented either 13 or 27 days after training was over, when forgetting was complete. Each data point represents a different group of 3month-old infants. Reprinted with permission from Rovee-Collier, C., Sullivan, M. W., Enright, M. K., Lucas, D., & Fagen, J. W. (1980). Reactivation of infant memory. Science, 208, 11591161. Copyright © 1980 by the American Association for the Advancement of Science.
Wondoloski, Klein, Gilch, Wurtzel, & Campos-de-Carvalho, 1998b). In fact, the maximum duration of delayed recognition is a linearly increasing function of age (see Fig. 6). Because infants' original learning and baseline levels were equivalent at all ages irrespective of task, these factors did not contribute differentially to their memory performance. Two aspects of this retention function are particularly important. First, memory performance does not change abruptly at the end of the first year. This evidence is inconsistent with claims that a qualitatively different memory system (Mandler, 1984; Nelson, 1995; Schacter & Moscovitch, 1984), including the capacity for long-term memory (Kagan & Hamburg, 1981), matures late in the first year of life. Second, memory performance does not change in the second year with the appearance of language. This evidence is also inconsistent with K. Nelson's (1990) claim that it is not possible for individuals to retain a memory over a long period of time--an ability that she describes as "a specifically human characteristic of memory" (p. 307, italics o u r s ) before they can rehearse it by talking about it. Exposing infants to periodic reminders can significantly protract their retention (Hayne, 1990). These reminders can take the form of reinstatement (periodic repetition of a partial training trial throughout the retention
Infant Memory
Fig. 5. From left to right, infants are 2, 3, 6, 9, 12, 15, and 18 months of age. Note the vast physical and behavioral differences between the youngest and oldest infants.
13 12 11 d) v tO t--
2
10 9 8 7
rr E E
6 5 4 3 2 1 .
i
qiw
i
2 3
• Mobile Task .
.
i
6
.
,
I
9
.
.
I
12
.
,
i
15
,
.
i
18
Age (months) Fig. 6. The maximum duration of retention (in weeks) of independent groups of infants over the first 18 months of life in studies in two operant tasks.
8
Carolyn Rovee-Collier and Michelle Gulya
interval; Campbell & Jaynes, 1966) or reactivation (exposure to a fractional component of the training event at the end of the retention interval; Spear & Parsons, 1976). In one study, we gave 2-month olds a 3-minute reminder every 3 weeks through 6 1/2 months of age (six reminders altogether) and a final retention test at 7 1/4 months, when the study was terminated because infants outgrew the task (Rovee-Collier, Hartshorn, & DiRubbo, 1999). Before each reminder, infants received a preliminary retention test. Infants who remembered the training event during the preliminary test received a reinstatement reminder; infants who did not remember received a reactivation reminder. Although the maximum duration of retention at 2 months is only 1-2 days (see Fig. 6), infants who were repeatedly reminded still exhibited significant retention 18 weeks later, at 6 1/2 months of age, and most of them still remembered 21 weeks later, at 7 1/4 months of age. Yoked controls, who were not initially trained but received the same regimen of reminders as their experimental counterparts, exhibited no retention after any delay, confirming that the reminders themselves did not produce new learning. Figure 7 presents the individual data of reminded infants (open squares) superimposed on the retention function of unreminded infants from Fig. 6 (filled squares). The dashed line extends that function through 30 months of age. This figure reveals that four 2-month olds who received multiple reminders remembered as long as 2 1/4 year olds, one remembered as long as 2-year olds, and the infant whose memory was "poorest" remembered as long as children aged almost 1 1/2 years. Had it not been necessary to terminate the study, some infants undoubtedly would have remembered longer. For her dissertation, Hartshorn (1998) trained 6-month olds in the train task and gave them a 2-minute reinstatement reminder at 7, 8, 9, and 12 months of age. Their final memory test was at 18 months, 1 year after the original training event. Although 6-month olds remember this task for only 2 weeks, infants exposed to four reminders still exhibited significant retention during the 18-month test. Yoked reminder controls again exhibited no retention after any delay. Immediately after the 18-month test, four infants received another 2-minute reminder. Of these infants, three still remembered at 2 years--1 1/2 years after original training--despite receiving only one reminder (at 18 months) in the preceding year. These studies clearly refute current views that preverbal infants are unable to maintain memories over the long term because of neural immaturity at the time of encoding or an inability to rehearse prior experiences by talking about them. Apparently, as long as organisms of any age periodically encounter appropriate reminders, their memories of prior events are likely to be maintained. Because periodic nonverbal reminders maintained two
l ~ a n t Memo W
22 20 18
9
/
~ ................................................. ~ J f
- D - ........................................ . 9
/
f
f J
14
4 2 0
i ~ i i 2 3
6
I I I 9
12
15
I 18 21
I
I
24 27
I 30
Age (Months)
Fig. 7. Individual data showing the maximum duration of retention (in weeks) of six infants who were trained at 2 months of age, reminded at 3-week intervals, and given a final retention test at 29 weeks of age (open squares). Infants received a memory test before each reminder. Also shown is the maximum duration of retention function for infants between 2 and 18 months of age who were tested without an interpolated reminder (filled squares; see Fig. 6). The dashed line predicts the maximum duration of retention through 30 months of age. All 2-month olds who received multiple reminders remembered as long as predicted for infants 2 or more years of age except one, who remembered as long as infants almost 1 1/2 years of age.
m e m o r i e s of c o m p a r a b l e events over an overlapping period f r o m 2 m o n t h s t h r o u g h at least 1 1/2-2 years of age, it seems likely that they could also maintain a single m e m o r y o v e r this same period, if not indefinitely. S o m e researchers have p r o p o s e d that older infants exhibit superior retention because their m e m o r i e s are m o r e deeply e n c o d e d (Brainerd & R e y n a , 1988; H o w e , 1991). I n d e e d , we have f o u n d that infants r e m e m b e r longer w h e n their attention to a target is enhanced, p r e s u m a b l y leading to its d e e p e r encoding. In a levels-of-processing study with 3 - m o n t h olds (Adler, Gerhardstein, & Rovee-Collier, 1998), we trained infants with a " p o p - o u t " mobile that displayed a single L block (the target) amidst six + blocks (the distractors). W h e n infants are tested with this mobile, the single L a p p a r e n t l y pops out and captures their attention, because they b e h a v e as if the test mobile were c o m p o s e d entirely of Ls. B y training infants with this mobile, we h o p e d to e n h a n c e their attention to the target, t h e r e b y increasing its d e p t h of processing during encoding and prolonging its retention.
10
Carolyn Rovee-Collier and Michelle Gulya
In fact, infants who were trained with a mobile displaying one L amidst six +s recognized a test mobile displaying seven Ls longer than infants who were trained with a mobile displaying seven Ls in the first place (see Fig. 8). Conversely, they recognized a test mobile displaying seven +s after shorter delays than infants who were initially trained with a mobile containing seven +s. These results suggest that infants' attention to and processing of the target was enhanced at the expense of their attention to and processing of the distractors. A corresponding set of results was found when infants were trained with a mobile displaying the opposite configuration (one + amidst six Ls) and were tested with a mobile displaying either seven +s (increased retention) or seven Ls (decreased retention). These data demonstrate that enhancing attention protracts retention, presumably because deeper processing is associated with enhanced attention, and are consistent with evidence that adults remember central stimuli--which presumably are better attended--better than peripheral stimuli (Belli, Windschitl, McCarthy, & Winfrey, 1992). Whether older infants attend more intensely or encode more deeply than younger infants, however, is impossiTest +
1.10
[ ] Target test Distractor test Control
Test L
1.00 O .i,-,
rr tO e-
0.90 0.80 0.70 0.60 0.50 0.40
rr
0.30 0.20
1
3
i 5
7
9
11
1
3
f 3
5
7
9
Retention Interval (days)
Fig. 8. The levels-of-processing (LOP) effect at 3 m o n t h s of age, shown as the protracted recognition over days of a test mobile displaying seven + s (left panel) and seven Ls (right panel) when they were the target (white columns) on the pop-out training mobile and the diminished recognition over days of a test mobile displaying seven + s (left panel) and seven Ls (right panel) when they were the distractors (dark columns) on the L or + pop-out training mobile, respectively. The standard of comparison for the L O P effect is the typical duration of retention exhibited by control groups (striped columns), who were both trained and tested with a h o m o g e n e o u s mobile composed of either seven + s (left panel) or seven Ls (right panel). A retention ratio = 1.00 indicates no forgetting. Asterisks m a r k the groups that displayed significant retention (M baseline ratio significantly > 1.00). Reprinted with permission from Adler, S. A., Gerhardstein, P., & Rovee-Collier, C. (1998). Levels-of-processing effects in infant m e m o r y ? Child Development, 69, 280-294.
Infant Memory
11
ble to determine. Moreover, depth of processing is usually measured in terms of the duration of retention, making it a circular account of retention. B.
A G E CHANGES IN THE SPEED OF MEMORY RETRIEVAL
We have found that the speed of m e m o r y retrieval also increases with age over the first year of l i f e - - a result consistent with other reports that the speed of information processing increases with age (for review, see Colombo & Mitchell, 1990). In these studies, we measured how rapidly infants displayed retention after exposure to a reactivation treatment. Infants between 3 and 12 months of age were trained in either the mobile or train task, were allowed to forget the training memory, and then were exposed to the original cue (the mobile or the train) as the m e m o r y prime during a reactivation treatment 1 week after they had last exhibited retention. Despite the fact that the delay between training and priming increased linearly between 3 and 12 months, the latency of priming decreased linearly over this same period until, at 12 months, the response to a prime was instantaneous (Hildreth & Rovee-Collier, in press, see Fig. 9). A t any given age, however, priming latency is a function of the retention interval: If the prime is presented 2 weeks after training (1 week after they forget the
24h ~ •~ r~
lh
E ,e-
1 rain
T
ask ask
a-
0-1 s I
I
I
3
6
9
I
12
Age (months) Fig. 9. The time required for a memory to be recovered in a reactivation task following the presentation of a memory prime to independent groups of infants 1 week after they last exhibited retention (see Fig. 6). The memory prime was presented 2 weeks after training at 3 months of age, 3 weeks after training at 6 months of age, 7 weeks after training at 9 months of age, and 14 weeks after training at 12 months of age. At 6 months of age, infants trained in the mobile and train tasks had identical retrieval latencies.
12
Carolyn Rovee-Collier and Michelle Gulya
training memory), 3-month-olds require 24 hours to respond to it, but if the prime is presented only 1 day after training, 3-month-olds respond to it instantaneously (Gulya, Rovee-Collier, Galluccio, & Wilk, 1998). Reinitz, Wright, and Loftus (1989) found that priming increases the rate of visual encoding and proposed that this is how semantic priming facilitates subsequent retention. Subsequently, Reinitz and Alexander (1996) found that adults' perceptual identification of primed and unprimed stimuli--both pictures and words--was perfectly predicted by a multiplicative model that assumes that prior exposure to a stimulus increases the rate of visual information processing when that stimulus is subsequently encountered. This mechanism is similar to the account that a memory prime increases the accessibility of a prior memory representation (Spear & Parsons, 1976). An extension of this account also predicts our finding that the speed of memory retrieval increases over successive reactivations. At 3 months of age, when the rate of priming is relatively lethargic, the speed of memory retrieval increases from 24 hours to 4 hours or less after only two reactivations (Hayne, Hildreth, & Rovee-Collier, 1998). This result reveals that developmental changes in the speed of priming may also be experientially based. This result is reminiscent of Tulving's (1983) description of the reduction in access time to information in semantic memory as a consequence of retrieval--a reduction he attributed to improvement in retrieval skill. C.
THE DEVELOPMENTOF MULTIPLEMEMORYSYSTEMS
In an influential chapter, Tulving (1972) proposed that adults possess two functionally distinct memory systems--an episodic memory system that contains information about specific prior events that are dated by their time and place of occurrence and is unaffected by subsequent retrievals, and a semantic memory system that contains general knowledge that is devoid of time and place information and is unaffected by subsequent retrievals. He subsequently proposed that the semantic system develops first (Tulving, 1983). This dichotomy was supported by clinical reports that aging amnesics and Korsakoff patients were impaired relative to normal adults on one kind of memory test but not on another (Warrington & Weiskrantz, 1970). Amnesics performed poorly on recognition tests when asked to choose which of four words was on a list they had studied just minutes earlier; however, they performed as well as normal adults on priming tests when asked to complete a word fragment with the first word that came to mind. Despite being unable to recognize them, amnesics typically completed the word fragment with a word from the prior study list. Dissociations in
Infant Memory
13
memory performance suggested that recognition and priming tests tap different underlying memory systems--one that is impaired in amnesia (episodic memory), and one that is not (semantic memory). Later, researchers applied the Jacksonian principle of the hierarchical development and dissolution of function to memory (Rozin, 1976). By this last in/first out account, the memory system that fails in amnesia (the "first to go") was assumed to mature last in ontogeny, whereas the memory system that is preserved in amnesia (the "last to go") was assumed to develop first (Naito & Komatsu, 1993). The same developmental sequence characterizes all major, dichotomous memory systems (semantic and episodic memory: Tulving, 1983; early- and late-maturing memory system: Schacter & Moscovitch, 1984; implicit and explicit memory: Graf & Schacter, 1985; nondeclarative and declarative memory: Squire, 1987; habit system and memory system: Bachevalier & Mishkin, 1984), yet it has never been studied with human infants. Rather, it has merely been inferred from the memory dissociations of aging amnesics (McDonough, Mandler, McKee, & Squire, 1995; McKee & Squire, 1993). More recently, we found memory dissociations in the memory performance of preverbal infants on reactivation (priming) and delayed recognition tests as well (Rovee-Collier, 1997). These dissociations are identical to those produced by adults on priming and recognition tests, respectively, in response to the same independent variables--age, the retention interval, vulnerability (interference), the number of study trials, study time, item number, level of processing, trial and session spacing, affect, serial position, studied size, and the memory load. For both infants and adults, manipulating these variables produces major effects on recognition performance but no effects on priming performance. If memory dissociations are diagnostic of two memory systems, then both are clearly functional by 2 months of age. We take this result as further evidence that memory processing by infants and adults is fundamentally the same.
HI.
Cues and Contexts
Although the focal cue and the training context are encoded in the same memory, they contribute differently to its retrieval and do so after different delays at different ages. We define the f o c a l cue as the object uniquely associated with the contingency (the mobile or train) and the context as the setting where training occurs, which does not affect the task characteristics or demands. In our studies, the context is either a distinctively coloredand-patterned cloth that is draped over the sides of the crib or playpen or a particular room in the home. Other researchers using the mobile task
14
Carolyn Rovee-Collier and Michelle Gulya
have manipulated the auditory (Fagen, Prigot, Carroll, Pioli, Stein & Franco, 1997) and olfactory (Rubin, Fagen, & Carroll, 1999) context. A.
EFFECT OF A CUE CHANGE ON RETENTION
As a rule, novel cues in a familiar contexts and familiar cues in novel contexts are ineffective retrieval cues. This rule is qualified by age and the retention interval. Between 2 and 6 months of age, even the slightest change in the cue impairs retention 1 day later. If more than a single object on the test mobile is different, for example, infants fail to recognize it (Hayne, Greco, Earley, Griesler, & Rovee-Collier, 1986). As the retention interval increases, infants increasingly respond to generalized cues--suggesting that they forget the details of the original cue over time (Rovee-Collier & Sullivan, 1980)--unless they are trained and tested in a distinctive context (Butler & Rovee-Collier, 1989; Hayne & Rovee-Collier, 1995), which facilitates discrimination of the test cue from the training cue after long delays. Although the training-reactivation delay is longer than the delay when infants generalize to a novel cue, however, only the original cue--not a generalized one--is an effective memory prime. At 3 months, for example, if a mobile contains more than a single novel object, then it will not reactivate the original memory (Rovee-Collier, Patterson, & Hayne, 1985b). At 6 months, neither a novel mobile (Hill, Borovsky, & Rovee-Collier, 1988) nor a novel train (Hartshorn & Rovee-Collier, 1997) will reactivate it. B.
DEVELOPMENTAL CHANGES IN CUE SPECIFICITY
A novel cue cannot retrieve a training memory after relatively short delays at 2-6 months of age; by 9-12 months, it can after short but not after long delays (Hartshorn, Rovee-Collier, Gerhardstein, Bhatt, Klein, Aaron, Wondoloski, & Wurtzel, 1998a; see Fig. 10, leflpanel). In deferred imitation studies, infants between 6 and 24 months of age first generalize to a novel test cue at 12 months--again, only after a short delay (e.g., 10 minutes). With age, they generalize to increasingly novel test cues after increasingly longer delays (Barnat, Klein, & Meltzoff, 1996; Hayne, MacDonald, & Barr, 1997). Because older infants generalize after relatively short delays but not after longer ones, they are able to discriminate between old and new cues but actively disregard the difference when the memory is highly accessible. This disregard, in turn, allows older infants to "test the waters" and determine if new and old cues are functionally equivalent. C.
EFFECT OF A CONTEXT CHANGE ON RETENTION
For years, neuropsychologists thought that before 8-9 months of age, infants' brains were too immature to store information about the environmen-
Infant Memory 6.00
Cue Change
15
ContextChange
~e
E
5.00 4.00
I
3.00
• . r ~ 2
2 Months 3 Months 6 Months 9 Months Months
m ~ 2.00
1- ~ -
1.00 Early
Middle
Late Earl Relative Retention Interval
Middle
Baseline
Late
Fig. 10. Mean baseline ratios of independent groups of infants between 2 and 12 months of age who were trained for two sessions and tested with a different cue in the original context (lefipanel) or with the original cue in a different context (right panel) after common relative retention intervals that corresponded to the shortest (early) or longest (late) test delays or to the midpoint of the test delays (middle) on the forgetting function of each age group. An asterisk indicates that a group exhibited significantretention (i.e., M baseline ratio significantly > 1.00). Vertical bars indicate _+ 1 SE.
tal surround. Nadel, Willner, and Kurz (1985), for example, asserted that "Virtually all learning during infancy i s . . . independent of context" (p. 398). However, this assertion is incorrect. In our initial study, 3-month-olds recognized the original mobile 1 week after training in the original context but not in a different one, and the training context a l o n e - - w i t h o u t the m o b i l e - - c o u l d reactivate the m e m o r y 2 weeks after training, but a novel context could not (Rovee-Collier, Griesler, & Earley, 1985a). Hayne and Findlay (1995) replicated the context-alone reactivation result after 3 and 4 weeks. If the original context, by itself, is an effective retrieval cue, it must be represented in infants' training memory. Butler and Rovee-Collier (1989) tested 3-month-olds after delays ranging from 1 to 5 days with all combinations of cues and contexts that were the same or different from training to testing. This study yielded three important results. First, infants did not treat the cue and context as a stimulus configuration. H a d they done so, then changing either would have impaired retention, but a context change did not impair retention during the 1-day test. Second, the focal cue, an otherwise effective m e m o r y prime, was rendered ineffective in a novel context. This result has now been replicated many times with both 3- and 6-month-olds. Third, a highly distinctive training and test context facilitated discrimination of a novel test mobile from the training mobile after delays when generalization to a novel mobile was seen in its absence. This result reflects the disambiguating function of
16
Carolyn Rovee-Collier and Michelle Gulya
context when the memory of the original cue is fuzzy (Bouton & Bolles, 1985). D.
DEVELOPMENTALCHANGES IN CONTEXTUAL SPECIFICITY
Context effects have been studied through the first year of life (Hartshorn et al., 1998a). Although the absolute delay after which a context change impairs retention increases over this period, so too does the maximum duration of retention. Therefore, we anchored the forgetting functions obtained by Hartshorn et al. (1998b) from 3-, 6-, %, and 12-month-olds at their respective beginning and end points and compared retention across ages at the first, middle, and last points on each forgetting function. These points correspond, respectively, to absolute retention intervals of 1, 3, and 5 days at 3 months; 1, 7, and 14 days at 6 months; 1, 28, and 42 days at 9 months; and 1, 28, and 56 days at 12 months. This strategy revealed that a context change impairs retention only after delays at the end of the forgetting function at all ages except 6 months (Borovsky & Rovee-Collier, 1990), when it impairs recognition after short delays only (see Fig. 10, right panel). We interpret the latter result as a functional adaptation that anticipates independent locomotion at 7 months of age. Having already learned what objects are in what places or contexts, independent locomotion permits them to learn what paths lead to those places (i.e., they form cognitive maps). In deferred imitation studies with 12- to 18-month-olds, infants similarly generalize across widely varying training-test contexts after delays from 3 minutes to 28 days (Hanna & Meltzoff, 1993; Klein & Meltzoff, 1999). Because 14-month-olds can imitate after delays of at least 14 weeks (Meltzoff, 1995), a context change should not impair retention until that delay or longer. Although context effects are a major source of retrieval failures in animals, Crowder (1985) argued that context effects in human adults result only from "sledge-hammer" manipulations, such as learning a word list underwater and recalling it on land (Godden & Baddeley, 1975): "As far as the flow of time without such radical interventions, contextual drift is more an article of faith than it is an operational concept" (p. 33). The infant data, however, suggest that less "radical interventions" may have failed to yield context effects in adults because their retention tests are usually administered after relatively short delays. E.
OVERRIDING THE EFFECTS OF CUE AND CONTEXT CHANGES
The debilitating effect on retention of altered cues and contexts can be overridden at 3 and 6 months by initially training infants with two or more mobiles (Fagen, Morrongiello, Rovee-Collier, & Gekoski, 1984) or in two
Infant Memory
17
or more contexts (Amabile & Rovee-Collier, 1991). The same result is achieved by merely exposing infants to a novel mobile (Rovee-Collier, Borza, Adler, & Boller, 1993a) or novel context (Boller& Rovee-Collier, 1992) after training them with a single mobile in a single context. Apparently, the novel mobile or context is integrated with the prior training memory because infants subsequently respond to the novel mobile or in the novel context, and the novel mobile primes the training memory in a reactivation paradigm. Cue- and context-dependent retrieval can also be eliminated by multiple reactivations, although the memory attributes representing the cue are more resilient than those representing the context. Hitchcock and RoveeCollier (1996) reactivated the memory of 3-month-olds both 6 and 20 days after training, but during the second reactivation, either the cue or the context was novel. The second reactivation was successful when the context--but not the cue--was novel. They then gave reactivation treatments 6, 13, and 20 days after training, but this time, the cue or context was novel during the third reactivation only. As before, reactivation was successful when the context--but not the cue--was novel. When this experiment was repeated with the third reactivation 34 days after training, when the twicereactivated memory was clearly reforgotten, the reactivation was effective when the cue was novel (see Fig. 11). A control group whose second reactivation was with a novel cue after the same delay, however, exhibited no retention, confirming that the age of the memory itself was not responsible for the effectiveness of the novel cue after 34 days. These data reveal that the attributes representing the specific details of the cue and the context effectively disappear from episodic memory at different rates after different numbers of prior reactivations. Thereafter, the memory can be reactivated by generalized cues in generalized contexts. The finding that specific contextual details are so resistant to forgetting in delayed recognition tasks (being absolutely essential for memory retrieval after long delays) but are so vulnerable in reactivation tasks (being unnecessary for memory retrieval after just one prior reactivation) is paradoxical. Given that specific place information is "lost" by the time of a second reactivation, the inability of children and adults to remember the time or place of early life events is hardly surprising and reveals how easily information can be transferred from episodic to semantic memory. This result was anticipated by Furlong (1951, cited in Tulving, 1983, p. 17), who distinguished "retrospective" from "non-retrospective" memory by its reference to context in time and space. He hypothesized that retrospective memory became non-retrospective memory as the context faded. Tulving (1972) subsequently proposed that information is more readily lost from the episodic than from the memory system. The multiple-reactivation study reveals
18
Carolyn Rovee-Collier and Michelle Gulya
No Change
Cue Change
Context Change
4.5 4.0 day34
3.5
day3z -1¢
.0
rr
3.0
O) t-
2.5 ff}
m
I
T
2.0 Baseline
1.5
¢
.-iX.
1.0
1
2
3
1
2
3
3
1
2
3
3
Number of Reminders Fig. 11. The progressively diminishing effects on memory retrieval of either a cue or a context change during the final (or only) reactivation reminder for groups of 3-month-olds who received one, two, or three reactivations. Asterisks indicate that a group exhibited significant retention 1 day following reactivation (i.e., M baseline ratio significantly > 1.00); that is, the reactivation successfully retrieved the memory. Vertical bars indicate + 1 SE. Reprinted with permission from Hitchcock, D. F. A., & Rovee-Collier, C. (1996). The effect of repeated reactivations on memory specificity in infants. Journal of Experimental Child Psychology, 27, 746-762.
that the first information to be lost from semantic memory in infancy is contextual and that, after additional reactivations and longer delays, information about the focal cue is lost. Whether the same general result will be found for older infants and young children is unknown. Given that few adults r e m e m b e r specific events that occurred before 2 or 3 years of age (Usher & Neisser, 1993), however, the effect of repeated priming on access to contextual details probably does not change radically over ontogeny. F.
CUE AND CONTEXT: DIFFERENCES IN PROCESSING TIME
The processing of an event continues for a period of time after the event is o v e r , d u r i n g w h i c h t i m e t h e m e m o r y is p a r t i c u l a r l y s u s c e p t i b l e t o m o d u l a -
Infant Memory
19
tion. This phenomenon has been the basis for studies of consolidation, retroactive interference, and the administration of amnesic or memoryenhancing agents. Because the cue and context contribute differentially to an event's memorability, we thought that they might also be processed for different periods of time after the event is over. If this were the case, then infants' memory for the cue and the context would be differentially susceptible to retroactive interference. We previously found that exposing infants to a novel cue (Rovee-Collier et al., 1993a) or a novel context (Boller& Rovee-Collier, 1992) immediately after training disrupted recognition of the original cue in the original context 1 day l a t e r - - a classic retroactive interference effect. Exposing them to the novel cue or context after a 1-day delay, however, produced no retroactive interference 1 day after that. To test the processing-time hypothesis, we exposed 3-month-olds to a novel cue, a novel context, or both after posttraining delays of 0-24 hours and asked when the postevent exposure no longer impaired their recognition of the original cue in the original context (Rossi-George & Rovee-Collier, in press). The results were surprising (see Fig. 12). A novel cue impaired recognition after exposure delays from 0 to 20 minutes; after exposure delays longer than 20 minutes, no retroactive interference was seen. A novel context impaired recognition after an exposure delay of 2 hours, but after 4 hours it did not. Exposure to a cue and context that were both novel did not interfere with recognition, confirming that the degree of overlap between the original and the interpolated stimuli is a major factor in retroactive interference. The finding that the cue remains vulnerable to retroactive interference for a shorter time than the context suggests that it is processed more rapidly. The finding that infants' memory for the cue is more buffered against retroactive interference than their memory for the context is consistent with the finding that it is also more resistant to the effects of multiple reactivations. However, this finding does not explain why the original context is requisite for memory retrieval only after relatively long delays. We found that these retroactive interference effects are only temporary (Gulya, Rossi-George, & Rovee-Collier, 1999). When we exposed 3-montholds to a novel cue immediately after training, they again recognized the original training cue 2 days later. Chandler (1991) obtained a similar finding with adults: Interference resulting from exposure to a novel visual stimulus disappeared after 2 days. G.
DISTORTING MEMORY FOR CUE AND CONTEXT
Although some retroactive interference effects are relatively transient, exposing infants to a novel cue or context after delays so long that the details of the original cue or context have been forgotten can permanently
~ 1.00); vertical bars indicate _+1 SE. t h e y w e r e n o t s i m p l y g e n e r a l i z i n g to all test m o b i l e s b u t h a d l e a r n e d t h e i d e n t i t y o f all five m o b i l e s o n t h e t r a i n i n g list. B e c a u s e t r a i n i n g w i t h a fivei t e m list e l i m i n a t e d serial p o s i t i o n effects b u t n o t t h e d i s c r i m i n a t i o n o f a m o b i l e t h a t was n o t o n t h e list, w e c o n c l u d e t h a t i n c r e a s i n g list l e n g t h i m p a i r e d i n f a n t s ' m e m o r y for serial o r d e r b u t d i d n o t affect t h e i r m e m o r y for i t e m identity. E v e n w h e n infants w e r e t e s t e d 1 h o u r a f t e r t h e e n d o f t r a i n i n g w i t h t h e m o b i l e f r o m Serial P o s i t i o n 3, t h e y still r e c o g n i z e d t h e m o b i l e f r o m t h e m i d d l e of t h e list. T h e s e results i n d i c a t e t h a t i n f a n t s ' r e t e n t i o n o f s e r i a l - o r d e r i n f o r m a t i o n is c o m p r o m i s e d o n d e l a y e d r e c o g n i t i o n tests w h e n t h e l e n g t h o f t h e i r s t u d y list is i n c r e a s e d . I n f a n t s f o r g o t t h e f i v e - i t e m list 2 w e e k s later, just as t h e y h a d f o r g o t t e n a t h r e e - i t e m list a f t e r t h a t s a m e delay. W h e n t h e y r e c e i v e d a r e a c t i v a t i o n
Infant Memory
39
treatment 13 days after training and were tested with mobiles from the five-item list 24 hours later, however, infants exhibited a primacy effect, recognizing only the mobile from Serial Position 1. These results suggested that infants who were originally trained with the five-item list had learned serial-order information after all and confirmed prior findings from our laboratory that a reactivation paradigm is more sensitive to the information that was originally encoded than a delayed recognition paradigm (RoveeCollier et al., 1985b). Because infants trained on a five-mobile list had demonstrated a primacy effect 24 hours after reactivation just as they had 24 hours after training on a three-mobile list, we decided to prime their reactivated m e m o r y with valid and invalid order cues immediately before the test, as in Gulya et al. (1998), to determine whether they had originally learned the serial order of the five list members. For methodological reasons (infants would not tolerate the 8 minutes required to prime the mobile from Serial Position 5 with mobiles from the preceding serial positions), we again primed and tested infants with mobiles from the second and third serial positions only. As before, infants recognized the test mobiles from Serial Positions 2 and 3 only when they were preceded by valid order cues. These findings confirmed that infants had learned the order of items even when the list was almost twice as long. C.
INFANTS'ABILITY TO DETECT STRUCTURE
Given that infants as young as 3 months can learn the structure of a category in a succession of items, it is not surprising that they can also detect the structure of a serial list. Other researchers have also demonstrated that young infants can detect the structure in their environment. Hull-Smith, Arehart, Haaf, and deSaint Victor (1989), for example, reported that 5month-olds who saw a stimulus appear in each of four locations in a specific order, when cued with the first stimulus 1 minute and 1 week later, looked at the remaining three locations in the correct order. Similarly, Haith, Hazan, and G o o d m a n (1988) exposed 3 l/2 month olds to a stimulus that appeared in one of two locations in either an alternating sequence or randomly. When stimuli alternated, infants' reaction times decreased, and they anticipatorily fixated the next (correct) location. Mandel, Nelson, and Jusczyk (1996) reported that 2-month-olds can detect if the order of words in a sentence has changed. They familiarized infants with either a wellordered, complete sentence or a sentence fragment and tested them immediately after training with the original words in a different order. During testing, the group that originally heard a sentence fragment failed to detect that the order of the words was different, but the group whose words were originally embedded in a complete sentence did. The authors hypothesized
40
Carolyn Rovee-Collier and Michelle Gulya
that the internal structure of the well-ordered sentence allowed infants to remember its word order. Taken together, the preceding evidence reveals that very young infants are capable of detecting structure in their environment and learning the serial order of arbitrarily ordered items. These results clearly demonstrate that enabling relations are not necessary for infants' learning of serially ordered information (Bauer, 1996). Not only is the capacity for learning and remembering serially ordered information present very early in life and utilized long before infants are able to talk, but also infants' ability to learn an arbitrary list of items is critical for their subsequent development of language (Terrace, 1998, cited in Bower, 1998).
VI.
Infantile A m n e s i a
The widely held view that memory processing by preverbal infants is of little or no consequence for memory processing by older children and adults reflects in large part the ubiquity of the phenomenon that older children and adults usually cannot remember events that occurred before 2-3 years of age (Usher & Neisser, 1993; White & Pillemer, 1979)--the phenomenon of infantile amnesia. This phenomenon is usually attributed to the neurological immaturity of the brain mechanisms responsible for encoding and/ or maintaining memories over the long-term (Nelson, 1995; Schacter & Moscovitch, 1984) and the inability of infants to maintain memories over the long term because they cannot rehearse events by periodically talking about them (K. Nelson, 1990). Our studies of periodic, nonverbal reminders with preverbal infants (see Section I), however, effectively dismiss both accounts. If current accounts of infantile amnesia are inadequate, then what does explain it? We have demonstrated that, for infants, a fairly veridical match between the encoding and retrieval cues is critical for memory retrieval after relatively long delays. If this general rule continues to apply into adulthood, then contextual changes--both natural and perceived--would severely reduce the probability that a memory encoded in infancy would be retrieved later in life. Shifting from nonverbal to verbal retrieval cues as children become increasingly reliant on language would further exacerbate this problem. Finally, because contextual information disappears from infants' memories that have been either repeatedly retrieved in different contexts or reactivated just once or twice in the original context, older children and adults may actually remember some early life events but be unable to identify when or where those events happened.
Infant Memory VII.
41
Summary
T h e findings p r e s e n t e d in t h e p r e c e d i n g s e c t i o n s s t a n d in s t a r k c o n t r a s t to t h e w i d e l y h e l d v i e w t h a t m e m o r y p r o c e s s i n g b y p r e v e r b a l infants is q u a l i t a t i v e l y d i f f e r e n t f r o m t h a t o f v e r b a l l y c o m p e t e n t c h i l d r e n a n d adults. T h e s e findings d e m o n s t r a t e t h a t q u a n t i t a t i v e a s p e c t s o f m e m o r y p r o c e s s i n g (e.g., d u r a t i o n o f r e t e n t i o n , s p e e d of r e t r i e v a l ) c h a n g e with age, b u t t h e b a s i c m e c h a n i s m s t h a t u n d e r l i e m e m o r y p r o c e s s i n g a p p a r e n t l y d o not. A t all ages, m e m o r i e s a r e f o r g o t t e n g r a d u a l l y , a r e r e c o v e r e d b y r e m i n d e r s , a n d a r e m o d i f i e d b y n e w i n f o r m a t i o n t h a t o v e r l a p s w i t h old. I n a d d i t i o n , b e c a u s e y o u n g infants e x h i b i t a n u m b e r of p h e n o m e n a t h a t a r e difficult if n o t i m p o s s i b l e to s t u d y in o l d e r subjects, w h o s e p r o c e s s i n g is so r a p i d , a n d t h e i r d a t a a r e f r e e o f linguistic influences a n d social d e m a n d s , t h e y a r e t h e subjects o f c h o i c e for studies of m a n y m e m o r y p h e n o m e n a . ACKNOWLEDGMENTS Preparation of this chapter was supported by grant nos. R37-MH32307 and K05-MH00902 from the National Institute of Mental Health to the first author. We thank the many students and colleagues who selflessly contributed to the research findings that we have described. REFERENCES Adler, S. A., Gerhardstein, P., & Rovee-Collier, C. (1998). Levels-of-processing effects in infant memory? Child Development, 69, 280-294. Amabile, T. A., & Rovee-Collier, C. (1991). Contextual variation and memory retrieval at six months. Child Development, 62, 1155-1166. Bachevalier, J., & Mishkin, M. M. (1984). An early and a late developing system for learning and retention in infant monkeys. Behavioral Neuroscience, 98, 770-778. Barnat, S. A., Klein, P. J., & Meltzoff, A. N. (1996). Deferred imitation across changes in context and object: Memory and generalization in 14-month-old infants. Infant Behavior and Development, 19, 241-251. Bauer, P. J. (1996). What do infants recall of their lives? Memory for specific events by oneto two-year-olds. American Psychologist, 51, 29-41. Bauer, P. J., & Fivush, R. (1992). Constructing event representations: Building on a foundation of variation and enabling relations. Cognitive Development, 7, 381-401. Bauer, P. J., Hertsgaard, L. A., & Dow, G. A. (1994). After 8 months have passed: Longterm recall of events by 1- to 2-year-old children. Memory, 2, 353-382. Bauer, P. J., Hertsgaard, L. A., & Wewerka, S. S. (1995). Effects of experience and reminding on long-term recall in children: Remembering not to forget. Journal of Experimental Child Psychology, 59, 260-298. Bauer, P. J., & Mandler, J. M. (1992). Putting the horse before the cart: The use of temporal order in recall of events by one-year-oldchildren. Developmental Psychology, 28, 441-452. Belli, R. F., Windschitl, P. D., McCarthy, T. T., & Winfrey, S. E. (1992). Detecting memory impairment with a modified test procedure: Manipulating retention interval with centrally
42
Carolyn Rovee-Collier and Michelle Gulya
presented event items. Journal of Experimental Psychology: Learning, Memory, & Cognition, 18, 356-367. Bhatt, R. S., & Rovee-CoUier, C. (1994). Perception and 24-hour retention of feature relations in infancy. Developmental Psychology, 30, 142-150. Bhatt, Ro S., Wilk, A., Rovee-Collier, C. (1996, April). Feature relations and the development of categorization. Paper presented at the International Conference on Infant Studies, Providence, RI. Blough, D. S. (1982). Pigeon perception of letters of the alphabet. Science, 218, 397-398. Boller, K., & Rovee-Collier, C. (1992). Contextual coding and recoding of infant memory. Journal of Experimental Child Psychology, 52, 1-23. Boller, K., Rovee-Collier, C., Gulya, M., & Prete, K. (1996). Infants' memory for context: Timing effects of postevent information. Journal of Experimental Child Psychology, 63, 583-602. Borovsky, D., & Rovee-Collier, C. (1990). Contextual constraints on memory retrieval at 6 months. Child Development, 61, 1569-1583. Bouton, M. E , & Bolles, R. C. (1985). Contexts, event-memories, and extinction. In P. D. Balsam & A. Tomie (Eds.), Context and learning (pp. 133-166). Hillsdale, NJ: Erlbaum. Bower, B. (1998). Babies get a kick out of serial memories. Science News, 154, 53. Brainerd, C. J., & Reyna, V. F. (1988). Memory loci of suggestibility development: Comment on Ceci, Ross, and Toglia (1987). Journal of Experimental Psychology: General, 117, 208-211. Butler, J., & Rovee-Collier, C. (1989). Contextual gating of memory retrieval. Developmental Psychobiology, 22, 533-552. Campbell, B. A., & Jaynes, J. (1966). Reinstatement. Psychological Review, 73, 478-480. Chandler, C. C. (1991)I How memory for an event is influenced by related events: Interference in modified recognition tests. Journal of Experimental Psychology: Learning, Memory, & Cognition, 17, 115-125. Clayton, K., Habibi, A., & Bendele, M. S. (1995). Recognition priming effects following serial learning: Implications for episodic priming effects. American Journal of Psychology, 108, 547-561. Colombo, J., & Mitchell, D. W. (1990). Individual differences in early visual attention: Fixation time and information processing. In J. Colombo & J. W. Fagen (Eds.), Individual differences in infancy (pp. 193-227). Hillsdale, NJ: Erlbaum. Cornell, E. H., & Bergstrom, L. I. (1983). Serial-position effects in infants' recognition memory. Memory & Cognition, 11, 494-499. Crowder, R. G. (1985). Basic theoretical concepts in human learning and cognition. In L.-G. Nilsson & T. Archer (Eds.), Perspectives on learning and memory (pp. 19-37). Hillsdale, NJ: Erlbaum. Fagen, J. W., Morrongiello, B. A., Rovee-Collier, C., & Gekoski, M. J. (1984). Expectancies and memory retrieval in three-month-old infants. Child Development, 55, 936-943. Fagen, J. W., Prigot, J., Carroll, M., Pioli, L., Stein, A., & Franco, A. (1997). Auditory context and memory retrieval in young infants. Child Development, 68, 1057-1066. Gerhardstein, P., Renner, P., & Rovee-Collier, C. (1999). The effect of conceptual and perceptual target-distractor similarity on color pop-out in infants. British Journal of Develop-
mental Psychology. Gibson, E. J. (1969). Principles of perceptual learning and development. New York: AppletonCentury-Crofts. Godden, D. R., & Baddeley, A. D. (1975). Context-dependent memory in two natural environments: On land and underwater. British Journal of Psychology, 66, 325-332. Graf, P., & Schacter, D. L. (1985). Implicit and explicit memory for new associations in normal and amnesic patients. Journal of Experimental Psychology: Learning, Memory, & Cognition, I1, 501-518.
Infant Memory
43
Greco, C., Hayne, H., & Rovee-Collier, C. (1990). The roles of function, reminding, and variability in categorization by 3-month-old infants. Journal of Experimental Psychology: Learning, Memory, & Cognition, 16, 617-633. Greco, C., Rovee-Collier, C., Hayne, H., Griesler, P., and Earley, L. (1986). Ontogeny of early event memory: I. Forgetting and retrieval by 2- and 3-month-olds. Infant Behavior and Development, 9, 461-472. Green, P. C. (1962). Learning, retention, and generalization of conditioned responses by young monkeys. Psychological Reports, 10, 731-738. Gulya, M., Rovee-Collier, C., Galluccio, L., & Wilk, A. (1998). Memory processing of a serial list by very young infants. Psychological Science, 9, 303-307. Gulya, M., Rossi-George, A., & Rovee-Collier, C. (1999, April). Time-dependent retroactive interference on a recognition task. Paper presented at the meeting of the Eastern Psychological Association, Providence, RI. Gulya, M., Sweeney, B., & Rovee-Collier, C. (1999). Infants' memory processing of a serial list: List length effects. Journal of Experimental Child Psychology, 73, 72-91. Haith, M. M., Hazan, C., & Goodman, G. S. (1988). Expectation and anticipation of dynamic visual events by 3.5-month-old babies. Child Development, 59, 467-479. Hanna, E., & Meltzoff, A. N. (1993). Peer imitation by toddlers in laboratory, home, and daycare contexts: Implications for social learning and memory. Developmental Psychology, 29, 701-710. Hartshorn, K. (1998, October). The effect of reinstatement on infant long-term retention. Unpublished doctoral dissertation, Rutgers University, New Brunswick, NJ. Hartshorn, K., & Rovee-Collier, C. (1997). Infant learning and long-term memory at 6 months: A confirming analysis. Developmental Psychobiology, 30, 71-85. Hartshorn, K., Royce-Collier, C., Gerhardstein, P., Bhatt, R. S., Klein, P. J., Aaron, F., Wondoloski, T. L., & Wurtzel, N. (1998a). Developmental changes in the specificity of memory over the first year of life. Developmental Psychobiology, 33, 61-78. Hartshorn, K., Rovee-Collier, C., Gerhardstein, P., Bhatt, R. S., Wondoloski, T. L., Klein, P., Gilch, J., Wurtzel, N., & Campos-de-Carvalho, M. (1998b). The ontogeny of longterm memory over the first year-and-a-half of life. Developmental Psychobiology, 32, 1-31. Hayne, H. (1990). The effect of multiple reminders on long-term retention in human infants. Developmental Psychobiology, 23, 453-477. Hayne, H., & Findlay, N. (1995). Contextual control of memory retrieval in infancy: Evidence for associative priming. Infant Behavior and Development, 18, 195-207. Hayne, H., Greco, C., Earley, L. A., Griesler, P. C., & Rovee-Collier, C. (1986). Ontogeny of early event memory: I. Encoding and retrieval by 2- and 3-month-olds. Infant Behavior and Development, 9, 441-460. Hayne, H., Hildreth, K., & Rovee-Collier, C. (1998, April). Repeated reminders facilitate memory retrieval Paper presented at the meeting of the International Society of Infant Studies, Atlanta, GA. Hayne, H., MacDonald, S., & Barr, R. (1997). Developmental changes in the specificity of memory over the second year of life. Infant Behavior and Development, 20, 233-245. Hayne, H., & Rovee-Collier, C. (1995). The organization of reactivated memory in infancy. Child Development, 66, 893-906. Hayne, H., Rovee-Collier, C., & Perris, E. E. (1987). Categorization and memory retrieval in 3-month-olds. Child Development, 58, 750-767. Herrnstein, R. J., & de Villiers, P. A. (1980). Fish as a natural category for people and pigeons. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 10, pp. 59-95). San Diego: Academic Press.
44
Carolyn Rovee-Collier and Michelle Gulya
Herrnstein, R. J., & Loveland, D. H. (1964). Complex visual concept in the pigeon. Science, 146, 549-551. Hildreth, K., & Rovee-Collier, C. (in press). Decreases in the response latency to priming over the first year of life. Developmental Psychobiology. Hill, W. H., Borovsky, D., & Rovee-Collier, C. (1988). Continuities in infant memory development over the first half-year. Developmental Psychobiology, 21, 43-62. Hitchcock, D. F. A., & Rovee-Collier, C. (1996). The effect of repeated reactivations on memory specificity in infants. Journal of Experimental Child Psychology, 62, 378-400. Howe, M. L. (1991). Misleading children's story recall: Forgetting and reminiscence of the facts. Developmental Psychology, 27, 746-762. Hull Smith, P., Arehart, D. M., Haaf, R. A., & deSaint Victor, C. M. (1989). Expectancies and memory for spatiotemporal events in 5-month-old infants. Journal of Experimental Child Psychology, 47, 210-235. Kagan, J., & Hamburg, M. (1981). The enhancement of memory in the first year. Journal of Genetic Psychology, I38, 3-14. Keller, F. S., & Schoenfeld, W. N. (1950). Principles of psychology. New York: AppletonCentury-Crofts. Klein, P. J., & Meltzoff, A. N. (1999). Long-term memory, forgetting, and deferred imitation in 12-month-olds. Developmental Science, 2, 102-113. Leaton, R. N. (1976). Long-term retention of the habituation of lick suppression and startle response produced by a single auditory stimulus. Journal of Experimental Psychology: Animal Behavior Processes, 2, 248-259. Mandel, D. R., Nelson, D. G. K., & Jnsczyk, P. W. (1996). Infants remember the order of words in a spoken sentence. Cognitive Development, 11, 181-196. Mandler, J. M. (1984). Representation and recall in infancy. In M. Moscovitch (Ed.), Advances in the study of communication and affect. Vol. 9: Infant memory (pp. 75-101). New York: Plenum. Mandler, J. M., & McDonough, L. (1995). Long-term recall of event sequences in infancy. Journal of Experimental Child Psychology, 59, 457-474. Meltzoff, A. N. (1995). What infant memory tells us about infantile amnesia: Long-term recall and deferred imitation. Journal of Experimental Child Psychology, 59, 497-515. McDonough, L., Mandler, J. M., McKee, R. D., & Squire, L. R. (1995). The deferred imitation task as a nonverbal measure of declarative memory. Proceedings of the National Academy of Sciences, 92, 7580-7584. McKee, R. D., & Squire, L. R. (1993). On the development of declarative memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, I9, 397-404. Merriman, J., & Rovee-Collier, C. (1994, June). Developmental changes in infants' sensitivity to temporal order. Paper presented at the International Conference on Infant Studies, Paris, France. Merriman, J., Rovee-Collier, C., & Wilk, A. (1997). Exemplar spacing and infants' memory for category information. Infant Behavior and Development, 20, 219-232. Miller, R. R., & Berk, A. N. (1997). Retention over metamorphosis in the African claw-toed frog. Journal of Experimental Psychology: Animal Behavior Processes. 3, 343-356. Morgan, M. J., Fitch, M. D., Holman, J. G., & Lea, S. E. G. (1975). Pigeons learn the concept of an "A." Perception, 5, 57-66. Murdock, B. B. (1962). The serial position effect of free recall. Journal of Experimental Psychology, 64, 482-488. Muzzio, I. A., & Rovee-Collier, C. (1996). Timing effects of postevent information on infant memory. Journal of Experimental Child Psychology, 63, 212-238.
Infant Memory
45
Nadel, L., Willner, J., & Kurz, E. M. (1985). Cognitive maps and environmental context. In P. D. Balsam & A. Tomie (Eds.), Context and learning (pp. 385-406). Hillsdale, NJ: Erlbaum. Naito, M., & Komatsu, S. I. (1993). Processes involved in childhood development of implicit memory. In P. Graf & M. E. J. Masson (Eds.), Implicit memory: New directions in cognition, development, and neuropsychology (pp. 231-260). Hillsdale, NJ: Erlbaum. Neisser, U. (1987). Preface. In U. Neisser (Ed.), Concepts and conceptual development (pp. vii-ix). Cambridge: Cambridge University Press. Neisser, U. (1997, November). Enabling conditions for false memories. Colloquium presented to the Department of Psychology, Rutgers University, New Brunswick, NJ. Nelson, C. A. (1995). The ontogeny of human memory: A cognitive neuroscience perspective. Developmental Psychology, 31, 723-738. Nelson, K. (1990). Remembering, forgetting, and childhood amnesia. In R. Fivush & J. A. Hudson (Eds.), Knowing and remembering in young children (pp. 301-316). Cambridge: Cambridge University Press. Nelson, K. (1993). Events, narratives, memory: What develops? In C. A. Nelson (Ed.), Minnesota symposia on child psychology. Vol. 24: Memory and affect in development (pp. 1-24). Hillsdale, NJ: Erlbaum. Reinitz, M. T. & Alexander, R. (1996). Mechanisms of facilitation in primed perceptual identification. Memory & Cognition, 24, 129-135. Reinitz, M. T., Wright, E., & Loftus, G. R. (1989). Effects of semantic priming on visual encoding of pictures. Journal of Experimental Psychology: General, 118, 280-297. Reznick, J. S., & Kagan, J. (1983). Category detection in infancy. In L. P. Lipsitt (Ed.), Advances in infancy research (Vol. 2, pp. 80-108). Norwood, NJ: Ablex. Rosch, E., Mervis, C. G., Gray, W. D., Johnson, D. M., & Boyes-Braem, P. (1976). Basic objects in natural categories. Cognitive Psychology, 8, 382-439. Rossi-George, A., & Rovee-Collier, C. (in press). Retroactive interference in human infants.
Developmental Psychobiology. Rovee, C. K., & Rovee, D. T. (1969). Conjugate reinforcement of infant exploratory behavior. Journal of Experimental Child Psychology, 8, 33-39. Rovee-Collier, C. (1995). Time windows in cognitive development. Developmental Psychology, 51, 1-23. Rovee-Collier, C. (1997). Dissociations in infant memory: Rethinking the development of implicit and explicit memory. Psychological Review, 104, 467-498. Rovee-Collier, C., Adler, S. A., & Borza, M. A. (1994). Substituting new details for old? Effects of delaying postevent information on infant memory, Memory & Cognition, 22, 644-656. Rovee-Collier, C., Borza, M. A., Adler, S. A., &Boller, K. (1993a). Infants' eyewitness testimony: Integrating postevent information with a prior memory representation. Memory & Cognition, 21, 267-279. Rovee-Collier, C., Greco-Vigorito, C., & Hayne, H. (1993b). The time window hypothesis: Implications for categorization and memory modification. Infant Behavior and Development, 16, 149-176. Rovee-Collier, C., Griesler, P. C., & Earley, L. A. (1985a). Contextual determinants of infant retention. Learning and Motivation, 16, 139-157. Rovee-Collier, C., Hartshorn, K., & DiRubbo, M. (1999). Long-term maintenance of infant memory. Developmental Psychobiology. Rovee-Collier, C., Patterson, J., & Hayne, H. (1985b). Specificity in the reactivation of infant memory. Development Psychobiology, 18, 559-574.
46
Carolyn Rovee-Collier and Michelle Gulya
Rovee-Collier, C., & Sullivan, M. W. (1980). Organization of infant memory. Journal of Experimental Psychology: Human Learning and Memory, 6, 798-807. Rovee-Collier, C., Sullivan, M. W., Enright, M. K., Lucas, D., & Fagen, J. W. (1980). Reactivation of infant memory. Science, 208, 1159-1161. Rozin, P. (1976). The psychobiological approach to human memory. In M. R. Rosenzweig & E. L. Bennett (Eds.), Neural mechanisms of learning and memory (pp. 3-48). Cambridge, MA: MIT Press. Rubin, G. B., Fagen, J. W., & Carroll, M. (1999). Olfactory context and memory retrieval in 3-month-old infants. Infant Behavior and Development, 21, 641-658. Schacter, D. L., & Moscovitch, M. (1984). Infants, amnesics, and dissociable memory systems. In M. Moscovitch (Ed.), Advances in the study of communication and affect. Vol. 9: Infant memory (pp. 173-216). New York: Plenum. Sherman, T. (1985). Categorization skills in infants. Child Development, 56, 1561-1573. Shields, P. J., & Rovee-Collier, C. (1992). Long-term memory for context-specific category information at 6 months. Child Development, 63, 175-214. Spear, N. E., & Parsons, P. J. (1976). Analysis of a reactivation treatment: Ontogenetic determinants of alleviated forgetting. In D. L. Medin, W. A. Roberts, & R. T. Davis (Eds.), Processes of animal memory (pp. 135-165). Hillsdale, NJ: Erlbaum. Strauss, M. S. (1979). Abstraction of prototypical information by adults and 10-month-old infants. Journal of Experimental Psychology: Human Learning and Memory, 5, 618-632. Squire, L. R. (1987). Memory and brain. New York: Oxford University Press. Tulving, E. (1972). Episodic and semantic memory. In E. Tulving & W. Donaldson (Eds.), Organization of memory (pp. 381-403). New York: Academic Press. Tulving, E. (1983). Elements of episodic memory. New York: Oxford University Press. Usher, J. A., & Neisser, U. (1993). Childhood amnesia and the beginnings of memory for four early life events. Journal of Experimental Psychology: General, 122, 155-165. Warrington, E. K., & Weiskrantz, L. (1970). Amnesic syndrome: Consolidation or retrieval? Nature, 228, 629-630. White, S. H., & Pillemer, D. B. (1979). Childhood amnesia and the development of a socially accessible memory system. In J. F. Kihlstrom & F. J. Evans (Eds.), Functional disorders of memory (pp. 29-74). Hillsdale, NJ: Erlbaum. Wright, A. A., Santiago, H. C., Sands, S. F., Kendrick, D. F., & Cook, R. G. (1985). Memory processing of serial lists by pigeons, monkeys, and people. Science, 22, 287-289. Younger, B. A., & Cohen, L. B. (1986). Developmental change in infants' perceptions of correlations among attributes. Child Development, 57, 803-815.
THE COGNITIVE-INITIATIVE ACCOUNT OF DEPRESSION-RELATED IMPAIRMENTS IN M E M O R Y Paula T. Hertel
I. Introduction The many and diverse interpretations of the word control make it clear that control constitutes a fundamental concern in most areas of psychology. In an illustration of this diversity, I described my interest in controlled uses of m e m o r y at a social gathering; my new acquaintances, without realizing the non sequitur, subsequently raised issues about self control and loss of control--issues much more relevant to their own interests in psychological phenomena than are my narrow musings. Yet a second thought devoted to the semantics of control reveals underlying commonalities. For example, when older people begin to have problems with controlled uses of memory, they sometimes feel like they are losing control in a more general sense. Consider a related concept: initiative. When it is used in the context of research on m e m o r y and cognition, it refers to the research participants' use of cognitive procedures that are not specified fully by the constraints of the experimental task (e.g., Hertel & Hardin, 1990). When it is used in an everyday context, however, it suggests an active motivational state or a certain readiness to perform, as in, "She showed excellent initiative in organizing the meeting." Yet, as the example illustrates, this everyday sense of initiative also sometimes includes the notion that procedures were accomplished without prior specification, that someone has done something THE PSYCHOLOGY OF LEARNING AND MOTIVATION, VOL. 39
47
Copyright © 2000 by Academic Press. All fights of reproduction in any form reserved. 0079-7421/00 $30.00
48
Paula T. Hertel
they were not directly told to do. Turning the concept of initiative in the clinical direction, we easily observe that depressed people do not show much of it. The lack of initiative in the everyday sense is understood to be a fundamental characteristic of depression. Regardless of their readiness, however, people who are in depressed mood states often show deficits in initiative in the sense that refers to deficient cognitive control. They have difficulty initiating thoughts and actions, or at least the sorts of thoughts and actions that happen to produce benefits on routine cognitive tasks. My interest in depression-related impairments in remembering was brought about by an accidental discovery that experimental control eliminated such impairments. When attention was well controlled by the demands of the task, and therefore did not have to be self-controlled, depressed participants recalled as well as did others. In this chapter, I review the lines of research that grew out of this accidental discovery and that my colleagues and I have called the cognitive-initiative framework. The reader should see that the sense of initiative that we have tried to capture since the late 1980s or so refers to the unspecified or uninstructed use of procedures ("someone has done something they were not directly told to do"). However, this approach has occasionally been interpreted in a more broadly motivational sense. After reviewing some studies, both old and new, that illustrate the framework, the chapter discusses the broader interpretation and related approaches.
IL
The Framework and the Findings
As a general approach for organizing findings of depression-related impairments in deliberate remembering, the cognitive-initiative framework makes three basic claims. First, evidence of impairments should be found under conditions in which attention is poorly controlled and cognitive procedures loosely constrained. Second, task structures and constraints that have been shown to benefit deliberate attempts to remember, when employed with participants in depressed and nondepressed states, should close the gap otherwise associated with the difference in state. In other words, the experimental control of attention--during initial exposure to the materials or during the memory test later on--should provide a good substitute for self-initiated control. Third, if the criterion task--the one that shows effects of prior experience--does not typically invoke deliberate remembering, depression-related impairments should not be found. In fact, on tasks that are actually disrupted by deliberate attempts to remember, depressed participants should fail to show the disruption. This section reviews research that supports each of these claims, beginning with the last as it plays out in the realm of problem solving.
Depression and Memory A.
49
WHEN CONTROL DISRUPTS
Like the memory literature, the literature on problem solving reveals difficulties associated with depressed mood states (see Williams, Watts, MacLeod, & Mathews, 1997). According to most theoretical frameworks, these difficulties would be expected to emerge in some types of problemsolving tasks, particularly those tasks that require sustained attention to systematic steps in hypothesis testing. There is, however, a paradigm for studying spontaneous analogical transfer in problem solving that is procedurally similar to many memory paradigms and that has revealed an advantage to being in a depressed mood. Spontaneous analogical transfer is typically studied by presenting logic and other kinds of word problems together with their solutions during a training phase. The training phase is followed by a transfer phase in which analogous problems are presented for solution without mention of the prior analogs. The paradigm, therefore, is procedurally similar to indirect tests of memory, during which no mention is made of the prior phase of initial exposure to materials. This parallel extends to the researchers' interests as well, because in both cases we are interested in revealing a benefit of prior "training" through its nondeliberate use. With this arrangement, Needham and Begg (1991) showed that spontaneous transfer profited from problemoriented training. In the problem-oriented condition, students were asked to try to solve each training problem before its solution was explained. Students in this condition accurately solved more transfer problems than did the students who had been given memory-oriented instructions during training--students who tried to learn the training problems and their solutions for a later test. (In a transfer-appropriate fashion, memory-oriented training produced superior recall of the training problems.) One possible interpretation of the problem-solving results holds that the solutions to the training problems come to mind spontaneously as the students are introduced to the corresponding analogical problem structures, much in the same way that prior experience influences performance on indirect tests of memory. In other words, memory operates in a truly spontaneous or automatic fashion. Alternatively, problem solvers might notice similarities between the problem sets and initiate a deliberate and controlled search of memory for the appropriate solutions. In other words, memory might operate predominately in automatic versus controlled ways to facilitate the solutions to the transfer problems. Knowing the extent to which self-initiated, controlled reflection plays a role is important to the consideration of how well students in depressed or dysphoric I moods should do. 1 To avoid misrepresenting the state of depression, dysphoria is a term used to refer to naturally occurring negative affect, as it is measured by such instruments as the Beck Depression Inventory (Beck, Ward, Mendelson, Mock, & Erbaugh, 1961) in the absence of clinical diagnosis through structured clinical interviews.
50
Paula T. Hertei
Alicia Knoedler and I (Hertel & Knoedler, 1996) predicted that performance of dysphoric students should be impaired if self-initiated reflection plays a significant role in the transfer phase. We also reasoned that if reflection plays a significant role, everyone (and especially the dysphoric students) should be helped by the provision of hints to think back to the analogous training problems before they try to solve the corresponding transfer problems. If memory is used deliberately and explicitly, cues should help. So in experiment 1, problem-oriented training was followed by a transfer phase with two conditions. Prior to each analogous transfer problem, half of the students--both dysphoric and nondysphoric--were given explicit hints to think of the appropriate training problems, whereas the other half were told merely to clear their minds and prepare for the next problem to solve. Much to our surprise, the hints actually disrupted the performance of the nondysph0ric students, compared to the no-hint controls. After replicating this finding, we began to see it as an example of transfer-appropriate processing (Morris, Bransford, & Franks, 1977). Problem-oriented training encouraged an initial focus on the structure of the problem; this focus had more in common with the focus in the no-hint condition of the transfer phase than it did with the focus in the hint condition. Instructions in the hint condition focused attention on the past, and perhaps on details of the problems in place of the more abstract relations among their elements. Regardless of the reason, the fact that the hints actually hindered performance by nondysphoric problem-solvers argued against self-initiated reflection as the primary way transfer was achieved in the no-hint condition. Knoedler and I (1996) concluded that spontaneous transfer in the no-hint condition was more likely achieved by the spontaneous use of memory than by a self-initiated and deliberate search; the method of solution came to mind as the analogous structure was conceived. On this view, we would not expect to find a depression-related impairment in the no-hint groups, and we did not. More surprising (again, and replicated) was the finding of a reliable advantage to feeling depressed when hints were provided. Figure 1 reproduces the mean percentage of problems solved by dysphoric and nondysphoric participants in both conditions of experiment 1. It is apparent that the dysphoric participants solved reliably more problems than the nondysphoric participants when everyone was given hints; Knoedler and I surmised that the dysphoric participants did not follow instructions to think back and thereby avoided the corresponding pitfalls. Now, cases in which controlled reflection disrupts performance are probably rare, particularly in the memory investigations that experimentalists tend to emphasize. We do like to imagine situations in which past experience is more beneficial if one "goes with the flow" in place of being
Depression and Memory
51
90
Mood Group 80
Nondysphoric
~
Dysphoric
7O
M e a n
60
% 60 S 40 o I v 3O e d 2O
10 0 No-Hint
Hint
Instructions
Fig. 1. The mean percentage of transfer problems solved by dysphoric and nondysphoric students who were either given hints to remember the training analogs or told to clear their minds in the no-hint condition. From a table in "Solving Problems by Analogy: The Benefits and Detriments of Hints and Depressed Moods," by P. T. Hertel and A. J. Knoedler, 1996, Memory & Cognition, 107, p. 19. Copyright © 1996 by the Psychonomics Society; adapted with permission of the authors.
more reflective, and in those cases a depressed mood could help us avoid the pitfalls of searching our m e m o r y in vain. Much more common is the situation in which m e m o r y operates spontaneously and sometimes without awareness. No one tells us to reflect back and no one thinks to do so. B.
W H E N CONTROL IS BESIDE THE POINT
Successful indirect tests of m e m o r y put nondepressed participants on the same footing with depressed participants, because neither group shows initiative in controlled uses of memory. Indirect tests are carefully designed to insure that participants do not deliberately think back to a prior experimental phase as they are spelling homophones, completing word fragments or word stems, or freely associating to cues. Indeed, no differences have been found on tests of h o m o p h o n e spelling (Hertel & Hardin, 1990), word
52
Paula T. Hertel
completion (Danion et al., 1991; Denny & Hunt, 1992; Watkins, Mathews, Williamson, & Fuller, 1992), or free association (Watkins, Vache, Verney, Muller, & Mathews, 1996). Yet the extent to which a particular word comes to mind to provide a spelling, complete a stem, or relate to a cue should also reflect the extent to which that particular word was attended initially. The results of one experiment that revealed differences associated with depression perhaps did so because the words were poorly attended in the first place (Hertel, 1994). In that experiment, performed with clinically depressed and nondepressed participants from the community, two types of rating tasks were used in phase 1. In one block the words were rated for their emotional value, and in the other block they were rated for the degree of curvature in their perceptual form. These rating blocks were followed by a test of perceptual identification, in which new words and previously rated words were presented very briefly and back-masked; the task was to read the words aloud. As anticipated, words previously rated for their emotional value were identified more readily than were new words, regardless of the mood group. However, the effect of prior exposure in the curvature task was reliably smaller in the depressed group than in the nondepressed group. Blocking according to the type of rating might have been the key to producing this depression-related impairment, because a block of curvature ratings could be completed without even noticing what the letters spelled. In other words, some of these words might have been read by depressed participants for the first time on the test. The focus of attention matters, if merely to establish a perceptual record of having read a word. The larger principle illustrated by this isolated finding is that procedures across episodes should be transfer appropriate if the past is to benefit the present. C.
WHEN CONTROL IS BENEFICIAL
Impaired performance on tests of intentional or deliberate memory typically occurs when attention is poorly controlled by external means. The extreme "proof" of this claim is the simple demonstration in which a neurologically healthy person is constrained to attend to nothing else except the material at hand, and then the material is swept away and an instant later the request for memory is posed. To the extent that variations in attention are introduced--perhaps in the form of a retention interval in which this person must attend to other things or perhaps through the introduction of other materials that vie for attention--performance on the memory test suffers. Performance always relies on the ability of the rememberer to direct attention to events that are no longer occurring--events in the past. And performance always benefits from the transfer-appropriate use of attention during
Depression and Memory
53
initial exposure, which is rarely well arranged, either naturally or experimentally. Therefore, variations in performance on tests of deliberate remembering are correlated with the extent to which the r e m e m b e r e r initiates beneficial procedures without being constrained to do so by the demands of the task. Initiative is important in an unplanned world. 1.
On Tests o f Free Recall
Stephanie Rude and I accidentally discovered that initiative is important to the understanding of memory in depressed states when we tried to replicate some results obtained by Ellis, Thomas, and Rodriguez (1984). Ellis et al. experimentally induced sad or neutral moods, and then presented the students with a semantic orienting task followed by a surprise test of free recall. The orienting task was to judge whether a target word fit meaningfully in a corresponding sentence frame. The frames established more or less difficult or distinctive contexts for the words to be recalled; distinctiveness benefited target recall by participants in a neutral mood, but not by those in a sad mood. A n o t h e r way of looking at the results revealed that the students in a sad m o o d recalled fewer words from the more distinctive contexts than did the students in a neutral mood. In a prior study (Tyler, Hertel, McCallum, & Ellis, 1979), the more distinctive frames, compared to the less distinctive ones, had also produced longer latencies on a secondary task intended to measure cognitive effort, or the amount of attentional resources expended in judging whether the words fit into the frames. Therefore, Ellis et al. concluded that the students induced to feel sad had insufficient resources available to encode the target words in those frames. They also generalized these results to depressed people, and Rude thought the justification for this generalization was worth investigating. Judging the fit of words like "artist" in sentences such as " T h e young man's physique was admired by the " did not seem too effortful for even a depressed person to do. Our first study (Hertel & Rude, 1991b, experiment 1; Rude & Hertel, 1987) revealed quite the opposite pattern to what Ellis et al. found. Naturally dysphoric students actually recalled more words from both types of frames than did the students who were not dysphoric. Although w e speculated about why they performed better (why the finding might have been real instead of a type-I error), the more important point was that they did not perform worse than the nondysphoric participants. This result led us to consider the methodological differences between the two experiments, other than the nature of participants' moods. The main differences were that Rude and I had required participants to repeat the target word at the end of each trial and then report whether it fit the frame, whereas Ellis et
54
Paula T. Hertel
al. (1984) did not require repetition and had accepted the judgment of fit at any point during the trial. These differences were incorporated into the design of subsequent experiments, as was the variation in the type of mood (naturally occurring versus experimentally induced). We replicated the recall deficit found by Ellis et al. when we used their method and eliminated it when we used our own. Later, we extended our findings to clinically depressed outpatients, nondepressed outpatients, and outpatients recovered from depressive episodes (Hertel & Rude, 1991a). The mean percentages of words recalled from the more distinctive frames are shown in Fig. 2. The figure illustrates the point that depressed people have sufficient "resources" to attend in ways that benefit subsequent recall. What matters is how that attention is controlled. The focused condition of initial exposure required participants to keep each word in mind for the duration of the 8-s trial in order to repeat it; this procedure also might have encouraged additional attention to the contextual frame and a more distinctive record for retrieval. The unfocused condition permitted such episodic enrichment, but at the behest of self-sustained attention. It was possible to think about other matters or not think at all. In short, there was room for initiative in the focus of attention. Again, the focus of attention matters, this time to establish a richer basis for deliberate retrieval. 2.
On Tests of Recognition
So far, it seems that tests that require deliberate retrieval are most sensitive to prior variations in cognitive initiative and corresponding depressionrelated impairments, whereas tests that do not--indirect or implicit tests-are least sensitive. One might imagine that recognition tests fall somewhere in the middle, given that they typically invoke a mixture of controlled and automatic retrieval processes (see Jacoby, 1991). Controlled judgments of prior occurrence can be based on a deliberate consideration of the prior context of the test items. Automatic influences can also guide recognition judgments; items are perceived or conceived more fluently the second time around (on the test), and that fluency is experienced as familiarity (see Jacoby, Kelly, & Dywan, 1989). Most likely, both controlled and automatic uses of the past are invoked on most recognition tests. However, probably because recognition tests can be performed by relying on more automatic processes, depression-related impairments are rarely observed. Therefore, to reveal possible impairments in the controlled component alone, Stephanie Milan and I used Jacoby's (1991) process-dissociation procedure for recognition (Hertel & Milan, 1994). We presented essentially unrelated pairs of words during phase 1 and asked dysphoric and nondysphoric students to judge their relatedness. In
Depression and Memory
55
40 Mood G r o u p I
I Nondepressed
~
Recovered
Depressed
M
30
e :::::: •: :. 5. .:.:. :. :. : : : ..., .. . ....... ".::.:.::
n
iii!!iiiii!i:i:::i:::!:i
%
iiilJi
......... "+x.>: ...... :.xP:,:.'.'.".'-'.
R 20 e c Q
l 1 e
d
lo i!?i)!i!)i?:
iiiiiiiiii!iiiiiiiill
iiiiiiiiiiii
Untocused Phase
Focused 1 Condition
Fig. 2. The mean percentage of words recalled from the more distinctive sentence frames presented in phase 1. Participants were clinically depressed, recovered from episodes of depression, or nondepressed controls. Their attention to the experimental materials in phase 1 was either unfocused or focused by the demands of the task. From a table in "Depressive Deficits in Memory: Focusing Attention Improves Subsequent Recall," by P. T. Hertel and S. S. Rude, 1991, Journal of Experimental Psychology: General, 120, p. 304. Copyright @ 1991 by the American Psychological Association and adapted with permission of the authors.
phase 2, they listened to single words on audio tape and tried to remember them for a later test. Then the recognition test was performed on the first members of the pairs in phase 1, the single words from phase 2, and words not previously presented. On half the trials (inclusion) the participants were instructed to call the words old regardless of whether they thought they had occurred in phase 1 or 2; they could make this judgment either on the basis of controlled recollection or on familiarity in the absence of control. On the other half (exclusion), they were instructed to call only
56
Paula T. Hertel
words from phase 2 old; phase-1 words should be excluded. By assuming that the two bases of recognition judgments were independent, 2 we c o m p u t e d estimates of each c o m p o n e n t and used those estimates as the dependent variables in separate analyses. The estimates of the controlled, recollective c o m p o n e n t of recognition m e m o r y were reliably lower in the dysphoric group than in the nondysphoric group, but the estimates of the automatic c o m p o n e n t of familiarity were similar. Unlike the situation with free recall, in which participants have little recourse when they cannot reflectively attend (other than to guess, of course), old/new recognition decisions can be m a d e without going to a lot of trouble to think back. By using a process-dissociation procedure, however, Milan and I were able to show that the flexibility in how recognition is p e r f o r m e d allowed r o o m for initiative in cognitive control. (Exclusion instructions can be taken with a grain of salt if one finds it difficult to conjure up prior contexts.) W e also thought that if we could constrain attention better in this testing situation, we might be able to close the dysphoriarelated gap in the estimates of control. Therefore, in another condition of the test, pairs of words were presented. These pairs were presented intact f r o m phase 1, or they were phase-2 words paired with new words, or they were pairs of entirely new words. We instructed participants in the paired condition that the second m e m b e r of the pair could help t h e m m a k e the recognition decision. In effect, we were giving t h e m a basis for source monitoring on exclusion trials, becuase if the second m e m b e r of the pair also seemed old they could be m o r e certain that the first m e m b e r of the pair came from phase 1 and should be excluded. The contextual support helped, of course. C o m p a r e d to the single-item test, the paired test raised estimates of the automatic c o m p o n e n t for everyone; greater fluency from the replicated partners m a d e the target words feel m o r e familiar. The paired test also provided everyone with a better basis of controlled reflection, dysphoric and nondysphoric participants alike. Thus, we failed to even partly alleviate the dysphoric participants' deficiency in controlled recollection. One possible reason was that the sourcemonitoring stratey was not guided on a trial-by-trial basis. A n o t h e r possible reason was that 6 s were allotted for each relatedness judgment in phase 2 Jacoby's equation to represent the probability of (correctly) endorsing a phase-1 word on inclusion trials captures the assumption of independent controlled and automatic processes: P(oldindusion) = C + (1 - C ) A, where C equals the probability of controlled recollection and A equals the probability of automatic influences. The equation to represent the probability of (erroneously) endorsing a phase-i word on exclusion trials is: P (old~×elusion)= (1 - C) A. By subtracting the second equation from the first--in practice, by subtracting the proportion of erroneously endorsed phase-1 words during exclusion from the proportion of correctly endorsed phase-1 words during inclusion--estimates of controlled recollection are obtained for each participant.
Depression and Memory
57
1--enough time to produce variation in attention during initial exposure to the pairs and consequently variable bases for controlled reflection. In a much earlier series of recognition experiments that relied on source monitoring in a different way, Tammy Hardin and 1 were more successful in closing the dysphoria gap (Hertel & Hardin, 1990). We obtained stochastic independence between performance on an indirect test of homophone spelling and performance on a subsequent recognition test, but only when participants were dysphoric. The nondysphoric students' responses on the two tests were correlated. They seemed to use the strategy of checking memory for how they had spelled the word on the indirect test and then asked themselves if that word (e.g., "pear" instead of "pair") had been presented in the first phase. (The first phase posed questions such as, "What color is a pear?") We led all participants through the steps of that strategy on each trial in a subsequent experiment. The dsyphoric participants showed stochastic dependence, just as the others had done without guidance, and the previously obtained deficit in (d-prime) recognition scores was now not found. These early experiments with Hardin provided a good set of examples of how low initiative on the part of dysphoric or depressed people could be compensated for by successful experimental control of attention. In an ongoing series of recognition experiments, Colleen Parks and I have been trying a different tack--one that seeks to take advantage of the tendency to attend to mood-congruent events. 3.
On Tests of Recognizing Emotional Material
For people who are in depressed or dsyphoric states, mood-congruent memory is a fairly robust phenomenon (for reviews, see Gotlib, Roberts, & Gilboa, 1996; Williams et al., 1997). Depressed participants produce superior recall for negative trait adjectives, for example, compared to their own recall of positive trait adjectives and sometimes compared to nondepressed recall of negative adjectives. Clearly in these paradigms, depression is not associated with problems in initiating or sustaining attention to moodcongruent materials. Parks and I are attempting to use this attentional tendency as a focusing device by providing emotional contexts for the neutral materials to be remembered later. If this method is successful, in the long run it will offer the extra advantage of solving a problem inherent in mood-congruent research designs. One of the methodological sticky points encountered by mood-congruent experiments has been the "materials" problem. These experiments typically are built around emotionally valenced nouns or adjectives. However, no matter how carefully one tries to balance positive and negative word lists
58
Paula T. Hertel
according to characteristics like concreteness, meaningfulness, and frequency, other differences between the lists cannot be ruled out. Researchers have often suspected that negatively toned words are better interrelated, for example. To address this problem, we have been trying to make neutral words emotional by manipulating how they are experienced during initial exposure. Nouns were selected for their emotional neutrality and then paired with adjectives such that each noun could be presented as an emotionally positive or negative pair (e.g., "flawless skin" or "slashed skin"; "warm cottage" or "gloomy cottage"). Lists were balanced on a number of dimensions, including pilot ratings for the emotional values of all pairs, and counterbalanced with the levels of the within-subjects factors in the design. Equal numbers of participants in each mood group experienced "skin" (for example) in a positive sense, in a negative sense, as a single word in a separate study phase, and as a new word on the recognition test. In one of these experiments, we told the students that they were participating in a memory experiment, but before it was to begin, we needed some ratings for materials to be used in future experiments. That was our cover story for phase 1, a n d it provided the rationale for exclusion instructions on the test. During phase 1, 30 word pairs (in blocks of 5 positive or 5 negative) were presented for 6 s each, and the participants were instructed to generate an image of themselves interacting with the event the pair described. The pair's presentation was followed by a rating scale, which they used to rate the emotional value of the image. The rating was selfpaced. Thus, we believed that we gave participants ample opportunity to devote as much or as little attention to the pairs as they chose to do. The rating task was followed by the so-called memory experiment. In phase 2, 30 single nouns were presented for 1.5 s each. (Fifteen of those were "'critical" nouns, from a list that was rotated through all conditions as part of the counterbalancing procedures.) Instructions for the "yes/no" recognition test alerted subjects about the need to exclude words from the rating task and endorse as recognized only those words they studied in the preceding phase. The test consisted of 90 trials: 15 nouns from positive pairs in phase 1, 15 from negative pairs in phase 1, 15 critical phase-2 nouns, 15 critical new nouns, 15 phase-2 fillers, and 15 new fillers. (Fillers were similar to the other nouns but were not rotated through all the conditions.) On each test trial, a single noun was presented for 2000 ms, and during the last 750 ms it was accompanied by a row of asterisks underneath. The participants understood that they should not press the Y or N key until the asterisks appeared, and they were given 15 practice trials to become accustomed to this procedure. After the main test, the participants filled out a Beck Depression Inventory (BDI). They had been preselected according to scores on a classroom administration of that inventory, and only the
Depression and Memory
59
data from those participants whose scores remained in the same categories were analyzed. 3 Parks and I predicted that 2 s would be ample time to recruit the prior context and exclude the nouns from well-attended trials during phase I. If the dysphoric students had sustained attention and constructed distinctive images on the negative trials in particular, those nouns should have been more successfully excluded than nouns from the positive trials and as successfully excluded as were negative nouns by nondysphoric students. The mean percentages of yes responses are presented in Fig. 3. We first performed an analysis of variance ( A N O V A ) on the number of yes responses from the so-called m e m o r y experiment. Mood group (dysphoric versus nondysphoric) constituted the between-subjects factor and item type (phase-2 versus new critical nouns) the within-subjects factor. The main effect of item type was obviously reliable F (1,30) = 213.3, M S E = 4.80, p < .001. Both the interaction with mood and its main effect were not reliable (Fs < 1.0). Next we evaluated differences in the number of exclusion errors (the number of yes responses) according to mood group and valence of the phase-1 material. Dysphoric participants made more errors overall F (1,30) = 8.58, M S E = 7.46, p < .01. This difference seemed greater for positive trials than for negative ones, although the interaction with valence was not reliable F (1,30) = 2.38, M S E = 3.18, p > .10. Yet, the difference between mood groups in excluding nouns from negative trials was not reliable at the .05 level of significance, particularly when the baseline difference was used as a covariate. In short, we have been somewhat successful in closing the dysphoria-related gap in m e m o r y for neutral materials by capitalizing on mood-congruent interests. This line of research continues. One other aspect to these results deserves mention: The absolute value of the difference in exclusion errors made to nouns from positive versus negative contexts was 1.1 on average in the nondysphoric condition, but 3.0 in the dysphoric group t(30) = 4.20, SE = 0.45,p < .001. The manipulation of valence clearly had a larger effect for the dysphoric students, if not always the same sort of effect; 10 participants made fewer exclusion errors on negative nouns, but 6 participants made fewer errors on positive nouns. Mood-incongruent m e m o r y would be produced if those students attempted mood repair by attending more carefully on the positive trials (see Gotlib et al., 1996). 3 Participants were selected initially if they scored below 6 or above 9 on the BDI. The data from 7 participants were set aside and replaced, because the end-of-session scores did not fall in the same category. The data from 8 additional participants were replaced due to a variety of running errors (insufficient fluency in English, misunderstood instructions, interruptions by maintenance workers). Finally, the data from 16 participants in each mood group were analyzed; they were equally distributed across the four counterbalancing conditions.
60
Paula T. Hertel
8O
M o o d Group Nondysphoric ~ Dysphoric
7o
60
M e a n
50
% 4o Y e $
3O 20
10
Positive
Negative
Phase 2
New
Item Type Fig. 3. The mean percentage of words endorsed as studied by dysphoric and nondysphoric students. The words were made positive or negative by a phase-1 task, studied in phase 2, or newly presented on the test. "Yes" responses were appropriate only for phase-2 words (Hertel and Parks, in progress).
4.
On Tests of Prospective Memory
With sufficient initiative or assistance from others, depressed people seek out psychotherapists to help with mood-repair efforts. Among the complaints presented to psychotherapists are difficulties with memory. What do people mean when they say they have trouble with memory? Certainly they do not mean that memory is failing them in spontaneous or automatic ways. If we are not aware that memory is operating--on those ubiquitous indirect tests of everyday life--we do not think to complaint about trouble with memory. Moreover, as discussed in an earlier section, depressed people's memories are probably not failing them in this respect. Sometimes when people complain about their memory they mean that they forget names. (Memory researchers are often asked for hints about
Depression and Memory
61
how to remember names, probably because forgetting names can be embarassing.) More likely, however, people mean that they forget to do things, because forgetting to do things can have important consequences. The field of prospective memory is the study of memory for carrying out intentions in the future, and it is an obvious domain for investigations of depressionrelated deficits in self-initiated uses of memory. Einstein and McDaniel (e.g., 1990) conducted a number of experiments on prospective memory and aging. A useful distinction established by their work is the one between time-based and event-based prospective tasks. Event-based tasks essentially provide cues for carrying out the intention (the pill container placed by the coffee pot), whereas in time-based tasks the passage of time is the signal for the act (of taking a pill every 4 hours, for example). In particular, Einstein and McDaniel's older participants have shown impaired prospective memory on time-based tasks, which require more self-initiation than do event-based tasks. Depressed people have a lot in common cognitively with older people, who are also impaired in controlled uses of memory for past events (e.g., Jennings & Jacoby, 1993). Stephanie Rude and I anticipated that the similarity would be found in this prospective domain as well (see Rude, Hertel, Jarrold, Covich, & Hedlund, 1999). We recruited clinically depressed and nondepressed volunteers from the community and used procedures similar to those used by Einstein and McDaniel in their time-based condition. The participants were instructed to press the F1 key on the computer keyboard every 5 min while they were answering general-knowledge questions (a 30-min task), and they could access a digital clock by pressing another key. We found a depression-related impairment in the number of prospective responses and also in the number of times the clock was checked. As seen in Fig. 4, time monitoring increased in frequency toward the end of the 5-min interval to a greater extent for the nondepressed participants than for the depressed. (The dropoff in the fifth segment of the interval reflects the fact that the prospective response itself could be made at the end of that interval.) These results are quite compatible with an initiative account of depression-related or age-related deficits, as well as with other accounts that stress deficient control. The next step is to try to remediate this impairment by introducing a focusing manipulation. In the meantime, let's consider why such a manipulation might work by examining the possible reasons for poor initiation in depressed states. HI.
The Role of Motivation in Memory Impairments
The cognitive-initiative account of memory impairments is a general cognitive framework. By that I mean that it should apply to any situation or
62
Paula T. Hertel
Mood Nondepressed
Group -@-
Depressed
2.5 M e a
n #
T i m e
1.5
C h e c
k s
0.5
0
0
I
I
I
I
I
1
2
8
4
5
M i n u t e - S e g m e n t of I n t e r v a l
Fig. 4. The mean number of times the clock was checked in each minute prior to the time for the prospective response. Means were computed across participants in each mood condition and across the six opportunities for the prospective response. (From a table in "Depression related impairments in prospective memory," by S. S. Rude, P. T. Hertel, W. Jarold, J. Covich, & S. Hedlund, 1999, Cognition & Emotion, 13, p. 273. Copyright © 1999 by Psychology Press Ltd.) reprinted by permission of Psychology Press Limited, Hove, UK. state that would occasion p o o r self-initiated processing, not just depressed mood. In fact, the notion of reduced initiative has been used by Craik (1986) to address age-related m e m o r y impairments. A separate issue is the question of why the account should apply to depressed people in the first place. As is suggested by the everday meaning of initiative, perhaps these difficulties are motivational in nature (see Abramson, Metalsky, & Alloy, 1989). Indeed, researchers have sometimes referred to the initiative framework as a motivational account (e.g., Hartlage, Alloy, Vazquez, & D y k m a n , 1993). Consider this description by Ellis, Ottoway, Varner, Becker, and M o o r e (1997, p. 132): "Individuals in a disruptive m o o d state m a y simply be less motivated or energized to p e r f o r m well in demanding tasks, that is, they m a y lack sufficient initiative to p e r f o r m the task adequately (e.g.,
Depression and Memory
63
Hertel & Rude)." Plainly, the word initiative has been taken to mean either incentive or arousal, although we have never described memory impairments in those ways. However, the cognitive-initiative account can be conceptualized as motivational in two other ways. As is sometimes suspected with aging, the difficulty in self-initiation is potentially related to reduced activation in the frontal regions of the cortex (e.g., Henriques & Davidson, 1991), the areas understood as responsible for planning, monitoring, initiating, and sustaining attention. The extent of this neuropsychological basis for reduced initiative should vary with the severity of the depression (Johnson & Magaro, 1987). Furthermore, on neuropsychological grounds, reduced initiative in the control of attention might underlie depression-related difficulties in other goal-directed behaviors, in addition to those related to remembering. The commonalities among the different meanings of the terms control and initiative, mentioned at the beginning of the chapter, can probably be attributed to their frontal roots. In that fundamental sense, the cognitive-initiative account does seem to be a motivational account of memory difficulties. Another motivational aspect of the initiative framework is quite different. Consider Simon's (1994) description of the relations among attention, memory, and emotion: "Items in memory with which emotion is associated are, ceteris paribus, more easily aroused than other items and hence more capable of directing attention or causing interruption of attention. They operate much like motives but are associated with perhaps less specific goals than motives usually are" (p. 19). In this sense, most approaches to depression and memory (e.g., Ellis & Ashbrook, 1988; Williams et al., 1997) are motivational. They generally assume that depressed people are motivated to attend to personal concerns--"to items in memory with which emotion is associated." Task materials that are related to those concerns are attended and remembered, thereby producing mood-congruent memory, and those that are unrelated suffer from neglect while personal concerns preempt attention. Like these other views, the initiative account is a motivational account in that it acknowledges that personal concerns and interests can divert attention that would otherwise be focused on the task at hand. Is there reason to believe that such diversions underlie impaired memory for neutral events? I conducted an experiment that addresses this question (Hertel, 1998). This experiment had three phases: The first and last were close replications of the materials and procedures used by Jacoby (1996). Word pairs (e.g., "building stone") were presented at a 2-s rate, and the participants read them aloud in anticipation of a later memory test. The form of the test was fragment completion on the second member of the pairs from phase 1 (e.g., "building to e"), conducted according to process-dissociation
64
Paula T. Hertel
procedures. On half of the trials, inclusion instructions were given. The participants were told that they should complete the fragment with a word from phase 1 that was related to the context word ("building") or, if they could not remember such a word, they should complete the fragment with the first word that comes to mind that is related to the context word. On exclusion trials they were told to try to remember a word from phase 1 that completes the fragment but not to use that word; instead they should use another word that fits the fragment and is related to the context word (e.g., "store"). By assuming that controlled and automatic uses of memory operate independently in this paradigm, we calculated estimates of each component. What we were most interested in were the estimates of the controlled component. Recall that we had previously shown dysphoriarelated impairments in controlled reflection within a recognition paradigm (Hertel & Milan, 1994). I hoped to replicate this effect and, moreover, gain some insight about why it might occur. To do that, I used three versions of a second phase of the experiment, interspersed between the study and test phases. In the unconstrained version of phase 2, the experimenter fiddled with the computer and shuffled papers for 7 rain, while the participants sat quietly and did nothing. At least they did nothing that the experimenters could observe; certainly, they were entitled to their thoughts. The point of using such a long interval was to invite the dysphoric participants to entertain the kind of thoughts that are often blamed for poor performance in laboratory memory tasks: automatic negative thoughts that Beck and others documented in the clinical literature (see Beck, Rush, Shaw, & Emory, 1979; Williams et al., 1997). Clearly, prior practice in entertaining these concerns during "free times" causes them to come to mind automatically during future unconstrained periods. Practice in attending makes the attended thought automatic (see Logan & Etherton, 1994). Once personal concerns come to mind, it should be difficult to dismiss them and control one's attention to the past during the memory test. The thoughts are simply more compelling or attention demanding than are the mundane events of the experiment. A difficulty for the researcher, however, is to determine if indeed these kinds of thoughts occur. If you ask the participants during the interval, you encourage the thoughts, and if you wait until later, you rely on memory and establish demand characteristics associated with the timing of the request (see Hertel, 1997; Parrott & Hertel, 1999). For these reasons, I chose to use two other conditions of the experiment as alternative models for what might have happened in the unconstrained condition. Only three kinds of cognitive activities were possible during the 7 min. The participants could have entertained no thoughts at all, self-focused thoughts, or other-focused thoughts. (Admittedly, various combinations were also possible.) I eschewed the condition of no thoughts at all, because it
Depression and Memory
65
seemed like an impossible outcome to pull off experimentally. To encourage self-focused or neutral (other-focused) thoughts, I slightly modified phrases borrowed from Nolen-Hoeksema and Morrow (1993). Their neutral phrases refer to geographic locations and objects (e.g., "the shape of the continent of Africa"). Although their self-focused phrases (e.g., "my character and who I strive to be") are not inherently negative, they have been shown to encourage ruminative thoughts in depressed and dysphoric participants. In each of the separate phase-2 conditions--self-focused and neutral-dysphoric and nondysphoric participants were instructed to read each phrase, form an idea of the meaning of the phrase, and then rate the clarity of that idea. They performed this rating task for 7 min. If the self-focused phrases encouraged rumination in the dysphoric group, I expected to see a corresponding impairment on estimates of the controlled component of memory on the subsequent test. The neutral condition was an important control for the nature of the thoughts. Frontal hypoactivation or some other cause of generalized distractibility might make it difficult to focus attention on the past. After all, it was possible to perform the test by relying on more automatic influences of memory. Figure 5 shows the mean estimates of control in this experiment. The impairment in the unconstrained condition was reliable. It was mimicked by the pattern obtained from participants in the self-focused condition, and the difference was eliminated in the condition in which the participants thought about other matters. A very similar pattern also obtained when these data were reanalyzed by using the extended measurement model of Buchner, Erdfelder, and Vaterrodt-Pliinnecke (1995; see Hertel & Meiser, in press). For dysphoric students, then, the self-focused condition provided a good model for the deficit obtained in the unconstrained condition. Of course, we cannot be sure of this interpretation, given the fact that the same pattern could be produced by different processes. However, we do know that a simple mood-change account probably is not sufficient. Mood ratings at the end of the test were no more extreme in the self-focused condition than in the neutral condition (nor were the scores on the BDI). It is also important to realize that more fundamental motivational difficulties might characterize the impairments of clinically depressed patients-perhaps the sort that are associated with frontal hypoactivation. These issues merit further research, perhaps more creatively conceived, aimed at the understanding of the processes responsible for poor initiation and control in depressed and dysphoric moods. IV.
Comparisons and Conclusions
In addition to a shared motivational flavor, the cognitive-initiative approach is compatible in other respects with earlier conceptualizations of
66
Paula T. Hertel
Mood Group 0.4
Nondysphoric
~
Dysphoric
M e a n
0.3
E s
t i
m a t
0.2
e
d C
o.1
Unconstrained
Self-Focused
Neutral
Interval Task Fig. 5. The mean estimate of the controlled component of memory in a fragmentcompletion test, following an unconstrained interval, a self-focused rating task, or a neutral rating task. Adapted from "The Relationship between Rumination and Impaired Memory in Dysphoric Moods," by P. T. Hertel, 1998, Journal of Abnorrnal Psychology, 107, p. 170. Copyright @ 1998 by the AmeriCan Psychological Association.
depression-related impairments (e.g., Ellis & Ashbrook, 1988; Hasher & Zacks, 1979; Weingartner, Cohen, Murphy, Martello, & Gerdt, 1981; Williams et al., 1988). In the main, these earlier approaches employed spatial metaphors and put forward the notion that competing thoughts occupy capacity or tie up resources that the depressed or dysphoric person might otherwise use to perform a cognitively demanding task. In contrast, my collaborators and I employed procedural metaphors and argued that capacity metaphors are not sufficient to explain the entire pattern of depressionrelated performance. Essentially, there are three types of phenomena that are difficult to explain by using a capacity-based metaphor alone. First, in the experiment just described, the elimination of the opportunity to ruminate also eliminated the impairment in controlled use of memory (Hertel, 1998); capacity
Depression and Memory
67
accounts rarely permit this degree of flexibility. Second, a deficit has been found on the simple task of perceptual identification (Hertel, 1994), and the condition of initial exposure that produced the deficit was the less resource-demanding task: the task of rating curvature, not the task of rating emotionality. Third is evidence that external control can sometimes compensate for deficiencies in less-structured situations, even when the task is resource-intensive (Hertel & Hardin, 1990; Hertel & Rude, 1991a,b). Thus, the contribution of the initiative framework has been to emphasize the importance of attentional control during unstructured conditions. It is important to know whether such control is disruptive, irrelevant, or beneficial to performance on the memory task. When it is beneficial, external mean can be used to focus attention appropriately. Whether guidance remediates depression-related impairments, however, depends on a careful analysis of the cognitive procedures that are instrumental to successful performance. Consider a series of experiments by Ellis at al. (1997) on the effects of "depressed" mood inductions on the detection of contradictions in text. Ellis et al. found that students who had received the "depressing" version of a mood-induction procedure detected fewer contradictions than did those in the neutral condition, even after the students had been alerted to the possible presence of contradictions. The authors saw the warning about possible contradictions as a focusing manipulation and interpreted the mood effect for participants so warned as evidence counter to an initiative account. Although the search for boundary conditions on the account is important, the justification for Ellis et al.'s interpretation is questionable. First, the experimental induction of depressed mood does not always serve as a reasonable model for naturally occurring depressed or dysphoric mood states (see Hertel & Rude, 1991b; Parrott & Hertel, 1999). Second, the instructions to search for contradictions was given right after the "depressed" participants--and not those in the neutral condition--had been encouraged to entertain any thoughts that came to mind as part of their induction procedure. Differential carry-over effects of distraction possibly interfered with the control intended by the warning. The third point is the more general point about the adequacy of experimental control. Consider that evidence for remediation of dysphoria- or depressionrelated impairments has been based on manipulations that guided participants' processing quite specifically. For example, Hertel and Hardin (1990) inferred that nondysphoric participants' stochastic dependence of homophone recognition on prior spelling reflected the use of an attentional strategy at the time of the test; we therefore guided the dysphoric participants' use of that strategy, trial-by-trial, and remediated the recognition deficit. Hertel and Rude (1991a,b) surmised that the constraints on attention
68
Pallia T. Hertel
during initial exposure to target materials were lax; we therefore tightened the demands of the orienting task by requiring sustained attention to the targets and thereby remediated the deficit in subsequent recall. In contrast, the pretask instructions given by Ellis et al. (1997) might not have served the purpose of guiding participants to engage in beneficial procedures during the task itself. In short, "depression-induced" participants were encouraged to let their minds wander, and attention was not focused by specific response requirements as the task proceeded. The cognitiveinitiative account--or any other account that stresses the attention-directing aspects of task requirements--cannot be ruled out on the basis of the failure to focus attention sufficiently. Understanding the specific procedures that contribute to successful task performance is, of course, a very general goal in cognitive psychology. The point of studying effects of mood or relationships to clinical syndromes is not only to achieve an understanding of the state or syndrome, but also to determine whether our theoretical frameworks can accomodate emotionrelated phenomena. There is little reason to believe that these goals should be easily achieved, given the difficulty of understanding the level of procedural specificity that underlies task performance (see Kolers & Roediger, 1984). Thus far, my colleagues and I have shown that deficiencies in deliberate memory associated with depressed and dysphoric states can be understood in terms of attentional control. We know that personal concerns can occupy attention and interfere with the use of procedures that would otherwise be self-initiated. Truly successful remediation of impairments, through guiding the use of beneficial procedures, has been achieved in only two experimental paradigms: one that focused attention sufficiently during initial exposure and one that directed the use of recognition strategies. Clear failures to gain experimental control have also been demonstrated. For example, Hertel and Milan (1994) demonstrated that dysphoriarelated impairments in the controlled component of recognition memory could not be remediated simply by reinstating contextual cues at the time of testing. However, the discovery that a particular attention-focusing procedure is inadequate should not discourage the search for better means. In general, investigators of depression-related impairments must move beyond the mere assertion that attention is diverted by personal concerns to an understanding of the specific procedures involved in producing and remediating the impairments. A proper understanding of a phenomenon can be shown by its experimental reduction or elimination. To do that in the case of cognitive impairments in depression, we need to know the nature of the "something" that someone has done without being told.
Depression and Memory
69
REFERENCES Abramson, L. Y., Metalsky, G. I., & Alloy, L. B. (1989). Hopelessness depression: A theorybased subtype of depression. Psychological Review, 96, 358-372. Beck, A. T., Rush, A. J., Shaw, B. F., & Emery, G. (1979). Cognitive therapy of depression. New York: Guilford. Beck, A. %° Ward, C., Mendelson, M., Mock, J., & Erbaugh, J. (1961). An inventory for measuring depression. Archives of General Psychiatry, 4, 561-571. Buchner, A., Erdfelder, E., & Vaterrodt-Plt~nnecke, B. (1995). Toward unbiased measurement of conscious and unconscious memory processes within the process dissociation paradigm. Journal of Experimental Psychology: General, 124, 137-160. Craik, F. I. M. (1986). A functional account of age differences in memory. In F. Klix & H. Hagendorg (Eds.), Human memory and cognitive capabilities: Mechanisms and performances (pp. 409-422), Amsterdam: Elsevier. Danion, J.-M., Willard-Schroeder, D., Zimmermann, M.-A., Grange, D., Sehilenger, J.-L., & Singer, L. (1991). Explicit memory and repetition priming in depression. Archives of General Psychiatry, 48, 707-711. Denny, E. B., & Hunt, R. R. (1992). Affective valence and memory in depression: Dissociation of recall and fragment completion. Journal of Abnormal Psychology, 101, 575-580. Einstein, G. O., & McDaniel, M. A. (1990). Normal aging and prospective memory. Journal of Experimental Psychology: Learning Memory, and Cognition, 16, 717-726. Ellis, H. C., & Ashbrook, P. W. (1988). Resource allocation model of the effects of depressed mood states on memory. In K. Fiedler & J. Forgas (Eds.), Affect, cognition and social behavior (pp. 25-43). Toronto: Hogrefe. Ellis, H. C., Thomas, R. L., & Rodriguez, I. A. (1984). Emotional mood states and memory: Elaborative encoding, semantic processing, and cognitive effort. Journal of Experimental Psychology: Learning, Memory, and Cognition, 10, 470-482. Ellis, H. C., Ottoway, S. A., Varner, L. J., Becker, A. S., & Moore, B. A. (1997). Emotion, motivation, and text comprehension: The detection of contradictions in passages. Journal of Experimental Psychology: General, 126, 131-146. Gotlib, I. H., Roberts, J. E., & Gilboa, E. (1996). Cognitive interference in depression. In I. G. Sarason, G. R. Pierce, & B. R. Sarason (Eds.), Cognitive interference: Theories, methods, and findings (pp. 347-377). Mahwah, NJ: Erlbaum. Hartlage, S., Alloy, L. B., Vazquez, C., & Dykman, B. (1993). Automatic and effortful processing in depression. Psychological Bulletin, 113, 247-278. Hasher, L., & Zacks, R. T. (1979). Automatic and effortful processes in memory. Journal of Experimental Psychology: General, 108, 356-388. Henriques, J. B., & Davidson, R. J. (1991). Left frontal hypoactivation in depression. Journal of Abnormal Psychology, 100, 535-545. Hertel, P. T. (1994). Depressive deficits in word identification and recall. Cognition and Emotion. 8, 313-327. Hertel, P. T. (1997). On the contribution of deficient cognitive control to memory impairment in depression. Cognition and Emotion, 11, 569-583. Hertel, P. T. (1998). The relationship between rumination and impaired memory in dysphoric moods. Journal of Abnormal Psychology, 107, 166-172. Hertel, P. T., & Hardin, T. S. (1990). Remembering with and without awareness in a depressed mood: Evidence of deficits in initiative. Journal of Experimental Psychology: General, 119, 45-59. Hertel, P. T., & Knoedler, A. J. (1996). Solving problems by analogy: The benefits and detriments of hints and depressed moods. Memory & Cognition, 24, 16-25.
70
Paula T. Hertel
Hertel, P., & Meiser, T. (in press). Capacity and procedural accounts of impaired memory in depression. In U. yon Hecker, S. Dutke, & G. Sedek (Eds.), Generative thought and psychological adaptation: New perspectives on cognitive resources and control functions. Dordrecht, The Netherlands. Kluewer Press. Hertel, P. T., & Milan, S. (1994). Depressive deficits in recognition: Dissociation of recollection and familiarity. Journal of Abnormal Psychology, 103, 736-742. Hertel, P. T., & Rude, S. S. (1991a). Depressive deficits in memory: Focusing attention improves subsequent recall. Journal of Experimental Psychology: General, 120, 301-309. Hertel, P. T., & Rude, S. S. (1991b). Recalling in a state of natural or induced depression. Cognitive Therapy and Research, 15, 103-127. Jacoby, L. L. (1991). A process dissociation framework: Separating automatic from intentional uses of memory. Journal of Memory and Language, 30, 513-541. Jacoby, L. L. (1996). Dissociating automatic and consciously controlled effects of study/test compatibility. Journal of Memory and Language, 35, 32-52. Jacoby, L. L., Kelley, C. M., & Dywan, J. (1989). Memory attributions. In H. L. Roediger III& F. I. M. Craik (Eds.), Varieties of memory and consciousness (pp. 391-422), Hillsdale, NJ: Erlbaum. Jennings, J. M., & Jacoby, L. L. (1993). Automatic versus intentional uses of memory: Aging, attention and control. Psychology and Aging, 8, 283-293. Johnson, M. H., & Magaro, P. A. (1987). Effects of mood and severity on memory processes in depression and mania. Psychological Bulletin, 101, 28-40. Kolers, P. A., & Roediger, H. L. III (1984). Procedures of mind. Journal of Verbal Learning and Verbal Behavior, 23, 425-449. Logan, G. D., & Etherton, J. L. (1994). What is learned during automatization? The role of attention in constructing an instance. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 1022-1050. Morris, C. D., Bransford, J. P., & Franks, J. J. (1977). Levels of processing versus transferappropriate processing. Journal of Verbal Learning and Verbal Behavior, 16, 519-533. Needham D. R., & Begg, I. M. (1991). Problem-oriented training promotes spontaneous analogical transfer: Memory-oriented training promotes memory for training. Memory & Cognition, I9, 543-557. Nolen-Hoeksema, S., & Morrow, S. (1993). The effects of rumination and distraction on naturally occurring depressed moods. Cognition and Emotion, 7, 561-570. Parrott, W. G., & Hertel, P. T. (1999). Research methods in cognition and emotion. In T. Dalgleish & M. Power (Eds.), The handbook of cognition and emotion. (pp. 61-81). Chichester: Wiley. Rude, S. S., & Hertel, P. T. (1987, November). Remembering as a consequence of cognitive effort and depression. Paper presented at the meeting of the Association for the Advancement of Behavior Therapy, Boston. Rude, S. S., Hertel, P. T., Jarrold, W., Covich, J., & Hedlund, S. (1999). Depression related impairments in prospective memory , Cognition and Emotion, 13, 267-276. Simon, H. A. (1994). The bottleneck of attention: Connecting thought with motivation. In W. D. Spaulding (Ed.), Integrative views of motivation, cognition, and emotion: Nebraska symposium on motivation (Vol. 41, pp. 1-21), Lincoln: University of Nebraska Press. Tyler, S. W., Hertel, P. T., McCallum, M. C., & Ellis, H. C. (1979). Cognitive effort and memory. Journal of Experimental Psychology: Human Learning and Memory, 5, 607-617. Watkins, P. C., Mathews, A., Williamson, D. A., & Fuller, R. D. (1992). Mood-congruent memory in depression: Emotional priming or elaboration? Journal of Abnormal Psychology, 101, 581-586.
Depression and Memory
71
Watkins, P. C., Vache, K., Verney, S. P., Muller, S., & Mathews, A. (1996). Unconscious mood-congruent memory bias in depression. Journal of Abnormal Psychology, 105, 34-41. Weingartner, H., Cohen, R. M., Murphy, D. L., Martello, J., & Gerdt, C. (1981). Cognitive processes in depression. Archives of General Psychiatry, 38, 42-47. Williams, J. M. G., Watts, F. N., MacLeod, C., & Mathews, A. (1988). Cognitive psychology and emotional disorders. New York: Wiley. Williams, J. M. G., Watts, F. N., MacLeod, C., & Mathews, A. (1997). Cognitive psychology and emotional disorders (Second Edition). New York: Wiley.
RELATIONAL TIMING: A Theromorphic Perspective J. Gregor Fetterman
Many studies of perception, cognition, and learning in humans and other animals have been inspired by the "comparative imperative," the notion that comparisons between humans and other species have a privileged status (e.g., see Wasserman, 1993, for an overview). These investigations have typically used the same independent and dependent variables in similar tasks to compare behaviors subserved by various perceptual and cognitive mechanisms (e.g., categorization; visual search; working memory). These experiments have at times demonstrated marked similarities across different species (e.g., Blough & Blough, 1990) and occasionally illuminated clear differences (e.g., Premack, 1983). An evolutionary biologist might view the similarities as reflective of invariant properties of the world and the differences as due to variability among niches (e.g., Shepard, 1984). The work presented in this chapter is representative of modern comparative cognitive psychology as it illustrates similarities and differences in the relational timing abilities of two species, research based on prior work with humans. Numerous studies of learning and memory in nonhuman animals have been predicated both conceptually and methodologically, on research with humans (e.g., Blough, 1992; Grant, 1981). Oftentimes the procedural translations from the human to the animal domain seem straightforward as, for instance, with the delayed-matching-to-sample procedure (DMTS; e.g., Blough, 1959). DMTS is a standard technique used to study working memory in nonhuman organisms. There is a general consensus (e.g., THE PSYCHOLOGY OF LEARNING AND MOTIVATION, VOL. 39
73
Copyright © 2000 by Academic Press. All rights of reproduction in any form reserved. 0079-7421/00 $30.00
74
J. Gregor Fetterman
White & Cooney, 1996) that this procedure and the process it engages bears important similarities to human working memory tasks (e.g., Peterson & Peterson, 1959), and many results from this area of research appear consistent with those obtained with humans (e.g., Roberts & Grant, 1976). In this chapter, however, I argue that conclusions about similarities and differences should be carefully evaluated because other species may not engage various "cognitive" tasks in the same manner as humans. That is, I suggest that there is a tendency to adopt an anthropomorphic (humancentered) as opposed to a theromorphic (animal-centered; Timberlake, 1994, 1997) perspective when considering methods and results in the field of comparative cognition. The basic points are that animals may adopt behavioral strategies different than those expected (and intended) by the experimenter; and that intuitions about what the animal "should" be doing derive from our own experiences in comparable situations; these experiences often guide inferences about data. Romanes (e.g., 1884), of course, was well-known for his views concerning the utility of anthropomorphic hypotheses about the minds of other animals. Boakes (1984) summarizes Romanes' position on this matter in the following way: Our subjective experience, "consciousness", provides the only direct way of understanding the workings of our own minds and the basis of our actions. When we perceive that the activities of other people resemble what we do ourselves, then, on the basis of analogy, we attribute to them minds like our own. A n d the same holds with regard to animals: to the extent that their behavior is analogous to ours, then they possess minds. (p. 27)
Contemporary psychologists recognize that, although such attributions do occur, they are not logically justified. Nonetheless, anthropomorphic biases may influence the conduct of research in subtler ways. For example, some nonhuman organisms, including chimpanzees, monkeys, rats, and pigeons (e.g., Davis, 1992; Gillan, 1981), can be taught discriminations that appear to demand reasoning based on a process of transitive inference. Research on transitive inference in animals has attracted a great deal of attention because it seems to indicate that some nonhuman organisms possess human-like reasoning abilities. Research suggests, however, that behaviors that have been taken as evidence for a process of transitive inference may emerge as a result of simpler learning mechanisms, such as differential histories of reinforcement (Couvillon & Bitterman, 1992) or a process of value transfer (Steirn, Weaver, & Zentall, 1995). Zentall (1995) has framed this issue in terms of sufficiency versus necessity. Researchers in the field of animal cognition often begin by asking what behaviors are sufficient to support inferences that an animal possesses
Relational Timing
75
a particular cognitive ability, such as the ability to chunk information (e.g., Terrace, 1993). Zentall notes, however, that researchers may fail to address the question of necessity; that is, although a behavior might be taken as sufficient evidence for demonstrating some cognitive capacity, that capacity may not be a prerequisite for the behavior in question, as a simpler mechanism could account for observed behavior. In other words, researchers sometimes favor explanations in terms of complex cognitive mechanisms when simpler explanations might suffice and the posited mechanisms typically are ones that a human might use in similar circumstances. What should one do to become less anthropomorphic and more theromorphic? A simple answer is that researchers should "place themselves in the position of the animal" (Timberlake, 1997, p. 117). This involves knowing about the perceptual, cognitive, motivational, and motor capacities of the animal, and projecting how these will be brought to bear on some specific task. It does not involve modeling what you would perceive and how you would act if you were in the position of the animal. Such a strategy is likely to lead to specious inferences and poor predictions, as the research in this chapter demonstrates. These issues are brought to bear on a program of research carried out by my colleagues and me, research that involved human-animal comparisons. I begin by laying out the general method and then attempt to consider the task from a theromorphic perspective. The remainder of the chapter is devoted to considering our results in light of the initial analysis of the task structure and a priori intuitions about how pigeons and people should engage the task.
I.
General Method
The two-alternative forced-choice procedure (2AFC; e.g., Macmillan & Creelman, 1991) was used in all of the studies reported in this chapter. The technique is a standard classic in the field of human psychophysics. This procedure has been used to study perceptual comparisons along many continuua (e.g., tone frequencies, line lengths) including stimulus duration (e.g., see Getty, 1975). In timing experiments, each trial involves a sequence of stimulus durations, tl followed by t2. After the stimuli are presented, a subject might be required to identify which observation interval contained the "standard" duration; in this version, the value of the standard duration remains fixed over trials whereas the comparison duration sometimes may be very similar to the standard and at other times very different (e.g., Getty, 1975). A related version, referred to as the roving standard design (Allan, 1979), arranges different values of both stimuli; in this design, a subject
76
J. Gregor Fetterman
must identify which observation interval contains the shortest (or longest) duration. Although many studies of human timing have used the 2AFC task, most experiments on animal timing have used schedule-related tasks, such as the peak procedure (Roberts, 1981), or simple psychophysical methods, such as the method of single stimuli (e.g., Stubbs, 1968). Very few studies with nonhuman animals have employed the 2AFC technique. An analysis of this task based on current knowledge about learning and memory in nonhuman organisms provides some insights about this disparity. Researchers normally prefer to study behavior in simple rather than complex situations, and from the perspective of research on animal learning and memory, the 2AFC task may seem quite complex. For example, the 2AFC discrimination appears to demand that subjects base discriminative responses on the relation between successively presented stimuli (i.e., which stimulus lasted longer) because the values of individual durations vary across trials (the roving standard design). Many studies indicate, however, that nonhuman animals, especially pigeons, have great difficulty learning relational discriminations (e.g., Premack, 1978). Thus, from the standpoint of much research on the ability of pigeons and other nonhuman species to learn and transfer discriminations based on stimulus relations, the 2AFC task would seem less than ideal as a methodology for studying animal timing. Working memory also could affect discrimination because the value of the first interval must be retained over the length of the second interval for comparison at the time of choice. It seems plausible to assume some degradation of memory for the initial stimulus during the presentation of the second stimulus, and the extent of the degradation should depend on the value of the second stimulus. However, this assumption pits established limits on working memory for temporal intervals (e.g., Fetterman, 1995) against a basic principle of discriminations along prothetic continua-Weber's law, which states that our ability to discriminate stimuli depends on the relative (not absolute) differences between the stimuli. Thus, the accuracy of discriminating a 1-s interval against a 2-s interval should equal that observed when the discrimination involves a comparison of 5 against 10 s (both involve a 1 : 2 stimulus ratio). In the latter example, however, a degraded memory for the first interval might reduce accuracy as compared to the 1- versus 2-s discrimination, constituting a violation of Weber's law. Although the 2AFC task might seem complex, and therefore less than ideal as a timing methodology for nonhuman animals, there are at least two advantages in using such a technique. First, there is a vast human timing literature, and many experimenters have used the 2AFC procedure as an assay of timing. The extant database thus affords the possibility of numerous comparisons of human 2AFC performance against that of pi-
Relational Timing
77
geons, a major focus of comparative cognitive research. Second, as noted, the task involves at least three cognitive processes of interest to researchers in comparative cognition--timing, relational learning, and working memory-processes that are often studied in isolation from one another. Whereas a strategy of isolating the effects of one variable from the influence of others has obvious advantages, it could be argued that strategies that involve the comingling of variables also are relevant, as they may possess greater ecological validity. In the remainder of this chapter I describe our research on temporal discrimination in pigeons and humans using the 2AFC procedure. I place particular emphasis on the data in the context of a priori notions concerning how a pigeon or human might approach these discriminations, notions based both on extant literatures and on anthropomorphic intuitions.
II.
Ordinal Comparisons o f Duration
Fetterman and Dreyfus (1986) arranged a temporal 2AFC discrimination for pigeons. The stimuli consisted of all combinations of 0.5, 1, 2, and 4 s in one condition and 2, 4, 8, and 16 s in another, excluding combinations in which the first and second stimuli were equal. The durations were signaled by a sequence of red and green lights on the center key of a standard threekey operant chamber; that is, the red light was turned on for one duration and then replaced by the green light, which lasted for another duration. At the end of the red-green sequence, the center key was darkened and the left and right side keys illuminated with amber lights. Responses to one side key were reinforced when red (ta) lasted longer and responses to the alternate side key were reinforced when green (t2) lasted longer. Each condition included probe trials containing equal pairs of stimuli (e.g., 2 s of red followed by 2 s of green); humans sometimes exhibit systematic biases that reflect the order of stimulus presentation--time order effects (e.g., Hellstrom, 1985). Choices on trials with equal stimulus pairs provide a sensitive measure of these effects. In addition, novel combinations of unequal stimuli were presented; some of these stimuli were outside the range used in training. Choices were never reinforced on these probe trials. All birds acquired the discrimination to a level of about 85% correct responses, roughly equal to that obtained with simpler psychophysical tasks (e.g., Stubbs, 1968). There were no differences in accuracy across stimulus ranges in spite of a fourfold difference in the values of the stimuli. When equal pairs of stimuli were introduced, subjects tended to respond that the second stimulus was longer, a negative time-order error (Hellstrom, 1985);
78
J. Gregor Fetterman
such errors are common in human psychophysical judgments, including judgments about the relative durations of stimuli (e.g., Allan, 1977). Although accuracy on this complex discrimination exceeded expectations, transfer tests with novel stimulus values, a standard assay of relational discrimination, suggested that the birds generally were not discriminating on a relational basis. For instance, after training with pairs composed of 0.5, 1, 2, and 4 s, the birds received probe trials where red lasted for 6 s and green lasted for 4 s. On these trials all birds consistently (and incorrectly) responded that green lasted longer, a result suggesting that the birds were responding according to the value of the second stimulus only. During training, 4 s of green always was longer, no matter what the value of red. Thus, a subject could ignore the value of one stimulus and respond accurately on the basis of the other whenever one of the stimuli was the shortest or longest value in the training set. A significant number of trials afforded the possibility of such a strategy. Although the method of Fetterman and Dreyfus (1986), strictly speaking, involved a roving standard, their task allowed the pigeons to sometimes discriminate on the basis of a single stimulus; subjects appeared to capitalize on this procedural shortcoming. Subsequent research with pigeons and humans aimed to significantly reduce this possibility, and to further explore similarities and differences in relational duration comparisons. Fetterman and Dreyfus (1987) changed the method for creating the duration pairs, using a technique that generated more than 900 possible combinations of the stimuli. In this version, each stimulus could last from 1 to 32 s in 1-s increments, making it more difficult for the pigeons to base choices on the value of a single duration. Sessions contained 80 trials, and therefore subjects experienced a subset of the pool of stimuli within each session. All other details of the task were as described earlier. Figure i displays the data for a representative pigeon in a way that gives the reader a sense both of the number of different duration combinations and a global picture of the resulting performance. The purpose of the figure is to provide the reader with a "gestalt" of the birds' performance, not illustrate the finer details of discrimination. The figure is arranged as a matrix, with the value of the first duration represented on the ordinate and the value of the second on the abscissa. The cells represent different duration combinations and the symbols identify the outcomes of individual trials. Filled circles indicate correct responses and Xs indicate incorrect responses. The major diagonal separates the matrix according to problems in which the first duration was longer (above the diagonal), and where the second duration was longer (below the diagonal). The other lines separate the matrix into regions based on duration ratio (first to second duration), with the ratios identified by the numbers on the top and right borders of
R e l a t i o n a l Timing
4-:I
19
2:1
•
79
1.5:1
I:I
12
•
cI n. .c .o.r.r e, c. t
17 ;;
Z
14
x~
.:~ . . . . . . . . . }__
I0
.o.
x
° :~
__O Js }.__
x
~quct .
o
*
•
:.
....
×
•
~ ~ •
.
x
.
o..
.
!'~r" ,:
"
.
.
.
.
.
.
o-
.o
.
..
3
ooo x
Z
--o
2.
3
4
.5
6
7
8
SECOND
9
I0
II
IZ
13 14
.×
15
6
-o
17 18
19 2.0
DURATION
Fig. 1. Duration of the first stimulus (ordinate) against the duration of the second stimulus (abscissa). The intersection of the vertical and horizontal axes identifies specific duration combinations, and the symbols in the imaginary cells represent the outcomes on individual trials. Filled circles indicate correct responses and Xs indicate incorrect responses. A l t h o u g h the duration of each stimulus could range between 1 and 32 s, the figure shows a subset of possible combinations because values above 20 s were relatively infrequent. T h e matrix shows data of a single subject, Pigeon 91.
the figure. For instance, the line identified as 4 : 1 represents duration pairs that stand exactly in a 4 to 1 ratio; points to the left of this line have ratios greater than 4 to 1; points to the right of the line have ratios less than 4 : 1, and so forth. The major diagonal is labeled as 1:1 because it represents cases where the two durations were equal. Note first that filled circles p r e d o m i n a t e in the u p p e r left and lower right quadrants of the matrix, regions that represent easy discriminations in the sense that one stimulus was considerably longer than the other. For example, the cell in the u p p e r left corner of the matrix indicates trials where the first stimulus lasted 20 s and the second lasted 1 s; the symbols in this cell show that this subject was correct on every presentation of this pair. Similarly, the cell in the lower right corner shows the c o m p l e m e n t a r y problem, where the first stimulus lasted one second and the second lasted twenty
80
J. Gregor Fetterman
seconds. Again, the symbol indicates that the subject was correct on the single presentation of this particular duration pair. Cells near the major diagonal (between 1.5:1 and the major diagonal and 1 : 1.5 and the major diagonal) represent more difficult discriminations in which both the relative and absolute differences in the durations were smaller than for the cells in the upper left and lower right quadrants. Casual inspection indicates that these regions of the matrix contain more Xs (errors) than other quadrants. For instance, the cell representing the duration pair 10 s versus 7 s contains three Xs and three filled circles, showing that the pigeon was correct three times and incorrect three times. The keen observer may also detect an asymmetry in the distribution of symbols on either side of the diagonal; more Xs appear above the diagonal than below it. This asymmetry may be interpreted as another example of negative time-order effects; subjects tended to be more accurate when the second stimulus was longer (e.g., 7 s vs. 8 s) than for (seemingly) comparable problems where the first stimulus lasted longer (e.g., 8 s vs. 7 s). The unfilled symbols along the major diagonal show the results of trials on which the durations of the stimuli were equal. Unfilled circles indicate a response that the first stimulus was longer and unfilled triangles indicate a response that the second stimulus was longer. When the durations were relatively short, the pigeon tended to report that the first stimulus was longer (positive time-order error), but when the durations were relatively long, the pigeon tended to report that the second stimulus was longer (negative time-order error). This pattern was observed in all birds (see Fetterman & Dreyfus, 1987, for a detailed discussion of this pattern). Figure 1 gives the impression that the accuracy of discrimination was related to relative stimulus differences, or, stated more precisely, to the ratio of the stimuli. On this analysis, equal stimulus ratios should produce equal levels of discriminability, irrespective of the absolute values of the stimuli (e.g., 2 s vs. 1 s and 10 s vs. 5 s should be equally discriminable); this is simply a restatement of Weber's law. Figure 2 replots the matrix data in a psychophysical format, formalizing the idea that the ratio of the two durations was the relevant variable. The various duration pairs were grouped into categories that included a narrow range of ratios (e.g. 1:4-1:3, 1:3-1:1.5, etc.); note, however, that each category included pairs with different absolute values of the stimuli. The figure shows the probability of responding that the first stimulus was longer ("tl > t2") as a function of the ratio of tl to t2. The curve is a smooth ogive reflecting orderly changes in performance along the duration ratio dimension. As before (Fetterman & Dreyfus, 1986), accuracy was a function of stimulus ratio, irrespective of the values of the stimuli that composed the ratio (Dreyfus, Fetterman, Smith, & Stubbs, 1988, report similar findings).
Relational Timing 1.0
81
¸
A 0.8' ":-'-4 41J 0.6'
b
0.4'
..Q ,.G 0
0.2 0,0 .1
1
10
Duration Ratio (tl/t2) Fig. 2. Probability of responding that the first stimulus was longer than the second stimulus ("tl > t2") as function of duration ratio.
These results are similar to those obtained with humans under comparable conditions, but are somewhat surprising in light of other literatures in the field of animal learning. For example, these and related results (Dreyfus, Fetterman, Stubbs, & Montello, 1992) demonstrate that the majority of the birds' choices were controlled by relational rather than absolute stimulus information; yet pigeons are notoriously inept at learning and transferring discriminations based on stimulus relations (e.g., Premack, 1978). In addition, the experiments failed to reveal an effect of absolute duration; that is, accuracy was invariant across different duration pairs that maintained a constant ratio of the stimuli, such as 2:1 (e.g., 8 s vs. 4 s, 16 s vs. 8 s, etc.). Although this result is consistent with predictions based on Weber's law for temporal discriminations (Killeen & Weiss, 1987), it is difficult to reconcile with facts about working memory in the pigeon. Many studies show that pigeons' memories for the properties of recent events, such as stimulus duration (e.g., Fetterman, 1995; Spetch & Wilkie, 1983), decrease with increases in the time since the to-be-remembered events. If subjects must maintain a "representation" of the duration of the first interval for comparison with the second interval, increases in the duration of the second interval should have a deleterious effect on the fidelity of that memory; yet, as noted, the present results did not bear out this expectation. However, subsequent research with carefully constructed duration pairs reveals a decrease in accuracy with increases in absolute duration, consistent with the hypothesis that there is a substantial working memory component to the temporal 2AFC procedure (see Stubbs, Dreyfus, Fetterman, Boynton, Locklin, & Smith, 1994, for details).
J. Gregor Fetterman
82
III.
Ratio Comparisons of Duration
The abscissa of Fig. 2 characterizes the controlling dimension as a ratio of the stimuli, even though subjects were not required to partition the stimuli on a ratio scale. The contingencies simply specified that different responses could be reinforced depending on which stimulus lasted l o n g e r - - a n ordinal comparison. Nonetheless, discrimination was an orderly function of changes in the duration ratio. In subsequent experiments, both pigeons and humans were studied under a version of the 2AFC procedure that explicitly required a partitioning of the stimuli based on duration ratios. Fetterman, Dreyfus, and Stubbs (1989) tested pigeons on a task that humans might see as a qualitatively different and more complex version of the 2AFC procedure. The basic task structure remained intact: subjects viewed a sequence of red and green lights, each lasting for some duration; two choices were offered at the end of the sequence and one choice was correct after one class of duration pairs and the other after another, mutually exclusive class. In this experiment, however, duration pairs were separated into classes according to whether the ratio of the first to the second interval was less or greater than a criterion ratio. In one condition, for instance, one choice was reinforced when the duration ratio was greater than 2:1 (e.g., 10 s followed by 3 s) and the alternate response was reinforced when the ratio was less than 2:1 (e.g., 10 s followed by 7 s). Many different values of the stimuli were used, as with Fetterman and Dreyfus (1987), to 1:2 1.'1_ 2:1 % "3
/ / / / 4 : 1 o.8-
0.6-
"y, 0.4 A 0.20.0 .01
.i
.......
Duration
......
Ratio
i'0
.....
i;0
(tl/t2)
Fig. 3. Probability of responding that the ratio of the first to the second duration exceeded a criterion ratio as a function of duration ratio. From left to right, the curves signify performance under criterion ratios of 1 : 4, 1 : 2, 1 : 1, 2 : 1, and 4 : 1. See text for additional details.
Relational Timing
83
minimize the possibility that a subject could base the discrimination on a single interval. Three pigeons experienced criterion ratios (q to t2) of 1:4; 1:2, 1:1 (equivalent to choosing according to which stimulus lasted longer), 2:1, and 4:1, and the conditions were presented in a pseudo-random order. Prior to exposure to conditions explicitly based on duration ratios, all birds had extensive experience discriminating according to which color lasted longer (1:1 ratio); the birds were returned to the 1:1 condition after exposure to different criterion ratios. The redetermination of the 1 : 1 performance was used as the baseline for comparison against the other conditions, and it is important to note that accuracy in the redetermination of the i : 1 condition (which followed conditions with different criterion ratios) did not differ significantly from the first exposure. Figure 3 provides a psychophysical portrait of the resulting discrimination. Each curve represents the probability of responding that a duration ratio exceeded the criterion ratio as a function of categories of duration ratios, as in Fig. 2. From left to right, the curves depict performance under criterion ratios of 1 : 4, 1 : 2, 1 : 1, 2: 1, and 4 : 1. The break in the connected symbols indicates the point at which the contingencies for choosing according to the criterion ratio changed. The curves are orderly and ogival, reflecting control by the duration ratio dimension across all conditions. Most important (and surprising), analyses demonstrated that accuracy did not differ across the various conditions, in spite of the fact that (to most human observers) ordinal comparisons ("which stimulus lasted longer") appear less complex than those that call for comparisons based on ratios of stimuli. A priori notions about the relative difficulty of the duration ratio discrimination proved wrong, at least for pigeons. This finding led, somewhat obviously, to another experiment designed to evaluate the correctness of our intuitions for humans. Fetterman, Dreyfus, and Stubbs (1993) presented college students with two versions of the 2AFC task with the intent of reproducing the essential features of the pigeon research. Participants viewed a sequence of red and green lights and then judged the relative durations of the stimuli by pressing one of two telegraph keys. Informative feedback was provided after correct responses. Each subject served under two versions of the procedure, with task order appropriately counterbalanced across subjects. All participants judged which stimulus lasted longer in one condition (200 trials), and whether the ratio of the durations was less or greater than a criterion ratio in another condition (400 trials). Criterion ratios of 1:4, 1 : 2, 2 : 11 and 4 : 1 were used for different groups. Many duration combinations were used (again, the roving standard design), and the stimuli were relatively brief (the great majority were less than 4 s; see Fetterman et al.,
84
J. Gregor Fetterman
1993, for details) in an effort to preclude chronometric counting strategies (e.g., Fetterman & Killeen, 1990). At the start of each session, participants were given a verbal description of the rule for comparing durations and the contingencies for choices; this information also was displayed on the computer monitor on every trial at the time of choice. Figure 4 shows the main result as a scatterplot. Each point represents a subject's performance on both tasks with the vertical dimension specifying accuracy in judging which stimulus lasted longer and the horizontal dimension showing accuracy in comparing the stimuli according to a duration ratio. The dependent variable, A', is a nonparametric signal detection index (Grief, 1971). A' typically ranges between 0.5 (chance) and 1.0 (perfect discrimination). The symbols identify subgroups exposed to different criterion duration ratios. Most of the points fall above the major diagonal, indicating more accurate discrimination for judgments based on which stimulus lasted longer. Statistical comparisons revealed that accuracy was significantly higher for the ordinal ("longer") comparisons of duration, contrary to the results obtained with pigeons. Although these results illustrate important differences between pigeons and humans in temporal comparisons based on ordinal and ratio rules, some questions about the differences remain. First, the difference between ordinal and ratio comparisons for humans could involve the rate of acquisi-
1.0 t
E1 "/"
I
0 ¢13
O
..'/
E © 0.8 .."
©
k
/"
oi. 9~ o • -eliE~ • /
"F- 0.9
,."" /'""
O.7
/
/
/
/
/
Criterion Ratio 0
E1
../"'"
1:4
•
1:2
2
2:1
•
4:1
,/ /
0,6 . . . . . . . . . . . . . . . . . . . . 0.6 0.7
0.8
0.9
1.0
A' Ratio Comparison Fig. 4. Accuracy (_4') in judging which stimulus lasted longer (ordinate) against accuracy in judging according to a criterion duration ratio (abscissa). Each data point represents the paired scores of one subject. The symbols identify different criterion ratios. From Fetterman, J. G., Dreyfus, L. R., & Stubbs, D. A. (1993). Discrimination of duration ratios by pigeons ( Columba livia) and humans (Homo sapiens). Journal of Comparative Psychology, 107, 3-11. Copyright © 1993 by the American Psychological Association. Reprinted with permission.
Relational Timing
85
tion only. Accuracy on the ratio task after 400 trials was significantly lower than on the ordinal task after 200 trials, but we cannot say whether the accuracy of ratio judgments was asymptotic. Ratio comparisons improved significantly between the first 200 trials of practice and the second 200 trials, and might have continued to improve with additional practice, eventually equaling that observed with ordinal comparisons. Second, pigeons might acquire the ratio discrimination more slowly than the ordinal version, but asymptotically learn the two tasks to the same degree. Fetterman et al. (1989) used a completely within-subjects design and thus could not draw meaningful conclusions about differences in acquisition. However, Fetterman and Dreyfus (1994) compared rates of acquisition in different groups of pigeons and found that the two tasks were acquired at the same rate. Thus, pigeons acquire these seemingly different discriminations at the same rate, and asymptotically to the same level of performance. At a minimum, however, humans take longer to learn the ratio discrimination than the ordinal discrimination. Figure 5 summarizes the findings of Fetterman et al. (1989, 1993). The figure shows the accuracy of discrimination for each species under both discrimination conditions. For humans, A ' scores are significantly higher for the discrimination of which stimulus was longer, whereas the mean scores for pigeons are not reliably different. As demonstrated in Fig. 5, pigeons were equally accurate in comparing durations according to ordinal and ratio rules, even though the ratio rule
1.00
0.95
:
.05) d u r i n g e i t h e r test session. It m a y s e e m s o m e w h a t s u r p r i s i n g t h a t t h e i n s t r u c t i o n a l m a n i p u l a t i o n h a d n o i m p a c t o n t h e a c c u r a c y o f d i s c r i m i n a t i o n w h e n p a r t i c i p a n t s w e r e req u i r e d to d i s c r i m i n a t e o n t h e basis o f w h i c h stimulus l a s t e d longer. B y c o m p a r i s o n , t h e m a n i p u l a t i o n p r o d u c e d r o b u s t effects o n t h e d i s c r i m i n a tions i n v o l v i n g t h e r a t i o a n d s a m e - d i f f e r e n t rules. T a b l e I clarifies t h e issue b y p r o v i d i n g d a t a o b t a i n e d d u r i n g debriefing. P a r t i c i p a n t s serving in t h e
J. Gregor Fetterman
90
TABLE I NUMBER OF PARTICIPANTS STATING THAT THEY JUDGED ACCORDING TO DIFFERENT RELATIONAL RULES ("STATED R U L E " ) AGAINST THE
EXPERIMENTER-DEFINED RULE ("PROGRAMMEDRULE") Stated rule Programmed rule Longer 3 : 1 Ratio Same-different
"Longer . . . . 10 6 7
Same-different. . . . 0 6 4
Other . . . . 1 0 1
Don't know" 1 0 0
uninformed conditions were asked to state the rule they used to guide their choices (this was a flee-response situation; no response categories were provided). The table summarizes the responses by showing both the programmed rule and participants' responses to the query concerning what rule they used for comparing the durations. The table is arranged as a matrix showing the programmed rule against various response categories provided by the participants. It is clear from Table I that the great majority of participants in the ORDINAL condition correctly surmised that the discrimination involved a judgment of which stimulus lasted longer; not surprisingly, the accuracy of discrimination for these participants was not significantly different than for their informed counterparts. However, the majority of participants in the SAME-DIFFERENT condition also (incorrectly) guessed that the task involved judging which duration lasted longer; only four participants correctly guessed that the judgment involved the SAME-DIFFERENT rule, but the accuracy of their judgments did not differ significantly from participants who were not able to verbalize the correct rule. None of the participants in the RATIO condition guessed correctly; half said the judgment involved which stimulus lasted longer and the other half said that they judged according to a SAME-DIFFERENT rule. Figure 8 provides information on the acquisition of the discriminations under each rule condition. The figure shows mean A' scores across blocks of 50 trials; each panel shows performance for a different temporal rule and the filled and unfilled symbols represent the two instructional conditions. The gaps between the connected points identify the end of the first and the beginning of the second session, and the vertical bars signify standard errors of the means. It is clear from Fig. 8 that the discrimination based on which interval lasted longer was acquired very rapidly, within 150 trials, for both informed and uninformed participants. Asymptotic levels
Relational Timing
91
1.00"
Instructional Condition
0.75
:
, 75-
.50S
E 25-
Incongruent
LU
o o
2~5
510
I
75
100
Observed Proportion (%) Fig. 4. Results of Heit (1998, Experiment 1). Reprinted by permission of APA.
Knowledge Selection in Category Learning
171
Generally speaking, it now seems well-accepted that background knowledge has a selective role in category learning, in a number of ways, just as this point has been made in related areas of research such as memory (e.g., Alba, 1983) and reasoning (e.g., Wason, 1960). D.
CONCEPTS USED AS INPUT TO BACKGROUND KNOWLEDGE
Wisniewski and Medin (1994) have argued that knowledge-driven processing and data-driven processing must be tightly coupled. That is, information should flow in both directions. They demonstrated this point in a set of studies in which subjects' background beliefs about categories such as creative children had to be adapted when observing stimuli such as ambiguous drawings done by children. It appeared that the subjects used the stimuli to acquire more general knowledge about how to parse drawings into features. (See also Schyns, Goldstone, & Thibaut, 1998, for an extensive discussion of how people learn to represent categories in terms of features.) This point regarding the flow of information from newly learned concepts to background knowledge was demonstrated in another way by Heit (1994), who, following a standard procedure of teaching subjects about people in city W, asked the subjects to make background judgments about people in the whole state rather than just in this one city. Figure 5 shows sample results, adapted from Heit (1994, experiment 5). There were large effects of prior knowledge, which was not surprising given that subjects were asked a general knowledge question. However, the slope of the lines also indicates that what the subjects observed in city W (just eight observations per
o'~ 10o
.E 251 .,o cO LLI 0
Incongruent
2i5 5m0 7m5 Observed Proportion (%)
1O0
Fig. 5. Results of Heit (1994, Experiment 5). Reprinted by permission of APA.
172
Evan Heit and Lewis Bott
category) had a substantial effect on their background knowledge judgments as well. The fact that subjects were tested immediately after they had observed the descriptions of people in city W could have led to an amplification of this effect. However, a longer delay between study and test may have made source discrimination even more difficult, as in a sleeper effect (e.g., Hovland & Weiss, 1952). Therefore, it is unclear to what extent Heit's procedure (1994) magnified the influence of new concepts on background knowledge. Heit (1994) accounted for the effects of recent observations on background knowledge judgments in the same way as the effects of background knowledge on a recently learned concept. In both cases, the integration model was applied, making the assumption that a categorization judgment would depend on the retrieval of memories for observations, corresponding to background knowledge and members of a recently learned category. The only difference in the accounts of the two cases was that for background knowledge judgments, it was assumed that a greater proportion of the retrieved memories would correspond to background knowledge compared to the judgments about a new concept. Predictions of the integration model are shown as the lines in Fig. 5. (For another discussion of revision of background knowledge, emphasizing rule-based systems, see Mooney, 1993.) E.
OBSERVATIONSUSED TO SELECT BACKGROUND KNOWLEDGE
Until now in this chapter, a crucial issue concerning knowledge effects on category learning has been passed over. At the beginning of the chapter, we argue that people face the problem of too many individual cases, so they treat individual things as belonging to categories. Yet this solution raises another problem--that there can be an extremely large number of ways to group a set of individuals into categories. This problem can be addressed, we argue, by using background knowledge to constrain category learning. Unfortunately, this solution itself raises yet another problem; namely, the problem of selecting prior knowledge. Just as there are many individual observations to deal with, and many possible category structures that could be considered, there are many possible sources of background knowledge that could be helpful in learning about a new category. For example, imagine visiting a new town or university campus and looking at the buildings there, trying to learn about the general layout and architectural styles. Many sources of background knowledge could possibly be helpful, such as memories of other towns or other campuses. In fact, it would be easy for the number of past observations to greatly outnumber the number
Knowledge Selection in Category Learning
173
of new observations! Even if past observations are organized and summarized, into a smaller number of categories, there will still be information corresponding to many different places and many different kinds of buildings. How could a person select useful information from all of this background knowledge and, in light of this knowledge selection problem, how could background knowledge actually make concept learning easier? On the surface, the knowledge selection problem would seem very troublesome for experimental and computational approaches to category learning and influences of past knowledge. It would be easy to justify not doing research on this topic. Although the knowledge selection problem does seem very imposing, and potentially unsolvable, it is still important to note that people do solve this problem every day. People face new situations and they manage to retrieve useful background knowledge somehow. In spite of the large numbers of things to observe, possible categories to put them in, and possible sources of background knowledge to guide this categorization, people are not normally left helpless due to issues in computational complexity. Therefore, we do see the knowledge selection problem as an appropriate issue for empirical study; namely, we are interested in how people find useful prior knowledge for category learning from the many possible sources of prior knowledge. In addition, it is encouraging to pick up any textbook on Bayesian statistics (e.g., Raiffa & Schlaifer, 1961) and find many techniques listed for combining multiple prior beliefs with observations, and selecting among these beliefs based on the data observed. In Bayesian statistics there is no assumption that a learner starts with optimal or perfectly correct prior beliefs. Instead, the learner begins with a reasonable guess that merely serves as an initial basis for learning, with corrective information then provided by the data. Indeed, it is possible to start with a whole set of different prior beliefs, with a distribution of initial degrees of confidence in each of these. When observations are made, confidence in various prior beliefs can be increased or decreased as appropriate (see also Heit, 1998b). That is, observations can be used to select from among a set of prior hypotheses. Therefore, Bayesian statistics already does provide an approach for addressing the knowledge selection problem, and indeed, our own categorization model to be proposed in this chapter takes some ideas from the Bayesian approach. Still, it might be argued that even Bayesian statistics does not fully address the knowledge selection problem because these methods merely indicate how to select among a set of prior hypotheses, but they do not say which prior hypotheses should be chosen. The key point is that Bayesian techniques can be applied to a large set of prior hypotheses, even when many of them are highly abstract, repetitive, or even ill chosen, as long as this set covers the hypothesis space well enough so that the target concept can be represented.
174
Evan Heit and Lewis Bott
Many previous experiments on knowledge effects on category learning, including Heit (1994, 1995, 1998a), have avoided the knowledge selection problem by more or less telling the subjects which prior knowledge to use in learning new categories. For example, when subjects learned about shy people in city W, it was easily understood that they were supposed to use prior knowledge of shy people in the real world. In contrast, some experiments have given subjects a more difficult task, using unlabeled categories or nonsense labels that minimize the clues available that might indicate which prior knowledge might be useful (e.g., Murphy & Allopenna, 1994; Wisniewski, 1995). For example, in Murphy and Allopenna (1994), subjects learned about categories of animals, vehicles, and buildings, with labels such as "Category 1" and "Category 2." These labels obviously did not constrain the knowledge selection problem very much. When a subject learned about a category of vehicles, for example, there were many known types of vehicles that could be informative. It was impossible to know in advance whether to use prior knowledge about snowmobiles, ice cream vans, heavy trucks, or jeeps. However, the content of the category itself--that is, the descriptions of category members--were helpful in finding useful prior knowledge. For example, when subjects observed a category member with the description "Made in Africa, lightly insulated, and drives in jungles," they were able to access knowledge about vehicles used in hot weather such as jeeps, rather than knowledge about other vehicles such as snowmobiles and heavy trucks. This process is denoted in Fig. 1 by the arrow running from observations to background knowledge. In these experiments, subjects had so much possible prior knowledge to apply to category learning that they needed to use the observations themselves to select and assemble helpful prior knowledge. Our own experiments were an attempt to further address the phenomenon of knowledge selection for category learning. Like Murphy and Allopenna, we used building categories (in experiment 1) and vehicle categories (in experiment 2). Given the extensive range of background knowledge people have for these domains, and the many familiar categories within these domains, we see these stimuli as encouraging knowledge selection processes. Unlike Murphy and Allopenna, we collected data over the course of learning. It seemed valuable to look at knowledge selection processes as they unfold over time. One of our goals was to show that in some situations, categorization judgments are not affected early on by prior knowledge until many observations have been made and relevant prior knowledge can be assembled--the opposite result of Heit (1995). Therefore, it was necessary to collect categorization judgments after various numbers of category members had been observed. Another advantage of
Knowledge Selection in Category Learning
175
collecting data along the course of learning was that our data were suitable for developing and testing a computational model of category learning. The greater number of data points compared to Murphy and Allopenna's experiments provided a more constraining data set for modeling. Our general prediction for these experiments was that, in terms of various measures, there would be increasing knowledge effects over the course of learning because subjects would have no indication, at the start of learning, which of many sources of prior knowledge would be relevant. We see this as a useful area of empirical study because most past experiments in this area just have not addressed the time course of prior knowledge effects. More important, a major class of models would make just the opposite prediction--namely, that prior knowledge would have its greatest influences early on, and these influences would be reduced over the course of learning. This prediction is made by "knowledge-first" categorization models, such as the integration model of Heit (1994), that have an initial store of prior knowledge, represented as exemplars, rules, prototypes, or connection strengths, and simply revise this representation to reflect local conditions. Early on, prior knowledge dominates judgments because that is the only information available. However, error-correcting learning mechanisms would lead to a more veridical representation over time, diminishing any influences of prior knowledge. We next present our two experiments on knowledge selection in category learning, followed by a more general review of computational models that employ prior knowledge and then by the introduction of a new computational model that addresses knowledge selection effects.
II. A.
Experiment 1
METHOD
In this first experiment, the 77 subjects learned about two categories of buildings, referred to as "Doe buildings" and "Lee buildings." The subjects were told to imagine that they were reading a book with a series of descriptions, each corresponding to a different building. The stimuli were organized in five blocks, with descriptions of four Doe buildings and four Lee buildings presented in each block. Each description included the category label (Doe or Lee) and a list of featural information, presented in a randomized order. There were two critical features presented in each description and two filler features. The critical features for each category were related to a known type of building (e.g., churches for Doe and office blocks for Lee or vice versa). The filler features, arbitrarily assigned to each category, were general
176
Evan Heit and Lewis Bott
characteristics that could be true of just about any building. Finally, each description contained three pieces of individuating information (name of builder, surveyor, and photographer). This information was included simply to make the descriptions a bit longer and more difficult so that learning did not occur too quickly. Results for the individuating features are not reported here. The critical and filler features were derived from a pretest. The object of the pretest was to ensure that the critical features would be grouped together consistently to form two categories and that the filler features would be distributed evenly between these two categories. The pretest involved a series of sorting tasks in which subjects were asked to place each feature into one of two groups. (Subjects were not given category labels for the two groups; instead, they freely sorted cards with feature names into two piles.) Initially, there were 18 pairs of binary features: 9 intended to be critical features and 9 intended to be filler features. For successive runs of the pretest, critical features were dropped or replaced if subjects did not show a strong preference for putting them in one category, and likewise filler features were dropped if subjects did show a strong preference for one category or the other. After a series of iterations of this procedure, a set of 8 pairs of critical features and 8 pairs of filler features was obtained. A final pretest group of 20 subjects sorted each of the critical features with at least 90% preferring one group over the other, and for the filler features preference for one group was always less than 75%. In addition, subjects were readily able to describe one sorted pile of features as being related to churches or old buildings, and the other as being related to office buildings or other commercial buildings. The complete list of critical features as well as sample filler features is shown in Table II. From the 8 pairs of critical features, 4 pairs were randomly assigned to presentation frequency one. Each feature in each pair was presented in one description per block, either Doe or Lee. Two pairs were assigned to presentation frequency two, and each feature presented in two descriptions per block. Finally, two pairs of features were not presented at all in the study blocks (but they were tested in test blocks). The whole experiment was a sequence of five study-test blocks. In each study block, the building descriptions, each with a category label, were presented individually, for 6 s each. A sample description would be: {Lee building type, Builder: T Jones, near a river, has gas central heating, Surveyor: R Rawson, Photographer: A Ferraro, has steeply angled roof, has wooden furniture}. Subjects were instructed to try to memorize the stimuli. Following each study block was a test block in which subjects were asked to categorize 40 single features in the Doe or Lee categories. These test items included 24 individuating features, 8 critical features (4 presented
Knowledge Selection in Category Learning
177
T A B L E II CRITICAL AND FILLER FEATURES FOR BUILDING STIMULI Critical features Has steeply angled roof Has wooden furniture Has an interesting structure Old building Quiet building Lit by candles Ornately decorated Built with stone Has a flat roof Has metal furniture Has a repetitive structure New building Busy building Lit by fluorescent light Blandly decorated Built with metal and concrete Sample filler features Near a bus station Designed by a local architect Has gas central heating Not near a bus station Designed by an international architect Has electric central heating
once, 2 p r e s e n t e d twice, and 2 not presented), and 8 filler features (same distribution as critical features). Overall accuracy f e e d b a c k was given at the end o f each test block to e n c o u r a g e g o o d p e r f o r m a n c e . B.
RESULTS
Initial analyses did not reveal any significant differences b e t w e e n presentation f r e q u e n c y 1 and p r e s e n t a t i o n f r e q u e n c y 2; therefore, the results were p o o l e d o v e r these two p r e s e n t a t i o n frequencies. T h e average p r o p o r t i o n s correct are shown in Fig. 6. T h e top panel shows responses to features that h a d b e e n p r e s e n t e d during the study blocks. Overall, there is a trend for p e r f o r m a n c e to i m p r o v e over blocks. A l t h o u g h there is no difference b e t w e e n critical and filler features in the first block, the
178
Evan Heit and Lewis Bott
BUILDINGS
100-
Data--PresentedItems Critical
9080o 70- ~ 0 o~ 60
e
r
5040
1
10090"5
80-
Block
Knowledge Selection in Category Learning
179
never presented in study blocks, categorization performance clearly improved from the first block to the fifth block. The results were analyzed with a three-way A N O V A with block, feature type (critical or filler), and presentation (observed or not observed) entered as variables. Each of the variables had statistically significant main effects, and likewise each of the two-way interactions were significant. Perhaps the most important interaction was the feature type by block interaction, supporting the observation that the difference between critical and filler features increased across blocks.
HI.
A.
Experiment 2
METHOD
This experiment was intended to be a replication of the first experiment with a different stimulus set (vehicles rather than buildings). The main procedural change was that the experiment had six study-test blocks rather than five, in an effort to get a fuller picture of the course of learning. The critical and filler features were derived from a pretest in a similar manner to experiment 1. One set of critical features was intended to be related to tractors and the other was related to racing cars. The critical features as well as sample filler features are shown in Table III. B.
RESULTS
Again, there was not any significant effect or interaction due to presentation frequency of features (once or twice per block), so the data were pooled over these two presentation frequencies. The results, in terms of average proportion correct, are shown in Fig. 7. Again, the pattern is for performance to improve with increased training, for people to be more accurate on critical features than filler features, and for the difference between critical features and filler features to increase over time. For example, on presented features there is a 10% difference between critical and filler features in block 1, but a 22% difference in block 4. The advantage of critical features over filler features is diminished somewhat by block 6, but this result may be due to a ceiling effect on critical features. Also, on the nonpresented features,-there is steady improvement on critical features from block 1 to block 6 (and judgments on filler features again represent chance guessing). The results of a three-way A N O V A were similar to that of the first experiment, in that each of the three main effects (block, type of feature, and presentation) as well as the two-way interactions were statistically significant.
180
Evan Heit and Lewis Bott
T A B L E III CRITICALAND FILLER FEATURES FOR VEHICLE STIMULI Critical features Useful for pulling heavy objects Is very heavy Used for doing work Drives on dirt roads Uses diesel Driver sits high off the ground Not aerodynamic Goes slowly Not useful for pulling heavy objects Is very light Used for entertainment Drives on smooth roads Uses petrol Driver sits close to the ground Aerodynamic Goes fast Sample filler features Has a rectangular gearbox Tires made of synthetic rubber Has gas shock absorbers Has a spherical gearbox Tires made of natural rubber Has hydraulic shock absorbers
IV.
Discussion of Experiments
The similarities between these two experiments are more important than the differences. In both experiments, subjects were increasingly influenced by background knowledge over the course of learning, in contrast to the results of Heit (1995). One source of evidence for increasing influences of knowledge is the results for presented features, in the top panels of Figs. 6 and 7. For the building stimuli, there was no difference in classification accuracy for critical and filler features after the first training block, but by the end of the second block subjects had apparently retrieved prior knowledge that facilitated performance on critical features compared to filler features. Realizing that the D o e buildings are churchlike and the Lee buildings are like office buildings, for example, would help answer questions
Knowledge Selection in Category Learning
181
VEHICLES Data--Presented Items
100-
Critical 90800
- - "F~ller
7060-
j,,
5040 Block
100~
Data--Non-Presented Items
90Critical
~6 80e 0
7060-
s*
Filler
50 40 Block
Fig. 7. Resultsfor Experiment 2. about critical features but not filler features. Although performance on critical and filler features continued to improve over the course of learning, the advantage for critical features was persistent. The results for vehicles were similar, except that there was an advantage for critical features even after the first block. Perhaps for these stimuli, seeing just four observations per category was enough to retrieve some useful prior knowledge. It is possible that if we had tested subjects halfway through the first study block of experiment 2, the results would have been more similar to experiment 1. In addition, for the vehicle stimuli, the advantage for critical features over filler features increased over time, more than doubling from the first block to the fourth block.
182
Evan Heit and Lewis Bott
The other source of evidence for changes in knowledge effects is the judgments on nonpresented critical features, shown in the bottom panels of Figs. 6 and 7. Subjects were never told the correct category for these features during training blocks. The only way to classify these features correctly was on the basis of general knowledge (about buildings or vehicles). In both experiments, performance on nonpresented critical features improved over the course of learning, suggesting that subjects were increasingly relying on appropriate knowledge for making judgments about these features. Why were the results of these experiments so different from those of Heit (1995)? Why do prior knowledge effects sometimes increase with learning and other times decrease with learning? The main difference between the present experiments and Heit (1995) is that in the present experiments, the category label names (e.g., Doe building type) did not suggest any particular source of prior knowledge, whereas in Heit (1995), the categories (e.g., shy people in city W) readily suggested which prior knowledge should be used. The Heit (1995) experiments failed to detect any increased use of prior knowledge over learning because there was an initial ceiling effect--the relevant prior knowledge was so easily retrieved at the start of the experiment, there was no chance for its influence to increase any further. Why didn't the present experiments find less use of prior knowledge over time? Indeed, there was a persistent advantage for critical features over filler features, even in blocks 5 and 6. It is hard to say whether performance on presented filler features would ever come up to the level of presented critical features, even with much more training. It seems likely that continued testing of individual features interleaved with training blocks would encourage subjects to learn about as many features as possible, but practical matters such as greater levels of motivation in early blocks compared to later blocks might make it difficult for filler features to ever be learned as well as critical features. One surprising result, or lack of result, from these experiments was the lack of difference between features presented once per block and features presented twice per block. For both critical and filler features, we did not find any statistically significant difference in judgments for the two levels of presentation, despite the 100% difference in presentation frequency. It is tempting to relate this finding to results from Murphy and Allopenna (1994), who also found low sensitivity to frequency manipulations for stimuli that lead to retrieval of prior knowledge. However, it would be wrong to conclude that people are not Sensitive to frequency information when category learning involves prior knowledge. For example, Heit (1994, 1995, 1998a) documented a very robust pattern Of responses to variations in frequency of presentation (see Figs. 2-4). Also, informal debriefing of
Knowledge Selection in Category Learning
183
subjects suggested to us that because each description, containing eight pieces of information, only appeared for 6 s, there may have been some strategic scanning of information. For example, in each block some subjects might have looked for features that had not already been presented in that block to maximize the amount of fresh learning per block. So the effect of a second presentation of a feature within a block could have been diminished due to some subjects' learning strategies. Therefore, we find the lack of frequency effects interesting, but it seems to require further study before stronger conclusions are reached. Indeed, Spalding and Murphy (in press) have argued that the lack of sensitivity to frequency in Murphy and Allopenna would depend on the judgment task being used (e.g., classification or frequency judgment).
V.
Putting Knowledge into Neural Network Models
Having collected some data on the time course of knowledge selection in category learning, we set out to develop and apply a computational model that could address thesephenomena. Previous modeling efforts (Heit, 1994, 1995, 1998a) did not address knowledge selection at all. Rather than continuing along these lines of extending the framework of exemplar models, we decided to develop a new model within the framework of connectionist or neural network models. Although exemplar models have some advantages, such as their simplicity and their wide success in application to categorization data, connectionist models seem to provide a richer descriptive framework. That is, the greater complexity of connectionist models in terms of possibilities for different architectures, learning rates, activation rules, initial connection weights, and so on, provides more opportunities for describing distinctive effects of knowledge on learning, as well as an appropriate framework for describing the dynamics of learning and the interplay of knowledge, concepts, and observations. Also, there has already been a great deal of research, mainly outside of psychology, on different ways of putting knowledge into neural networks. Before we present our own model, we review some of this past work, largely from the field of engineering. A useful framework for discussing prior knowledge in neural networks has been developed by Geman, Bienenstock, and Dourstat (1992). In their discussion of computational models of learning, they demonstrated that the generalization error when learning a concept can be broken down into a bias component and a variance component. Models that rely heavily on prior assumptions about the data (e.g., having architectural constraints that favor a particular conceptual structure) can lead to a high bias component;
184
Evan Heit and Lewis Bott
that is, the model can persistently fail to capture aspects of the target concept that do not meet its prior assumptions. However, models that do not make strong assumptions about the concept to be learned can show a high variance component; that is, that they will be easily swayed by noise in training samples. Therefore a model without many assumptions could require an excessively large training sample to achieve satisfactory generalization performance. Furthermore, reducing one type of error frequently is accompanied by an increase in the other type of error, leading to what Geman et al. (1992) referred to as the bias-variance dilemma. To reduce generalization error, both bias and variance must be reduced. One way of doing so would be to increase the number of training examples. Unfortunately, as Geman et al. show, in practice the number of training examples will be insufficient to achieve anywhere near optimal performance. We next review a number of learning algorithms from artificial intelligence (AI) research that are aimed at reducing generalization error, keeping in mind the need to minimize the number of training examples as well. One method for reducing the number of examples required for good generalization is to introduce "hints" into neural networks (Abu-Mostafa, 1993, 1995). Hints are general properties of a class of target concepts, independent of the specific details of the training data. For example, a hint in letter recognition might be that the mapping of a pixel image of a letter to the identification of that letter is position invariant. Hints are introduced into the network by presenting "virtual examples" of the hint and altering the error function to incorporate a term for the hint. (There is some similarity between virtual examples and "prior examples,' in Heit, 1994.) Building on the work of Vapnik and Chervonenkis (1971), Abu-Mustafa has derived a theoretical framework for predicting how much a particular hint will reduce the need for training examples. Another approach to prior knowledge is to insert biases directly into neural networks by setting the weights before learning begins. This approach has been taken by, for example, Frasconi, Gori, and Soda (1995) and Giles and Omlin (1993). In both cases the specific method was to insert transition rules into recurrent neural networks; known transitions were built into the network and then unknown transitions were learned from the data. Giles and Omlin showed that "malicious" rules or incorrect prior knowledge could be overcome gradually by corrective training data. As Frasconi et al. (1995) noted, however, a potential problem with this method is that the longer a network is trained, the more likely it is to use a solution based on the data, thereby forgetting its prior knowledge. Frasconi et al. suggested a compromise of allowing the weights to vary within a constrained space, which was the technique employed by Choi, McDaniel, and
Knowledge Selection in Category Learning
185
Busemeyer (1993). Also, rather than inserting knowledge directly, it is possible to train the network in one input-output domain and then rely on this prior knowledge to help learning about structurally similar domain, freezing a subset of the hidden units to prevent forgetting (Dienes, Altman, & Gao, in press). We next review ways of building in prior knowledge by varying the network architecture. The basic goal here is to allow the network to have sufficient representational power to capture the underlying concept, but at the same time to avoid fitting the noise in the data. This goal is another way of looking at the bias-variance dilemma--a network that is too small leads to a high bias, but a network that is too large leads to high variance (and fitting the noise). Constructive networks (e.g., Giles, Chen, Sun, Chen, Lee, & Goudreau, 1995; Mareschal & Schultz, 1996; Prechelt, 1997) expand their architecture during learning, allowing the complexity of the network to increase as the data suggest it. Destructive networks, however, start off with an excess of hidden units and then prune off the hidden units that are not useful (e.g., Mozer & Smolensky 1989; Reed, 1993; or, for a more biological treatment, Brown, Hulme, Hyland, & Mitchell, 1994). The advantage constructive networks have is that they might require less computation than destructive nets and that there is no need to make an initial guess at the appropriate number of hidden units (Giles et al., 1995). Rather than varying the network architecture over the course of learning, another approach is to employ more than one architecture within a mixed network and allow the network itself to learn which of the architectures is best for a particular problem. An example of this approach is the mixtureof-experts network (Jacobs, 1997; Jacobs, Jordan, & Barto, 1991; see also Erickson & Kruschke, 1998). For example, Jacobs et al. (1991) used a mixed network, with three modules having different structures (no hidden units, medium number of hidden units, and a high number of hidden units). In effect, each module took a different approach to the bias-variance dilemma, with the simplest network being most constrained in terms of what it could learn and the network with many hidden units being most sensitive to variation in a training sample. The network was trained to perform two tasks: an object localization task and an object recognition task. The localization task was simpler in that it did not require hidden units for good performance. The mixture-of-experts network learned to allocate the module without hidden units to the localization task while it allocated one of the modules with hidden units to the recognition task. We see the mixtureof-experts approach as coming close to the Bayesian idea of starting with multiple hypotheses then selecting among them based on the data (and see Jacobs, 1995, for a more substantial comparison).
186
Evan Heit and Lewis Bott
VI. A.
The Baywatch Model
OVERVIEW
Our own approach to the knowledge selection problem has some parallels to the mixture-of-experts architecture, but instead of using modules with different structures, we used modules with different pools of pretrained knowledge. Therefore, our method also has some relations to techniques that insert prior knowledge directly into networks. Our own model, illustrated in Fig. 8, can be described as having one module or set of weights for strictly empirical learning. These weights do not get any pretraining. Then the model also has a set of experts that are pretrained to recognize different known categories. For example, a network for learning about buildings might have experts that can recognize different kinds of buildings such as churches, office blocks, restaurants, and schools. (Only two of these expert modules are illustrated in Fig. 8.) We refer to this model as the Baywatchmodel because it combines a general Bayesian approach to selecting among multiple sources of prior knowledge with an empirical learning component. The Baywatch model is a feedforward network in which the input units represent the individual features and the output units represent the Doe and Lee category nodes. The two hidden units correspond to two expert
Doe
Lee
~rch
FO
Ocife
t(30 @©()C F1
F1
Fig. 8.
F2
CO
C1
C1
I|ustratJon of Baywatch model
C2
Knowledge Selection in Category Learning
187
modules, or prior knowledge category nodes (PK nodes). The four input units on the left of Fig. 8 represent filler features, and the four inputs on the right represent the critical features. The only difference between the two types of features is that the filler features are only connected to the output nodes, whereas the critical features are connected both directly the output nodes and indirectly to the output nodes via the prior knowledge nodes. The difference between filler and critical features in the model reflects our assumptions about how learning would take place in our experiments. Consequently, we required filler features to be learned directly without the help of prior knowledge, whereas critical features were to be learned both directly and by a mediated connection through prior knowledge. The connections between the critical features and the PK nodes have fixed weights, so that values of critical features of the stimuli that correspond to church features would activate the church PK node, and likewise critical features of the stimuli that correspond to offices would activate the office PK node. It is assumed that these fixed weights would correspond to prior knowledge about familiar characteristics of churches and office blocks learned through ordinary means of association. The PK nodes have threshold functions, so that if any church feature, say, a steeply angled roof, is presented, then the church PK node will be activated. The activation from the PK node would then be propagated to the output units. In contrast to the connection weights between the critical features and the PK nodes, the other weights in the network are learnable through gradient descent on the error between the desired output of the network and the actual output. Adjusting the weights from filler units and the critical units to the output units allows the features to be associated with the category nodes in the empirical learning module. Note that if these were the only weights in the network, there would be no difference between the two types of features. Finally, there are adjustable weights between the PK nodes and the category nodes. These represent the subject's capacity to associate known categories--say, churches and office blocks--with the new categories, Doe and Lee buildings. We see this part of the network as addressing (at least in part) the knowledge selection problem, because here the network is learning to select from already known categories and apply this knowledge to judgments about new categories. Finally, we note that same simulations were used to address experiments 1 and 2, which had the same stimulus structure and similar results. (We continue to refer to buildings rather than buildings and vehicles, for simplicity.) B.
TECHNICAL DETAILS OF THE MODEL
The input units can take on the values {+1, 0, -1}, which correspond to the Doe value of a feature, the feature not being present, and the Lee
188
Evan Heit and Lewis Bott
value of a feature respectively. For instance, if the feature is the lighting feature (see Table II), then a - 1 value would m e a n "lit by candles" value, a 0 would correspond to not presenting the feature at all, and a +1 would m e a n "lit by fluorescent lights." The two output units vary continuously between - 1 and +1. One output unit corresponds to the D o e category and the other to the Lee category. The activation on each category was given by the weighted sum of its inputs. This activation was then converted into a probability measure using the logistic transformation given in Gluck and B o w e r (1988, equation 7). If a D o e exemplar is presented during training, the teaching values for the category nodes are +1 on the D o e node and - 1 on the Lee node (Table IV). These values would be reversed for a Lee training example. Critical features are connected by fixed weights to the PK nodes. As can be seen from Fig. 8, these were connected so that if the Lee value ( - 1 ) of a feature is presented, this lead to positive activation on the church P K node (because Lee buildings would correspond to churches) and a negative activation on the office node. The output of a P K node was a threshold transformation of the weighted sum of its inputs, such that the output was 1 if the sum was greater than or equal to 1, and 0 otherwise. All of the weights in the network were adjusted according to the standard delta rule (e.g., Gluck & Bower, 1988). C.
SIMULATIONOF EXPERIMENTS
The network was trained for a total of 10 epochs, with the learning rate in the delta rule set at 0.1 and the probability mapping constant for the logistic transformation function set at 7.0 (both values were derived f r o m an informal sampling of the p a r a m e t e r space). The training stimuli consisted of four examples of buildings--two D o e exemplars and two Lee exemp l a r s - w h i c h are shown in Table IV. The first two rows are the D o e buildings and the second two rows are the Lee buildings. Note that the fourth features in the critical feature section and in the filler feature section TABLE IV STRUCTURE OF THE TRAINING DATA
Filler features 1 1 -1 -1
1 0 -1 0
0 1 0 -1
Desired output
Critical features 0 0 0 0
1 1 -1 -1
1 0 -1 0
0 1 0 -1
0 0 0 0
1 1 -1 -1
-1 -1 1 1
Knowledge Selection in Category Learning
189
always have a value of zero. These features correspond to those that were never presented to the subjects in the experiments. Following each training epoch, the network was tested on the individual features by presenting a vector of all zeroes except for the particular feature of interest, which had a value of either +1 or - 1 . The results of the simulations are displayed in Fig. 9, with the proportion correct on the test set shown as a function of the number of learning epochs and feature type. The top panel shows the model's predictions for presented features. As for the results of the experiments, the predictions for features presented once per epoch and features presented twice per epoch are pooled together.
MODEL PREDICTIONS Presented Items
100
Critical 908O o
70
Filler
605040 0
1'0 Epoch
Non-Presented Items
100
90"5 0
80
Critical
70-
°~ 60 50-
_/ Filler
40 0
1'0 Epoch
Fig. 9. Predictions for both experiments.
190
Evan Heit and Lewis Bott
The bottom panel shows predictions for features that had not been presented during training. The predictions fit well with the main results of the experiments. Critical features were learned more quickly than filler features, and critical features that had not been presented were responded to more accurately than chance, whereas filler features that had not been presented were at chance level. The first result can be explained in terms of the extra connections from critical feature inputs to the output units, mediated by connections through the PK nodes. As the network progressively learned which sources of prior knowledge correspond to the Doe and Lee categories, responses on critical features were derived both from the empirical learning module and from prior knowledge. In addition to these two paths of influence on the category outputs, the other advantage for critical features over filler features is that there are two paths of learning, in effect leading to twice as much updating of weights after a particular learning trial. A similar advantage for presented critical features over presented filler features might be obtained without any PK nodes at all by simply increasing the learning rate on the critical features relative to the filler features. However that scheme would not predict any advantage for nonpresented critical features over nonpresented filler features. In the Baywatch model, for nonpresented critical features and filler features, the weights leading from the input units directly to the output units remain at zero throughout learning. Because this is the only way the filler features can activate the output units, their accuracy stays at chance level. In contrast, the nonpresented critical features have another route to the category units, through the PK nodes whose weights are adjusted when any critical feature are presented. Therefore the PK nodes are critical to the Baywatch model's predictions on nonpresented critical features. To provide a better idea of how the Baywatch model uses prior knowledge, we reran the simulations without any PK nodes for comparison. In Fig. 10, we show simulated predictions on presented items, comparing versions of the model with and without prior knowledge. For critical features, in the top panel, it can be seen directly the prior knowledge does not have any influence initially on judgments; the model acts the same way with or without PK nodes. However, the beneficial effect of prior knowledge for critical features increases over the course of learning, as the network with PK nodes learns which categories to connect with its prior knowledge. In the bottom panel of Fig. 10, there is evidence for a slight detrimental effect of prior knowledge on the learning of filler features. This result can be explained as a kind of overshadowing effect, in which knowledge of some highly predictive cues can reduce learning on other predictive cues. As a consequence of the delta rule, when the network learns to predict the outputs increasingly well from the critical feature inputs, learning on
Knowledge Selection in Category Learning
191
Critical Presented Items
00-
With PK 9080Without PK
70 60504O 0
1'0 Epoch
Filler Presented Items
100 90-
Without PK 80
S
70o~ 6050-
With PK
4O 0
1'0 Epoch
Fig. 10. Predictions with and without prior knowledge nodes.
the filler features will increasingly be disadvantaged. However, one possible difference between our experiments and the model is that the repeated testing of individual features could encourage subjects in the experiments to learn as much as possible about each individual feature regardless of how much is known about other features.
VII.
Evaluation o f the Baywatch Model
The Baywatch model captures many of the important features of the two experiments on knowledge selection in category learning. At the start of
192
Evan Heit and Lewis Bott
learning, the model is not influenced by prior knowledge, because it does not know which past categories are useful for making predictions about the Doe and Lee categories. However, as observations are made, the model is able to select relevant prior knowledge to be used for judgments about the novel categories. This influence of prior knowledge leads to a persistent advantage for critical features over filler features. Admittedly, the Baywatch model would require more experimental testing before a complete evalua, tion can be made, but even this initial application brings up some interesting issues. One notable difference between the model's predictions and subjects' performance is that the model would predict a robust effect of presentation frequency; that is, more accurate judgments for features presented twice per block compared to features presented once per block. (This prediction is not shown in Fig. 8, however.) In contrast, there was no significant difference between these two levels of presentation in the experiments. This insensitivity to frequency could be an important aspect of concept learning in knowledge-rich domains (c.f., Murphy & Allopenna, 1994), in which case it would be important to try to capture it in a future version of the Baywatch model. However, in the present experiments the lack of sensitivity to presentation frequency could just reflect subjects' reading strategies and might be highly dependent on number of features per presentation and the reading time allowed for each presentation. Therefore, further experimental study is required. Perhaps a more fundamental question is to what extent the Baywatch model is really addressing the knowledge selection problem. The simulations were run with just two sources of prior knowledge (e.g., churches and office blocks) and the network was able to link up these two sources with the correct output categories, Doe and Lee. However, people would obviously have a much larger number of known categories when facing the knowledge selection problem due to large numbers of known kinds of buildings, vehicles, and so on. How well would the Baywatch model scale up? We think the model might scale up well, specifically in terms of adding more prior knowledge nodes. Our investigations so far have distinguished three different classes of PK nodes that might be added to the network in Fig. 8, in addition to the church and office nodes. First, completely irrelevant prior knowledge nodes might be added that have little or no connection to the input stimuli. For example, there could be prior knowledge nodes for space stations, igloos, tents, and cave dwellings added to the network, but these nodes would hardly be activated by the inputs. For example, an input feature such as "lit by fluorescent light" would not be strongly associated with these categories, according to prior
Knowledge Selection in Category Learning
193
knowledge. Therefore, adding PK nodes that are irrelevant to the stimuli would not affect the results of the simulations very much. Second, additional PK nodes that are similar to the existing PK nodes might be added. For example, a PK node corresponding to cathedrals would entail much of the same connections to inputs as the church node. Likewise, there might be similar PK nodes for industrial parks and office buildings. In further simulations, we added a cathedral PK node that had two connections to the critical features for churches (to the critical feature presented twice and the nonpresented critical feature) and an industrial park PK node that likewise was connected to two critical features for office buildings. The results are shown in Fig. 11, comparing the original simulations with two PK nodes to the new simulations with four PK nodes. Inserting the two additional PK nodes improved performance on those critical features that now had two paths for knowledge-directed learning. However, inserting PK nodes did worsen performance on filler features because the additional reliance on critical features led to some overshadowing of filler features. Likewise, there was a slight decrement on performance (not shown in Fig. 11) on critical features that differed within a pair of PK nodes (e.g., features that were true of office buildings but not industrial parks). Still, to the extent that sources of prior knowledge were mutually supporting, having multiple sources of prior knowledge helped performance. Generally speaking, we did not find that adding additional similar PK nodes led to a knowledge selection problem. This result raises an interesting question about our experiments. Although we observed better performance on critical features than filler features, due to increased use of prior knowledge, the results themselves do not indicate which prior knowledge was being retrieved. Some subjects could well have been retrieving knowledge about cathedrals rather than churches, or industrial parks rather than office buildings. Indeed, informal debriefings of subjects revealed some variety of responses to questions about what the experimental stimuli were like in the real world. Third, "malicious" prior knowledge nodes could be added to the network, for example, prior knowledge about some kind of building that is halfchurch and half-office block. Although we initially expected that malicious PK nodes would hurt performance, we had some trouble finding any negative effects in simulations. A half-church, half-office PK node would not get activated very much by our training stimuli, which after all did not contain any items that were half-church, half-office. To the extent that the malicious PK node did get activated, the network would learn equal associations between it and both the Doe and Lee output units. In sum, the malicious PK node was poor competition for real PK nodes, because it did not match the inputs well and it did not become strongly associated
194
Evan Heit and Lewis Bert
MODELPREDICTIONS Critical Items 100-
Extra PK-Presented
90-
"" !~ -"---'----'--'---" "~tandare-Presented
'd 80,e 0 70o~ 605040
0
100,
/
~
ExtraPK-NotPresented Standard-Not Presented
,V" 1'0
Epoch Filler Presented Items
Standard
90~
80-
j
~
j
j
~
w
~
Extra PK
(~ 70o~ 60-
50 40
0
1'0
Fig. 11. Predictions with additional prior knowledgenodes compared to standard network.
with one output rather than the other. Again, we failed to find any knowledge selection problem due to adding malicious PK nodes. Of course, we intend to conduct further simulations involving additional PK nodes, but so far prospects look fairly good for the model's potential to be scaled up with more PK nodes and perform knowledge selection. The success of the Baywatch model in dealing with multiple PK nodes bears a great deal of resemblance to the ability of Bayesian statistics to work with multiple prior hypotheses, including some that are irrelevant, some that are repetitive, and some that are incorrect.
Knowledge Selection in Category Learning
195
More generally, we see the knowledge selection problem as surely having many facets. Certainly one of them is that when learning about novel categories, a learner would need to link up knowledge of familiar categories with judgments about the novel categories. The Baywatch model seems to address this aspect of knowledge selection, in terms the gradual selection of prior knowledge nodes to use for a particular novel output category. In contrast, the prior knowledge in terms of connections from input units to PK nodes is fixed at the start of the simulations. It is assumed that these connections would have been already learned through ordinary associative processes so that the network can more or less instantly recognize church or office buildings. However, there could be some gradual aspects of knowledge activation or retrieval that are not captured by the model. It could be the case that somehow the connections between input units and PK nodes would be strengthened over the course of making observations so that the recognition of relevant categories in prior knowledge would not be instantaneous when a single observation is made. It could be valuable to study this aspect of knowledge selection more directly, for example by showing subjects a series of training examples and asking them to judge directly which familiar categories are related to these stimuli. Finally, we would point out that the Baywatch model as presented in this chapter is but one possible variant within a larger class of models that could perform knowledge selection. For example, referring to Fig. 8, the model could have category label units (Doe and Lee) added to the input layer as well as feature units (F0, F1, etc.) added to the output layer, turning the model into an auto-associator. Such a model could make a greater variety of inferences, such as feature-to-feature inferences (e.g., Heit, 1992) in addition to the feature-to-category inferences in the present version of the model. Hence, the auto-associator version could be applied to a wider range of experimental tasks. There are several other ways that the architecture of the Baywatch model could be modified. These changes were not necessary for fitting the results of our experiments, but they could be useful for application to other experimental designs. First, hidden units could be added to the empirical side of the network, allowing it to solve nonlinear classification problems. Second, the various modules in the network, including the empirical module and all the PK nodes, could be placed in greater competition with each other. The present architecture of Baywatch encourages cooperation between different modules, in the sense that outputs from multiple modules are combined to make a prediction. Instead, the network could be encouraged to specialize; for example, learning that different modules should be used for different stimuli. Some stimuli might be best classified with the empirical module alone, whereas other stimuli would be best classified based on a
196
Evan Heit and Lewis Bott
single PK node. This scheme would force the network, for example, to choose between a church PK node and a cathedral PK node, rather than allowing their influences to combine. (See Jacobs, Jordan, Nowlan, & Hinton, 1991, for a further discussion of ways to increase competition between modules.) Perhaps an even more radical change would be to alter the nature of the knowledge-driven side of the network. The knowledge-driven part of the network and all the PK nodes could be replaced by a module that has been pretrained with a set of rules for identifying buildings. This kind of architecture would make Baywatch closer to hybrid rule-plus-association networks such as those by Ashby, Alfonso-Reese, Turken, and Waldron (1998) and Erickson and Kruschke (1998). However, it is unclear to what extent such a network would make different predictions. Another, less extreme change to Baywatch would be to allow learning on the connections between critical input features and the PK nodes (again, see Fig. 8). At present, these connections are fixed at the start of learning, but it is possible that allowing these weights to change slowly would allow the network to address the issue of how global theories might change over time. That is, people may have a set of prior concepts that help learning, but these concepts themselves could be modified occasionally. To give a real example, one of the authors visited a church in Hungary that was in the shape of an owl; seeing this church led to learning about the local conditions as well as altering the author's general conception of churches. A last extension to the Baywatch model, following Abu-Mostafa (1993, 1995), would be to apply it to situations in which the learner is given a hint about how to solve a classification problem. For example, a rather specific hint would be that Lee buildings are office buildings; such a hint could be given to the network in terms of pre-training and likewise this hint could be given to subjects in an experiment. The use of hints could be a good way to generate and test more detailed predictions of the Baywatch model. The model could be used to predict a hierarchy of hints, with some hints aiding learning more than others.
VIII.
Conclusions
Since the influential Murphy and Medin (1985) paper that raised the issue of background knowledge in terms of category learning and models of categorization, there has been much progress on this issue (again, see Heit, 1997, and Murphy, 1993, for reviews). In particular, there has been a great deal of documentation of the various ways that prior knowledge influences category learning, for which Fig. 1 is only a partial summary. At present,
Knowledge Selection in Category Learning
197
w e see t h e m o s t p r e s s i n g a n d m o r e exciting issue in this a r e a of r e s e a r c h to b e t h e k n o w l e d g e s e l e c t i o n p r o b l e m . O n t h e surface it is a v e r y d i s c o u r a g i n g p r o b l e m , as it r e q u i r e s choices f r o m m a n y p o t e n t i a l l y useful s o u r c e s o f p r i o r k n o w l e d g e . It is e a s y to see w h y little r e s e a r c h on c a t e g o r i z a t i o n , f r o m e i t h e r e x p e r i m e n t a l o r m o d e l i n g a p p r o a c h e s , has a d d r e s s e d t h e k n o w l e d g e s e l e c t i o n issue. Y e t p e o p l e m a n a g e to solve this p r o b l e m e v e r y d a y a n d use t h e i r p r i o r k n o w l e d g e profitably. T h e r e f o r e w e t h i n k it is i m p o r t a n t to a d d r e s s this p r o b l e m h e a d on, r a t h e r t h a n a v o i d i n g it a n y longer. O u r o w n a p p r o a c h e s , involving e x p e r i m e n t a l r e s e a r c h o n t h e t i m e c o u r s e of c a t e g o r y l e a r n i n g a n d c o m p u t a t i o n a l m o d e l i n g of k n o w l e d g e s e l e c t i o n p r o c e s s e s , a r e in t h e i r e a r l i e s t s t a g e s b u t w e a r e h o p e f u l t h a t t h e s e a p p r o a c h e s will c o n t i n u e to b e i n f o r m a t i v e a b o u t this m o s t i m p o r t a n t issue.
ACKNOWLEDGMENTS We thank Ulrike Hahn, Gregory Murphy, and Yves Rosseel for comments on this paper. This research was supported by the Economic and Social Research Council and the Biotechnology and Biological Sciences Research Council (United Kingdom) and the National Institute of Mental Health and National Science Foundation (United States). Please address correspondence to Evan Heit, Department of Psychology, University of Warwick, Coventry, United Kingdom; email:
[email protected].
REFERENCES Abu-Mostafa, Y. S. (1993). Hints and the VC dimension. Neural Computation, 278-288. Abu-Mostafa, Y. S. (1995). Hints. Neural Computation, 7, 639-671. Alba, J. W., & Hasher, L. (1983). Is memory schematic? Psychological Bulletin, 93, 203-231. Ashby, F. G., Alfonso-Reese, L. A., Turken, A. U., & Waldron, E. M. (1998). A neuropsychological theory of multiple systems in category learning. Psychological Review, 105, 442-481. Ashby, F. G., & Gott, R. E. (1988). Decision rules in the perception and categorization of multidimensional stimuli. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 33-53. Brown, G. D. A., Hulme, C., Hyland, P. D., & Mitchell, I. J. (1994). Cell suicide in the developing nervous system: A functional neural network model. Cognitive Brain Research, 2, 71-75. Choi, S., McDaniel, M. A., & Busemeyer, J. R. (1993). Incorporating prior biases in network models of conceptual learning. Memory & Cognition, 21, 413-423. Dienes, Z., Altman, G., & Gao, S.-J. (in press). Mapping across domains without feedback. Cognitive Science. Erickson, M. A., & Kruschke, J. K. (1998). Rules and exemplars in category learning. Journal of Experimental Psychology: General, 127, 107-140. Frasconi, P., Gori, M., & Soda, G. (1995). Recurrent neural networks and prior knowledge for sequence processing: A constrained nondeterministic approach. Knowledge-Based Systems, 8, 313-328.
198
Evan Heit and Lewis Bott
Geman, S., Bienenstock, E., & Dourstat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4, 1-58. Giles, C. L., & Omlin, C. W. (1993). Extraction, insertion and refinement of symbolic rules in dynamically driven recurrent neural networks. Connection Science, 5, 307-337. Giles, C. L., Chen, D., Sun, G., Chen, H., Lee, Y., & Goudreau, M. W. (1995). Constructive learning of recurrent neural networks: Limitations of recurrent cascade correlation and a simple solution. IEEE Transactions on Neural Networks, 6, 829-836. Gluck, M. A., & Bower, G. H. (1988). From conditioning to category learning: An adaptive network model. Journal of Experimental Psychology: General, 117, 227-247. Hayes, B. K., & Taplin, J. E. (1995). Similarity-based and knowledge-based process in category learning. European Journal of Cognitive Psychology, 7, 383-410. Heir, E. (1992). Categorization using chains of examples. Cognitive Psychology, 24, 341-380. Heir, E. (1994). Models of the effects of prior knowledge on category learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 1264-1282. Heit, E. (1995). Belief revision in models of category learning. In Proceedings of the Seventeenth Annual Conference of the Cognitive Science Society (pp. 176-181). Hillsdale, NJ: Erlbaum. Heit, E. (1997). Knowledge and concept learning. In K. Lamberts & D. Shanks (Eds.), Knowledge, concepts, and categories (pp, 7-41). London: Psychology Press. Heit, E. (1998a). Influences of prior knowledge on selective weighting of category members. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 712-731. Heit, E. (1998b). A Bayesian analysis of some forms of inductive reasoning. In M. Oaksford & N. Chater (Eds.), Rational models of cognition (pp. 248-274). Oxford: Oxford University Press. Hovland, C. I., & Weiss, W. (1952). The influence of source credibility in communication effectiveness. Public Opinion Quarterly 15, 635-650. Jacobs, R. A. (1995). Methods for combining experts' probability assessments. Neural Computation, 7, 867-888. Jacobs, R. A. (1997). Nature, nurture, and the development of functional specializations: A computation approach. Psychonomic Bulletin & Review, 4, 299-309. Jacobs, R. A., Jordan, M. I., & Barto, A. G. (1991). Task decomposition through competition in a modular connectionist architecture. Cognitive Science, 15, 219-250. Jacobs, R. A., Jordan, M. I., Nowland, S. J., & Hinton, G. E. (1991). Adaptive mixtures of local experts. Neural Computation, 3, 79-87. Keil, F. C. (1989). Concepts, kinds, and cognitive development. Cambridge, MA: MIT Press. Keleman, D., & Bloom, P. (1994). Domain-specific knowledge in simple categorization tasks. Psychonomic Bulletin & Review, 1, 390-395. Marechsal, D., & Schultz, T. R. (1996). Generative connectionist networks and constructivist cognitive development. Cognitive Development, 11, 571-603. Markman, E. M. (1989). Categorization and naming in children. Cambridge, MA: MIT Press. Medin, D. L., & Ross, B. H. (1997). Cognitive psychology (2nd ed.). Fort Worth: Harcourt Brace. Medin, D. L., & Schaffer, M. M. (1978). Context theory of classification learning. Psychological Review, 85, 207-238. Medin, D. L., Wattenmaker, W. D., & Hampson, S. E. (1987). Family resemblance, conceptual cohesiveness, and category construction. Cognitive Psychology, 19, 242-279. Mooney, R. J. (1993). Integration theory and data in category learning. In G. V. Nakamura, R. Taraban, & D. L. Medin (Eds.), The psychology of learning and motivation: Categorization by humans and machines (Vol. 29, pp. 189-218). San Diego: Academic Press. Mozer, M. C., & Smolensky, P. (1989). Using relevance to reduce network size automatically. Connection Science, 1, 3-16.
Knowledge Selection in Category Learning
199
Murphy, G. L. (1993). Theories and concept formation. In I. V. Mechelen, J. Hampton, R. Michalski, & P. Theuns (Eds.), Categories and concepts: Theoretical views and inductive data analysis (pp. 173-200). London: Academic Press. Murphy, G. L., & Allopenna, P. D. (1994). The locus of knowledge effects in concept learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 904-919. Murphy, G. L., & Medin, D. L. (1985). The role of theories in conceptual coherence. Psychological Review, 92, 289-316. Murphy, G. L., & Wisniewski, E. J. (1989). Feature correlations in conceptual representations. In G. Tiberghien (Ed.), Advances in cognitive science (Vol, 2, pp. 23-45). Chichester: Ellis Horwood. Nosofsky, R. M., Palmeri, T. J., & McKinley, S. C. (1994). Rule-plus-exception model of classification learning. Psychological Review, 101, 53-79. Peirce, C. S. (1931-1935). CoUectedpapers of Charles Sanders Peirce. Cambridge: Harvard University. Prechelt, L. (1997). Investigation of the CasCor family of learning algorithms. NeuralNetworks, 10, 885-896. Raiffa, H., & Schlaifer, R. (1961). Applied statistical decision theory. Boston: Harvard University, Graduate School of Business Administration. Reed, R. (1993). Pruning algorithms: A survey. IEEE Transactions on Neural Networks, 4, 740-746. Schyns, P. G., Goldstone, R. L., & Thibaut, J. P. (1998). The development of features in object concepts. Behavioral and Brain Sciences, 21, 1-40. Spalding, T. L., & Murphy, G. L. (in press). What is learned in knowledge-related categories? Evidence from typicality and feature frequency judgments. Memory & Cognition, 2Z Vapnik, V., & Chervonenkis, A. (1971). On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and Its Applications, 16, 264-280. Ward, T. B. (1994). Structured imagination: The role of category structure in exemplar generation. Cognitive Psychology, 27, 1-40. Wason, P. C. (1960). On the failure to eliminate hypotheses in a conceptual task. Quarterly Journal of Experimental Psychology, 12, 129-140. Wisniewski, E. J. (1995). Prior knowledge and functionally relevant features in concept learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 449-468. Wisniewski, E. J., & Medin, D. L. (1994). On the interaction of theory and data in concept learning. Cognitive Science, 18, 221-282.
THE ROLE OF L A N G U A G E IN THE CONSTRUCTION OF KINDS Susan A. Gelman Michelle Hollander Jon Star Gail D. Heyman
I.
Introduction
Human categories have two primary functions: to organize information efficiently and to enable inductive inferences. Thus, by identifying an object as a "kumquat," we not only have an efficient way of communicating about it to others and reducing the informational load in memory, but also we can make inferences that it is edible, juicy inside, and ripens over time. Inductive inferences include predictions about the future, and so are particularly critical for guiding intelligent behavior. Category-based inferences are not limited to those properties necessary for survival (such as edibility or danger), but extend to a variety of nonobvious features, underlying causal properties, and even theorized, invisible essences (such as DNA or souls). It is thus not surprising that questions of taxonomy are often the most fundamental and contested areas of discussion in science (see, for example, biological taxonomies in Ghiselin, 1969, or diagnostic categories of the American Psychological Association). Although a sizeable body of research demonstrates the ubiquity of category-based inductive inferences (Gelman & Markman, 1986; Osherson, Smith, Wilkie, Lopez, et al., 1990) and essentialist beliefs (Medin, 1989; THE PSYCHOLOGY OF LEARNING AND MOTIVATION, VOL. 39
201
Copyright © 2000 by Academic Press. All rights of reproduction in any form reserved. 0079-7421/00 $30.00
202
Susan A. Gelman et al.
Gelman et al., 1994; Keil, 1989) even in young children, little is known about the conditions that foster or inhibit category-based induction or essentialist reasoning. In this chapter, we focus on the role of language. Specifically, in what ways (if any) does language promote the use of categories as inductive tools? We take a developmental approach because we assume that the most important shaping effects of language are those that occur early in development. Some have argued that language should have no substantive role in the formation or structure of richly structured or essentialized categories (e.g., Pinker, 1994). Arguments against linguistic effects on concepts include the following important observations. Language does not appear to be necessary for the formation of categories, because prelinguistic infants form a multitude of categories (Mehler & Fox, 1985; Waxman & Balaban, 1997) and even use categories to form inferences about unknown properties (Hayne, Rovee-Collier, & Perris, 1987; Baldwin, Markman, & Melartin, 1993). Furthermore, young children treat categories in essentialized ways, long before the introduction of formal schooling or scientific principles (Gelman & Coley, 1990). Moreover, parents provide little or no direct instruction about essences in their ordinary conversations with children (Gelman, Coley, Rosengren, Hartman, & Pappas, 1998). Thus, to some extent children seem to construct essentialist beliefs spontaneously. Finally, there are striking cross-cultural similarities in conceptual organization in speakers of widely varying cultural backgrounds (Atran, 1990; Berlin, 1992). These similarities appear to include an appeal to category essences. Altogether, the picture that emerges--of essentialist beliefs early in childhood, universally attained in the absence of instruction--might suggest a robust capacity that spontaneously develops rather than an acquired set of beliefs that are susceptible to varying linguistic input. Nonetheless, there are at least two reasons to suspect that essentialism is not wholly a wired-in capacity and that language may play a role in the nature of the categories that develop: 1. There is cross-cultural variation regarding which categories support rich inferences and essentialist accounts. For example, caste is recognized as an essentialized category in India (Mahalingam, 1998), but does not even exist as a category in the United States. To take another example that is not quite so extreme, occupations were essentialized in nineteenth century Britain (Thompson, 1963), but are treated as relatively superficial by preschool and elementary school children in the United States today (Hirschfeld, 1996). Similarly, class can be viewed as either fluid and circumstantial or as deeply rooted and essential. Consider the following discussion, which contrasts class distinctions made in the current-day United States with those
The Role of Language
203
m a d e in early twentieth century Peru e.g., "gente decente" (respectable people) vs. "gente de p u e b l o " ( c o m m o n people; Parker, 1998, pp. 24-25): Common to all these labels was the implication that they referred to sorts of people rather than to locations in a fluid social structure. Unlike such terms as upper, middle, and lower class, which might denote momentary economic status, gente decente and genre depueblo were moralcategories, signifyingintrinsic qualities, not transitory circumstances. By using these terms, Latin Americans constructed a vision of society in which status was clearly ascribed: either one was born decente, or one was not. Respectability was a matter of blood and character, innate and unchanging. As a result, the distinction between gente decente and genre de pueblo tended to be seen in rigidly dualistic terms, following an "us and them" logic that left scant middle ground. . . . Members of the society shared the same linguistically constructed assumption that decency reflected some inner essence. Some scholars have even suggested that there are b r o a d e r cultural differences in the degree of category accessibility (Choi, Nisbett, & Smith, 1997). Obviously, cultural variation in essentialism cannot be innately determined. We must therefore look toward other means of expressing and conveying cultural differences in belief systems. Language is one potential means of conveying cultural beliefs. 2. Language has at least two expressive functions that are directly relevant to the inductive potential of categories: conveying m e m b e r s h i p in a kind (e.g., by labeling an entity with a c o m m o n noun, or by referring to kind m e m b e r s h i p with the word k i n d ) , and expressing scope of a proposition (e.g., with logical quantifiers, such as all, some, or most, or with generic noun phrases, such as "Bears hibernate in winter"). It would be tedious, at best, to carry out either of these functions in the absence of language. For example, it is difficult to imagine how a nonlinguistic species could convey that a legless lizard really is a lizard, even though it looks outwardly just like a snake. With language, however, such a concept is elegantly expressed (e.g., "This is a lizard"). Likewise, no process of enumerating and displaying examples can convey that all birds have hollow bones, whereas this is an uncomplicated linguistic effort. Given the relevance of these functions for induction and category-based reasoning, given the relative ease of conveying these functions via language, and given the difficulty of expressing t h e m by nonlinguistic means, there is reason to suspect that language plays a role in the structure of people's categories. In this chapter, we explore several potential ways that language m a y affect the construction of inference-promoting kinds. We begin with a discussion of what is m e a n t by "kinds" and "essentialism" and an overview of some of the findings demonstrating essentialist beliefs even in young children. We then describe four distinct linguistic devices and evaluate
204
Susan A. Gelman et al.
the role of each in conveying essentialism. Two of these forms convey membership in a richly structured category (the word kind; lexicalization) and two of the forms express scope of a proposition (logical quantifiers; generic noun phrases). We end by drawing some general conclusions regarding the nature of the effects of language and potential areas for future research.
lI.
Categories, Kinds, and Essences
Human categories are distinctive in their diversity, ranging from simple to complex, from concrete to abstract, from arbitrary groupings to those deeply rooted in theories. To understand the role of language in categorization, it is first necessary to make some distinctions. In this section we introduce some terminology that will be used in the remainder of the chapter. Specifically, we distinguish "category," "kind," and "essence." A.
SOME TERMINOLOGY
Whereas a category is any grouping together of two or more discriminably different things, a kind (or "natural kind"; see Schwartz, 1977, 1979) is a category that is treated by those who use it as being based in nature, discovered rather than invented, and capturing many deep regularities) An example of a category that is not a kind is the set of things with stripes, including tigers, striped shirts, and barbershop poles. This categorical grouping is not a "kind" because it captures only a single, superficial property; it is not richly structured and does not capture nonobvious regularities (Mill, 1843; Markman, 1989). Similarly, ad-hoc categories, such as "things to take on a camping trip," are not kinds (Barsalou, 1991). In contrast, an example of a kind is the set of tigers. Kinds play an important role in human cognition because they are used to guide inductive inferences about novel properties, as discussed in the previous section. Moreover, children and adults use knowledge about kinds to form overarching hypotheses ("over-hypotheses") that apply generatively to novel categories, for example, "each kind of animal has its own characteristic sound" (Shipley, 1989, 1993). Children appear to generate such over-hypotheses on the basis of experience with a limited set of familiar kinds, which then allows them to generate novel inferences about unfamiliar kinds (e.g., that armadillos have a characteristic sound), even in the absence of any further information about that kind (e.g., without knowing anything about armadillos other than that they are a kind of animal). 1 Our distinction between category and kind is equivalent to Shipley's (1993) distinction between "class" and "category" and Markman's (1989) distinction between "natural kinds" and "arbitrary categories."
The Role of Language
205
Related to the notion of kind is that of essence. Framed as an intuitive folk construal, psychological essentialism is the belief that members of a kind share some underlying quality or substance that confers identity and is causally responsible for observable similarities among category members (Medin, 1989; Gelman, Coley, & Gottfried, 1994). For example, one plausible essence for the category of tigers might be shared D N A structure, which (according to folk belief) is what ultimately gives tigers their identity. Thus, the major difference between kind and essence is that the latter incorporates the former and adds to it the idea that there is a part, substance, or quality (i.e., the essence) that causes the properties shared by the kind. The two notions are thus closely related, but distinguishable. Essences are attributed only to those categories that are kinds, and not to categories that have a more arbitrary basis. 2 B.
EVIDENCE FOR K I N D S AND ESSENCES
Empirical investigations of human concepts suggest that children, as well as adults, form categories that have rich inductive potential, capture nonobvious properties, and are treated as if they have essences. We briefly review some of this evidence below. 1.
R i c h I n d u c t i v e Potential
Members of a category may share indefinitely many properties. For example, cats are alike not just in ways that are immediately perceptible (e.g., shape, fur), but also in nonobvious ways (e.g., anatomical structure, means of reproduction) that ordinary adults may come to learn. Therefore, facts learned about an individual often generalize to the kind. Studies with children demonstrate that 2a/~- through 5-year olds readily draw inductive inferences from one category member to another even in the strong case when outward appearances are conflicting (Gelman & Coley, 1990; Gelman & Markman, 1986, 1987). For example, on learning that a stegosaurus ("dinosaur") has cold blood and a bird has warm blood, preschool children will infer that a pterodactyl (also labeled a "dinosaur") has cold blood, even though it more closely resembles the bird. Before school age, children expect category members to share important underlying commonalities that are not immediately apparent. 2.
N o n o b v i o u s Properties
Category members can share properties that are not readily observable and not necessarily reflected in surface appearances. A member of a richly 2 Essences need not be linked to categories, however. A n individual (e.g., William Shakespeare) can be treated as having an essence.
206
Susan A. Gelman et al.
structured category may not necessarily resemble other category members (e.g., an ostrich does not look like a robin or a bluejay) and may even appear to be something else altogether (e.g., an insect camouflaged to look like a leaf). The ultimate arbiters of category membership are the nonobvious properties shared by ostriches, robins, and bluejays, or by leaf insects and other insects rather than leaf insects and leaves. Preschool children's sensitivity to non-obvious properties can be seen in various ways: in their attention to internal parts (Gelman & Gottfried, 1996; Gelman & O'Reilly, 1988; Gelman, 1990; Gelman, Durgin, & Kaufman, 1995; Simons & Keil, 1995; Johnson & Wellman, 1982); their reasoning about nonvisible or invisible entities, as widely ranging as germs, dissolved particles, and mental states (Au, Sidle, & Rollins, 1993; Kalish, 1996; Rosen & Rozin, 1993; Siegal, 1988; Wellman, 1990); their comprehension of appearance-reality contrasts (Flavell, Flavell, & Green, 1983); and their judgments about identity (Gelman & Wellman, 1991; Gutheil & Rosengren, 1996; Keil, 1989). Children realize that, for animals, internal parts (which are inherently nonobvious) differ from external parts (Gelman, Spelke, & Meck, 1983) and can be critical in object identity and functioning (Keil, 1989; Gelman & Wellman, 1991). Children also appreciate that animals have innate potential for certain behavioral and physical properties (e.g., shape of tail, movement pattern; Gelman & Wellman, 1991), and that a range of important characteristics are inherited (Solomon et al., 1996; Springer, 1996; Springer & Keil, 1989; Hirschfeld & Gelman, 1997).
3.
Psychological Essentialism
Certain categories are treated as if they have an underlying reality or true nature (an "essence") that one cannot observe directly but that gives an object its identity and is responsible for other similarities that category members share (James, 1890; Medin, 1989; Locke, 1894/1959). For gold, it is the atomic mass of 196.97; for tigers, it is perhaps a particular genetic structure (but see Malt, 1994). Note that this is a psychological claim and not a metaphysical or linguistic one. Although many philosophers and biologists question whether categories truly have essences (e.g., Mayr, 1988), ordinary speakers treat categories as having this structure (Atran, 1990; Medin & Ortony, 1989; Gelman, Coley, & Gottfried, 1994). Moreover, in many (perhaps nearly all) cases, the hypothesized "essence" is not known by the ordinary language user. Instead, people have an essence placeholder (Medin, 1989). In other words, they hold the intuitive belief that an essence exists, even if its details have not yet been revealed. Thus, an essence typically could not be part of the semantic core of a word, nor could it determine word extensions. Nonetheless, it has implications for people's
The Role of Language
207
beliefs regarding the depth and stability of a concept (Rothbart & Taylor, 1990). Essentialist reasoning is implicit in preschool children's judgments that an animal's identity is retained over even dramatic transformations (e.g., caterpillar to butterfly; Keil, 1989; Rosengren et al., 1991). 3 It is also implicit in preschoolers' judgments that animals have innate potential to exhibit various physical and behavioral characteristics (e.g., that a helpless tiger cub will grow to be fierce, even if raised by sheep; Gelman & Wellman, 1991). Categories with inductive potential, nonobvious properties, and a presumption of essence are richly structured in the sense that they presuppose a reality beyond the phenomenal. In other words, theoretical constructs provide a "truer" representation of reality than what can be observed, and the world is organized into densely complex and predictive clusters of correlated features. To give an example: When we classify an animal as a turtle, we are interested in much more than its outward appearance. We typically assume that this classification may have a nonobvious basis (e.g., although presence of a shell or particular markings may be useful to classifying a turtle, these features can be overridden by other, more "biological" properties), an essence (e.g., turtle DNA), rich inductive potential (e.g., regarding body temperature, number of offspring typically produced, means of gathering food), and openness to revision. We presume there may be turtles that look like rocks (but are not), and rocks that look like turtles (but are not), or that one could discover new species of turtles that are unusually tiny or unusually large or that do not even have distinct shells. The research cited does not imply that perceptual similarity is unimportant to children's concepts. Appearances are clearly salient and important in many contexts and on many tasks (Jones & Smith, 1993). Even within an essentialist framework, appearances provide crucial cues regarding an underlying essence (Gelman & Medin, 1993). Nevertheless, evidence strongly suggests that preschool children assume that some categories are structured in ways that cannot be characterized in terms of perceptual information alone. C.
WHY IS PSYCHOLOGICAL ESSENTIALISM IMPORTANT TO THE STUDY OF COGNITIVE DEVELOPMENT.9
Essentialism in children is important to study for several reasons: Most importantly, the framework has revealed previously unsuspected abilities 3 Although a sizeable literature suggests that young children have difficulty understanding such transformations (e.g., DcVries, 1969; Kohlberg, 1966), this can be in part attributed to children's confusion regarding unnatural transformations (Rosengren et al., 1991), asymmetries between the salience of category membership information vs. property information (Gelman, Collman, & Maccoby, 1986), and pragmatic aspects of the tasks (Siegal & Robinson, 1987).
208
Susan A. Gelman et al.
in young children, thus contradicting a widely accepted view that children's concepts are limited to concrete, perceptual, and obvious qualities. By extension, this portrait suggests a shift in views of knowledge developm e n t - - w h a t is most basic, what is derived, and how knowledge develops. For example, if unobservable constructs are present from the start, then observable surface features cannot be privileged, m o r e "simple," or m o r e "basic." Studies of psychological essentialism have also expanded the range of tasks used to study categorization, to include not only identification and naming, but also induction and causal reasoning (tasks that m o r e directly p r o b e essences; see G e l m a n & Medin, 1993; G e l m a n & Diesendruck, 1999). These new tasks have enriched our understanding of category functioning over development. Finally, studies of essentialism have educational implications. Much of our knowledge of the world is arrived at by induction rather than being directly taught. Thus, any full account of knowledge acquisition must consider the conditions that p r o m o t e or discourage induction in children. Furthermore, the study of essentialism promises to shed light on naive biases, rooted in essentialist thinking, that interfere with the acquisition of scientific knowledge (see Mayr, 1991).
HI.
Four Linguistic D e v i c e s
T h e focus of this chapter concerns how language conveys that a category is a kind. We focus on an age range (2 years and older) in which children already have productive language and show evidence for kind concepts. Thus, we assume that these children have already established a notion of kind. The question before us is whether and how language helps children figure out which categories are kinds. For example, as we discuss in study 2, referring to a novel category with a noun label encourages children to view the category as m o r e stable over time, and less susceptible to external influences. The studies thus take a detailed look at the linguistic mechanisms by which a notion of kind is conveyed. 4 W e consider four linguistic devices: use of the word kind, lexicalization, logical quantifiers, and generic noun phrases. This is not m e a n t to be an exhaustive list of the sorts of language features that could affect essentialist construals of concepts. However, each of these constructions is found in 4 A further issue that is not examined in this chapter but that remains an important issue for future research is whether languageplays a more foundational role in the initial construction of kinds. For example, hearing various perceptually distinct objects labeled with the same word may encourage children to construct the belief that these distinct objects must at some underlying level be alike. This level of effect would be implicated if prelinguistic children have not yet wholly constructed a notion of kind, and language is a critical factor in allowing such a concept to emerge.
The Role of Language
209
everyday speech, is plausibly related to notions of kind or essence, and has received some empirical study. Furthermore, these four forms were selected as representing each of the expressive functions described in the introduction: conveying membership in a kind and expressing scope of a proposition. As noted earlier, our approach is primarily a developmental one, on the assumption that the most long-reaching effects of language would be those available early in a child's life. A.
EVIDENCE FOR OR AGAINST LANGUAGE EFFECTS
What would constitute evidence for or against language effects? There are at least three conditions that must be met for a linguistic form plausibly to play a role in the early construction of kinds: It must be available in the input to young children, it must be used in ways that map onto relevant conceptual distinctions (i.e., distinguishing kinds from other categories), and it must be understood by children. We briefly discuss each of these criteria in the remainder of this section.
1. Availability in the Input Availability in the input is the starting point. Even if a linguistic structure is understood by children and has implications for their reasoning, it is unlikely to have much of an effect if it is not frequently encountered. For all four devices, we consider frequency. Obviously, count nouns are highly frequent in maternal speech, but the frequency of the others in the input is an open empirical question. Determining what constitutes "frequent" input is a tricky issue. Language is used to convey many different ideas, and we would not expect even salient topics to constitute more than a small fraction of the speech that children are hearing. As comparison, consider that parents' discussion of mental states, which are clearly salient and frequent topics of conversation, constitutes less than 5% of their utterances (Bartsch & Wellman, 1995). Considering that parents can produce hundreds of utterances an hour, a form that appears in 1 out of every 100 utterances would expose the child to a sizeable database because it would expose the child to several examples in an hour and, by extrapolation, dozens of examples during a single day. As a rough guide, we assume that forms that consistently exceed 1% of utterances constitute a theoretically significant amount of input. (This would be approximately the same order of magnitude as talk about mental states and causality; Bartsch & Wellman, 1995; Hickling & Wellman, 1998.) In the conclusions, we also consider the relative frequency of the various forms under study to draw conclusions regarding which forms are most available to young children.
210
2.
Susan A. G e l m a n et al.
Conceptual Distinctions
Related to the frequency issue is the question of whether or how often these linguistic devices express the conceptual distinctions of interest. At a most fundamental level, it is important to determine how often these forms refer to kinds. All of the forms under consideration have some flexibility in their application and use. Although some of the uses map onto the distinctions we are investigating, other uses do not. Take as an example the word all, which, as we discuss later, can be used in any of three ways: to refer to an entire kind (e.g., "All bats sleep during the day"), to refer to instances of a kind within a specific context (e.g., "I cleaned out all my closets"), or not to refer to kinds whatsoever (e.g., "I'm all done"; "All right"). When considering frequency in the input, we thus also need simultaneously to consider the various possible uses of these terms. Relatedly, it is also important to consider the categorical level and domains of application of these terms. It is well known that categories at an intermediate level of generality either basic-level categories (Rosch, Mervis, Gray, Johnson, & Boyes-Braem, 1976) or generic-specieme categories (Atran et al., 1997) are particularly richly structured, with many correlated features and rich inductive potential. In order for a linguistic construction to convey information about kinds to children, it should do so predominantly for basic-level categories (e.g., "All bats sleep during the day") and apply less frequently to subordinate-level categories (e.g., "All fruit bats sleep during the day"). Likewise, we assume that animate categories are more "kindlike" than categories of artifacts (see Keil, 1989; Gelman, 1988). Where direct comparisons of animals and human-made artifacts have been conducted, clear domain differences have been found as early as age 3 or 4 years. These comparisons include studies of internal parts (Gelman, 1990; Simons & Keil, 1995), object identity (Keil, 1989), inheritance (Hirschfeld, 1995b; Springer, 1992), origins (Gelman & Kremer, 1991; Keil, 1989), selfgenerated movement (Gelman et al., 1995; Massey & Gelman, 1988), and spontaneous growth and healing (Backscheider et al., 1993; Rosengren et al., 1991). Specifically, animals are assumed to have richly structured internal parts that differ from their exteriors and are responsible for selfgenerated movement; in contrast, simple artifacts are thought to have the same parts inside as outside, and inner parts are unrelated to movement. Similarly, for animals, superficial changes that seemingly alter an object, making it into something else, cannot influence the item's identity; whereas for artifacts, such changes can alter identity. Regarding inheritance, animal properties such as skin color and build are thought to be inherited through the biological parents; for artifacts, no such inheritance process is possible.
The Role of Language
211
Origins for animals implicate a natural, self-generated, inherent process; origins for artifacts implicate a human or human-like "other" who creates the item (Bloom, 1996). Finally, growth and healing in animals are assumed to be highly patterned, predictable processes stemming from the animal itself. In artifacts, the terms "growth" and "healing" do not even typically apply; moreover, changes over time and mending are accomplished in a less predictable way, and require external agents of change. Thus, much evidence suggests that children's early categories are richly structured, but that this rich structure is selectively applied to some domains and hierarchical levels more than others. If parental language is playing a role in the development of these concepts, then it should be providing information about which categories are kinds. This condition (differentiation in parental input) is a necessary, but not sufficient, condition that a linguistic form must meet, for it to play a role in development.
3.
Comprehension by Children
The most critical piece in the puzzle concerns how children interpret these linguistic constructions. It is only when a form is understood, and is understood in kind-referring ways, that it could plausibly influence further conceptualization. For example, consider generic noun phrases, such as "Kitty cats love to play with yarn." Generics are abstractions in that they extend beyond the here-and-now to encompass the kind as an entirely. If children below a certain age are unable to appreciate that generics extend in reference beyond the immediate context, then they would be unlikely to benefit from hearing generics in the parental input. Therefore, a critical consideration here is the nature of children's early construal of these terms. In no case do we yet have complete evidence on the issue, but the data we report are sufficient to draw some initial conclusions.
4.
The Studies That Follow
Below we report a variety of studies examining the conceptual implications of the word kind (study 1), lexicalization with common nouns (study 2), universal quantifiers (all, any, each, every; study 3), and generic noun phrases (studies 4-8). The different studies make use of different approaches to gather evidence. Studies 1, 3, 4, 5, and 6 focus on analyses of natural language in both parent and child speech, whereas studies 2, 7, and 8 take an experimental approach to reveal children's interpretations of these linguistic forms. Each study examines one or more of the three criteria sketched out previously. In some cases, we have focused more on the evidence for availability in the input, reasoning that when such evidence is sparse, it is rather unlikely that the linguistic strategy under investigation
212
Susan A. Gelman et al.
will play a major role in conveying essentialist or kind beliefs. In other cases in which a strong case can be made for availability in the input, we provide converging evidence from experimental approaches to investigate children's understanding. Altogether, the studies are used to make inferences concerning the role of language in guiding children's kind concepts. We also point out along the way where there are currently gaps in the research, and where further research will be beneficial.
IV.
The Word "Kind"
Perhaps the most explicit means of expressing membership in a kind is with the word kind itself, as in, "Robins are a kind of bird" (Wierzbicka, 1994). From her review of a variety of unrelated languages including Chinese, Japanese, Thai, Ewe (a Niger-Congo language), Acehnese (an Austronesian language of Indonesia), Kalam (a Papuan language), and Kayardild (an Australian language), among others, Wierzbicka concludes that not only do all languages sampled have a lexical entry for kind, but also that they distinguish kind from like (i.e., kinds do not reduce to similarity). For example, all the languages surveyed can express something like, "These trees are the same kind, not two different kinds," as well as, "This flower is like a rose, but it is not a rose." This result suggests the possibility of a universal conceptual distinction between kinds and other sorts of groupings. Wierzbicka (1994) did not speculate about the mechanisms by which a concept of "kind" is expressed to young children, although she implies that having a word for a concept plays an important role: "in human communication it is not enough to 'have' a concept, it is also important to have means to convey it to other people (even assuming that one C O U L D 'have' a concept without being able to communicate it to other people). For some concepts, this can be done by means of some circumlocution or paraphrase; for others, however, it is necessary to have a direct lexical exponent" (p. 348). However, although the word kind provides an intriguing window onto adult concepts, one cannot necessarily infer that it will be an important mechanism for conveying kind concepts to children. One problem is that, at least in English, the word kind can refer to one narrow sense of kind, that of a nested subtype (e.g., "What kind of cereal do you like best?"). This meaning is distinct from the notion of "kind" outlined earlier in this chapter: not all subtypes are richly structured and inference promoting (e.g., argyle socks are a kind of sock), and not all kinds are subtypes within a class-inclusion hierarchy (e.g., gold, water).
The Role of Language
213
Several studies of lexical development have employed the phrase "This is a kind of Y" to express inclusion relations to young children. Notably, this phrase is equally appropriate to use for artifacts (such as clothing or furniture) and natural kinds (such as categories of animals and plants). Artifacts are not richly structured natural kinds, but they permit class inclusion relations, and thus allow for sentences like, "Sneakers are a kind of shoe." The experimental studies employing the " X is a kind of Y" construction generally find that children between the ages of 2 and 5 years are sensitive to this implication of the word kind (Callanan, 1989; Diesendruck & Shatz, 1997; Gottfried & Tonks, 1996). This suggests that the "subtype" construal of the word kind is salient in children's early speech. A further issue concerns the availability of the word kind in speech to young children. If the word kind is to be a mechanism for expressing kind concepts to young children, it would need to appear with some frequency in speech addressed to young children. Unfortunately, these questions cannot be answered by past work, which did not include analyses of the frequency and function of kind in actual speech. A.
STUDY1: THE WORD "KIND" IN PARENT--
CHILD CONVERSATIONS To address these issues, we (Hollander & Gelman, 1999a) conducted a small-scale study of parents' and children's spontaneous speech to determine the relative frequency and usage of the world kind in parent-child conversations for children between the ages of 2 and 5 years. Data were obtained from the CHILDES database (MacWhinney & Snow, 1985, 1990). The researchers who contributed the data were Lois Bloom (1970), Roger Brown (1973), Stan Kuczaj (1976), Brian MacWhinney, Jacqueline Sachs (1983), and Catherine Snow. Subjects were eight children (age 2-5 years) followed longitudinally (with researcher's name in parentheses): Abe (Kuczaj), Adam (Brown), Mark (MacWhinney), Naomi (Sachs), Nathaniel (Snow), Peter (Bloom), Ross (MacWhinney), and Sarah (Brown). First, we simply tallied the frequency with which the word kind appeared at all in the speech of children and parents (Table I). This tally overestimates the frequency of relevant expressions because the word can also be used in ways that have nothing to do with categories (e.g., "This is kind of big"; "You are a kind boy"). Nonetheless, even with these potentially inflated numbers, it is clear that the word kind is rare in both parents' and children's speech, occurring on average in less than 1% of utterances. However, it is possible that even such rare instances could be theoretically significant if they convey important information of a sort not available through other means. For this reason, we undertook a more detailed analysis of the uses of the word kind. Specifically, we coded three aspects of each use:
214
Susan A. Gelman et al.
1. Scope: Does the word refer to a generic kind (e.g., "What kind of flowers do you like? .... The kind of balloon people used to fly inside of") or does it refer to an individual or set of individuals (e.g., "What kind of game were you playing? .... What kind of basket is this?")? Uses were coded as referring to a generic kind if they made reference to the category as a whole, in a manner not tied to a specific individual or set of individuals. In contrast, uses were coded as referring to (an) individual(s) if they were: tied to past events, or requests for labels or modifying information (e.g., often in the form of, "What kind of X is this?") and did not make reference to the entire category in any way. 2. Category level: Does the word refer to a basic-level category (e.g., "a kind of animal"), or does it refer to a subordinate-level category (e.g., "a kind of dog")? 3. Domain: Does the word refer to an animate entity (e.g., person, dog), an artifact (e.g., airplane), or other (e.g., food)? If the word kind is to be a useful source of information to children about inference-promoting categories, then, it should be used primarily (1) to provide information about a generic category (not just an individual), (2) to refer to basic-level categories, and (3) to refer to animates. All of these are shown to be more richly structured on a variety of tasks (basicvs. subordinate-level categories: Rosch, Mervis, Gray, Johnson, & BoyesBraem, 1976; Atran, Estin, Coley, & Medin, 1997; category vs. individual: Gelman et al., 1998; animacy: Keil, 1989; Gelman, 1988). We conducted an in-depth analysis of one mother-child pair (Roger Brown's Adam and his mother) that was then bolstered by a sampling of the speech of others in the CHILDES database. A d a m was selected as being representative of children's language on a variety of other measures (e.g., Bloom, 1990; Marcus, Pinker, Ullman, Hollander, Rosen, & Xu, 1992)
TABLE I STUDY 1: RELATIVE FREQUENCY OF THE WORD KIND IN THE NATURALLY OCCURING SPEECH OF CHILDREN AND PARENTS IN THE C H I L D E S DATABASE, AS M E A N PERCENTAGE OF TOTAL UTrERANCES
Mean percentage of utterances containing kind Range (in percent) across parent-child dyads Total number of kind instances
Children (N = 8)
Parents" (N = 8)
0.28% 0.05%-0.74% 559
0.52% 0.09%-1.01% 688
For each child, these data are from the one parent (mother or father) who provided the most data: the mothers of Adam, Naomi, Nathaniel, Peter, and Sarah; the fathers of Abe, Mark, and Ross.
The Role of Language
215
and because his data are plentiful and cover a lengthy developmental range (from 2;3 to 5;2). Altogether, A d a m was taped on 55 occasions, producing more than 46,000 child utterances and more than 20,000 maternal utterances. Each instance of the word was first identified as a target meaning or not (e.g., " Y o u are a kind b o y " and "He is kind of silly" would be excluded from further consideration). Nontarget uses accounted for only 3% of Adam's uses and 5% of his mother's uses. Of the remaining 365 instances of kind (161 for Adam, 204 for his mother), each was coded independently along each of the three dimensions described. A second coder coded 25% of the utterances, achieving reliability of 85 to 96% on each of the dimensions. As can be seen in Table II, the word kind was rarely used to refer to generic kinds, basic-level categories, or animate entities. Less than onethird of Adam's mother's uses at any age referred to a generic kind. When we focus specifically on those utterances in which kind did refer to a generic kind, we find neither an animacy bias nor a basic-level bias (Table III). Uses of the word kind to refer to a generic kind were in fact most frequent for artifacts and subordinate-level categories. When one takes the intersection of all three factors (kind referring to an animate, basic-level, generic kind), we found only 1 such use by A d a m and only 4 such uses by his mother. In order to examine the generality of these findings with Adam, we analyzed a sampling of the speech from additional children in the C H I L D E S database. We focused on those parent-child pairs for whom data were T A B L E II STUDY 1: USES OF THE WORD KIND IN THE SPEECH OF ADAM AND HIS MOTHER (AS PERCENTAGES OF TOTAL TARGET USES AT EACH AGE) Adam
(1) Scope Kind Individual (2) Level Basic level Subordinate (3) Domain Animate Artifact Other (4) Number of target uses (5) Percentage of total utterances
Adam's mother
2 yrs
3 yrs
4 yrs
2 yrs
3 yrs
4 yrs
0% 100
3% 97
15% 75
7% 93
23% 77
28% 72
0 100
5 95
12 78
9 91
32 68
36 64
12 70 18 66 0.35
14 52 34 73 0.64
15 50 35 68 0.89
14 73 14 22 0.13
35 32 32 111 1.22
32 40 28 25 0.73
216
Susan A. Gelman et al.
T A B L E III STUDY 1: USES OF THE WORD KIND THAT REFER TO A GENERIC KIND (ABSOLUTE NUMBER OF SUCH USES), IN THE SPEECH OF ADAM AND HIS MOTHER Adam
Animate Artifact Other
Adam's mother
Basic
Subordinate
Basic
Subordinate
1 5 0
0 8 6
4 4 2
2 17 8
available at age 4 (the age at which parental input to Adam was most kindlike in its properties): Abe, Mark, Naomi, Ross, and Sarah. We selected 20% or more of the transcripts for each child at age 4, centering on the middle of the age range as being most representative of the input at that age. As can be seen in Table IV, the results from this sampling of five
T A B L E IV STUDY 1: USES OF THE WORD KIND IN PARENTAL SPEECH DIRECTED TOWARD ABE, MARK, NAOMI, Ross, AND SARAH AT AGE 41/2
(AS MEAN PERCENTAGES OF
TOTAL TARGET USES AT EACH AGE) a Mean (1) Scopeb Kind Individual (2) Levelb Basic level Subordinate (3) Domain b Animate Artifact Other (4) Mean number of target uses (5) Percentage of utterances sampled
Range
34% 66%
0-73% 27-100%
32% 68%
0-45% 55-100%
18% 42% 39% 4.6 0.21%
0-45% 27-50% 27-50% 0-10 0.00-0.65%
a For each child, these data are from the one parent (mother or father) who provided the most data: the mothers of Naomi and Sarah; the fathers of Abe, Mark, and Ross. For these calculations, we excluded those parents (N = 2) who did not produce kind at all within these transcripts.
The Role of Language
217
additional children support the patterns obtained from the in-depth analysis of Adam and his mother. Once again, the word kind was rarely used to refer to generic kinds, basic-level categories, or animate entities. Indeed, in the entire sample of more than 8000 parental utterances, only 3 instances referred to animate, basic-level kinds.
1.
Conclusions
The word kind is infrequent in parents' speech (occurring in well under 1% of parental utterances). Moreover, when it is used, it is most often used to label an individual member of a category subtype, often when requesting a label or modifying information (e.g., "What kind of game were you playing?"; "What kind of gun is that?"), and only rarely to label a broader kind (e.g., "Yours looks like a dirigible, the kind of balloon people used to fly inside of"; "What kind of flowers do you like?"). We conclude that explicit reference to kinds by means of the word kind is unlikely to be a source of much information to children regarding category structure. However, one piece that is missing from the current analysis is an examination of children's interpretation of the word kind. Past work has shown that young children are capable of interpreting the word as referring to subtypes within a hierarchy (e.g., Diesendruck & Shatz, 1997), but it is not yet known if use of the term also conveys membership in a richly structured kind. For example, it would be interesting to examine children's interpretations of kind when used in kind-referring expressions (e.g., "This kind of bear eats insects"), to determine if children treat such expressions as conveying category-wide properties. However, even if children were to interpret such input appropriately, we argue that it occurs too rarely in natural input to play a major role in children's developing kind concepts.
V.
Lexicalization
In contrast to use of the word kind, which is infrequent and restricted in reference primarily to subordinate-level categories, labeling is a pervasive and powerful way of conveying category membership to young children. Moreover, most labels used with young children are basic level (Rosch et al., 1976). Through labeling, children incorporate novel instances and can even learn to redraw category boundaries. For example, naming a pterodactyl a "dinosaur" changes the type of inferences young children make about the animal (Gelman & Coley, 1990; Gelman & Markman, 1986). Category labels (e.g., "tattle-tale," "nerd") seem intuitively to tell us what an entity is, not just what an entity is like (Markman, 1989). Many
218
Susan A. Gelman et al.
properties that could be construed as t e m p o r a r y states (e.g., "Sally didn't clean up her r o o m t o d a y " ) m a y seem m o r e enduring and f u n d a m e n t a l (i.e., m o r e like a kind) w h e n expressed in the f o r m of a category label (e.g., "Sally is a slob"). I n this sense, labels c o n v e y m e m b e r s h i p in a kind. C a r e y (1995) further suggests that category labels do n o t m e r e l y reflect essentialism, but in fact are the r o o t of essentializing: Essentialism, like taxonomic structure, derives from the logical work done by nouns. The child has a default assumption that count nouns are substance sortals, i.e. naming concepts that provide conditions of identity during the maximal lifetime of an ent i t y . . , the application of every count noun carries with it the idea that the identity of the entity picked out by the noun is unchanged in the face of surface changes. I submit that biological essentialism is the theoretical elaboration of the logical-linguistic concept, substance sortal. (p.277) F u r t h e r m o r e , children's t e n d e n c y to treat category labels as mutually exclusive (i.e., assuming that each object has only one label) m a y reflect their overapplication of this linkage b e t w e e n count n o u n s and categories with underlying essences (Carey, 1995). M a y r (1991) p r o p o s e s a similar argument: Essentialism's influence on pre-Darwinian philosophers was great in part because its principle is anchored in our language, in our use of a single noun in the singular to designate highly variable phenomena of our environment, such as mountain, home, water, horse, or honesty. Even though there is great variety in kinds of mountain and kinds of home, and even though the kinds do not stand in direct relation to one another (as do the members of a species), the simple noun defines the class of objects. (p.41) T h e r e is widespread interest in the effects of labeling on social categorization. Researchers have theorized that labels lead to c h a n g e d expectations, and as such can have either positive or negative effects. O n e benefit of labeling is that it can allow m o r e c o m p l e x interpretations of b e h a v i o r that might otherwise be evaluated negatively ( W o o d & V a l d e z - M e n c h a c a , 1996). F o r example, behaviors that would otherwise be considered disruptive are reassessed as creative w h e n a child displaying the behaviors is labeled as " g i f t e d " (Murphy, 1990). Labeling can also have the practical benefit of easing access to social services (Rosenfield, 1997). H o w e v e r , there is also an ample literature d e m o n s t r a t i n g that labeling can foster stereotypes and lead to negative expectations (e.g., D a r l e y & Fazio, 1980; Fiske & N e u b e r g , 1990; Hamilton, Sherman, & R u v o l o , 1990; Miller & Turnbull, 1986). Labeling effects are f o u n d in children as well as adults (Milich, M c A n i n c h , & Harris, 1992). H o w e v e r , w h a t is not completely u n d e r s t o o d is why labeling has negative effects. Certainly one cause of negative evaluations is the information liter-
The Role of Language
219
ally conveyed in a label. For example, describing someone as a criminal provides information about that person's past behavior and increases the probability that one can make inferences about their trustworthiness. As Jussim, Nelson, Manis, and Soffin (1995) discuss, a label carries with it information about category base-rates. This information can lead to negative appraisals, which in turn can affect how a person is treated (e.g., Rosenthal & Jacobson, 1968). Relatedly, labeling may activate a stigma associated with that particular category. Word meanings can, in addition, include quite subtle and nonobvious social implications. A number of studies have found that the type of word selected (e.g., action verb vs. mental state; descriptive action verb vs. adjective) implicitly conveys information relevant to a social appraisal. For example, Brown and Fish (1983) found that adults draw different causal inferences depending on whether a verb describes an action or a mental state. When hearing behavioral or action verbs (e.g., helps), people give greater causal weight to the subject (e.g., Ted helps Paul because Ted is helpful, not because Paul is helpable). In contrast, with mental or state verbs (e.g., likes), people give greater causal weight to the object (e.g., Ted likes Paul because Paul is likeable, not because Ted is likeful). This effect is quite systematic across a wide range of behavioral/action vs. mental/state verbs. Relatedly, there are systematic differences in interpretation due to whether an event is described with a verb (e.g., Paul is lying) vs. an adjective (e.g., Paul is dishonest): the latter is viewed as more informative, and as reflecting a more enduring quality of the subject (Semin & Fiedler, 1988; Fiedler, Semin, & Bolten, 1989). A.
COGNITIVEIMPLICATIONS OF LEXICALIZATION
The effects described demonstrate that certain linguistic form classes are associated with certain types of meanings. However, they do not show that people use linguistic form class productively, as information to be used when interpreting novel words. Does lexicalization per s e - - t h a t is, characterizing a person or object with a classificatory label--carry implications beyond the literal information conveyed? Specifically, labeling may imply that the information provided is particularly stable and immutable. Giving a label may reify a category in a way that other ways of referring to the same information does not. We find intuitive support for this hypothesis in noting that labels can be separated from the behaviors they describe (e.g., "He's not a criminal; he just made an error in judgment"; "I believe in equal rights for women, but I'm not a feminist"). In these cases, the label conveys that someone is a member of a category (with implied stability and centrality to identity), whereas the behavioral description conveys that
220
Susan A. G e l m a n et al.
someone has a particular attribute (with implied temporary status and distance from central identity). There is now growing evidence that nouns may carry implications beyond other linguistic expressions (see Gentner, 1982). Markman (1989) discusses this distinction when contrasting nouns and adjectives. She hypothesizes that referring to a category with a n o u n conveys that a category (1) supports more inferences, (2) provides more essential information, (3) is central to the identity of an object, (4) is relatively enduring and permanent, (5) is organized into taxonomies, and (6) is unique and nonoverlapping with other categories. In contrast, referring to a category with an adjective implies that it supports fewer inferences, provides less essential information, is less central to an object's identity, and so on. Markman and Smith (cited in Markman, 1989) tested these ideas directly in a series of studies with adults. On one task, participants were asked to list properties of a series of categories. Depending on the condition, categories were either nouns (e.g., "an intellectual") or adjectives (e.g., "intellectual"), matched for semantic content. Subjects listed more properties of the nouns than of the content-matched adjectives (Ms per item of 4.0 vs. 3.1, respectively). On another task, subjects were given a direct contrast between nouns and adjectives and asked which was more important and why. These adults judged nouns as conveying more powerful information than adjectives, and often explained their choices by suggesting that the noun was more enduring and central to category identity. B.
COMPREHENSION BY CHILDREN
What about development? It is plausible that children would attend to lexicalization. A number of studies have shown that category labels are important sources of information for both children and adults, compared to conditions in which no labels are provided (Balaban & Waxman, 1997; Baldwin & Markman, 1989; Gelman & Markman, 1987; Markman & Hutchinson, 1984; Waxman & Hall, 1993; Waxman & Markow, 1995)) Furthermore, children are sensitive to linguistic form class (e.g., nouns vs. adjectives) as early as 2 years of age or even earlier (Brown, 1957; Hall, 1994; Hall, Waxman, & Hurwitz, 1993; Katz, Baker, & Macnamara, 1974). For example, children appropriately assume that a novel noun refers to a class of like objects, whereas a novel adjective refers to a single property. However, we know of only a few studies that contrast nouns with other parts of speech in terms of the inferences that children draw. One study used familiar nouns and familiar adjectives, and found that nouns (not 5 Davidson and G e l m a n (1990) also found some labeling effects on induction with novel labels, although these were limited to cases in which there was s o m e perceptual support for the labeling.
The Role of Language
221
adjectives) were used by children to draw novel inferences (Gelman & Coley, 1990). Children 2 to 3 years of age inferred that two animals with the same noun label (e.g., "bird") shared the same properties (diet, habitat, etc.) even when they were perceptually dissimilar. This was not found when the animals were labeled with familiar change-of-state adjectives (e.g., "sleepy"). Relatedly, Gelman, Collman, and Maccoby (1986) found that gender nouns ("boy," "girl") imply richer inferences than gender-linked properties (e.g., "will grow up to be a Daddy," "will grow up to be a Mommy"). This finding is notable because the properties were central to category identity. Yamauchi and Markman (1998) also found that, for adults, category labels lead to different inferences than category features. Hall and Moore (1997) directly contrasted adjectives and nouns and found that preschool children and adults distinguish adjectives and nouns on the basis of form-class alone. In their studies, children heard familiar color terms in either adjective or noun form applied to a set of novel creatures. For example in one experiment the distinction between nouns and adjectives was supplied morphosyntactically: for example, "This is a blue one" (adjective) vs. "This is a blue" (noun). In further experiments the distinction was supplied phonologically: for example, "This is a blue bird" (adjective) vs. "This is a bldebird" (noun). Children were then asked to judge which of two pictures was also " a blue one"/"a blue bird" (adjective condition) or "a blue"/"a bluebird" (noun condition). Participants chose between pictures depicting either an object kind match (the same creature/bird but now covered with a red substance) or a property match (a different creature/bird that was blue in color). Results indicated that both 4-year-olds and adults used lexical category (noun or adjective) as the basis of their judgments. On hearing an adjective, participants typically selected the property match, whereas on hearing a noun, participants typically selected the object kind match. One way of interpreting these results is to say that nouns led to judgments of greater stability--that is, object identity was preserved with nouns but not adjectives. C.
STUDY 2: LEXICALIZATION OF SOCIAL CATEGORIES
To date, nearly all studies of lexicalization effects focused on familiar labels. This makes it difficult to tease apart effects of the information conveyed in the label vs. effects of the label form itself. Study 2 examined whether the linguistic form itself is sufficiently powerful to produce inferences of stability (see Gelman & Heyman, in press, for a fuller report). We tested this possibility by using novel nominalized phrases to remove the possibility of contaminating effects of familiar labels that may cause listeners to retrieve predetermined meanings. During the experimental sessions, four
222
Susan A. Gelman et al.
child characters were described. Each was described as having an idiosyncratic characteristic (e.g., loves to eat carrots). Then, each was further described with either a novel noun (e.g., "She is a carrot-eater"; Label condition) or a descriptive phrase (e.g., "She eats carrots whenever she can"; Verbal Predicate condition). Each characteristic was chosen as one that could be construed as either temporary or stable. We hypothesized that labels would imply greater stability of the characteristics. Children were then asked a series of questions designed to assess their judgments of the stability of the characteristic over time and across contexts. Participants were 5- and 7-year-old children, randomly assigned to either a Label condition or a Verbal Predicate condition. Each participant received four item sets. For each item set, participants heard a three-sentence description, followed by a set of four test questions. The three-sentence description included the character's name and age, a distinctive behavior that the character characteristically engages in, and either a noun label (Label condition) or a description in the form of a verbal predicate (Verbal Predicate condition). For example, for one story, the description was as follows: "Rose is 8 years old. Rose eats a lot of carrots. She is a carrot-eater (Label condition). She eats carrots whenever she can (Verbal Predicate condition)." The verbal predicates were designed to restate the information in the previous statement in a slightly different form. The labels were designed to refer to the same information using a single compound noun phrase. Aside from the carrot-eaters item, the other items concerned a boy who thinks creatures live on other planets ("a creature-believer"), a boy who wakes up early ("an early-waker"), and a girl who really loves guinea pigs ("a guinea-pig-lover"). The four test questions asked of each item set concerned the stability of the key property (e.g., eating carrots). The concerned: (1) past behavior ("Did Rose eat a lot of carrots when she was 4 years old?"); (2) future behavior ("Will Rose eat a lot of carrots when she is grown up?"); (3) behavior with no family support ("Would Rose eat a lot of carrots if she grew up in a family where no one liked carrots?"); and (4) behavior with family opposition ("Would Rose stop eating a lot of carrots if her family tried to stop her from eating carrots?"). Responses were scored as 1 for each stable response ("yes" to the questions regarding prior behavior, future behavior, and no family support; " n o " to the question regarding family opposition), 0 for each nonstable response ( " n o " to the questions regarding prior behavior, future behavior, and no family support; "yes" to the question regarding family opposition), and 0.5 for each "don't know" response. As predicted, children predicted significantly greater stability in the Label condition than in the Verbal Predicate condition (see Table V).
The Role of Language
223
TABLE V STUDY 2: M E A N NUMBER OF PREDICTIONS THAT THE
PROPERTY WOULD BE STABLE(OUT OF 4 POSSIBLE) AS A FUNCTION OF A G E , CONDITION, AND PROPERTY
TYPE (SDs ARE IN PARENTHESES) CONDITION Label
Verbal predicate
5-year-olds Past Future No family support Family opposition
2.82 3.28 2.78 3.15
(1.46)** (0.98)*** (1.44)** (1.03)***
2.50 2.52 2.18 2.86
(1.62) n.s. (1.47) n.s. (1.56) n.s. (1.27)**
2.92 3.08 2.65 3.03
(1.54)** (1.00)*** (1.37)* (1.38)***
2.83 2.59 1.93 2.63
(1.26)** (0.90)** (1.45) n.s. (1.42)*
7-year-olds Past Future No family support Family opposition
n.s. = non-significant. * Greater than chance, t-test, p < .05. ** Greater than chance, t-test, p < .01. *** Greater than chance, t-test, p < .001. From Gelman, S. A., & Heyman, D. G. (in press). Carrot-eaters and creature-believers: The effects of lexicalization on children's inferences about social categories. PsychologicalScience.
To examine whether these effects hold for each of the four item sets, we examined responses for each item set separately. In every case, the label condition was significantly higher than the verbal predicate condition. To summarize the results of this study: By 5 years of age, children judge personal characteristics as more stable when they are referred to by a noun (e.g., "She is a carrot-eater") than by a verbal predicate (e.g., "She eats carrots whenever she can"). Children in the label condition predict that characteristics are more stable over time (i.e., more likely to be retained in the future) and more stable over adverse environmental conditions (i.e., more likely to be retained even when there is no family support). This finding is consistent with a range of other findings showing that people possess strong stereotypes of social categories encoded in labels (e.g., Darl e y & Fazio, 1980) and that nouns are particularly important for implying that a category is richly structured (Hall & Moore, 1997; Markman, 1989). These findings also extend beyond previous work in showing that labels make a difference even compared to a condition in which the same information is provided in no-label format. Moreover, the present findings are note-
224
Susan A. Gelman et al.
worthy in that all the characteristics were relatively novel (e.g., carrot-eater, creature-believer). This implies that children were not retrieving rote meanings, but rather made use of a general rule that they applied to these novel noun phrases. We thus conclude that lexicalization (in the form of a noun) provides important information to children regarding property stability. D.
SUMMARY AND REMAINING QUESTIONS
Lexicalization meets all three criteria outlined earlier (see section entitled "Evidence for or against language effects"): common nouns are freely available in the input, they are the preferred part of speech when referring to kinds (cf., adjectives or verbal predicates), and they are understood by children as kind-referring. Altogether, this suggests that labeling may be one important mechanism for encouraging children to treat categories as kinds. An important question that remains for future research concerns the scope of the labeling effects on children's reasoning. Does lexicalization narrowly affect children's judgments of characteristic stability (as tested in study 2), or does it have broader implications for how children view certain social descriptions? We hypothesize that use of a label may have a broader effect by serving as one factor that helps children to construe certain social categories as natural kinds. Rothbart and Taylor (1992) note that, "whereas social categories are in reality more like human artifacts than natural kinds, they are often perceived as more like natural kinds than human artifacts" (p. 12). Our work suggests that language may be one factor that changes where social characteristics are perceived to fall on this continuum. In other words, referring to a category with a noun label may foster an essentialist perspective on a category. In future research it would also be important to discover what forms of language have the effects demonstrated here. We have focused on noun labels; however, it is possible that other parts of speech (such as adjectives) may similarly convey essentialist implications--especially for social categories, which often are expressed with adjectives (e.g., smart, athletic, shy). Another open question concerns what kinds of entities are susceptible to language effects. Lexicalization effects may be found across domains. Alternatively, it may be that language is especially powerful for affecting social categories because social categories are so variable in structure. A final point is that labeling by itself cannot wholly solve the problem of how children decide which categories are kinds, as not all labels map directly onto theory-rich, or even coherent, categories (e.g., food, chair, pet). We return to this point in the Conclusions, where we speculate about the importance of converging sources of information to children.
The Role of Language
225
VI. Logical Quantifiers A third way in which children might learn that a category is a kind is by hearing properties predicated explicitly of all members of a category. For example, the statement "All bats sleep during the day" directly conveys that bats constitute a category with coherence and inductive potential. Words that refer to the entire category (universal quantifiers) include all, each, every, and any (Vendler, 1967). A large body of research has examined children's understanding of logical quantifiers, such as all, each, and some (e.g., Brooks & Braine, 1996; Macnamara, 1986). Although initially Piagetian analyses suggested that children below age 6 or 7 years were incapable of understanding these constructions due to intractable cognitive limitations (Inhelder & Piaget, 1964), studies that posed fewer information-processing demands suggested that even 4year olds can distinguish all and some (Smith, 1979, 1980). Many difficulties with uses of all and some involve class-inclusion (e.g., "All Xs are Ys"), complex syntactic constructions (e.g., "A boat is being built by all the men"; Brooks & Braine, 1996), or contexts with competing irrelevant cues (Donaldson & McGarrigle, 1974). In contrast, children have relatively few difficulties with uses in simple declarative sentences that involve property predication (e.g., "All Xs have Ys"; Smith, 1979, 1980). Thus, children's relatively mature performance would suggest that logical quantifiers might be an important source of information. However, although the logical quantifier all can convey important and precise information regarding category properties, initial evidence suggests that it is rarely used in speech to young children. Gelman, Coley, Rosengren, Hartman, & Pappas (1998) examined in detail the speech of 46 mother-child dyads, focusing in part on uses of the word all. The dyads were videotaped looking through picturebooks that were specially designed to elicit talk about categories and category structure. Each utterance containing all was coded into one of three categories: (1) universal quantifier, referring to all members of the category, including those not immediately present (e.g., "I think chic-- roosters all have that thing"); (2) specified context, referring to all members of some specified subset of the category (e.g., "They all go in water, like fish," referring to all the animals on the page); and (3) other, including nonquantification uses (including unanalyzed expressions such as "all done," e.g., "I'm all done looking at the goats"). Results indicated that less than 2% of uses of all were as a universal quantifier (see Table VI). This translates into less than 0.03% of all maternal utterances. Much more typically, all applied instead to a particular subset of objects in context (e.g., "What do you think all these different things are?").
226
Susan A. Gelman et al.
TABLE VI USES OF THE WORD ALL IN CHILD-DIRECTED SPEECH Study A 35 months (N = 16)
Study B 35 months (U = 14)
Study C 20 months (N = 16)
Function (as percentages of total uses at each age): Universal quantifier
Specified context Other Total number of alls Percentage of total utterances
0%
4%
0%
38% 62% 50 1.65
9% 87% 56 2.23
21% 79% 14 1.04
Data from Gelman,S. A., Coley, J. D., Rosengren,K., Hartman, E., & Pappas, T. (1998). Beyond labeling:The role of maternalinput in the acquisitionof richly-structuredcategories. Monographs of the Society for Research in Child Development. SerialNo. 253, Vol. 63, No. 1. A.
STUDY3: UNIVERSAL QUANTIFIERS IN NATURAL LANGUAGE
T h e analyses in the G e l m a n et al. (1998) studies described focused exclusively o n the w o r d all, which is just o n e of several u n i v e r s a l quantifiers f o u n d in English. I n Study 3, we ( H o l l a n d e r & G e l m a n , 1999a) p r o v i d e a n analysis of the full set of u n i v e r s a l quantifiers: all, each, every, a n d any. W e first e x a m i n e d overall f r e q u e n c y of all instances of these forms in the C H I L D E S database. W e searched the C H I L D E S d a t a b a s e for all i n s t a n c e s of these four words, as well as all instances of l o n g e r words b e g i n n i n g with these strings (e.g., " e v e r y t h i n g , " " a n y b o d y " ) . W e focused o n eight of the E n g l i s h - s p e a k i n g c h i l d r e n who h a d the m o s t extensive l o n g i t u d i n a l data, a n d restricted the age r a n g e to 2 to 4 years. A s can b e s e e n in T a b l e VII, TABLE VII STUDY 3: RELATIVE FREQUENCY OF THE QUANTIFIERS A L L , A N Y , EACH, AND EVERY IN THE NATURALLY OCCURING SPEECH OF CHILDREN AND PARENTS IN THE C H I L D E S DATABASE, AS PERCENTAGE OF TOTAL UTTERANCES Children (N = 8) Mean percentage of utterances containing all, any, each, or every Range (in percent) across parent-child d y a d s Total number of instances
2.55% 1.11-5.70% 4056
Parents ~ (N = 8) 3.58% 2.68-4.54% 4502
aFor each child, these data are from the one parent (mother or father) who providedthe most data: the mothers of Adam,Naomi,Nathaniel,Peter, and Sarah; the fathers of Abe, Mark, and Ross.
The Role of Language
227
these quantifiers are considerably more frequent than the word kind and more frequent in natural speech than in the picturebook reading context studied by Gelman et al. (1998). Therefore, it becomes particularly critical to examine the nature of these uses. We predicted that, if parents are using these terms to teach children that categories are inference-promoting kinds, then these words should function frequently as universal quantifiers (referring to entire kinds), and particularly for animate categories. Again, we focused on the speech of A d a m and his mother, supplemented with data sampled from a subset of the other children in the C H I L D E S database. We searched the C H I L D E S database for A d a m and his mother for all instances of these four words, as well as all instances of longer words beginning with these strings (e.g., "everything," "anybody"). Each use was classified according to function (universal quantifier, specified context, or other) and domain (animate, artifact, or other). A second coder coded a subset of the utterances and obtained agreement of more than 90% on both scope and domain for both speakers. All and any accounted for the bulk of the sample, together making up 77% of A d a m ' s uses and 93% of A d a m ' s mother's uses. Each was the least frequently used of the four words, accounting for 5% or less of the sample for both A d a m and his mother. As can be seen in Table VIII, these words rarely functioned as universal quantifiers. More than 90% of the time when one of these words was used, it was to refer to a specific context (e.g., from T A B L E VIII STUDY 3: USES OF THE WORDS Az~L, EACH, EVERY, AND ANY IN THE SPEECH OF ADAM AND HIS MOTHER (BRowN, 1973) Adam 2 yrs
3 yrs
Adam's mother 4 yrs
2 yrs
3 yrs
4 yrs
4%
10%
(1) Function (as percentages of total uses at each age) Universal quantifier
2%
Specified context 26 Other 72 (2) Domain (as percentages of total target uses at each age) Animate
Artifact Other (3) Total number of uses (4) Percentage of total utterances
6%
15%
9%
54 40
52 33
61 30
65 31
64 27
54%
34%
20%
21 25 86 0.53
16 50 209 1.12
23 57 221 1.93
25%
25%
23%
37 39 163 2.13
22 53 254 2.79
26 51 124 3.62
228
Susan A. Gelman et al.
Adam's mother, " W h a t are all those things behind you there?") or in other ways (e.g., "Oh, I don't know that I'd like that at all"). Altogether, universal quantifier uses of these words occurred in less than 0.3% of Adam's utterances at any age, and less than 0.4% of his mother's utterances at any age. However, although these uses were quite rare, kind-referring universal quantifiers were more frequent for animates than artifacts, especially for Adam's mother, as can be seen in Table IX. In order to examine the generality of these findings with Adam, we analyzed a sampling of the speech from additional children in the C H I L D E S database. As with the analysis of kind, we focused on Abe, Mark, Naomi, Ross, and Sarah, selecting the same 20% of the transcripts produced at age 4 years. As can be seen in Table X, the results from this sampling of five additional children support the patterns obtained from the in-depth analysis of A d a m and his mother. Once again, the quantifiers all, any, each, and every were rarely used to refer to generic kinds or animate entities.
1.
Conclusions
The words all, any, each, and every are common in parental speech, but are rarely used to refer to generic kinds. Thus, we suggest that it is unlikely that these explicit forms of language play a vital role in the acquisition of kind concepts in children. Nonetheless, it is intriguing that kind-referring uses of these terms are disproportionately found for the animate domain. Given this pattern of results, it would be revealing to examine children's use of these terms in their category-based reasoning and inductive inferences. Studies 7 and 8, reported later in the chapter, present some initial data on this question. VII.
G e n e r i c N o u n Phrases
The fourth linguistic expression we consider is the generic noun phrase (e.g., "Dogs bark," "A giraffe is an animal," or "The hippo is a four-legged T A B L E IX STUDY 3: USES OF THE WORDS ALL, EACH, EVERY, AND ANY THAT FUNCTION AS KIND-REFERRING UNIVERSAL OUANTIFIERS (ABSOLUTE NUMBER OF SUCH USES) Adam
Adam's mother
Parents of Abe, Mark, Naomi, Ross, Sarah
Animate
16
24
23
Artifact Other
4 50
1 13
0 24
The Role of Language
229
TABLE X STUDY 3: USES OF THE WORDS
ALL, ANY, EACH, AND
EVERY, IN
PARENTAL SPEECH DIRECTED TOWARD ABE, MARK, NAOMI, Ross, AND SARAH AT AGE 4 ½ (AS MEAN
PERCENTAGESOF TOTALTARGETUSES AT EACH ACE) Mean
Range
(1) Function
Universal quantifier Specified context Other (2) Domain
Animate Artifact Other (3) Mean number of target uses (4) Percentage of utterances sampled
12% 50 38 24% 21 54 62.4 3.87%
5-19% 47-54 29-47 12-39% 11-28 41-60 13-140 3.57-4.35%
a These data are from the parent who provided the most data: the mothers of Naomi and Sarah; the fathers of Abe, Mark, and Ross.
beast"). Generics are potentially important for conveying generalizations about shared properties of category members (Carlson & Pelletier, 1995). They can do so in at least two ways. First, they involve properties that are definitional, recurrent, or lawlike (Dahl, 1975), and true of the prototype. Thus, they are useful for making predictions and may be particularly important for conveying that categories have rich structure. Second they make reference to objects as a category, rather than objects as individuals (see Lyons, 1977). For example, "Dogs are friendly beasts" refers to the category of dogs rather than any particular dog or group of dogs. Indeed, some properties are true only of the category, and not of any individual, such as, "Kangaroos are numerous in Australia" (no single kangaroo can be numerous). Generic noun phrases in English are expressed with bare plurals (e.g., "Bears hibernate in winter"), definite singulars (e.g., "The elephant is found in Africa and Asia"), or indefinite articles (e.g., "A male goose is called a gander"), and are accompanied by verbs that are typically nonpast and nonprogressive. Because there is no one-to-one relation between form and generic function, meaning and context are required in order to reach a generic interpretation (e.g., "The elephant" may refer to a particular elephant or to the kind). What distinguishes generics is that they refer to a category as an abstract whole, rather than referring to an individual or group of individuals (e.g., Carlson & Pelletier, 1995; Lawler, 1973).
230
Susan A. Gelman et al.
Lyons suggests that generics can often be translated roughly as "generally," "typically," "characteristically," or "normally" (although not as "necessarily"). Unlike statements using some, generics invoke the entire category. Yet unlike statements using universal quantifiers such as all, every, or each, generic statements allow for exceptions (Lawler, 1973, p. 329; McCawley, 1981). The statement "Birds lay eggs," for example, is considered true, even though less than half the bird population does so (e.g., excluding male birds and chicks). In contrast, "All birds lay eggs" is false. As a consequence, generic statements are perhaps more powerful than utterances with universal quantifers. Whereas even a single counterexample would negate the generalization "All boys play with trucks," the generic statement "Boys play with trucks" can persist in the face of numerous counterexamples. Indeed, some generics make claims for which no evidence is available (e.g., stereotypes of social categories). To put this another way, generic statements refer to kinds (Carlson, 1977): "Birds lay eggs" can be paraphrased as "Birds are a kind of animal such that the mature female lays eggs" (Shipley, 1993). Shipley (1993, p. 278) proposes that a generic statement such as this, "which presupposes the conceptualization of the class of birds as a single entity, should enhance the psychological coherence of the class of birds for that reason." Mayr (1991, p. 42) likewise suggests: "He who speaks of 'the Prussian,' 'the Jew,' 'the intellectual' reveals essentialistic thinking. Such language ignores the fact that every human is unique; no other individual is identical to him." Thus, generics may be a subtle but effective device used by parents to convey that members of a taxonomic category share properties. In the remainder of this section, we present a series of five studies investigating adults' and children's use and interpretation of generics. We address the three considerations raised earlier in the chapter: (1) Are generics available in the input to young children? (2) Are they used in ways that map onto relevant conceptual distinctions (e.g., to distinguish kinds from other categories)? (3) How are they understood by children? Studies 4, 5, and 6 examine generic use in naturalistic language, finding that generics are indeed available in the input to young children and are used by both children and adults in ways that map onto relevant conceptual distinctions. Studies 7 and 8 examine how children and adults interpret generics. Altogether, these studies suggest that parents convey kind concepts to their young children via generic noun phrases and that preschool children demonstrate sensitivity to the semantic implications of generics. A.
FREQUENCYOF GENEgICS IN ORDINARYSPEECH
Until recently, there was little direct psychological study of generics, nor any reports of their distribution in adult or child speech. However, generics
The Role of Language
231
are frequently employed in studies of categorization, perhaps with an implicit recognition of their significance (e.g., Rips, 1975; Waxman, Shipley, & Shepperson, 1991). For example, in a series of nuanced studies of labeling and social attribution, Kanouse (1987; Abelson & Kanouse, 1966; Kanouse& Abelson, 1967) analyzed the semantics of generic statements such as "Committees need bumblebees," although the published reports made no reference to the word "generic," nor to the linguistic literature on generics (e.g., Carlson & Pelletier, 1995). There are also anecdotal reports of generic usage (again, typically not explicitly labeled as such) in studies of children's and/or parents' spontaneous comments. For instance, in her examples of how parents introduced novel categories to their preschool children, Callanan (1990) included generic statements such as, "They hummingbirds sort of make a humming sound" or "A mixer is what we use to mix things up in the kitchen." Similarly, Shipley (1989) mentioned that, in her studies, preschool children (some as young as 3 years of age) referred to animal kinds with generic statements including: "Dogs go ruff-ruff and them have long tails" or "Animals can't talk." Likewise, Adams and Bullock (1986) found that parents of 3-year-olds provide generic statements such as, "They penguins live at the South Pole and they swim and they catch fish." These informal reports suggest that generics are used in ordinary speech, at least on occasion. We have begun to examine more systematically their frequency and use. We first studied generics as one component of an intensive study of maternal input, examining how parents convey information about category structure, beyond simple labeling, during naturalistic interactions (Gelman, Coley, Rosengren, Hartman, & Pappas, 1998). Forty-six mothers and their 20- or 35-month old children read picturebooks together. Sessions were videotaped and coded for explicit and implicit talk and gestures concerning categories. There were a variety of intriguing findings from the study. Here we focus on one finding in particular: mothers used generic noun phrases to convey category-wide information, and did so much more for animals than artifacts (Table XI). Indeed, most of the mothers made at least one statement including a generic noun phrase during the brief (15- to 30-min) session. Thus, the results suggest that generics are relatively frequent in ordinary speech, they are available to young children learning about category structure, and they are used differentially across domains. Although generics occurred in only a small percentage of mothers' speech, this frequency represents a substantial and potentially salient amount of input to children. Nouns can function in many different ways, including generic reference, singular definite reference, general definite reference, nonreferring definite reference, distributive general reference,
232
Susan A. Gelman et al.
TABLE XI U S E S OF GENERIC N O U N PHRASES IN CHILD-DIRECTED SPEECH Study A 35 months (N = 16) Domain Animal Artifact Total number of generics Percentage of total utterances
82% 18% 117 3.86
Study B 35 months (N = 14)
87% 13% 63 2.51
Study C 20 months (N = 16)
90% 10% 52 3.87
Data from Gelman, S. A., Coley, J. D., Rosengren, K., Hartman, E., & Pappas, T. (1998). Beyond labeling: The role of maternal input in the acquisition of richly-structured categories. Monographs of the Society for Research in Child Development. Serial No. 253, Vol. 63, No. 1.
collective general reference, specific indefinite reference, and nonspecific indefinite reference (Lyons, 1977, pp. 177-197). Given this variety of functions, any given noun phrase type will constitute only a small fraction of speech. Accordingly, even the most salient of noun phrase types will occur in less than the majority of utterances. (Analogously, although food is a highly salient and important concept for young children, mention of food appears in much less than half of their utterances, because there are many competing topics of conversation.) In order to determine the relative salience of generics, it is thus misleading to consider the proportion of speech containing generics, and more meaningful to consider the absolute frequency of such speech. For example, in study A, 87% of the mothers produced one or more generics in a 10- to 15-min session. (In contrast, only 56% of the mothers talked about numbers and only 37% of the mothers referred to object shape.) During this brief session, each mother produced on average approximately 189 utterances, nearly 4% of which were generics. By extrapolation, this suggests that children would typically hear more than 30 generics per hour, if placed in a comparable context, or hundreds of generics per day. Indeed, the rate of generics in maternal speech is comparable to the rate that mothers produce causal language (Hickling & Wellman, 1998) and exceeds the rate that children produce genuine psychological references to thoughts and beliefs at 6 years of age (Bartsch & Wellman, 1995). In study A, for example, the rate of generic usage was greater than the rate at which mothers talked about object size (3.09% of utterances), color (1.96% of utterances), number (0.77% of utterances), shape (0.35% of utterances), or texture (0.22% of utterances). By contrast, truly rare linguistic forms, such as the dative passive, would be found much less frequently.
233
The Role of Language
The domain differences in generic usage cannot be attributed to familiarity, similarity, or amount of talk, all of which were controlled in these studies. It is also unlikely that the domain differences can be attributed to lack of sufficient knowledge about the artifacts. Mothers certainly knew several category-general properties true of each artifact depicted (including its parts, function, thematic associates, and appearance), and mentioned many of these properties in reference to particular objects and contexts. Importantly, however, mothers typically failed to mention these properties in generic form. Why, then, did animals elicit so many more generics than artifacts? We interpret this result as reflecting conceptual differences between animal vs. artifact categories. Assuming that mothers construe animal kinds as more richly structured than artifact kinds (deeper similarities, greater coherence, etc.), it should be easier for mothers to conceptualize animal categories as abstract wholes, and hence to use generics. What is then interesting for the present discussion is that the domain difference in maternal generic usage is available to young children, and may inform children's acquisition of this very same conceptual distinction. B.
STUDY4: GENERICS IN CHILD-DIRECTED SPEECH OF MANDARIN CHINESE SPEAKERS
Generics in English are marked with specific formal devices such as bare plurals (e.g., bears) and definite singular noun phrases (e.g., the bear). Yet languages differ in the formal devices employed to express definiteness and plurality (Croft, 1990). What are the implications of these cross-linguistic differences for the expression of generics in languages other than English? Mandarin is a particularly revealing comparison language because it lacks articles and the singular/plural distinction on nouns. Thus, it contains sentences that could be translated into English using either generic or nongeneric forms (Krifka, 1995). For example, the following sentence: xiao3 little
yalzi duck
yao2yao2bai3bai3 waddlingly
de DE
zou3 walk
lu4 road
could be translated into English as: (1) "The duck is waddling," (2) "The ducks are waddling," or (3) "Ducks waddle." Only (3) is generic. This does not mean that Mandarin fails to express generics. In particular, there are subtle semantic and pragmatic cues that help clarify the status of the utterance (Krifka, 1995). However, generics are less transparently marked in Mandarin than in English. A longstanding but untested claim is that these linguistic differences lead to corresponding conceptual differences in how speakers of Mandarin vs.
234
Susan A. G e l m a n et al.
English think about abstract kinds (Moser, 1996). Bloom (1981, p. 36) stated the linguistic relativity hypothesis clearly: "Perhaps the fact that English has a distinct way of marking the generic concept plays an important role in leading English speakers, by contrast to their Chinese counterparts, to develop schemas specifically designed for creating extracted theoretical entities, such as the theoretical buffalo, and hence for coming to view and use such entities as supplementary elements of their cognitive worlds." However, Bloom's evidence for this position was insufficient on his own admission (p. 36), and he cautioned that further research is needed. Study 4 examined generics cross-linguistically (English and Mandarin) in child-directed speech from caregivers in the United States and China (see Gelman & Tardif, 1998, for a fuller report). Our primary questions were whether generics could be identified in Mandarin, despite the crosslinguistic differences in how transparently they are expressed, and if so, how frequently they appear relative to English. We gathered child-directed speech from 24 English-speaking parents (in Ann Arbor, Michigan) and 24 Mandarin-speaking parents (in Beijing, China) interacting with their 20-month-old children. Each parent-child pair was videotaped for 30 min. We kept the physical contexts (including play materials) identical across languages. Each videotape was transcribed and coded by native speakers in the relevant language, with a bilingual coder for reliability. We did not code pronouns, given that Mandarin is a prodrop language. All other noun phrases were coded in two ways: (1) as generic or nongeneric, and (2) for domain. Sample generics included: "Baby birds" eat worms" English and "da4 lao3shu3 yao3 bu4 yao3 ren2?" ("Do big rats bite people or not?") Mandarin. We found that generic NPs could be reliably identified in both English and Mandarin (with agreement between coders of well more than 90% in each language). Moreover, despite very different formal devices for expressing generics, patterns were remarkably similar across languages. Generics were frequent in Mandarin as well as English (83% of the Mandarin-speaking mothers and 100% of the Englishspeaking mothers produced at least one during 30 min of play with their 20-month-olds; average of 1 generic every 4 min). Moreover, the distribution of generic noun phrases differed markedly from that of nongeneric noun phrases in both languages (with generics used significantly more for animals than for artifacts, and nongenerics used significantly more for artifacts than animals; Table XII). Thus, domain differences in generic use cannot be due to differences in the salience of each domain. Interestingly, however, generics were significantly more common in English than Mandarin, suggesting that language-specific differences in how transparently generics are marked may affect frequency of use. (As predicted, there were no language differences in frequency of nongenerics.)
The Role of Language
235
T A B L E XII STUDY 4: RELATIVE FREQUENCY OF GENERIC AND NONGENERIC N O U N PHRASES IN E N G L I S H AND M A N D A R I N , AS M E A N PERCENTAGE OF T O T A L UTTERANCES, WITHIN E A C H D O M A I N English (N = 24)
Mandarin (N = 24)
Generic n o u n phrases as m e a n n u m b e r per 100 total utterances
Animates
2.08 1.16 0.34 3.58
0.83 0.43 0.14 1.40
Animates
11.68
Artifacts Other Total
19.34 5.66 36.68
12.30 16.73 5.01 34.04
Artifacts Other Total Nongeneric n o u n phrases as m e a n n u m b e r per 100 total utterances a
From Gelman, S. A., & Tardif, T. Z. (1998). Generic noun phrases in English and Mandarin: An examination of child-directed speech. Cognition, 66, 215-248.
We conducted an additional analysis to make sure that the language differences were not an artifact of the coding system. If the procedure we used to identify generics was more conservative in Mandarin than English, this could explain why we found more generics in English. To look at this issue, we took a subset of the English transcripts, stripped away all linguistic markers that are not typically found in Mandarin (including articles, plural markers, and pronouns), and gave these modified transcripts to the coders. We asked them to use the same criteria that were used for coding Mandarin. On this crude measure, at least, coders identified m o r e generics when the markers were removed than when they were present. Seventy-five percent of generics identified originally in English (with markers present) were still identified when markers were absent, plus an additional set. If we exclude those utterances for which there was agreement across languages, we find that twice as many generics were identified when markers were absent as when markers were present. These results suggest that the coding of Mandarin did not reduce the estimated of the number of generics (and may even have inflated it). 1.
Summary
The results of study 4 demonstrate that generic noun phrases are expressed in at least two quite distinct languages (English and Mandarin Chinese)
236
Susan A. Gelman et al.
that make use of formally distinct constructions. Moreover, despite an overall greater frequency of generics in the speech of English-speaking vs. Mandarin-speaking mothers, generics are frequent in the input to young children in both samples. Furthermore, generics in both samples were consistently domain specific in their contexts of use, more often referring to categories of animates than categories of artifacts. These data further support the suggestion that generic noun phrases are an important source of information to children about kinds. C.
STUDY5: LONGITUDINAL STUDY OF GENERICS IN CHILDREN'S SPEECH
To this point, we have focused exclusively on parental speech. Yet, when do children first use and understand generics? The remaining studies address this question. Study 5 focuses on children's spontaneous production of generics in natural conversations. This study was conducted in collaboration with Jonathan Flukes and Thomas Rodriquez (Gelman, Flukes, & Rodriguez, 1999). Our primary question concerns the age at which children begin to have command of this linguistic form. Although children have acquired many of the basic grammatical devices necessary for expression of generics in English (articles, plurality, tense, and aspect) by 3 years of age, if not earlier, their semantic implications are potentially difficult. Generics refer to concepts that are abstract, not readily depicted (Jackendoff, 1996), and beyond the "here-and-now." Thus, it is not obvious that children will have acquired the semantics of generics. A secondary purpose of the natural language studies is to examine the domain specificity of "kind" concepts in early development. As others have noted, adults treat a broad range of categories as kinds (e.g., including gender and race; Hirschfeld, 1996; Taylor, 1996) but they are also selective (i.e., excluding simple artifacts; Diesendruck, Gelman, & Lebowitz, 1998). Correspondingly, adults show an animacy bias in their use of generic noun phrases. Do children also show an animacy bias, and if so, does it increase or decrease over time? The answer to this question provides insight into the developmental origins of kind concepts. On one view, essentialism is initially specific to biology and later spreads by analogy to other domains (Atran, 1990, 1995). On a second view, essentialism is at first a domain-general assumption, derived from the logic of count nouns (Macnamara, 1986) and applying to them all, which only later gets refined to those domains that best support it (Carey, 1995). Thus, these accounts lead to two competing developmental predictions: In Arran's (1990, 1995) view, generics should start out domain specific and get broader over time; in Carey's (1995) view, generics should start out domain general
The Role of Language
237
and get more specific over time. Natural language provides a sensitive vehicle for examining these issues, as it enables studying kind concepts in toddlers who are not capable of handling the complex informationprocessing demands of many experimental tasks (Bartsch & Wellman, 1995). The data in this study were drawn from longitudinal transcripts in the C H I L D E S database organized by Brian MacWhinney and Catherine Snow (1985, 1990). Subjects were eight children (ages 2-4 years) followed longitudinally. The researchers who contributed the data were Lois Bloom (1970), Roger Brown (1973), Stan Kuczaj (1976), Brian MacWhinney, Jacqueline Sachs (1983), and Catherine Snow. We examined all utterances containing plural nouns, mass nouns, and indefinite singular nouns (totaling nearly 45,000 utterances), and coded each in two ways: (1) as generic or nongeneric, and (2) for domain (person/animal, artifact, other). Intercoder agreement on identification of generics was 97%.
1. Frequency of Generics in Child Speech As can be seen in Table XIII, children as young as 2 years of age spontaneously produced generics in everyday conversations. The eight children we studied produced 3114 generic noun phrases during the sessions recorded between ages 2 and 4 years. Examples included the following (with generic T A B L E XIII STUDY 5: RELATIVE FREQUENCY OF GENERIC NOUN PHRASES IN THE NATURALLY OCCURING SPEECH OF CHILDREN IN THE C H I L D E S DATABASE, AS MEAN PERCENTAGEOF TOTAL UTTERANCESAND AS MEAN PERCENTAGE OF SEARCHED UTTERANCES (MAss NOUNS, PLURAL NOUNS, AND INDEFINITE SINGULAR NOUNS ONLY) WITHIN EACH DOMAIN Age 2 (N=7)
Age 3 (N= 8)
Age 4 (N=6)
0.27%
1.23%
1.82%
0.09 0.33 0.69
0.35 0.57 2.15
0.59 0.77 3.18
Generics as mean percent of total utterances Animates
Artifacts Other Total Genetics as mean percent of searched utterances within each domain Animates
Artifacts Other Total number of genetics
3.74%
1.39 2.95 378
9.62%
3.90 4.53 1564
13.05%
7.83 5.01 1172
238
Susan A. G e l m a n et al.
noun phrases in italics): "That shirt's not for girls" (Ross, 2;7); "Animals eat berries and they eat mushrooms" (Abe, 2;9); "Indians live in Africa" (Adam, 3;3); "Bad guys have some guns" (Mark, 3;7); " D o n ' t play with guns" (Sarah, 4;10). The children thus readily made reference to kinds. Although the frequency of generic utterances is only a modest fraction of children's total speech, this amount is high when one considers the high volume of speech produced, the variety of noun phrase types that are possible, and the comparable frequency of other salient and important topics (see discussion of these issues in the section entitled "Frequency of generics in ordinary speech"). T h e use of generics increased from ages 2 to 4 years. We do not yet know why the developmental increase occurs. It may reflect a conceptual change in the early preschool years. Specifically, a developmental increase may occur in how readily children think about kinds. The change is unlikely to be due entirely to increasing syntactic skills during this age range, as we find the same developmental patterns when we restrict the focus just to those noun phrases with indefinite singular nouns, mass nouns, or plural nouns (the forms used for generics) (see "Generics as mean percent of searched utterances within each domain" in Table XIII). In other words, when we examine the percentage of searched utterances that include generics, once again we find a statistically significant difference between frequency at age 2 years and frequency at each of ages 3 and 4 years (p < .005).
2.
Domain Specificity of Generics in Child Speech
When we turn our attention to the domain specificity of children's generics, we find that children at each age provided significantly more generics for animate kinds than for artifacts (p < .02; see Table XIII). Before concluding that children have an animacy bias, however, it is important to conduct an analysis of children's baseline speech. In other words, we need to make sure that children's animacy bias in generics is not simply due to an abundance of animate noun phrases overall. In order to address this question, we computed a proportion score for each domain that was the number of generic noun phrases in that domain divided by the number of total coded noun phrases in that domain. Thus, each subject's data serve as his or her own control. As shown in Table XIII, even controlling for baseline frequencies of speech in each domain, there remained a strong preference for children to use generics for animates--both people and nonhuman animals. This difference was significant even at age 2 years. Furthermore, the data were consistent across subjects: When controlling for the number of searched utterances in each domain, six of the eight children provided more generic nouns for animates than artifacts at every age, one of the children showed the pattern in two of the three age periods,
The Role of Language
239
and the eighth child showed this pattern at one of the two ages for which we had data. To put this another way, out of 21 comparisons (five children with data at all three age periods, and three children with data at two age periods), 19 showed a higher proportion of animate generics than of artifact generics. These patterns are intriguing, given the decided lack of an animacy bias in the earlier studies examining the word kind and logical quantifiers. We found no animacy bias for noun phrases containing kind, all, each, every, or any. In contrast, generic noun phrases were weighted toward the animate categories. Furthermore, instances of quantifiers that were kindreferring also demonstrated an animate bias (see study 3). These results suggest that generic noun phrases--as well as kind-referring instances of all, each, any, and every--function differently from these other words for both children and adults. What do these results imply about the theories of developmental origins outlined earlier? Interestingly, the results support neither the "domain general" nor the "biology module" position. On the one hand, we find no evidence that children possess a domain general essentializing tendency (Carey, 1995). Although children do, of course, learn lexical labels for categories in every domain, they selectively prefer to apply generic noun phrases to people and animals. On the other hand, the data also do not support the notion that children start out with a specifically biological notion of kind that gets extended to other domains (Atran, 1995). Although children's earliest generics are more frequent for animates than artifacts, the animate kinds that receive generic noun phrases are not strictly biological. Children's earliest uses incorporate nonbiological social categories (e.g., bad guys, carpenters, cowboys, strangers, clowns). Instead, we suggest that children may have an early appreciation for animacy (not biology), which gets linked to their concept of kind.
3.
Summary
The finding that children express generics consistently and in appropriate contexts by 2 years of age suggests that they are understood by this age, if not earlier. Correspondingly, this finding further supports the hypothesis that maternal generics may play a role in children's developing concepts. The next step in determining their role is to look directly at generics interpretation and comprehension in children (see studies 6-8). D.
STUDY6: CONCEPTUAL DISTINCTIONS BETWEEN GENERIC AND NONGENERIC NOUN PHRASES 1N PARENT-CHILD CONVERSATIONS
Although the research described previously documents that generics are available in ordinary speech from a surprisingly early age, it does not tell
240
Susan A. Gelman et al.
us how this linguistic construction is understood. At a most fundamental level, the work does not tell us whether generics are conceptually distinct from nongeneric utterances. What independent evidence do we have that speakers use generics to refer to kinds as opposed to individual instances? Two findings are suggestive in this regard, although not definitive. First, as discussed previously, both mothers and their children displayed a substantial domain difference, producing significantly more generics for animals vs. artifacts, even when we control for the frequency of talk in the two domains. Taken in conjunction with work suggesting that animal categories are more coherent and richly structured than artifact categories (e.g., Gelman, 1988; Keil, 1989), this domain difference suggests that generics are reserved for talking about categories with particularly rich correlated structure. A second piece of evidence came from the Gelman et al. (1998) study, in which it was noted that mothers showed an occasional mismatch between the number of available category instances and the plurality of the noun phrase used. Specifically, mothers at times used plural generics even when only a single instance was visible in the picture (e.g., "That's a chipmunk. A n d they eat the acorns"). Similarly, sometimes mothers shifted between singular and plural forms (e.g., "Did you know when a pig gets to be big, they're called hogs?"). This pattern is striking, because on the surface it would appear to be a blatant error: reference to a single individual with a plural noun. However, we suggest that the "error" is in fact not an error at all, but rather reflects the semantics of generic nouns. Specifically, "they" in the chipmunk example refers not to the chipmunk identified in the previous sentence, but rather to chipmunks as an abstract kind. If our interpretation is correct, then these mismatches suggest that generics are not tied to a particular set of instances present in the immediate context but rather refer to the category as a larger whole. It is unclear, however, how characteristic these mismatches are for generics and whether they differ systematically from the use of nongeneric noun phrases. Study 6 was designed to address whether generics are distinct in function from nongenerics by looking more closely at the phenomenon discussed previously: mismatches between context (one vs. multiple instances) and linguistic form (singular vs. plural) (see Pappas & Gelman, 1998, for a full report). Specifically, preliminary evidence suggests that generics may be used to refer to categories in general (e.g., squirrels as an abstract whole). However, in order to argue that generics refer to categories as distinct from individuals in immediate contexts, two alternative explanations need to be ruled out. First, the use of plural noun phrases in the context of a single instance could simply be an error. Parents may occasionally use the wrong form due to forgetfulness or slips of the tongue. For example, a mother
The Role of Language
241
may have intended to say "/t eats the acorns," but came out with "They eat the acorns" instead. A second possibility is that the number mismatch reflects use of " t h e y " as a gender-neutral pronoun. Because it was not possible to detect whether the animals in the picturebook were male or female, perhaps subjects, uncertain of whether to say " h e " or "she," opted for "they." If either alternative account is apt (errors or gender-neutral pronouns), then we should find the same mismatch between plural noun phrases and single-exemplar contexts with nongeneric utterances as with generics. For example, if the gender interpretation is correct, then parents should just as often say things like, "See this bat? They came from the cave over there" (i.e., using "they" in a nongeneric sentence) as "See this bat? They live in caves" (i.e., using " t h e y " in a generic sentence). In contrast, if the number mismatch pattern is distinctive to generics, this would provide indirect evidence for a conceptual distinction between generic and nongeneric constructions. To summarize, the present study examines the distribution of generic utterances relative to nongeneric utterances. If generics and nongenerics are semantically and conceptually equivalent, then they should not differ from each other with respect to the distribution of linguistic form (singular vs. plural) across depicted contexts (individual instance vs. multiple instances depicted on a page). However, if generics and nongenerics are semantically and conceptually distinct, then their distributions should differ, with generics eliciting more plural forms in single-instance contexts. We asked mother-child pairs to look through picturebooks about animals. The books were specially created so that each page included either a single instance of a category (e.g., one crab) or many instances of a category (e.g., many crabs), thus manipulating contexts by varying the number of items on a page. There were 16 pages per book: 8 pages depicted a single animal on each; 8 pages depicted many (12-15) animals of a given category on each. The number of instances were counterbalanced across books (e.g., book A included one crab and many rabbits; book B included many crabs and one rabbit). Subjects were 26 mother-child pairs, with children ranging in age from 23 to 57 months (mean age 38 months). Subjects were seated on chairs at a table and told that they would be given a picturebook for them to look through and talk about as they typically would at home. Sessions were videotaped and later transcribed. A coder identified all noun phrases (proper nouns, common nouns, pronouns, and adjectival noun phrases) referring to the target items--for example, on the fish page, all noun phrases referring to fish, regardless of whether depicted on the page. We refer to these as "coded utterances." Utterances containing the target noun phrases were then coded for number
242
Susan A. Gelman et ai.
(singular vs. plural) and generic status (generic vs. nongeneric). See Table XIV for results. 1.
Maternal Generics
Generics accounted for a small but consistent subset of the noun phrases produced: 92% of the parents (24 of the 26) produced at least one generic noun phrase; overall, this accounted for a mean of 11% of the coded utterances that parents produced. Rates ranged across parents from 0% to 41% of all utterances produced. As expected, nongeneric utterances were more frequent than generic utterances, as indicated by a main effect of generic status (p < .001). There was also a main effect of linguistic form (p < .001), indicating that utterances with singular noun phrases were overall more common than utterances with plural noun phrases. Most interesting for our purposes was the three-way interaction involving generic status, page type, and linguistic form (p < .001). We approach the interaction by considering the patterns for generics and nongenerics TABLE XIV S T U D Y 6: M E A N N U M B E R OF G E N E R I C A N D N O N G E N E R I C U T T E R A N C E S AS A F U N C T I O N OF S P E A K E R , A G E , P A G E T Y P E , A N D L I N G U I S T I C F O R M Generics
Mothers Singular NP b Plural NP b Children Singular NW Plural NP b
Nongenerics
Single instance"
Multiple instancesa
1.42 5.42 +
1,08 3.38 +
0.04 1.04 +
0.08 0.58 +
Single i n s t a n c e ~
Multiple instances~
n.s. n.s.
38.00 2.08 +
21.23 * 26.81 * n.s.
n.s. n.s.
19.69 0.58 +
9.81 * 12.61 * n.s.
Sixteen pairwise comparisons were performed: 8 within each row of the table, for generics and nongenerics separately; and 8 within each column of the table, for mothers and children separately. * Significant difference between single instance and multiple instances (p < .001). + Significant difference between singular and plural NPs (p < .05). n.s. = non-significant. NP = Noun Phrase. Indicates number of category instances depicted on a page (one or many). b Indicates form of noun phrase produced by speaker (singular or plural). From Pappas, A., & Gelman, S. A. (1998). Generic noun phrases in mother-child conversations. Journal of Child Language, 25, 19-33.
The Role of Language
243
separately. For nongenerics, linguistic form (singular or plural) interacted with page type: utterances containing singular noun phrases were more frequent when the page depicted just a single instance than when it depicted multiple instances; utterances containing plural noun phrases were more than 10 times more frequent when the page depicted multiple instances than when it depicted a single instance. Thus, when producing nongenerics, the form of the language (singular or plural) closely matched what was depicted on the page (one or many instances). The only lack of correspondence was due to the fact that parents also used many singular noun phrases for pages with multiple instances. However, this finding is consistent with the fact that one can focus on an individual animal even when multiple instances are displayed. In contrast, for generics, linguistic form was wholly independent of page type: both singular noun phrases and plural noun phrases were produced as often when the page depicted just one instance as when it depicted multiple instances. There was a slight tendency for parents to produce more utterances containing generic noun phrases (of either singular or plural form) for pages depicting a single instance than for pages depicting multiple instances, but these differences were not significant. Rather, what mattered for generics was linguistic form: utterances containing plural noun phrases were significantly more frequent than utterances containing singular noun phrases. Thus, generics do not appear to be tied closely to the numerical information on the page. 2.
Child Generics
Although the overall percentage of generics was rather modest (1% of the coded utterances produced by 2-year-olds and 5% of the coded utterances produced by 3- and 4-year-olds), more than half the subjects produced at least one generic noun phrase during the book-reading session (50% of the 2-year-olds and 79% of the 3- to 4-year-olds). The reading sessions were rather brief, averaging approximately 10-15 minutes apiece; thus, by extrapolation, children were producing more than six generics per hour. The patterns for the children were remarkably similar to those of the adults. Again, generics were more frequent than nongenerics, as indicated by a main effect of generic status (p < .001). Also, singular noun phrases were more frequent than plural noun phrases (p < .001). As with the mothers, the comparisons of primary interest are those involving generic status, linguistic form, and page type, including a significant three-way interaction, (p < .001). Here again, to interpret the three-way interaction we consider the patterns for generics and nongenerics separately. For nongenerics, linguistic
244
Susan A. G e l m a n et al.
form (singular or plural) interacted with page type: singular noun phrases were approximately twice as frequent when the page depicted just a single instance than when it depicted multiple instances; plural noun phrases were more than 20 times more frequent when the page depicted multiple instances than when it depicted a single instance. As with the adults, children frequently used singular noun phrases when talking about pages with multiple instances. Otherwise, for nongenerics, the form of the language (singular or plural) closely matched what was depicted on the page (one or many instances). Once again, linguistic form was independent of page type: For both singular noun phrases and plural noun phrases, frequency of generics did not differ significantly as a function of how many pictures were displayed on the page. Rather, what mattered for generics was linguistic form: plural noun phrases were much more frequent than singular noun phrases. 3.
Summary
The data clearly show that generics are distributed differently from nongenerics for both parents and children. Whereas the linguistic form of nongenerics closely matched the number of pictures in the context (with singular noun phrases typically used for single-instance pages and plural noun phrases typically used for multiple-instance pages), such was not the case for generics. Indeed, generic plurals were used slightly more often in the context of single-exemplar pages than in the context of multiple-exemplar pages, although this difference was not significant. At times this led to the sort of "mismatches" described earlier. For example, in one transcript, the mother referred to an individual ostrich as "ostrich," and the child replied, "They stink," using a plural pronoun following reference to an individual. Although we had predicted that generics would be relatively more independent of context than nongenerics, the size of the effect was rather surprising: for generics, linguistic form was wholly independent of context, as measured by number of items on the page. In other words, subjects were no more likely to access the larger category when presented with many instances than when presented with just one. The fact that even a single instance of the category could serve to trigger a generic utterance suggests that subjects may be thinking about individual animals in two ways, both as individuals and as instantiations of a kind. In summary, we interpret the present data as providing evidence that generic noun phrases differ in their semantics and conceptual organization from nongeneric noun phrases, both in the input to young children and in children's own speech.
The Role of Language
E.
245
STUDY7: SEMANTICINTERPRETATIONOF GENERICS
Study 7 focuses directly on what generics mean to young children. Although studies 5 and 6 demonstrate that preschoolers use generics in different contexts than nongenerics, and therefore that generics are distinguished from nongenerics in some respects, that work did not examine the meaning of these expressions. Study 7 addresses the meaning of generic expressions by examining their scope for young children. As noted earlier, for adults, generics are distinctive in implying broad category scope (e.g., "Birds fly" is generally true of birds) yet allowing for exceptions (e.g., penguins). Thus, generics are distinct from both all (e.g., "All birds fly") and s o m e (e.g., "Some birds fly"). We conducted an experiment to test whether preschool children appreciate this (Hollander & Gelman, 1999b). The study was modeled after an experiment conducted by Smith (1980) that focused exclusively on all and s o m e . In Smith's study, children ages 4;1 to 7;6 received a series of questions regarding properties of categories. One-third of the properties were true of all members of the category in question (what we will call "all-properties"); one-third were true of some members of the category ("some-properties"); and one-third were true of no members of the category ("none-properties"). Children were asked about each category-property pairing with either the word all or the word s o m e (e.g., "Do all elephants have trunks?" vs. "Do some elephants have trunks?"). Smith's results indicated that even 4-year-olds appropriately distinguished all and s o m e under favorable presentation conditions (i.e., first half of the first block of questions). We predicted that, if given the same task with questions presented in generic form, children would treat generics as partly like all and partly like s o m e . In particular, we predicted that children would accept both "allproperties" and (to a lesser extent) "some-properties" as true in generic form. Also of interest was whether generics would pattern more like all or more like s o m e . Here we had no a p r i o r i predictions. In study 7, children were tested on three kinds of expressions: all, s o m e , and generic. Ten children participated, ranging in age from 4;0 to 4;10 (mean age 4;6). We focused on 4-year-olds because this is the youngest age at which children have been shown to distinguish all and s o m e consistently. Each child received three blocks of questions (generic, "all," and "some"), in counterbalanced order. Each block consisted of 12 questions: 4 concerning all-properties, 4 concerning some-properties, and 4 concerning none-properties (Table XV). Each property was rotated through each of the three wording conditions so that the specific content was not confounded with a particular condition (e.g., across children, a given question would
246
Susan A. G e i m a n et al.
T A B L E XV STUDY 7: SAMPLE ITEMS a Wording condition Generic questions
"All" questions
"Some" questions
Are fires hot? Do girls have curly hair? Do fish have branches? Is all candy sweet? Do all dogs have brown spots? Do all saws have toothaches? Do some giraffes have long necks? Do some books have color pictures? Do some zebras wear watches?
(all-property) (some-property) (none-property) (all-property) (some-property) (none-property) (all-property) (some-property) (none-property)
Each property was rotated through all three wording conditions.
be " A r e fires hot?", " A r e all fires hot?", or " A r e some fires hot?"). Each question was asked in yes/no format. We recorded each response as well as any additional comments children spontaneously provided. Our first analysis examined the number of trials on which children said "yes". These results can be seen in Fig. 1. There was a significant interaction between question and property type (p < .001). With the all-properties, children were more likely to answer "yes" in response to "all" and generic questions than in response to " s o m e " questions (p < .05). There was no
4-Year-Olds' Interpretation of Quantified Noun Phrases (Hollander & Gelman)
43.5 32.52. 1,5-
10.5
0 Generic
"All . . . .
Some"
Fig. 1. Study 7: Mean number of trials on which children responded "yes" as a function of wording condition and property type.
The Role of Language
247
significant difference between "all" and generic on these items. In contrast, with some-properties, children were more likely to answer "yes" in response to "some" and generic questions, than in response to "all" questions (p < .01). There was no significant difference between "some" and generic on these items. Finally, for both generic and "all" questions considered separately, children were more likely to affirm all-properties than someproperties (p < .05). In contrast, there was no significant difference between all- and some-properties for "some" questions. We then examined how often children qualified their response with "some" (e.g., in response to, " D o girls have curly hair?" a child might say, "Some girls do" or "Yes, some girls do"). Appropriately, nearly all such qualifications were in response to the some-properties (p < .02). Of greater interest is whether this differs by question. Although there was not a significant interaction between question and property type, a planned comparison revealed that for the some-properties, "some" qualifications were significantly more frequent for generic questions (M = 1.08) than for "some" questions (M = 0.44; p < .01). " S o m e " qualifications were intermediate for "all" questions (M = .83). We suggest that qualifications are provided when the extra information supplied cannot be assumed in the question. Thus, the fact that generic questions are more often qualified with "some" indicates that generics do not themselves imply "some." Finally, we examined how often children qualified their response with "all" (e.g., in response to, " A r e some fires hot?" a child might say, "All of them are"). Appropriately, the vast majority of "all" qualifications were in response to the all-properties (p < .01). More interesting for our purposes, they were used significantly more often in response to "some" questions than in response to "all" or generic questions (Ms = 0.52, 0.07, and 0.04, respectively; p < .02). This result was particularly clear when focusing just on responses to the all-properties (Ms = 1.56, 0.22, and 0.00 for "some," "all," and generic questions, respectively), where "some" was significantly different from each of the other two questions (ps < .01). 1.
Conclusions
Children interpret generics as being reduceable to neither "all" nor "some." Like "all," generics are appropriate for category-wide generalizations e.g., "(All) fires are hot". Yet like "some," generics are appropriate for properties true of a subset e.g., "(Some) girls have curly hair". Although generics can be said to be midway between "all" and "some", they are also more "all"like in two respects. First, generics are endorsed more often for all-properties than some-properties (like "all," but unlike "some"). Second, responses to generic questions are more likely to be qualified with "some" (M = .39) than
248
Susan A. Gelman et al.
with "all" (M = .04). This pattern is also found for questions flamed with "all" (Ms = .32 and .07, respectively) and is in contrast to what is found for questions framed with "some" (Ms = .15 and .52, respectively). Overall, these results with 4-year-old children are consistent with a semantic analysis in which generics imply broad generalizations, but also allow for exceptions. F.
STUDY 8: GENERICS AS A BASIS OF INDUCTIVE INFERENCES
The purpose of study 8 is to examine children's use of generics in a categorybased induction task. Specifically, we were interested in whether children would make use of generic noun phrases to guide their inductive inferences. Although past evidence demonstrates that children can form inductive inferences on the basis of novel generic properties (Waxman, Shipley, & Shepperson, 1991), such work did not contrast generic information with nongeneric information, and so did not measure the specific contribution of this linguistic form. We predicted that generics would be distinct from nongeneric utterances, in two respects. First, we predicted that generics would differ from properties stated specifically (e.g., the generic "Bears have three layers of fur" vs. the specific "These bears have three layers of fur"), because generics imply that a property is broadly true of a category. Second, generics were predicted to differ from properties stated absolutely (e.g., the universal "All bears have three layers of fur"), because generics more readily allow for exceptions. Thus, information stated in generic form would seem particularly powerful in guiding children's developing concepts. Thirty-six children (4;2 to 5;10; mean age 4;10) and 38 undergraduates participated in an induction task (Star & Gelman, 1999). On each of 9 item sets, a subject was first shown two target animal pictures (e.g., two bears), and then learned a novel property in one of three wording conditions (generic, "all," and "these"; described below)--for example, "Bears have three layers of fur." Finally, the subject was presented pictures of three other members of the target category (the test pictures--bears, in this example) varying in their similarity to the target pictures. The test pictures were viewed one at a time in random order, and for each the participant was asked to say whether it had the novel property (yes or no). Adults were also asked to rate how confident they were that their answer was correct on a scale from "1" (not at all confident) to "7" (highly confident). In the generic wording condition, subjects heard the novel property in generic form (e.g., "Bears have three layers of fur"). In the "these" wording condition, subjects heard the novel property in specific form (e.g., "These bears have three layers of fur"). In the "all" wording condition, subjects heard the novel property using the universal quantifier all (e.g., "All bears have three layers of fur"). Each participant received the three wording
The Role of Language
249
conditions in blocks, with the blocks presented in one of three randomized orders all-these-generic (ATG), these-generic-aU (TGA), or generic-allthese (GAT). Which animals/properties were paired with which wording condition was systematically varied across subjects. Thus, no person received the same property in more than one wording condition, and every property appeared equally in each of the three wording conditions. 1.
A d u l t Results
Adults showed a strong condition effect, drawing more inferences in the "all" and generic conditions than in the "these" condition (Ms = 93%, 92%, and 51%, respectively, p < .01). 6 Furthermore, the condition differences were strongest in the G A T order (p < .001), indicating that inferences were lowest in the "these" condition in this order. Adults appeared to be influenced by the contrast between "all" and "these," thus yielding induction rates for "these" that were even lower than in the other conditions. However, in all three orders, "these" was significantly lower than both generic and "all," with no significant differences between the latter two wording conditions. Adults' confidence ratings converged with their yes/ no responses. The adults were more confident in their judgments for "all" and generic trials than in their judgments for "these" trials (Ms = 6.09, 5.81, and 4.41, respectively; p < .001). It is interesting that, although semantically generics do permit exceptions, adults treated generics as being as powerful as "all" statements for promoting inductions. 2.
Child Results
There were two significant results of primary interest. 7 First, as predicted, children made more inferences in the "all" condition (M -- 61%) than in the "these" condition (M = 51%; p < .05). The generic condition was intermediate and not significantly different from either of the other two conditions (M = 54%). Second, there was a main effect for order of presentation (p < .05). The mean induction rates for experimental sessions beginning with "all" ( A T G ) or generic nouns ( G A T ) were significantly higher than for those beginning with "these" (TGA; Ms = 66%, 60%, and 40%, respectively). This last result suggests that children may have been influenced by the first block on trials when responding to the remaining blocks. That the order beginning with "these" yielded the lowest induction rates 6There were also significant effects due to degree of perceptual similarity between the target pictures and the test pictures. 7As with the adults, there was also a significant main effect for degree of perceptual similarity between the target pictures and the test pictures.
250
Susan A. G e l m a n et al.
is consistent with the interpretation that children view "these" as less powerful for induction than "all" or generic. In order to follow up the order effect, we conducted a secondary analysis focusing just on the first block of trials for each experimental session. For example, for children who were presented in the pictures in the ATG order, we only analyzed the first block of trials (the "all" condition only). Similarly, we used only the "these" trials for the TGA order and only the generic trials for the GAT order. This analysis allows us to eliminate any contamination from the first block of trials to subsequent trials. On this analysis, we found a main effect of condition (p < .01). Post-hoc analyses indicated that induction rates for the "these" condition were significantly lower than those for each of the generic and "all" conditions (Ms = 35%, 56%, and 74%, respectively). The latter two conditions did not differ significantly from each other.
3.
Summary
Study 8 indicates that generic language affects both children's and adults' inferences. In some ways, Study 8 provides a particularly strong test. First, the linguistic manipulation was subtle, consisting of adding or deleting a single word. Second, the initial presence of the two target pictures provides a strong non-linguistic context that participants could use to answer the questions. Pragmatically, it would be reasonable for a child to assume that the questions refer to the pictures presented initially. This is particularly true, given children's well-known reliance on perceptual information when present in context (e.g., Jones & Smith, 1993). That children systematically overcame this perceptual information to answer differently based on subtle differences in wording condition demonstrates that language plays a role in directing children's inferences. C.
SUMMARY OF GENERICS
Altogether, studies 4-8 suggest that generic noun phrases may be a mechanism by which parents and others convey to young children that a category is a richly structured kind. We draw this conclusion on the basis of the following pieces of evidence: (1) Generics are frequent in ordinary speech addressed to young children. (2) Generics map onto conceptual structure in interesting ways, with much greater frequency for animates (both people and animals) than for artifacts. This finding holds up for both adults and children. (3) For both children and adults, the distribution of generics (over domain, and with respect to object number) differs from the distribution of nongenerics. (4) The patterns obtained in English (frequency in parental input, domain specificity, and differences between generics and nongener-
The Role of Language
251
ics) replicate even in a language with very different formal means of expressing generics; namely, Mandarin Chinese. (5) Children sensibly interpret the semantics of generics, treating them as broader in scope than "some," but narrower in scope than "all," (6) Initial evidence suggests that children draw broader inferences from generic statements (e.g., "Bears have three layers of fur") than nongeneric statements (e.g., "These bears have three layers of fur"). These last two findings regarding children's interpretation of generics only begin to address the cognitive implications of generic use in children. More research is needed to replicate and explore the effects that generics have on children's inductive inferences. Nonetheless, we speculate that generics may serve two distinct functions for young children. First and most obviously, generics may serve to teach children particular category-wide generalizations. From maternal generics, children can learn particular facts concerning animal vocalizations, habitat, diet, behaviors, and so on. Because these properties are predicated of the kind as a whole, they may become more central to children's conceptual representations than if they had been stated nongenerically. Furthermore, because these facts are stated generically (rather than as universal quantifiers), they may be particularly robust against counterevidence (e.g., "Birds fly" allows for penguins, whereas "All birds fly" does not). Thus, even erroneous properties stated generically, such as stereotypes concerning gender or race, may be more difficult to counter and erase than erroneous properties stated absolutely. The second potential function of maternal generics m a y be to indicate to children that a category as a whole is an inference-promoting entity, even beyond the particular properties mentioned in the generic statements. In other words, hearing numerous generic statements about a category may lead children to treat this category as a "kind" of which indefinitely many category-wide generalizations could be made. In short, we suggest that hearing generics may lead children to make inferences regarding the structure of the category. If this is true, then generics may serve this function even when the information is relatively superficial (e.g., "Little rabbits are called kits"), or when little or no new information is provided (e.g., with questions, such as " H o w do they bats sleep?"), because the generic form itself implies that category members are importantly alike.
VIII.
Summary and Conclusions
In this chapter we investigated four forms of language: lexicalization, generic noun phrases, the word kind, and logical quantifiers. A logical or semantic analysis would suggest that all four are kind-referring expressions
Susan A. Gelman et al.
252
TABLE XVI A COMPARISON OF T H R E E LINGUISTIC DEVICES a G e l m a n et al. (1998)*
Children's age N Kind
Kind-referring in scope Total All, any, each, and every
Kind-referring in scope Total Generic N o u n Phrases
Kind-referring in scope Total n u m b e r of utterances
Adam's mother
Study A
Study C
2; 3-5; 2 1
M = 2; 11 16
M = 1; 8 16
0.19%
0.33%
0.07%
1.01%
1.68%
0.52%
0.18%
0.13%
0.00%
2.69%
2.11%
1.41%
2.79% 20,168
3.86% 3,027
3.87% 1,345
a Percentages are based on the total number of utterances produced in each study.
that could potentially help shape children's concepts. However, the results of the studies we have reviewed indicate a divide among these four devices. Considering first availability in the input, generics and lexicalization appear to be considerably more frequent than kind or universal quantifiers. To assess the relative frequencies of these various devices, we present in tabular form a direct comparison among them (excluding lexicalization, which is unquestionably frequent), using two different sorts of databases: the longitudinal study of "Adam" and his mother (Brown, 1973), densely sampled over the period from 2;3 to 5;2, and a cross-sectional study of 32 mother-child dyads who participated in a picturebook reading task in a lab setting, with books designed to elicit talk about categories and kinds (studies A and C from Gelman et al., 1998; Table X V I ) ) We focus on maternal speech only, for this comparison. These two data sources have complementary strengths: Adam's mother's speech consists entirely of natural conversations in the home, and thus should generalize reasonably well to spontaneous language that children are likely to hear in noncontrived settings. Furthermore, it provides an extremely large sample of utterances. In contrast, studies A and C from Gelman et al. (1998) are useful for including more subjects and for revealing what devices mothers use when s The analyses of A d a m ' s m o t h e r ' s use of kind and universal quantifiers were previously presented in studies 1 and 3; A d a m ' s m o t h e r ' s generics are reported here for the first time. Maternal use of all and of generics from G e l m a n et al. (1998) were reported earlier in this chapter; the use of kind and of the other universal quantifiers from that study are presented here for the first time.
The Role of Language
253
in a context that is maximally designed to elicit talk about categories and kinds (see Gelman et al., 1998, for more discussion of the study design). As can be seen, when parents use a kind-referring expression, they are much more likely to use a generic noun phrase than any of the other expressions that were studied (kind, all, any, each, or every). Thus, it is questionable whether these latter expressions are sufficiently available in the input to provide a substantive amount of information to children. We turn next to conceptual distinctions. To some extent, all the words studied except kind map onto relevant conceptual distinctions. Common nouns, although fully applicable to any domain, were used to imply kind membership, at least with the social categories presented in study 2. More research is needed to determine the extent of this effect. Generic noun phrases were found to map onto conceptual distinctions in interesting ways, showing a highly consistent bias toward referring to animate kinds. Universal quantifiers overall showed no animacy bias, nor were they often used to refer to kinds. However, on those occasions that universal quantifiers did refer to an entire category, they more commonly applied to animates than artifacts. Finally, we found no evidence that the word kind was used in ways that map onto interesting conceptual distinctions. They were typically used to express subordinate-level categories, and rarely to refer to either animates or kinds. Finally, how are these different expressions understood by children? This issue is not completely understood at the present, although initial data suggest that children do indeed understand common nouns, generic noun phrases, and the universal quantifier all as kind-referring expressions. The work we have reviewed has focused primarily on kinds as inference promoting. There are other possible implications, however, that would be interesting to investigate. For example, essentialism implies relative emphasis on within-group similarity and between-group differences. It also implies that a category has a nonobvious basis, that it is real (discovered) rather than invented, that it is biological in origins rather than social, and that it is inherent in an individual rather than the product of social interaction. Overall, language may help turn an arbitrary characteristic into a kind and may provide clues about how to carve up the social and nonsocial world. To summarize, then, lexicalization and generic noun phrases are frequently used by parents and preschool children, typically refer to kinds, and are interpreted as such by young children. In contrast, the other two forms (the word kind and logical quantifiers) appear infrequently and rarely with reference to basic-level kinds, thus suggesting that they are less likely to influence children's kind concepts. Table XVII illustrates in schematic form how each of these linguistic devices functions in the speech of children and parents.
254
Susan A. G e l m a n et al.
TABLE XVII A COMPARISON OF FOUR LINGUISTIC DEVICES: GENERIC NOUN PHRASES, COMMON NOUNS, UNIVERSAL QUANTIFIERS, AND THE WORD KIND
Available in the input Conceptual distinctions: Typically refer to kinds Typically refer to basic level Typically domain specific Understood by children
Generics
Common nouns
Quantifiers
"Kind"
Yes
Yes
Yes
Somewhat
Yes ? Yes Yes
Yes Yes No Yes
No ? Yes a Yes
No No No ?
Only when used in kind-referring ways.
A.
UNIVERSALITY AND LANGUAGE SPECIFICITY
Although we argue that language affects children's conceptual understanding, our position is not a Whorfian claim of radical language differences. Languages universally have the capacity to express important concepts (Au, 1988), including membership in a category and scope of quantification. The available evidence suggests indeed that the distinction between nouns and verbal predicates is universal (Gentner, 1982), as is the capacity to express generics (Carlson & Pelletier, 1995). We expect that the use of these linguistic expressions to foster kind concepts is not limited to English. Nonetheless, there are cross-linguistic variations in the expression of these concepts that could conceivably affect acquisition and use of these expressions. Regarding lexicalization, languages vary regarding the relative primacy of nouns vs. verbs (Choi & Gopnik, 1995; Tardif, 1996; Tardif, Gelman, & Xu, 1999; but see Au, Dapretto, & Song, 1994), and there may be cognitive consequences of these differences (Gopnik & Choi, 1990). The variations in generic expression are particularly interesting, given the results of study 4. Recall that Tardif et al. (1999) found that generic noun phrases were more frequent in English than Mandarin, despite the lack of differences between the languages in the production of nongeneric noun phrases. Thus, faced with an identical context, mothers of English- vs. Mandarin-speaking toddlers produce generic utterances at different rates. We speculate that formal properties of the language may prompt speakers to notice and use generics relatively more (as with English) or less (as with Mandarin). Although the generic/nongeneric distinction itself is not obligatorily marked in either language, in English it is conveyed by means of obligatory cues (including number and determiners). The use of obligatory markers for conveying generics in English may make generic expressions more salient and so more frequently used. In other words, the morphosyn-
The Role of Language
255
tactic system may have a subtle effect on the frequency with which speakers consider abstract kinds. If this is so, then frequency effects should also appear in other languages that are structurally similar to Mandarin in their nominal and verbal systems. Furthermore, there should be measurable cognitive consequences that can be found on nonlinguistic tasks. B.
How FUNDAMENTALA ROLE DOES LANGUAGEPLAY?
Can we characterize more precisely the effects of language in children's kind concepts7 One fundamental question that arises is whether common nouns and generics simply reflect preexisting conceptual structures or whether they play any causal role whatsoever. Furthermore, the answer to this question may depend on the category in question. For example, animal kinds are undoubtedly supported by an extensive nonlinguistic, perceptual basis (Rosch et al., 1976), whereas some of the social categories being considered may be more susceptible to language effects. It is highly likely that preexisting conceptual structures are in place by the time children are learning these constructions. Certainly the assumption that categories serve as a basis of induction is untaught (Baldwin, Markman, & Melartin, 1993; Hayne, Rovee-Collier, & Perris, 1987). The finding that children without exposure to a conventional language spontaneously create their own communicative system, complete with nouns and the capacity to do displaced reference (Goldin-Meadow & Mylander, 1990; Morford & Goldin-Meadow, 1997), would also suggest that a rich conceptual system is in place prior to the cultural transmission of a conventional language. Furthermore, the linguistic devices we are talking about are, at best, oblique and sketchy. Common nouns and generics only implicitly refer to kinds and inductive potential, and are, in fact, far less explicit in this way than either the word kind or universal quantifiers. We infer from this characterization that children must be filling in gaps based on their own extralinguistic understanding. Nonetheless, the evidence suggests that language is doing more than simply reflecting children's preexisting concepts. Language has direct effects on children's inductive inferences, in experimental scenarios that contrasted different forms of input (studies 2 and 8). Therefore, if we grant that generics and lexicalization do affect thought, at what level do they exert an effect? Do certain forms of language allow new conceptual understandings to arise, or do they modify existing concepts? To present a somewhat simplified view of the range of possibilities, we propose three potential levels of effects, from narrowest to broadest:
1. Content of kinds: On this view, language helps fill in the details of the kinds that children have already established through nonlinguistic means.
256
Susan A.
Gelman et
al.
For example, generics may tell children which properties are true of "dogs" (as a kind), or lexicalization may increase a child's confidence that a particular trait is stable over time. 2. Which categories are kinds: On this view, language helps children sort out which categories are relatively stable and inference-rich, and which categories are m o r e arbitrary or impermanent. For example, generics may tell children that " r o b b e r s " are a stable kind, and not simply a group of individuals who engage(d) in a particular behavior. 3. That there are kinds: On this view, language can exaggerate any essentializing tendencies that are already present. F o r example, an i n d i v i d u a l - - o r a c u l t u r e - - t h a t engages in an extensive amount of essentializing talk m a y foster a higher degree of essentializing. A t present, we have evidence only of content effects, but in future research, it will be important to explore the possibility of effects at the other two levels. W e end by acknowledging that language is just one of m a n y cues available regarding category structure. There are multiple sources of information for children to consider, including, but not limited to language, perceptual similarity, functions and behaviors, similarity of context, feature correlations, feature entrenchment, and other factual knowledge. One challenging set of issues concerns how people coordinate these cues as well as the degree of concord vs. competition a m o n g these cues in the input to children. To complicate things even further, it m a y be that some of these cues have different strengths at different points in development. For example, the role of language m a y be particularly strong early in development, when children have relatively less world knowledge and information regarding specific features to guide their reasoning. We hope that the present studies strengthen the case for examining these questions in greater detail. ACKNOWLEDGMENTS Support for this research was provided by NICHD grant HD36043 to Gelman and NICHD grant HD08006 to Heyman and Gelman. The data from Studies 2, 4, and 6 are presented in greater detail in Gelman & Heyman (in press), Gelman & Tardif (1998), and Pappas & Gelman (1998). REFERENCES Abelson, R. P., & Kanouse, D. E. (1966). The subjective acceptance of verbal generalizations. In S. Feldman (Ed.), Cognitive consistency."Motivational antecedents and behavioral consequents (pp. 171-197). New York: Academic Press.
The Role of Language
257
Adams, A. K., & Bullock, D. (1986). Apprenticeship in word use: Social convergence processes in learning categorically related nouns. In S. A. Kuczaj & M. D. Barrett (Eds.), The development of word meaning (pp. 155-197). New York: Springer-Verlag. Atran, S. (1990). Cognitivefoundations of natural history: Towards an anthropology of science. Cambridge: Cambridge University Press. Atran, S. (1995). Causal constraints on categories and categorical constraints on biological reasoning across cultures. In D. Sperber, D. Premack, & A. J. Premack (Eds.), Causal cognition: A multidisciplinary debate (pp. 205-233). Oxford: Clarendon Press. Atran, S., Estin, P., Coley, J., & Medin, D. (1997). Generic species and basic levels: Essence and appearance in folk biology. Journal of Ethnobiology, 17, 17-43. Au, T. K. (1988). Language and cognition. In R. L. Schiefelbusch & L. L. Lloyd. (Eds.), Language perspectives: Acquisition, retardation, and intervention (2nd ed). Austin, TX: PRO-ED. Au, T. K., Dapretto, M, & Song, Y.-K. (1994). Input vs constraints: Early word acquisition in Korean and English. Journal of Memory & Language, 33, 567-582. Au, T. K., Sidle, A. L., & Rollins, K. B. (1993). Developing an intuitive understanding of conservation and contamination: Invisible particles as a plausible mechanism. Developmental Psychology, 29, 286-299. Backscheider, A. B., Shatz, M., & Gelman, S. A. (1993). Preschoolers' ability to distinguish living kinds as a function of regrowth. Child Development, 64, 1242-1257. Balaban, M. T., & Waxman, S. R. (1997). Do words facilitate object categorization in 9month-old infants? Journal of Experimental Child Psychology, 64, 3-26. Baldwin, D. A., & Markman, E. M. (1989). Establishing word-object relations: A first step. Child Development, 60, 381-398. Baldwin, D. A., Markman, E. M., Melartin, R. L. (1993). Infants' ability to draw inferences about nonobvious object properties: Evidence from exploratory play. Child Development, 64, 711-728. Barsalou, L. W. (1991). Deriving categories to achieve goals. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 27, pp. 1-64). New York: Academic Press. Bartsch, K., & Wellman, H. M. (1995). Children talk about the mind. Cambridge: Oxford University Press. Berlin, B. (1992). Ethnobiological classification: principles of categorization of plants and animals in traditional societies. Princeton, NJ: Princeton University Press. Bloom, A. H. (1981). The linguistic shaping of thought. Hillsdale, NJ: Erlbaum. Bloom, L. (1970). Language development: Form and function in emerging grammars. Cambridge, MA: MIT Press. Bloom, P. (1990). Syntactic distinctions in child language. Journal of Child Language, 17, 343-355. Bloom, P. (1996). Intention, history, and artifact concepts. Cognition, 60, 1-29. Brooks, P. J., & Braine, M. D. S. (1996). What do children know about the universal quantifiers all and each? Cognition, 60, 235-268. Brown, R. (1957). Linguistic determinism and the part of speech. The Journal of Abnormal and Social Psychology, 55, i-5. Brown, R. (1973). A first language: The early stages. Cambridge, MA: Harvard University Press. Brown, R., & Fish, D. (1983). The psychological causality implicit in language. Cognition, 14, 237-273. Callanan, M. (1989). Development of object categories and inclusion relations: Preschoolers' hypotheses about word meanings. Child Development, 56, 508-523.
258
Susan A. Gelman et al.
Callanan, M. A. (1990). Parents' descriptions of objects: Potential data for children's inferences about category principles. Cognitive Development, 5, 101-122. Carey, S. (1995). On the origins of causal understanding. In D. Sperber, D. Premack, & A. J. Premack (Eds.), Causal cognition: A multidisciplinary debate (pp. 268-302). Oxford: Clarendon Press. Carlson, G. N. (1977). A unified analysis of the English bare plural. Linguistics and Philosophy, 1, 413-457. Carlson, G. N., & Pelletier, F. J. (1995). The generic book. Chicago: Chicago University Press. Choi, S., & Gopnik, A. (1995). Early acquisition of verbs in Korean: A cross-linguistic study. Journal of Child Language, 22, 497-529. Choi, I., Nisbett, R. E., & Smith, E. E. (1997). Culture, category salience, and inductive reasoning. Cognition, 65, 15-32. Croft, W. (1990). Typology and universals. New York: Cambridge University Press. Dahl, O. (1975). On generics. In E. L. Keenan (Ed.), Formal semantics of natural language (pp. 99-111). Cambridge University Press. Darley, J. M., & Fazio, R. H. (1980). Expectancy-confirmation processes arising in the social interaction sequence. American Psychologist, 35, 867-881. Davidson, N. S., & Gelman, S. A. (1990). Inductions from novel categories: The role of language and conceptual structure. Cognitive Development, 5, 151-176. DeVries, R. (1969). Constancy of generic identity in the years three to six. Society for Research in Child Development Monographs, 34 (No. 127). Diesendruck, G., Gelman, S. A., & Lebowitz, K. (1998). Conceptual and linguistic biases in children's word learning. Developmental Psychology, 34, 823-839. Diesendruck, G., & Shatz, M. (1997). The effect of perceptual similarity and linguistic input on children's acquisition of object labels. Journal of Child Language, 24, 695-717. Donaldson, M., & JcGarrigle, J. (1974). Some clues to the nature of semantic development. Journal of Child Language, 1, 185-194. Fiedler, K., Semin, G. R., & Bolten, S. (1989). Language use and reification of social information: Top-down and bottom-up processing in person cognition. European Journal of Social Psychology, 19, 271-295. Fiske, S. T., & Neuberg, S. L. (1990). A continuum of impression formation, from categorybased to individuating processes: Influences of information and motivation on attention and interpretation. In M. P. Zanna (Ed.), Advances in experimental social psychology (Vol. 23, pp. 1-74). New York: Academic Press. Flavell, J. H., Flavell, E. R., & Green, F. L. (1983). Development of the appearance-reality distinction. Cognitive Psychology, 15, 95-120. Gelman, R. (1990). First principles organize attention to and learning about relevant data: Number and the animate-inanimate distinction as examples. Cognitive Science, 14, 79-106. Gelman, R., Durgin, F., & Kaufman, L. (1995). Distinguishing between animates and inanimates: Not by motion alone. In D. Sperber, D. Premack, & A. J. Premack (Eds.), Causal cognition: A multidisciplinary debate (pp. 150-184). Oxford: Clarendon Press. Gelman, R., Spelke, E., & Meck, E. (1983). What preschoolers know about animate and inanimate objects. In D. Rogers & J. Sloboda (Eds.), The acquisition of symbolic skills. New York: Plenum. Gelman, S. A. (1988). The development of induction within natural kind and artifact categories. Cognitive Psychology, 20, 65-95. Gelman, S. A., & Coley, J. D. (1990). The importance of knowing a dodo is a bird: Categories and inferences in 2-year-old children. Development Psychology, 26, 796-804.
The Role of Language
259
Gelman, S. A., Coley, J. D., & Gottfried, G. M. (1994). Essentialist beliefs in children: The acquisition of concepts and theories. In L. Hirschfeld & S. Gelman (Eds.), Mapping the mind: Domain-specificity in cognition and culture (pp. 341-365). New York: Cambridge University Press. Gelman, S. A., Coley, J. D., Rosengren, K., Hartman, E., & Pappas, T. (1998). Beyond labeling: The role of maternal input in the acquisition of richly-structured categories. Monographs of the Society for Research in Child Development. Serial No. 253, Vol. 63, No. 1. Gelman, S. A., Collman, P., & Maccoby, E. E. (1986). Inferring properties from categories versus inferring categories from properties: The case of gender. Child Development, 57, 396-404. Gelman, S. A., & Diesendruck, G. (1999). What's in a concept? Context, variability, and psychological essentialism. In I. E. Sigel (Ed.), Development of mental representation: Theories and applications (pp. 87-111). Mahwah, NJ: Erlbaum. Gelman, S. A., Flukes, S., & Rodriguez, T. (1999). Children's talk about generic categories: A longitudinal analysis. Unpublished raw data. Gelman, S. A., & Gottfried, G. (1996). Causal explanations of animate and inanimate motion. Child Development, 67, 1970-1987. Gelman, S. A., & Heyman, D. G. (in press). Carrot-eaters and creature-believers: The effects of lexicalization on children's inferences about social categories. Phychological Science. Gelman, S. A., & Kremer, K.E. (1991). Understanding natural cause: Children's explanations of how objects and their properties originate. Child Development, 62, 396-414. Gelman, S. A., & Markman, E. M. (1986). Categories and induction in young children. Cognition, 23, 183-209. Gelman, S. A., & Markman, E. M. (1987). Young children's inductions from natural kinds: The role of categories and appearances. Child Development, 58, 1532-1541. Gelman, S. A., & Medin, D. (1993). What's so essential about essentialism? A different perspective on the interaction of perception, language, and conceptual knowledge. Cognitive Development, 5, 157-168. Gelman, S. A., & O'Reilly, A. W. (1988). Children's inductive inferences within superordinate categories: The role of language and category structure. Child Development, 59, 876-887. Gelman, S. A., & Tardif, T. Z. (1998). Generic noun phrases in English and Mandarin: An examination of child-directed speech. Cognition, 66, 215-248. Gelman, S. A., & Wellman, H. M. (1991). Insides and essences: Early understandings of the non-obvious. Cognition, 38, 213-244. Genter, D. (1982). Why nouns are learned before verbs: Linguistic relativity vs. natural partitioning. In S. A. Kuczaj II (Ed.), Language development." Syntax and semantics. Hillsdale, NJ: Erlbaum. Ghiselin, M. T. (1969). The triumph of the Darwinian method. Chicago: University of Chicago Press. Goldin-Meadow, S., & Mylander, C. (1990). Beyond the input given: The child's role in the acquisition of language. Language, 66, 323-55. Gopnik, A., & Choi, S. (1990). Do linguistic differences lead to cognitive differences? A cross-linguistic study of semantic and cognitive development. First Language, 10, 199-215. Gottfried, G. M., & Tonks, S. J. M. (1996). Specifying the relation between novel and known: Input affects the acquisition of novel color terms. Child Development, 67, 850-866. Gutheil, G., & Rosengren, K.S. (1996). A rose by any other name: Preschoolers' concept of identity across name and appearance changes. British Journal of Developmental Psychology, 14, 477-498.
260
Susan A. Gelman et al.
Hall, D. G. (1994). Semantic constraints on word learning. Proper names and adjectives. Child Development, 65, 1299-1317. Hall, D. G., & Moore, C. E. (1997). Red bluebirds and black greenflies: Preschoolers' understanding of the semantics of adjectives and count nouns. Journal of Experimental Child Psychology, 67, 236-267. Hall, D. G., Waxman, S. R., & Hurwitz, W. M. (1993). How two- and four-year-old children interpret adjectives and count nouns. Child Development, 64, 1651-1664. Hamilton, D., Sherman, S. J., & Ruvolo, C. M. (1990). Stereotype-based expectancies: Effects on information-processing and social behavior. Journal of Social Issues, 46, 35-60. Hayne, H., Rovee-Collier, C., & Perris, E. E. (1987). Categorization and memory retrieval by three-month-olds. Child Development, 58, 750-767. Hickling, A. K., & Wellman, H. M. (1998). The emergence of everyday causal explanation in foundational knowledge domains. Unpublished ms., University of North CarolinaGreensboro. Hirschfeld, L. A. (1995). Do children have a theory of race? Cognition, 54, 209-252. Hirschfeld, L. A. (1996). Race in the making. Cambridge, MA: MIT Press. Hirschfeld, L. A., & Gelman, S. A. (1997). What young children think about the relation between language variation and social difference. Cognitive Development, 12, 213-238. Hollander, M., & Gelman, S. A. (1999a). Natural language analyses of parent-child conversations about kinds and quantification. Unpublished raw data. Hollander, M., & Gelman, S. A. (1999b). Semantic interpretations of generics, all, and some. Unpublished raw data. Inhelder, B., & Piaget, J. (1964). The early growth of logic in the child. New York: Norton. Jackendoff, R. (1996). Semantics and cognition. In S. Lappin (Ed.), The handbook of contemporary semantic theory (pp. 539-559). Cambridge, MA: Blackwell. James, W. (1890). The principles of psychology (Vol. 2). New York: Dover. Johnson, C. N., & Wellman, H. M. (1982). Children's developing conceptions of the mind and brain. Child Development, 53, 222-234. Jones, S., & Smith, L. B. (1993). The place of perception in children's concepts. Cognitive Development, 8, 113-140. Jussim, L., Nelson, T. E., Manis, M, & Soffin, S. (1995). Prejudice, stereotypes, and labeling effects: Sources of bias in person perception. Journal of Personality and Social Psychology, 68, 228-246. Kalish, C. W. (1996). Preschoolers' understanding of germs as invisible mechanisms. Cognitive Development, 11, 83-106. Kanouse, D. E. (1987). Language, labeling, and attribution. In E. E. Jones, D. E. Kanouse, et al. (Eds.), Attribution: Perceiving the causes of behavior (pp. 121-135). Hillsdale, NJ: Erlbaum. Kanouse, D. E., & Abelson, R. P. (1967). Language variables affecting the persuasiveness of simple communications. Journal of Personality & Social Psychology, 7, 158-163. Katz, N., Baker, E., & Macnamara, J. (1974). What's in a name? A study of how children learn common and proper names. Child Development, 45, 469-473. Keil, F. C. (1989). Concepts, kinds, and cognitive development. Cambridge, MA: MIT Press. Kohlberg, L. (1966). A cognitive-developmental analysis of children's sex-role concepts and attitudes. In E. E. Maccoby (Ed.), The development of sex differences. Stanford, CA: Stanford University Press. Krifka, M. (1995). Common nouns: A contrastive analysis of Chinese and English. In G. N. Carlson & F. J. Pelletier (Eds.), The generic book (pp. 398-411). Chicago: Chicago University Press.
The Role of Language
261
Kuczaj, S. (1976). -ing. -s and -ed: A study of the acquisition of certain verb inflections. Unpublished doctoral dissertation, University of Minnesota. Lawler, J. M. (1973). Tracking the generic toad. Papers from the Ninth Regional Meeting of the Chicago Linguistic Society (pp. 320-331). Chicago: Chicago Linguistic Society. Locke, J. (1894/1959). An essay concerning human understanding, Vol. 2. New York: Dover. Lyons, J. (1977). Semantics: Vol. L Cambridge, MA: Cambridge University Press. Macnamara, J. (1986). A border dispute. Cambridge, MA: MIT Press. MacWhinney, B., & Snow, C. (1985). The child language data exchange system. Journal of Child Language, 12, 271-295. MacWhinney, B., & Snow, C. (1990). The Child Language Data Exchange System: An update. Journal of Child Language, 17, 457-472. Mahalingam, R. (1998). Essentialism, power and representation of caste: A developmental study. Ph.D. Dissertation, University of Pittsburgh. Malt, B. C. (1994). Water is not H20. Cognitive Psychology, 27, 41-70. Marcus, G. F., Pinker, S., Ullman, M., Hollander, M., Rosen, T. J., & Xu, F. (1992). Overregularization in language acquisition. Monographs of the Society for Research in Child Development. Serial No. 228, Vol. 57, No. 4. Markman, E. M. (1989). Categorization and naming in children. Cambridge, MA: MIT Press. Markman, E. M., & Hutchinson, J. E. (1984). Children's sensitivity to constraints on word meaning: Taxonomic versus thematic relations. Cognitive Psychology, 16, 1-27. Massey, C., & Gelman, R. (1988). Preschoolers' ability to decide whether a photographed unfamiliar object can move itself. Developmental Psychology, 24, 307-317. Mayr, E. (1988). Toward a new philosophy of biology: Observations of an evolutionist. Cambridge, MA: Harvard University Press. Mayr, R. (1991). One long argument." Charles Darwin and the genesis of modern evolutionary thought. Cambridge, MA: Harvard University Press. McCawley, J. D. (1981). Everything that linguists have always wanted to know about logic. Chicago: University of Chicago Press. Medin, D. L. (1989). Concepts and conceptual structure. American Psychologist, 44, 1469-1481. Medin, D., & Ortony, A. (1989). Comments on Part I: Psychological essentialism. In S. Vosniadou & A. Ortony (Eds.), Similarity and analogical reasoning (pp. 179-195). Cambridge: Cambridge University Press. Mehler, J., & Fox, R. (Eds.) (1985). Neonate cognition. Hillsdale, NJ: Erlbaum. Milich, R., McAninch, C. B., & Harris, M. J. (1992). Effects of stigmatizing information on children's peer relations: Believing is seeing. School Psychology Review, 21, 400-409. Mill, J. S. (1843). A system of logic, ratiocinative and inductive. London: Longman Group. Miller, D. T., & Turnbull, W. (1986). Expectancies and interpersonal processes. In M. R. Rosenzweig & L. W. Porter (Eds.), Annual review of psychology (Vol. 37, pp. 233-256). Palo Alto, CA: Annual Reviews. Morford, J. P., & Goldin-Meadow, S. (1997). From here and now to there and then: The development of displaced reference in homesign and English. Child Development, 68, 420-435. Moser, D. J. (1996). Abstract thinking and thought in ancient Chinese and early Greek. Unpublished doctoral dissertation, University of Michigan, Ann Arbor. Murphy, J. W. (1990). Giftedness as a limited episteme: A postmodern exposition. Early Child Development and Care, 63, 153-160. Osherson, D. N., Smith, E. E., Wilkie, O., Lopez, A., et al. (1990). Category-based induction. Psychological Review, 97, 185-200. Pappas, A., & Gelman, S. A. (1998). Generic noun phrases in mother-child conversations. Journal of Child Language, 25, 19-33.
262
Susan A. Geiman et ai.
Parker, D. S. (1998). The idea of the middle class: Whiw-collar workers and Peruvian society, 1900-1950. University Park, PA: Pennsylvania State University Press. Pinker, S. (1994). The language instinct. New York: W. Morrow. Rips, L. J. (1975). Inductive judgments about natural categories. Journal of Verbal Learning & Verbal Behavior, 14, 665-681. Rosch, E., Mervis, C. B., Gray, W. D., Johnson, D. M., & Boyes-Braem, P. (1976). Basic objects in natural categories. Cognitive Psychology, 8, 382-439. Rosen, A. B., & Rozin, P. (1993), Now you see it, now you don't: The preschool child's conception of invisible particles in the context of dissolving. Developmental Psychology, 29, 300-311. Rosenfield, S. (1997). Labeling mental illness: The effects of received services and perceived stigma on life satisfaction. American Sociological Review, 62, 660 672. Rosengren, K. S., Gelman, S. A., Kalish, C. W., & McCormick, M. (1991). As time goes by: Children's early understanding of growth in animals. Child Development, 62, 1302-1320. Rosenthal, R., & Jacobson, L. (1968). Pygmalion in the classroom: Teacher expectation and pupils' intellectual development. New York: Holt, Rinehart, and Winston. Rothbart, M., & Taylor, M. (1992). Category labels and social reality: Do we view social categories as natural kinds? In G. R. Semin & K. Fiedler (Eds.), Language, interaction, and social cognition (pp. 11-36). London: Sage Publications. Sachs, J. (1983). Talking about the there and then: The emergence of displaced reference in parent-child discourse. In K. E. Nelson (Ed.), Children's language, VoL 4. Hillsdale, NJ: Erlbaum. Schwartz, S. P. (Ed.) (1977). Naming, necessity, and natural kinds. Ithaca, NY: Cornell University Press. Schwartz, S. P. (1979). Natural kind terms. Cognition, 7, 301-315. Semin, G. R., & Fiedler, K. (1988). The cognitive functions of linguistic categories in describing persons: Social cognition and language. Journal of Personality and Social Psychology, 54, 558-568. Shipley, E. F. (1989). Two kinds of hierarchies: Class inclusion hierarchies and kind hierarchies. Genetic Epistemologist, 17, 31-39. Shipley, E. F. (1993). Categories, hierarchies, and induction. In D. Medin (Ed.), Thepsychology of learning and motivation (Vol. 30, pp. 265-301). New York: Academic Press. Siegal, M. (1988). Children's knowledge of contagion and contamination as causes of illness. Child Development, 59, 1353-1359. Siegal, M., & Robinson, J. (1987). Order effects in children's gender-constancy responses. Developmental Psychology, 23, 283-286. Simons, D. J., & Keil, F. C. (1995). An abstract to concrete shift in the development of biological thought: The insides story. Cognition, 56, 129-163. Smith, C. L. (!979). Children's understanding of natural language hierarchies. Journal of Experimental Child Psychology, 27, 437-458. Smith, C. L. (1980). Quantifiers and question answering in young children. Journal of Experimental Child Psychology, 30, 191-205. Solomon, G. E. A., Johnson, S. C., Zaitchik, D., & Carey, S. (1996). Like father, like son: Young children's understanding of how and why offspring resemble their parents. Child Development, 67, 151-171. Springer, K. (1992). Children's awareness of the biological implications of kinship. Child Development, 63, 950-959. Springer, K. (1996). Young children's understanding of a biological basis for parent-offspring relations. Child Development, 67, 2841-2856.
The Role of Language
263
Springer, K., & Keil, F. (1989). On the development of biologically specific beliefs: The case of inheritance. Child Development, 60, 637-648. Star, J., & Gelman, S. A. (1999). The effects of generic noun phrases in inductive inferences. Unpublished raw data. Tardif, T. (1996). Nouns are not always learned before verbs: Evidence from Mandarin speakers' early vocabularies. Developmental Psychology, 32, 492-504. Tardif, T. Z., Gelman, S. A., & Xu, F. (1999). Putting the 'noun bias' in context: A comparison of Mandarin and English. Child Development, 70, 620-635. Taylor, M. (1996). The development of children's beliefs about social and biological aspects of gender differences. Child Development, 67, 1555-71. Thompson, E. P. (1963). The making of the English working class. New York: Pantheon Books. Vendler, Z. (1967). Linguistics in philosophy. Ithaca, NY: Cornell University Press. Waxman, S. R., & Balaban, M. T. (1997). Do words facilitate object categorization in 9month-old infants? Journal of Experimental Child Psychology, 64, 3-26. Waxman, S. R., & Hall, D. G. (1993). The development of a linkage between count nouns and object categories: Evidence from fifteen-to twenty-one-month-old infants. Child Development, 64, 1224-1241. Waxman, S. R., & Markow, D. B. (1995). Words as invitations to form categories: Evidence from 12- to 13-month-old infants. Cognitive Psychology, 29, 257-302. Waxman, S. R., Shipley, E. F., & Shepperson, B. (199l). Establishing new subcategories: The role of category labels and existing knowledge. Child Development, 62, 127-138. Wellman, H. M. (1990). The child's theory of mind. Cambridge: MIT Press, A Bradford Book. Wierzbicka, A. (1994). The universality of taxonomic categorization and the indispensability of the concept 'kind.' Rivista di Linguistica, 6, 347-364. Wood, M., & Valdez-Menchaca, M. C. (1996). The effect of a diagnostic label of language delay on adults' perceptions of preschool children. Journal of learning disabilities, 29, 582-588. Yamauchi, T., & Markman, A. (1998). Category learning by inference and classification. Journal of Memory and Language, 39, 124-148.
INDEX
A
Categorization artifacts, 210-211, 240 background knowledge input functions, 166-168 selection function, 163-171, 173 benefits, 163 as cognitive process, 22-23 cross-cultural variation, 202-203 definition, 204-205 experiments buildings, 175-179 discussion, 180-183 vehicles, 179 function, 201 goals, 99 inductive inferences, 201-202 kinds conceptualization, 212-213 content, 255-256 conversational use, 213-217 definition, 204-205 evidence, 205-207 generics and, 230 lexicalization, 208, 217-219 mechanisms, 212 knowledge-first, 175 language devices, 208-212 generic noun phrases Chinese speakers, 233-236 conceptual distinctions, 239-244 domain specificity, 238-239 frequency, 230-233, 237-238 function, 228-230
Aging amnesia and, 12 Korsakoff syndrome, 12 memory depression, comparison, 61 prospective, 61 Amnesia age-associated, 12 infantile, 40 Anthropometry, 74-75 Artifacts, category, 210-211,240 Artificial intelligence, 134 Associations category-to-goal, 99 feature-to-goal, 99, 125 Attention to target, 9-10 Attributes, choice and, 104-106 Avoidance goals, 101-103
B Bayesian statistics, 173 Baywatch model description, 186-187 evaluation, 191-196 simulations, 188-191 technical details, 187-188 Biases, anthropomorphic, 74-75
C Car choice study, 104-106 CARIN model, s e e Competition among relations in nominals model 265
266 Categorization (continued) inductive inferences, 248-251 longitudinal study, 236-244 semantic interpretation, 245-248 kind usage, 212-217 lexicalization application, 217-219 cognition, 219-220 comprehension, 220-221 effects, 217-219 research issues, 224 quantifiers logical, 225 universal, 226-228 role, fundamental, 255-256 studies conclusions, 251-253 universality, 254-255 and language, relation, 202 learning difficulty, 164 input functions, 171-172 memory correlated attributes, 29-32 deferred imitation tasks, 34-35 description, 23-26 exemplar similarity, 26-28 procedures, 32-33 serial-probe recognition, 35-39 structure detection, 39-40 time windows, 28-29 observations, 164-166 properties, nonobvious, 205-206 social effects, 218-219 CHILDES database, 226-228 Children, see also Infants comprehension, 211 essentialist reasoning, 202 generic noun phrases Chinese speakers, 233-236 conceptual distinctions, 239-244 domain specificity, 238-239 frequency, 230-233, 237-238 inductive inferences, 248-251 longitudinal study, 236-244 semantic interpretation, 245-248 inference drawing, 205 kinds expression, 212-213 generation, 204 lexicalization comprehension, 220-221
Index
research issues, 224 social effects, 221-224 nonobvious properties, 205 quantifiers logical, 225 universal, 226-228 structures categories, 211 linguistic, 209 Chinese, generic use, 233-236 Choices aspects, 99 predicting, 111-112 processing goals, 103-106 trials, 77 Christmas clubs, 123 Cigarette smoking deprivation study, 100 need study, 110-111 Cognition catagorization as, 22-23 copying machine metaphor, 130-132 as cybernetic system, 98-99 essentialism and, 207-208 lexicalization, 219-220 Compatibility goals self-regulation, 122-124 and valued, 106-108 objects goals, 120-122 values, 120-122 Competition among relations in nominals model, 132 Conceptualizations combinations alignment, 149-150 kinds, 210 knowledge construction, 151-155 models, 131 knowledge, 129-130 noun phrases, 239-244 primitives, 132 Construction, knowledge conceptual combination, 151-155 induction, 150-151 metaphors, 155-157 Constructive integrative processing concerns, 157-159
Index
mechanisms, 145-150 principles, 143-145 relational interpretations, 154-155 representation, 137-143 theory, 135-137 Contexts definition, 13 initiative, 47-48 memory infant, 14-18 processing time, 18-19 novel, 19 Conversations kind use, 213-217 noun use, 239-244 Copying machine metaphor in cognitive psychology, 130-132 integrative/constructive processing 135-137 prevalence, 132-135 Cues definition, 13 memory effects, 14-16 processing time, 18-19 novel, 19 Currency coin size, 99-100 standardized, 119-120
D Data-driven processing, 171 Deferred imitation tasks, 34-35 Delayed-matching-to-sample, 73-74 Delayed recognition, 2-3 Depression control function beneficial, 52-61 disruption, 49-51 irrelevancy, 51-52 metaphors, 65-68 motivation role, 61-65 Development cognitive, 207-208 memory age-related retention, 4-11 retrieval speed, 11-12
267
contextual specificity, 16 cue specificity, 14 multiple systems, 12-13 Discrimination complex, 78 relational, 76 Distance conceptualization, 113-114 Distortion, memory, 19, 21-22 DMTS, s e e Delayed-matching-to-sample Duration, ordinal comparisons, 77-81
E
vs.,
Economic research car choice study, 104-106 cigarette smoking deprivation study, 100 need study, 110-111 coin size study, 99-100 electric guitar study, 103-104 foundations, 97-98 gambling study, 120-121 jacket study, 118-119, 122-124 lottery study, 108-110 Electric guitar study, 103-104 Emotions, recognizing, 57-60 Essences definition, 204-205 evidence, 205-207 Essentialism development, 236-237 influence, 218 psychological cognition and, 207-208 description, 206-207 reasoning, 202
F Feature-to-goal associations, 99, 125 Feedforward network, s e e Baywatch model Formalism, representational, 133-134 Free recall, 53-54
G Gambling study, 120-121 Generalizations, 229 Gift money, 119-120 Goals activation cycles, 107
268
Index
Goals (continued) avoidance, 101-103 choice processing, 103-106 feature-to, associations, 125 gradient study, 113-115 measures, 100 and objects, relations, 98-108 prevention, 101-103 promotion, 101-103 utility, 98 and values activation level, 108-113 segregation, 117-120 temporal aspects, 113-117 compatibility, 106-108 object, 120-122 self-regulation, 122-124
H House-building metaphor, 130
I Induction model, 131 Infants, see also Children amnesia, 40 categorization skills correlated attributes, 29-32 description, 23-26 exemplar similarity, 26-28 lists deferred imitation tasks, 34-35 procedures, 32-33 serial-probe recognition, 35-39 structure detection, 39-40 time windows, 28-29 memory context distortion, 19, 21-22 effects, 14-18 processing, 18-19 CUeS
distortion, 19, 21-22 effects, 14, 16-18 processing, 18-19 ontogeny multiple systems, 12-13 retention, 4-11 retrieval speed, 11-12
reminders, 6, 8 studies history, 1 procedures, 2-3 Inferences drawing, 205 inductive, 248-251 Information processing, 103-106 Initiation context, 47-48 Integration model knowledge selection, 168, 172 mechanisms, 145-150 principles, 143-145 Integrative constructive processing concerns, 157-159 mechanisms, 145-150 principles, 143-145 relational interpretations, 154-155 representation, 137-143 theory, 135-137 Interference, retroactive, 19 Interpretations generics, 245-248 relational, 154-155
J Jacket study, 118-119, 122-124 Jacksonian principle, 13 Judgments automatic influences, 54 transfer, 166-168
K Kinds conceptualization, 212-213 content, 255-256 conversational use, 213-217 definition, 204-205 evidence, 205-207 generics and, 230 lexicalization, 208, 217-219 mechanisms, 212 Knowledge conceptualizations, 129-130 construction conceptual combination, 151-155 induction, 150-151 metaphors, 155-157 -driven processing, 171
Index
first categorization, 175 input functions, 166-168, 171-172 integration mechanisms, 145-150 principles, 143-145 selection baywatch model description, 186-187 evaluation, 191-196 simulations, 188-191 technical details, 187-188 experiments buildings, 175-179 discussion, 180-183 vehicles, 179 function, 163-171, 173 neural networks, 183-185 Korsakoff syndrome, 12 L Labeling, s e e Lexicalization Language categorization devices, 208-212 formation, 202 generic noun phrases Chinese speakers, 233-236 conceptual distinctions, 239-244 domain specificity, 238-239 frequency, 230-233, 237-238 function, 228-230 inductive inferences, 248-251 longitudinal study, 236-244 semantic interpretation, 245-248 qnantifiers logical, 225 universal, 226-228 role, fundamental, 255-256 studies conclusions, 251-253 universality, 254-255 expressive functions, 203 formalism, 133-134 kinds conceptualization, 212-213 conversational use, 213-217 definition, 204-205 evidence, 205-207 generics and, 230 lexicalization, 208, 217-219
mechanisms, 212 usage, 212-217 lexicalization application, 217-219 cognition, 219-220 comprehension, 220-221 research issues, 224 social effects, 221-224 word meaning, 219 Learning background knowledge input functions, 166-168, 171-172 selection function, 163-171, 173 experiments buildings, 175-179 discussion, 180-183 vehicles, 179 observations, 164-166 Levels-of-processing effect, 10 Lexicalization application, 217-219 cognition, 219-220 comprehension, 220-221 research issues, 224 social effects, 221-224 Lists deferred imitation tasks, 34-35 procedures, 32-33 serial-probe recognition, 35-39 structure detection, 39-40 Location, spatial, 147-148 Logical quantifiers, 225 LOP, s e e Levels-of-processing effect Lottery study, 108-110 M
Memory categorization correlated attributes, 29-32 description, 23-26 exemplar similarity, 26-28 lists deferred imitation tasks, 34-35 procedures, 32-33 serial-probe recognition, 35-39 structure detection, 39-40 time windows, 28-29 changes, age-related, 1-2
269
270 Memory (continued) context distortion, 19, 21-22 effects, 14-18 processing, 18-19 cues distortion, 19, 21 22 effects, 14, 16-18 processing, 18-19 delayed recognition, 2-3 depression control function beneficial, 52-61 disruption, 49-51 irrelevancy, 51-52 metaphors, 61-65 motivation role, 61-65 ontogeny age-related retention, 4-11 retrieval speed, 11-12 multiple systems, 12-13 prospective, 60-61 reactive paradigm, 2-3 reminders, 8, 14 studies history, 1 procedures, 2-3 Metaphors capacity-based, 66-67 copying machine in cognitive psychology, 130-132 integrative/constructive processing vs., 135-137 prevalence, 132-135 house-building, 130 knowledge constructs, 155-157 Models, see specific models Money, gift, 119-120 Motivation, memory impairments, 61-65
N Negative time-order error, 77-78 Neural networks models, 183-185 Nouns generic phrases Chinese speakers, 233-236 conceptual distinctions, 239-244 domain specificity, 238-239 frequency, 230-233, 237-238
Index inductive inferences, 248-251 longitudinal study, 236-244 semantic interpretation, 245-248 head, 153-154 -noun combinations, 138-141
O Objects compatibility goals, 120-122 values, 120-122 and goals, relations, 98-103, 107 representation, 138 Observations frequency, 163 new concepts, 164-166 select, 168-171 Orientation, spatial, 148
P Prediction, choice, 111-112 Prevention goals, 101-103 Procedures, see Tasks Promotion goals, 101-103 Property, naming, 138 Prospective memory, 60-61 Psychological essentialism cognition and, 207-208 description, 206-207
Q Quantiflers logical, 225 universal, 226-228
R Reasoning, essentialist, 202 Recognition delayed, 2-3 emotional, 57-60 judgments, 54 serial-probe, 35-39 tests, 54-57 Reflection, self-initiated, 50 Relations interpretations, 154-155
Index
time, species comparisons extraexperimental experience, 86-93 general method, 75-77 instruction role, 86-93 ordinal, 77-81 ratio, 82-84 Representation knowledge, 145-146 local, 137-143 Representational formalism, 133-134 Retention cue effects, 14-16 duration, 4-11 Retrieval speed, 11-12 Retroactive interference, 19
S Schedule-related tasks, 76 Segregation values, backfiring, 123-124 values, description, 117-120 Self-regulation, 122-124 Serial-probe recognition tasks, 35-39 Smoking, s e e Cigarette smoking Social effects categorization, 218-219 lexicalization, 221-224 Spontaneous analogical transfer, 49 Structure detection, 39-40 Systematicity, 156
T Tasks 2AFC description, 75-77 species comparisons ordinal, 77-81 ratio, 82-84 deferred imitation, 34-35 DMTS, 73-74
271
schedule-related, 76 serial-probe recognition, 35-39 Time and goals, 113-117 integration process, 168 processing, 18-19 relational, species comparison extraexperimental experience, 86-93 general method, 75-77 instruction role, 86-93 ordinal differences, 77-81 ratio differences, 82-84 windows, 28-29 Transfer judgments, 166-168 spontaneous analogical, 49 Two-alternative forced-choice descriptions, 75-77 species comparisons ordinal differences, 77-81 ratio differences, 82-84
U Universal quantifiers, 226-228 Utility function, 98
V Values activation level, 108-113 segregation, 117-120 temporal aspects, 113-117 goal compatibility framework, 106-108 object, 120-122 self-regulation, 122-124
W Weber's law, 76
C O N T E N T S OF RECENT VOLUMES Volume 29
Concept Structure and Category Boundaries Barbara C. Malt Non-Predicating Conceptual Combinations Edward J. Shoben Exploring Information about Concepts by Asking Questions Arthur C. Graesser, Mark C. Langston, and William B. Baggett Hidden Kind Classifications Edward Wilson Averill Is Cognition Categorization? Timothy J. van Gelder What Are Concepts? Issues of Representation and Ontology William F. Brewer
Introduction: A Coupling of Disciplines in Categorization Research Roman Taraban Models of Categorization and Category Learning W. K. Estes Three Principles for Models of Category Learning John K. Kruschke Exemplar Models and Weighted Cue Models in Category Learning Roman Taraban and Joaquin Marcos Palacios The Acquisition of Categories Marked by Multiple Probabilistic Cues Janet L. McDonald The Evolution of a Case-Based Computational Approach to Knowledge Representation, Classification, and Learning Ray Bareiss and Brian M. Slator Integrating Theory and Data in Category Learning Raymond J. Mooney Categorization, Concept Learning, and Problem-Solving: A Unifying View Douglas Fisher and Jungsoon Park Yoo Processing Biases, Knowledge, and Context in Category Formation Thomas B. Ward Categorization and Rule Induction in Clinical Diagnosis and Assessment Gregory H, Mumma A Rational Theory of Concepts Gregory L. Murphy
Index
Volume 30 Perceptual Learning Felice Bedford A Rational-Constructivist Account of Early Learning about Numbers and Objects Rochel Gelman Remembering, Knowing, and Reconstructing the Past Henry L. Roediger III, Mark A. Wheeler, and Suparna Rajaram The Long-Term Retention of Knowledge and Skills Alice F. Healy, Deborah M. Clawson, Danielle S. McNamara, William R. Marmie, Vivian I. Schneider, Timothy C. Rickard, Robert J. Crutcher, Cheri L. King, K. Anders Ericsson, and Lyle E. Bourne, Jr. 273
274
Contents of Recent Volumes
A Comprehension-Based Approach to Learning and Understanding Walter Kintsch, Bruce K. Britton, Charles R. Fletcher, Eileen Kintsch, Suzanne M. Mannes, and Mitchell J. Nathan Separating Causal Laws from Causal Facts: Pressing the Limits of Statistical Relevance Patricia W. Cheng Categories, Hierarchies, and Induction Elizabeth F. Shipley Index
V o l u m e 31 Associative Representations of Instrumental Contingencies Ruth M. Colwill A Behavioral Analysis of Concepts: Its Application to Pigeons and Children Edward A. Wasserman and Suzette L. Astley The Child's Representation of Human Groups Lawrence A. Hirschfeld Diagnostic Reasoning and Medical Expertise Vimla L. Patel, Jos6 F. Arocha, and David R. Kaufman Object Shape, Object Name, and Object Kind: Representation and Development Barbara Landau The Ontogeny of Part Representation in Object Concepts Philippe G. Schyns and Gregory L. Murphy Index
V o l u m e 32 Cognitive Approaches to Judgment and Decision Making Reid Hastie and Nancy Pennington And Let Us Not Forget Memory: The Role of Memory Processes and Techniques in the Study of Judgment and Choice Elke U. Weber, Wiliam M. Goldstein, and Sema Barlas
Content and Discontent: Indications and Implications of Domain Specificity in Preferential Decision Making William M. Goldstein and Elke U. Weber An Information Processing Perspective on Choice John W. Payne, James R. Bettman, Eric J. Johnson, and Mary Frances Luce Algebra and Process in the Modeling of Risky Choice Lola L. Lopes Utility Invariance Despite Labile Preferences Barbara A. Mellers, Elke U. Weber, Lisa D. Orddfiez, and Alan D. J. Cooke Compatibility in Cognition and Decision Eldar Shafir Processing Linguistic Probabilities: General Principles and Empirical Evidence David V. Budescu and Thomas S. Wallsten Compositional Anomalies in the Semantics of Evidence John M. Miyamoto, Richard Gonzalez, and Shihfen Tu Varieties of Confirmation Bias Joshua Klayman Index
V o l u m e 33 Landmark-Based Spatial Memory in the Pigeon Ken Cheng The Acquisition and Structure of Emotional Response Categories Paula M. Niedenthal and Jamin B. Halberstadt Early Symbol Understanding and Use Judy S. DeLoache Mechanisms of Transition: Learning with a Helping Hand Susan Goldin-Meadow and Martha Wagner Alibali The Universal Word Identification Reflex Charles A. Perfetti and Sulan Zhang
Contents of Recent Volumes Prospective Memory: Progress and Processes Mark A. McDaniel Looking for Transfer and Interference Nancy Pennington and Bob Rehder Index
Volume
34
Associative and Normative Models of Causal Induction: Reacting to versus Understanding Cause A. G. Baker, Robin A. Murphy, and Frdd6ric Vallde-Tourangeau Knowledge-Based Causal Induction Michael R. Waldmann A Comparative Analysis of Negative Contingency Learning in Humans and Nonhumans Douglas A. Williams Animal Analogues of Causal Judgment Ralph R. Miller and Helena Matute Conditionalizing Causality Barbara A. Spellman Causation and Association Edward A. Wasserman, Shu-Fang Kao, Linda J. Van Hamme, Masayoshi Katagiri, and Michael E. Young Distinguishing Associative and Probabilistic Contrast Theories of Human Contingency Judgment David R. Shanks, Francisco J. Lopez, Richard J. Darby, and Anthony Dickinson A Causal-Power Theory of Focal Sets Patricia W. Cheng, Jooyong Park, Aaron S. Yarlas, and Keith J. Holyoak The Use of Intervening Variables in Causal Learning Jerome R. Busemeyer, Mark A. McDaniel, and Eunhee Byun Structural and Probabilistic Causality Judea Pearl Index
Volume
35
Distance and Location Processes in Memory for the Times of Past Events William J. Friedman
275
Verbal and Spatial Working Memory in Humans John Jonides, Patricia A. Reuter-Lorenz, Edward E. Smith, Edward Awh, Lisa L. Barnes, Maxwell Drain, Jennifer Glass, Erick J. Lauber, Andrea L. Patalano, and Eric H. Schumacher Memory for Asymmetric Events John T. Wixted and Deirdra H. Dougherty The Maintenance of a Complex Knowledge Base After Seventeen Years Marigold Linton Category Learning As Problem Solving Brian H. Ross Building A Coherent Conception of HIV Transmission: A New Approach to Aids Educations Terry Kit-long Au and Laura F. Romo Spatial Effects in the Partial Report Paradigm: A Challenge for Theories of Visual Spatial Attention Gordon D. Logan and Claus Bundesen Structural Biases in Concept Learning: Influences from Multiple Functions Dorrit Billman Index
Volume
36
Learning to Bridge Between Perception and Cognition Robert L. Goldstone, Philippe G. Schyns, and Douglas L. Merlin The Affordances of Perceptual Inquiry: Pictures Are Learned From the World, and What That Fact Might Mean About Perception Quite Generally Julian Hochberg Perceptual Learning of Alphanumeric-Like Characters Richard M. Shiffrin and Nancy Lightfoot Expertise in Object and Face Recognition James Tanaka and Isabel Gauthier Infant Speech Perception: Processing Characteristics, Representational Units, and the Learning of Words Peter D. Eimas Constraints on the Learning of Spatial Terms: A Computational Investigation Terry Regier
276
Contents of Recent Volumes
Learning to Talk About the Properties of Objects: A Network Model of the Development of Dimensions Linda B. Smith, Michael Gasser, and Catherine M. Sandhofer Self-Organization, Plasticity, and Low-Level Visual Phenomena in a Laterally Connected Map Model of the Primary Visual Cortex Risto Mikkulainen, James A. Bednar, Yoonsuck Choe, and Joseph Sirosh Perceptual Learning From Cross-Modal Feedback Virginia R. de Sa and Dana H. Ballard Learning As Extraction of Low-Dimensional Representations Shimon Edelman and Nathan Intrator Index
V o l u m e 37 Object-Based Reasoning Miriam Bassok Encoding Spatial Representation Through Nonvisually Guided Locomotion: Tests of Human Path Integration Roberta L. Klatzky, Jack M. Loomis, and Reginald G. Golledge Production, Evaluation, and Preservation of Experiences: Constructive Processing in Remembering and Performance Tasks Bruce W. A. Whittlesea Goals, Representations, and Strategies in a Concept Attainment Task: The EPAM Model Fernand Gobet, Howard Richman, Jim Staszewski, and Herbert A. Simon Attenuating Interference During Comprehension: The Role of Suppression Morton Ann Gernsbacher Cognitive Processes in Counterfactual Thinking About What Might Have Been Ruth M. J. Byrne
Episodic Enhancement of Processing Fluency Michael E. J. Masson and Colin M. MacLeod At a Loss From Words: Verbal Overshadowing of Perceptual Memories Jonathan W. Schooler, Stephen M. Fiore, and Maria A. Brandimonte Index
V o l u m e 38 Transfer-Inappropriate Processing: Negative Priming and Related Phenomena W. Trammell Nell and Katherine M. Mathis Cue Competition in the Absence of Compound Training: Its Relation to Paradigms of Interference Between Outcomes Helena Matute and Oskar Pinefio Sooner or Later: The Psychology of Intertemporal Choice Gretchen B. Chapman Strategy Adaptivity and Individual Differences Christian D. Schnnn and Lynne M. Reder Going Wild in the Laboratory: Learning About Species Typical Cues Michael Domjan Emotional Memory: The Effects of Stress on "Cool" and "Hot" Memory Systems Janet Metcalfe and W. Jake Jacobs Metacomprehension of Text: Influence of Absolute Confidence Level on Bias and Accuracy Ruth H. Maki Linking Object Categorization and Naming: Early Expectations and the Shaping Role of Language Sandra R. Waxman Index