Cognition, 8 (198;) 369-387 @ Elsevier Sequoia S.A., Lausanne
1 - Printed
in the Netherlands
What young children thin...
21 downloads
998 Views
6MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Cognition, 8 (198;) 369-387 @ Elsevier Sequoia S.A., Lausanne
1 - Printed
in the Netherlands
What young children think you see when their eyes are closed* JOHN H. FLAVELL SUSAN G. SHIPSTEAD KAREN CROFT
Stanford University Abstract The common assumption that young children egocentrically believe you cannot see them when their own eyes are closed was investigated in two studies. It was found that 2.5-4-year-olds, but not 5-year-olds and adults, would indeed often give a negative reply to the experimenter’s question “Do I see you? ” when their eyes were closed and covered with their hands. However, they would also correctly reply that the experimenter did see their arm and an object placed in front of them and did not see their eyes and back, indicating that they were making veridical, nonegocentric inferences about the experimenter’s visual experience. In addition, their eyes being visible to the experimenter did not prove to be either a necessary or a sufficient condition for their judgment that the experimenter could see “them”(“you “7. It was concluded that, in this context, adults take “you” to mean their whole body while young children take it to mean primarily their face region. Speculations were made as to how young children could have acquired this meaning, and about possible similarities and differences between the self conceptions of young children and adults.
Knowledge concerning visual perception constitutes one form of social or psychological cognition (Shantz, 1975). Flavell and his co-workers have hypothesized that there are at least two developmental levels of such knowledge (Flavell, 1974, 1978; Lempers, Flavell, and Flavell, 1977; Masangkay, *This research was supported by National Science Foundation Grant no. BNS 76-16830. We wish to express our gratitude to the nursery school children and teachers whose cooperation made these studies possible; to Eleanor Flavell for her assistance in testing subjects; to Eleanor Flavell, Rachel Gelman, and Ellen Markman for their critical reading of the manuscript; and to Barbara Abrahams, Eleanor Flavell, Eleanor Maccoby, Ellen Markman, Sandra Starr, James Speer, and numerous others for their valuable suggestions about tasks and interpretations of results. Portions of this paper were presented at the meeting of the American Psychological Association, Toronto, August 1978. Requests for reprints should be sent to John H. FlavelJ, Department of Psychology, Stanford University, Stanford, Calif. 94305.
370
J. H. Flavell, S. G. Shipstead and K. Croft
McCluskey, McIntyre, Sims-Knight, Vaughn and Flavell, 1974). At earlierdeveloping Level 1, the child can nonegocentrically infer what objects another person does and does not see, given adequate cues. At later-developing Level 2, the child further knows that an object simultaneously visible to both the self and the other person may nonetheless elicit different visual impressions or experiences in the two if their viewing circumstances differ (cf., Hughes, 1975). A recent study by Flavell, Shipstead, and Croft (1978) illustrates how surprisingly nonegocentric and skillful Level 1 children can be at inferring whether an object is or is not visible to another person under various perceptual conditions (see also Hughes, 1975). Children of ages 2.5, 3, and 3.5 years were tested for their understanding of object hiding. Even the youngest subjects nonegocentrically hid an object from another person’s sight by placing it on the opposite side of a screen from that person, even though placing it there necessarily left it unconcealed from themselves. Most of them also correctly recognized that the other person could see the object when the screen was interposed between them and the object (thereby blocking their own view of it), but that the other person could not see it when the screen was interposed between that person and the object. In sum, they did not seem to mistake what they themselves did and did not see for what the other person did and did not see. Thus, previous research would lead us to expect that children of this age would also do well on the following unusual type of Level 1 task: (a) the child and another person face one another, (b) the child’s eyes are closed and/or covered, (c) the child is told that the other person’s eyes are open and directed at the child’s face, (d) the other person then says, “Do I see you?” The child’s total lack of visual input or experience in this situation should provide an unusually powerful temptation to respond egocentrically. Consistent with this, there seems to be a popular assumption that young children do often egocentrically assume that others cannot see them when their own eyes are closed. For instance, they are sometimes observed to merely close or cover their eyes rather than conceal their whole body when playing hide-andseek. On the other hand, if it should turn out that young children respond nonegocentrically rather than egocentrically in this putatively egocentrismtempting situation, it would suggest that their Level 1 knowledge is very solid indeed. It would also lead us to question what appears to be a folk belief about what young children think you see when their eyes are closed. The major purpose of Study 1 was therefore to test the solidity of 2- and 3year-olds’ Level 1 knowledge.
What young children think you see
371
Study 1 Method Subjects The subjects were 64 children from middle-class nursery schools and kindergartens, plus nine Stanford University students and staff. The age groups were categorized as 2.5 years (mean age 32.9 months, range 30-35 months), 3 years (mean age 39.4 months, range 36-41 months), 3.5 years (mean age 44.7 months, range 42-47 months), 5 years (mean age 63.3 months, range 60-67 months), and adult (mean age 23.0 years, range 21.2-26.0 years). There were eight girls and eight boys in each child group, four women and five men in the adult group. Procedure The experimenter and subject sat facing each other across a low table with a Snoopy dog toy on it. The adult subjects were told that the tasks were designed for young children and were therefore very simple. The adults were also told to answer each question quickly, giving their first, “gut-level” reaction; they were not to think before answering-just answer. The tasks described below were presented in random order, with the exception that the task Two Eves Closed or Covered,’ administered twice, was always the first (A) and the last (B) task given. 1. Two Eyes Closed or Covered A After the child closed or covered both eyes, the experimenter said, “Now your eyes are closed, and my eyes are open.” Then she asked, “Do I see Snoopy?“, and then, “Do I see you?“. If the child indicated that the experimenter did not see him, the experimenter proceeded to ask, “Do I see your head? “, and then again, “Do I see you?“. (A number of children had difficulty keeping their eyes closed and so were asked to cover them with their hands instead. Regrettably, we did not record which or how many children covered rather than closed their eyes.) 2. One Eye Covered The same procedure (minus the initial statement and the Snoopy was repeated while the child covered one eye with his hand. ‘To aid discrimination,
Study
1 task names will be italicized
and Study
2 ones wiU not.
question)
372
J. H. Flavell, S. G. Shipstead and K. Croft
3. Mouth
Covered
The child was asked to cover his mouth with a hand. The experimenter “Do I see your mouth?“, and then, “Do I see you?“. 4.
Two Eyes Exposed
The child stepped away material and looked at The experimenter could nose. The experimenter you?“. 5.
asked,
from the table and stood behind the experimenter through a small thus see only the child’s eyes and asked, “Do I see your eyes?“, and
a long piece of rectangular slot. the bridge of his then, “Do I see
Turn 180”
The child sat facing away experimenter said, “Your too.” She then asked, “Do the experimenter followed
6. Experimenter
from the experimenter with both eyes open. The eyes are open, and I’m going to keep my eyes open I see you?“. If the child responded in the negative, with, “Do I see your head?“, and “Do I see you?“.
Eyes Covered
A second experimenter faced the child, closed both eyes, and covered them with her hands. The first experimenter then asked the child, “Do you see _ (name of second experimenter)?“.
7. Reflective
Glasses
The
child and experimenter took turns putting on the “special” glasses (silvered ski sunglasses) to show that the wearer of the glasses could see the other but the other could not see the wearer’s eyes. The experimenter verified the child’s understanding of these features of the glasses before posing the questions. The child put on the reflective glasses and was then asked, “Do I see your eyes?“, and “Do I see you?“.
8. Two Eyes Closed or Covered B Same as 1.
What young children think you see
373
Rationale A curious and wholly unanticipated pattern of responding was observed during the pilot testing for this study. With only their eyes closed or covered, some young children would say that the experimenter did not see “them” (‘*you”), just as the popular assumption would predict. However, they would also correctly reply that she did see their head, arm, or other objects in her field of vision. The Both Eyes Closed or Covered task was included to find out how frequently this pattern would be observed in children of different ages. More generally, the set of seven tasks was designed to identify the visual conditions of observed and observer which influence young children’s judgments about what the observer sees. One possibility is that young children egocentrically assume the other person cannot see anything at all when they themselves cannot. As suggested above, recognizing that others can see when the experience of not seeing anything is filling one’s own field of awareness may require more Level 1 ability to decenter from one’s own perspective than young children possess. Negative answers to all Two Eyes Closed or Covered questions would support this possibility; negative answers to the “you” questions only (the response pattern seen in pilot testing) would clearly rule it out. A second easily tested possibility is that they believe the other cannot see “them” unless both their eyes are open (One Eye Covered). A third is that she cannot see “them” if any important part of the face is concealed from her view, or if they engage in any sort of self-hiding gesture (Mouth Covered). The other tasks, together with Two Eyes Closed or Covered, could provide at least tentative evidence for other possibilities that will be considered below.
Results and discussion Table 1 shows the percentages of correct answers to each task question in each of the five age groups. Recall that the questions which are most indented in Table 1 were asked only of subjects who had given an incorrect (negative) answer to the “you” question immediately preceding them. We shall first describe and discuss the adult response pattern, then the developmental trends leading to it, and finally the nature and possible meaning of immature patterns. Adult pattern The adults answered all object and body part questions correctly. They also seemed to construe “you” and “ -” (experimenter’s name) as referring
374
J. H. Flavell, S. G. Shipstead and K. Croft
Table 1.
Percentage ofcorrect answers to each question answers given in paren theses) Tasks and Questions
in each age group (correct
Agea
2.5
3
3.5
5
Adults
100 37 80 10
100 37 90 30
100 50 15 12
100 87 100 0
100 100
81
100
94
100
100
3. Mouth Covered Your mouth? (no) You? (yes)
100 81
94 94
100 100
100 100
100 100
4. Two Eyes Exposed Your eyes? (yes) You? (ye$
100 81
100 75
100 81
100 31
100 33
50 88 25
31 91 18
50 1 00 12
69 100 20
100
44
50
56
87
100
94 44
100 31
00 62
100 63
100 100
100 31 91 9
100 44 89 11
00 50 00 12
100 87 100 0
100 100
1. Two Eyes Closed or Covered A Snoopy? (yes) You? (yes) Your head? (ye~)~ You? (yes) 2. One Eye Covered You? (yes)
5. Turn ISO” You? (yes) Your head? You? (yes)
(yes)
6. Experimenter Eyes Covered Do you see (experimenter)?
(yes)
7. Reflective Glasses Your eyes? (no) You? (yes) 8. Two Eyes Closed or Covered B Snoopy? (yes) You? (yes) Your head? (yes) You? (yes)
“N = 16 for each child group and 9 for the adult group. 4 ndented questions were only asked of subjects who had responded incorrectly (i.e., negatively) to the preceding “you” question. The percentages in these rows are thus based on Ns of less than 16 in all cases. b’or example, 6 of the 16 2.5-year-olds answered the initial “you” question of Task 1 correctly (37%). Of the remaining 10, 8 (80%) correctly answered the subsequent “head” question and 1 (10%) correctly answered the subsequent “you” question. ‘The “correct” answer to this question is somewhat arbitrarily set as yes here.
to each individual’s physical body taken more or less as a whole. Like body parts and external objects, “you” the body-as-a-whole was apparently experi-
What young children think you see
375
enced as “seen” to the extent that it was unconcealed and visible to the observer: definitely and unambiguously seen when most of it was visible; not seen or less certainly seen when only the eyes (or, for all one knows, any small portion of the body) were exposed to view, as in Two Eyes Exposed. Developmental
trends
The data in Table 1 suggest that there is considerable development towards the adult pattern between three and five years of age. Significant or nearsignificant decreases across this age range obtained for task 1, x2(3) = 10.79, p < 0.05, task 6, x2(3) = 7.51,~ < 0.10, and task 8, x*(3) = 11.29,~ < 0.05; the apparent decrease for task 5 is not significant. The age increase in negative answers to the “you” question of the Two Eyes Exposed task was also significant, x2(3) = 12.69, p < 0.01. While not all 5-year-olds responded like the adults, a good many did: eight responded affirmatively to each of the five “you” questions on tasks 1, 5, 6, 7, and 8 and five more responded affirmatively to four of the five; the corresponding figures for the three youngergroups were, from youngest to oldest, 3 and 1, 2 and 0, and 6 and 0. Immature
patterns
There is no suggestion whatever in the data that even the youngest subjects egocentrically assumed that the experimenter could not see anything when they themselves could not see anything. As Table 1 indicates, all subjects said the experimenter saw the Snoopy doll on both administrations of Two Eyes Closed or Covered. In addition, of those children who were asked the “head” question on Tasks 1, 5, and 8 (by virtue of having just said no to the “you” question), the percentages responding correctly were 83%, 94%, and 93% respectively. These affirmative answers are significantly more numerous than would be expected by chance (all are p < 0.001 by Sign Test) and therefore, of course, also far more numerous than would be predicted by any total-inability-to-decenter hypothesis. The data in Table 1 also indicate that almost all the children believed the other could see “them” when only one rather than both of their eyes was covered (One Eye Covered). The possibility that they would say no to the “you” question no matter what facial part was covered was also ruled out by the finding that most of the children also said yes to the “you” question when their mouth was concealed (Mouth Covered). Only 10 subjects consistently gave incorrect answers to the five “you” questions of tasks 1, 5, 6, 7, and 8. We analyzed children’s patterns of yes and no answers to these “you” questions plus the “you” question of Two
376
Eyes
J. H. Flavell, S. G. Shipsteacl and K. Croft
Exposed (task 4) to see if these patterns might at least provide clues about underlying beliefs in this area. We first excluded from these pattern analyses the 18 children who gave correct answers to the task I, 5, 6, 7, and 8 “you” questions, plus two others who responded incorrectly to several “head” questions and may therefore have had unusual attention or comprehension problems. This left a sample of 44 subjects. One imaginable childish belief is that you “see me” if and onZy if Isee, i.e., am sighted. A child who believed this should say no to at least one of the two “you” questions where he is unsighted (tasks 1 and S), but should say yes to all the “you” questions where he is sighted (tasks 4, 5, and 7). A second possible belief is that you “see me” if and o&y if Isee you. A child who believed that should also respond as above, except to say no rather than yes on the task where he is sighted but cannot see the experimenter (task 5). A third possible belief is that you “see me” if and only if you see my eye(s). The response pattern consistent with this belief is a yes on the task where his eyes are visible to the experimenter (task 4) and no on those where they are not (tasks 1 or 8, 5, and 7). Of the 44 subjects considered, one showed the first pattern, eight the second and 14 the third. Moreover, 11 of the latter 14 also said they could not see the experimenter in Experimenter Eyes Covered (task 6), a pattern consistent with the more general belief that anyone can be seen by an observer if and only if the observer can see the person’s eyes. The third belief differs from the first two in that it takes as the relevant consideration what the observer sees rather than what the observed person sees. The young subjects in this study obviously took what the observer sees as the relevant consideration when answering “head” and “Snoopy” questions. It is therefore reasonable to suppose that the same was also true when they answered the “you” questions. The overall pattern of results in Study 1 led us to the following conclusions and speculations. Consistent with their performance on other Level 1 tasks, 3-year-olds are quite capable of accurately and nonegocentrically inferring what physical objects the other does and does not see, even in the extreme condition when they themselves do not see anything. This suggests that their Level 1 knowledge is very robust and well consolidated, and thereby answers the question that originally motivated this study. If this is true, however, it implies that their negative answers to “you” questions were not usually caused by incorrect inferences concerning what or how much of their physical bodies were actually visible to the other. We are thus left with an intriguing puzzle that had not been anticipated when we undertook this study. The most likely alternative cause of these answers seemed to be that “you” or “see you” means something different to young
What young children think you see
377
children than it does to adults; the fact that some children deny that they can see the experimenter when her eyes are closed (Experimenter Eyes Closed) clearly suggests that semantic rather than perceptual considerations must be important here. It looks as if the rumored tendency of young children to think that others do not see them when they avert or cover their eyes does have a factual basis, although its meaning appears to be very different from what most of us would have suspected. Perhaps young children really do believe you “see them” in some special, nonadult meaning of these terms if and only if you see at least one of their eyes (see the above pattern analysis). And if so, could it conceivably be because they (a) take “you” to refer to their inner-psychological rather than outer-physical self in these task settings, and (b) believe that their inner-psychological self is somehow visible to others through their eyes? A search through thesauruses revealed that many writers from Cicero on have spoken metaphorically of the eyes as “the windows of the soul” or the equivalent. Implausible as it may appear, perhaps young children entertain some literal version of this idea, especially when eyeball to eyeball with large, seemingly all-knowing and all-seeing grownups. Study 2 was undertaken to obtain more and better evidence relevant to these possibilities than Study 1 afforded, as well as to see if the basic Study 1 results could be replicated. Study 2 Method Subjects The subjects were 6 boys and 16 girls from middle-class (mean age 46.6 months, range 39-52 months).
nursery
schools
Procedure The tasks described below were presented to the children in random order, with the exception that the Cognitive Self Interview was always administered last. Their rationales will become apparent in the Results and Discussion section. 1. Two Eyes Covered
The child and experimenter sat facing one another. The child closed his eyes and covered them with his hands. After making sure he could not see any-
378
J. H. Flavell, S. G. Shipstead arld K. Croft
thing,
the experimenter said: “My eyes are open and I’m looking. Do I see right now?“. This question was asked five times in succession, with the blank filled by “you”, “you, -” (child’s first name), “your eyes”, “your back” (a nonvisible body part), and “your arm” (a visible body part). These five questions were asked in a random order that was variable across subjects, with the constraint that the two “you” questions were always separated by at least one other question. This same questioning procedure was used in the next four tasks as well, except that the visible body part queried was not always the child’s arm.
2. Card The experimenter held a 5 X 8 inch white card perpendicularly about 20 cm in front of the child’s face, such that neither could see the other’s face. The visible body part was “your foot” in this task.
3. Turn 135” A second experimenter sat 135” to the subject’s right rear, holding a puppet. The child continued to look at it over his shoulder while being questioned, turning his upper torso a greater or lesser amount in order to do so. The visible body part was “your arm. ” “Your back” continued to be used as the supposedly nonvisible body part, although at least a portion of the child’s back was in fact usually visible to the first experimenter while the child looked at the puppet.
4. One Eye Exposed The child stood
behind a long piece of material with one eye pressed against a hole about the*same size and shape as his eye. The questioning procedure was identical to that used in task 1, except of course that “your eye” was substituted for “your eyes”.
5. Reflective The
Glasses and Mirror
properties of the silvered reflective glasses were demonstrated to the child much as in Study 1. The child then put the glasses on and the experimenter knelt behind him, holding a 33.5 X 23 cm mirror in front of them. The child could thus see in the mirror both his own face and that of the
Whatyoung children think you see
379
experimenter looking at him, but of course could not see his own eyes. Unlike in the Reflective Glasses task of Study 1, however, the child also “saw” that the experimenter could not see his eyes either. The visible body part queried was “the top of your head” and a sixth question followed the usual five, namely “Do you see yourself?“. 6. Where Experimenter
Looks
The experimenter
faced about 180” away from the child and said: “My eyes are open and I’m looking right here (pointing to an object across the room). Do I see you right now?“. The statement and question were then repeated with the experimenter pointing successively at (but not naming) the child’s shin, stomach, eyes, chin, and finally shin again. The order of eyes, stomach, and chin was randomized, however, thus making the order of the entire set of subtasks as follows: Away-Shin-(Chin, Eyes, Stomach)-Shin.
7. Experimenter
and Doll Eyes Closed
The experimenter said “Now -‘s right now?“. The child’s visual targets experimenter who had just closed her automatically closed when it was placed
eyes are closed. Do you see ___ were, in random order, the second eyes and a small doll whose eyes in a horizontal position.
8. Cognitive Self Interview The interview
dealt with the meaning, location, and potential visibility of the “cognitive self”, in that order. Using the abovementioned doll, the experimenter first explained that dolls are like people in some ways, namely, both have legs, arms, heads, etc. (pointing to corresponding body parts on the doll and on the child and experimenters). The experimenter then asked how dolls are different from people, and whether dolls know their names and think about things, as the child and other people do. The inflection of the questions and the nature of accompanying remarks suggested that people are in fact different from dolls in just these ways. The location questions then were: “Where is the part of you that knows your name and thinks about things? Where do you do your thinking and knowing?“. Every effort was made to get the child to listen very attentively to these questions and comprehend them as best she could. If the child did not indicate a location in response to these questions the experimenter gestured randomly and imprecisely towards
380
J. 11. Havell, S. G. Shipstead and K.
Croft
different areas of the child’s body, asking “is it here, here, here... where?” The visibility questions came next: “If I look here (points), at (in) your -, do I see the part of you that knows things and thinks?“. Four body parts were named and inquired about in random order: stomach, foot, nose (but with the experimenter actually staring at the child’s eyes), and eyes. Ad lib follow up questions about the location and visibility of the cognitive self were also asked in many cases, depending upon the child’s previous responses and responsivity to the standard questions. Rationale
The subjects used in Study 2 were selected with the hope that they would still be young enough to give some immature responses to key “you” questions but also old enough to comprehend the ideas and questions presented in the Cognitive Self Interview. The questioning procedure of tasks l-5 in this study was intended to be a methodological improvement over that used in Study 1. As. for specific tasks, Two Eyes Covered provides a replication of Study l’s Two Eyes Closed or Covered, but with all subjects hiding their eyes in the same fashion. The Card task presents a condition in which the child can “see” that his face is not visible to the experimenter. If the visibility of his eyes to the observer is critical for the young child, this task should elicit a great many negative answers to its “you” questions. In Turn 135”, most of the front of the child’s body and a bit of the side of his face remains visible to the experimenter; as in Turn 180”, however, the child cannot see the experimenter. Moreover, in contrast to tasks like Two Eyes Covered, Two Eyes Closed or Covered, Two Eyes Exposed, One Eye Exposed, Card, and perhaps even Turn 180”, the child’s turning to look at the puppet in Turn 135” does not closely resemble any hiding-of-self action one could imagine young children of any culture performing in everyday life, e.g., in hiding games with parents. If young children also say that the experimenter does not see “them” in this task, therefore, it probably means they are not merely assimilating all our task conditions to culturally-acquired, stereotyped hiding games. One Eye Exposed provides a more stringent test than Two Eyes Exposed of the hypothesis that, for young children, eye visibility is a sufficient condition for a judgment that they are seen. Similarly, Reflective Glasses and Mirror should be a better test than Reflective Glasses of the possibility that eye visibility is a necessary condition. Once the reflective glasses were put on them, a number of the younger children in Study 1 seemed to have trouble maintaining their just-established recognition that others cannot see the
What young children think you see
381
wearer’s eyes through the glasses. For such children, then, the “you” question came immediately after a hard won and perhaps merely token negative answer to the “eyes” question; this could have led to a similarly shallow negative answer to the “you” question. In contrast, the children in Study 2 had no difficulty in believing that the experimenter did not see their eyes. The reason is that they could not perceive their own eyes and could also “see” that the experimenter could not perceive them either. The subtasks of Where Experimenter Looks might answer several questions. Would young children adopt the adult, whole-body-as-visual-target interpretation of “see you” in a situation designed to highlight it? The Away-Shin sequence should highlight it, since the experimenter first looks away from the child, then turns to look at a part of his body. When the experimenter does look at the child, will the child tend to say the experimenter sees him only when the experimenter looks at his eyes? Or might the tendency to reply affirmatively instead increase more or less continuously as the experimenter’s gaze approaches the eyes, for example from Shin to Stomach to Chin to Eyes? Will there be less tendency to say yes to the second Shin question than to the first one, since the immediate context is now being looked in the eye rather than not being looked at at all? Finally, as in Turn 135”, negative answers to, for example, Shin cannot be easily dismissed as generalizations from previous experience with hiding games. The Experimenter and Doll Eyes Closed subtasks follow up the Study 1 Experimenter Eyes Covered task. The child sees no hands-over-eyes actions that could be assimilated to gamelike hiding rituals in the former, however. The Doll subtask was included simply to find out whether any tendency to say that one cannot “see” other people when their eyes are closed applies only to real people. The principal motivation for appending the Cognitive Self Interview was to provide evidence for or against the windowsof-the-soul speculations advanced at the conclusion of Study 1. If the child localizes at least the cognitive part of the inner, psychological self (“soul”) in the head and also harbors this “windows” intuition, she ought to say the experimenter can see that part when he looks into her eyes. If this intuition depends upon actual eye contact as against the experimenter’s verbal and gestural specification of what he is looking at, the response to Nose and Eyes should be the same; if not, the two responses should differ. Finally, we were simply interested in finding an effective, methodologically adequate method for assessing whether and where young children locate at least one, fairly clearly specifiable part of the psychological self, namely, the thinking and knowing part. A search of the literature suggests that no such method has yet been devised (cf., Horowitz, 1935).
382
J. H. Flavell, S. G. Shipstead and K. Crofi
Results and Discussion Table 2 shows the percentage of subjects giving correct answers to body part and “you” questions. As in Study 1, the children did well on the body part questions. The one apparent exception (task 3, nonvisible part) is readily explained: as indicated earlier, part of the child’s back was in fact usually visible to the experimenter when the child turned to look at the puppet. Of the 14 other body part questions in tasks l-5, the mean number correctly answered was 12.9 1. The sturdiness of 3-yearolds’ Level 1 percept inference skills in the face of probable temptations to egocentrism is again demonstrated. The Two Eyes Covered “you” questions seem to have elicited roughly the same proportion of no answers in this study as the Two Eyes Closed or Covered “you” questions did for the Study 1 group most similar in age to the present sample, namely, the Study 1 3.5year-olds. The somewhat similar Card task “you” questions also elicited substantial percentages of negative answers. The curious tendency for this sort of task situation to elicit “you don’t see me” judgments in many young children thus appears to be quite
Table 2.
Percentages of subjects giving correct answers to each question
Tasks
Types of Questions “You”
Body Part Questions
1. 2. 3. 4. 5. 6.
Two Eyes Covered Card Turn 13.5” One Eye Exposeda Reflective Glasses and Mirror Where Experimenter Looks a. Away b. Shin c. Stomach d. Chin e. Eyes f. Shin 7. Experimenter and Doll Eyes Closed a. Experimenter b. Dolt
aAs in Study yes here.
l’s Two byes Exposed
Eye(s)
Visible
Nonvisible
100 91 86 100 100
86 82 86 100 82
100 100 23 100 1-l
54 23 64 41 86
Questions
“You,
_”
45 36 64 59 82
Both 36 23 46 32 17
100 45
_
73 86 86 50
_
task, the “correct”
“You”
answer
to “you”
questions
~
is arbitrarily
set as
Whatyoung children think you see
383
robust. There were nine no answers (41%) to whichever “you” question was asked first in Turn 135”, compared to 50% for Turn 180” in Study 1. More than is true for Turn 180”, negative answers to Turn 135” “you” questions cannot easily be explained as simple generalizations from self-hiding actions or games learned at home. The data from One Eye Exposed suggests that eye visibility is not in fact a sufficient condition for judged “you” visibility for at least a number of 3.5year-olds. Although all the subjects said their eye was visible, 32% said no to both “you” questions and 68% said no to at least one. Notice that these results could hardly reflect a belief that the experimenter had to see both of their eyes in order to see “them.” That belief would generate negative answers to the Study 1 One Eye Covered “you” question, and such answers were very rare (Table 1). The data from Reflective Glasses and Mirror very strongly indicate that eye visibility is not usually a necessary condition either. Although all subjects said their eyes were not visible in this task, 77% said yes to both “you” questions and 91% said yes to at least one (all subjects also said that they could see themselves). These percentages of yes answers are similar to those for Where Experimenter Looks: Eyes, where the experimenter actually looks at the child’s completely visible eyes. Finally, a child who consistently believed eye visibility to be both a sufficient and a necessary condition for “you” visibility should give two yes answers in One Eye Exposed and two no answers in Reflective Glasses and Mirror. Not one child showed this response pattern, however. It is of course possible that eye visibility might be a sufficient and/or necessary condition for judged “you” visibility in children younger than 3.5 years, although we frankly doubt it. It is apparent from the Where Experimenter Looks data that the AwayShin sequence did not seem to lead most of the 3.5-year-olds to adopt the adult, whole-body-as-physical-target interpretation of “see you”, as we thought it might: only 45% gave a yes response to the “you” question when the experimenter looked at their shins after having just looked away from them (first Shin question), with a similar (50%) rather than a significantly lower number giving the same response to the second Shin question. It is also clear from the Where Experimenter Looks data that yes answers were not given solely when the experimenter looked at the children’s eyes: they were as common or nearly so when their chins and stomachs were the visual targets. These, together with the yes answers to the Shin questions, constitute further evidence against the eye-visibility-as-necessary-condition hypothesis. The frequent no answers to the Shin questions, like those in Turn 135”, once again seem to argue against the supposition that children were merely assimilating our tasks to familiar hiding games. Finally, five of the 22 subjects
384
J. H. Flavell, S. G. Shipstead and K. Croft
(23%) said they did not see the second experimenter when her eyes were closed (task 7a), but none said this when the doll’s eyes were closed (task 7b). Fourteen of the subjects met the following criteria in their responses to the Cognitive Self Interview. First, they unequivocally localized the “part of you that knows your name and thinks about things” in one specific place. Second, they did not do or say anything in addition that was inconsistent with that unique localization, such as later indicating that the experimenter could see that part of them at a location other than the one initially specified. Of these 14 subjects, 10 localized it in the head, three in the mouth, and one in the shoulders. Among the other eight subjects, there were three mentions of stomach, one each of face, foot, hand, and knee, and one failure to specify any location. Significantly, no subject in either group mentioned or pointed to her eyes as a location. Our general impression was that the 14 who met these criteria understood our questions quite well and that most of the remaining eight probably did not. The two subgroups did not differ consistently in their performance on other tasks. Of the 14 who met these criteria, one said that the experimenter saw the part in question only when he indicated he was looking into the child’s eyes, one only when he indicated that he was looking at the child’s nose (but, according to procedure, was actually looking at her eyes), three answered affirmatively to both questions, and the remaining nine answered negatively to both questions. However, an examination of the interview protocols of even the five subjects who responded affirmatively here revealed no evidence whatever that they entertained any “windows-of-the-soul” conception of their eyes. The subsequent interchange with one went like this: “Can I see you thinking? No. Even if I look in your eyes, do I see you thinking? No. Why not? Cause I don’t have any big holes. You mean there would have to be a hole there for me to see you thinking?” The child nods. Two others also subsequently denied that the experimenter could see them thinking (“Cause the skin’s over it”, said one), while the remaining two localized the thinking part in the mouth and shoulders, respectively. From the children’s responses to standard and follow-up questions, the modal intuition seemed roughly to be that thinking and knowing go on inside the head and are therefore not visible to others; in particular, others cannot see these activities or the part of the self that does them by looking into one’s eyes. Although the main purpose of the Cognitive Self Interview was to settle the visibility-of-the-inner-you question, it also appears to be a more promising method for learning about very young children’s concept of the self than previous ones of its kind (Horowitz, 1935).
Whatyoung children think you see
385
How, then, to explain the results of the two studies? It is possible that the young child’s tendency to say yes in response to a given task’s “you” question partly depends upon what he thinks the observer sees in that task condition. What he may think the observer sees is characterized below in the form of an ordered series of categories. The Study 2 tasks that seem to belong in each category are also given, together with justifications where needed: 1. None of body-Where Experimenter Looks: Away. 2. None of face but some of body-Where Experimenter Looks: Shin, and Card. In pilot work with the Card Task, we found that a number of children did not think the experimenter saw their arm when he held the card between their faces. Many did think he could see their foot, however, and that was consequently selected as the visible body part. This explains the present classification of this task under “some of body.” 3. None offace but most of body-Where Experimenter Looks: Stomach. 4. Some of face-Two Eyes Covered, Turn 135”, and One Eye Exposed. In Two Eyes Covered, the child’s hands covered most of the rest of her face as well as her eyes. 5. Most of face-Reflective Glasses and Mirror. 6. AZZofface-Where Experimenter Looks: Eyes and Where Experimenter Looks: Chin. Let us make the post hoc hypothesis that the child’s inclination to say yes to “you” questions increases as task conditions progress from category 1 to category 6. We can then compare the rank order of the 10 tasks based on their category membership with the rank order of these same tasks based upon children’s percentages of yes answers, as shown in Table 2 (where a task had two “you” questions, the average of the two percentages was used for the rank-ordering). The rank-order correlation between the two sets of ranks is 0.92, suggesting that the “dimensions” underlying this ordered categorization may in fact have affected the children’s judgments in the hypothesized way. These and other findings in the two studies suggest the following speculations about the nature and development of the young child’s reactions to our “Do I see you?” questions. When adults (and children) refer to the child, to themselves, or to other people present by the appropriate personal pronoun, they are apt to look at or otherwise direct attention to the face of the person referred to. “Look at me” is usually correctly understood by the child listener to mean “Look at my face.” “ I want to tell you something” is normally accompanied by looking at the child’s face, a co-occurrence he can readily observe. Moreover, should he fail to turn his face to meet the adult’s gaze under these circumstances for any reason (inattention, apprehension, etc.), the adult may effectively get each pronoun associated with its appropriate face by saying
386
J. H. blavell, S. G. Shipsieadand
I(. O-oft
something like “Look at me when I talk to you”, perhaps manually turning the child’s face towards her for good measure. When adults refer to “your arm”, “your leg”, etc., while speaking to the young child, the child usually sees them look at those parts of his body. On the other hand, when they refer to “you” while speaking to the child, he usually sees them look at his face. Such experiences might lead a child to think that the “you” that is sometimes visible to another and sometimes not (thus, precisely the “you” that our tasks must make salient) is roughly coextensive with his face. It might thus seem sensible to the child, although not to an adult, to say that he does not “see” another person whose hands cover her eyes and most of her face (Experimenter Eyes Covered task in Study 1). This “You, your face”, like the adult’s “You, your body”, is a wholly external, physical affair; like the adult, the child has learned that only external, physical entities normally vary in visibility from one observer circumstance to another. The Cognjtive Self Interview data suggest that many young children may also have inklings of another “you’‘-one that knows and thinks. Interestingly, this “you” is situated quite close to the other one. However, it is wholly internal rather than external, and has no ocular windows through which it can be seen (although it might be conceived as material by some young children, and hence visible if one could only see inside somehow). Part of self concept development may therefore take the following form, at least in the subculture from which our subjects were drawn: Both adults and young children (circa 3--4 years of age) have intuitions about at least one kind of inner, psychological self, a cognitive one, and they both probably localize it in the same place: the head. Both have also developed intuitions about at least one kind of outer, physical self, the self that is visible to others, but they probably localize it in different places: the entire body surface, in the case of the adults; largely the facial surface, in the case of the children. By age 5 or so, these differences in the conceived extension of this kind of physical self have largely disappeared.
References Flavell,
J. H. (1974) The development of inferences about others. In T. Mischel (Ed.), (ind~sln&i~z~q other persons. Oxford, England, Blackwell, Basil & Mott. Flavell, J. H. (1978) .The development of knowledge about visual perception. Nebraska Synzposirtrn on Motivation, 25, 43-76. I:laveU, J. [I., Shipstead, S. G., and Croft, I) gift would involve the revision of this structure by the introduction of an extra NP node to dominate the whole of the complex noun phrase ml’ birth&~. gift fbr Susutz. This is a violation of both MA and RALR. The parser is not in fact in a last resort situation when it makes this revision; it could have ploughed on with the original analysis, and found a simpler attachment for Jar Susan as sister to one of the verbs. Clearly, it will not do to abandon RALR and MA in order to account for these examples, for then there would be no explanation for the observed preference in cases like (20). Instead, we must look for some principled division of cases into those in which MA and RALR do apply. and those in which they do not apply and allow RA to win. In the SM, this division follows automatically from the limited capacity of the first stage parser (the PPP). We can hold to MA and RALR as absolute constraints with no exceptions, governing the operations of both the PPP and the SSS. The appearance of violations stems from the fact that the PPP can only abide by these principles within its own very limited view of the sentence. Because it cannot ‘see’ much of the phrase marker, it will not detect certain attachment possibilities for incoming constituents. It will therefore sometimes have to give up on an analysis it has attempted because it cannot find any way of continuing it; it will conclude that it has reached its last resort, and will therefore quite properly revise the phrase marker that it has formed. But sometimes when the PPP concludes correctly that it is in a last resort situation, this will be false from the point of view of the phrase marker as a whole ~ there will be legitimate ways of continuing the initial analysis by attaching incoming words to nodes in the phrase marker that the PPP cannot see. In other words, even though the PPP honestly tries to abide by MA and RALR, parsing as a whole will not invariably abide by them. This account predicts that in cases of conflict, MA will take precedence over RA if the MA attachment associates incoming words with nearby words in the sentence (since the PPP will be able to see all of these words simultaneously). but RA will take precedence over MA if the MA attachment associates incoming words with distant words in the sentence (since the PPP will be unable to see the distant words). This is exactly what is observed in the contrast between sentences like (20) in which MA wins, and sentences like (27) in which RA wins. And it is this tendency to associate nearby words together, even at the expense of other principles, that we called Local Association. (We would emphasize that in the SM model LA does tzot have to be stipulated as an explicit principle that guides the parser’s decisions. It is an automatic consequence, as WC have just indicated, of the limited capacity of t11e PPP.)
Is the human sentence parsing mechanism an A TN?
443
We do not see any way of incorporating Local Association into an ATN other than by making RALR sensitive to the number of words within a constituent. One way (the best way?) of achieving this would be to stipulate that after five or six words had been processed within a subnetwork, the processor would ‘forget’ that this subroutine had been called by some higher network. For sentence (26), for example, the processor would be operating within the NP network in order to attach the note, the memo and the letter. By the time it reached to Mary it would have forgotten that it had gotten into the NP network as the result of a SEEK NP instruction from the VP network. The SEND arc out of the NP network would therefore lead nowhere. Taking this SEND arc before all the words of the input had been accommodated would thus constitute failure. So the processor would have to look for a way of incorporating to Mary within the noun phrase. As in the SM, this would not constitute a violation of RALR, but rather a mistake (due to the forgetting of earlier structure) about when a last resort situation has arisen. Of course, a parser that completely forgot the SEEK instructions that called its subroutines would simply be unable to parse most sentences of the language. So it would also be necessary to assume that some part of the processor does remember that the NP network was entered because of a SEEK arc in the VP network. But this now begins to look very much like an ATN emulation of the SM. Some part of the ATN processor, just like the PPP, loses access to higher structure within the space of half of dozen words or so; another part of the processor, just like the SSS, keeps track of this higher structure. To summarize: Wanner’s ATN, in which RALR alone governs the interplay of MA and RA, is not rich enough to account for all of the data. The reversal of the interaction of MA and RA must be explained. It appears to be triggered by constituent length. Thus RALR must somehow be rendered inoperative for long distance attachments. We, at least, have been unable to think up any plausible way of achieving this within an ATN other than by isolating lower level parsing from higher level parsing by some sort of ‘forgetting’, during lower level operations, of what higher level operations have been performed, just as in the SM. Notice in particular that the isolation involved is not a matter of making a single cut across an ATN network such that subnetworks on one side of the cut have no access to subnetworks on the other. Whether or not there is access between subnetworks has to do with the properties of the sentence being parsed (the lengths of its constituents). Thus the loss of access that puts RALR out of action must be the result of some on-line phenomenon, rather than being built into the permanent data structure.
444
J. D. Fodor and L. Frazier
The length sensitive discontinuity in the interaction of MA and RA is by no means the only indication that there is a low level parsing device which has no access to the structure of the sentence beyond its most recent six or seven words. In FF, we cited three other discontinuities in parsing that can be explained in the same way. Wanner does not refer to these arguments, though they are the heart of the motivation for the SM. In each case, his single stage ATN parser makes the wrong predictions. We will not discuss these phenomena in great detail here, for they are all quite explicitly described in the earlier paper. But it may be worth reminding readers of the points we made. First, there is a discontinuity in the correlation between height of attachment and unnaturalness of the analysis. A low and local right attachment of an incoming constituent is strongly favored, and all high and distant attachments are more or less equally unfavored. (See the discussion of sentence (16) in FF.) We criticized Kimball’s Right Association principle for predicting a steady increase in unnaturalness as a function of attachment height. Readers can trace through Wanner’s ATN and see that it makes the same incorrect prediction. The SM predicts the discontinuity, on the grounds that all distant attachments fall outside the PPP’s limited view of the sentence. Second, there is a discontinuity in the preference for attachments as a right sister to constituents already present in the phrase marker, over attachments as a left sister to subsequent items in the input. Where the right attachment to prior constituents is local, it is preferred over a left attachment to subsequent constituents. But when the right attachment would be a distant one, the parser prefers to make a local left attachment. (See the discussion of example (17) in FF.) This is so even when the local left attachment results in a highcjr attachment in the phrase marker as a whole than the distant right attachment would have resulted in. Like Kimball’s Right Association principle, Wanner’s ATN incorrectly predicts that a higher attachment is always less favored than a lower one, regardless of how local it is. In the SM, the priority of local attachment over low attachment follows from the fact that the PPP must group each word with others either immediately to its left or immediately to its right, and that it cannot ‘see’ what eventual effect these local attachments will have on the height of that word in the total phrase marker. (Note that this presupposes that bottomup parsing is permitted. In current ATNs, by contrast, all parsing is top-down.) Third, there is a discontinuity in the strength of the tendency towards local association depending on the length of the constituent to be attached. A short constituent strongly attaches itself to neighboring words, even though the result is often a nonsensical interpretation of the sentence. A longer phrase or clause can quite readily be attached to more distant consti-
Is the human sentence parsing mechanism an ATN?
445
tuents. (See the discussion of examples (25) - (28) in FF.) The parsing preferences of Wanner’s ATN are fixed and independent of the length of the constituent to be attached. In the SM, short constituents are attached by the PPP and hence can only be attached locally, while long constituents are attached by the SSS which has access to the whole phrase marker and is thus not subject to any pressure towards local attachment. These four discontinuities, all sensitive to at least roughly the same length parameter, still seem to us to add up to very strong evidence for a fundamental discontinuity in the parsing program, with structure assigned first on a very local basis to clumps of neighboring words in the input sentence. We have sketched here the beginnings of an account of these phenomena within an ATN. We don’t know whether the ‘forgetting’ mechanism that we have suggested for ATNs will prove to be the best approach. But we are prepared to bet that any ATN that could cope with all of these phenomena (in anything less than a totally ud hoc fashion) would in effect incorporate two-stage parsing.
5. Right Association The one observation in Wanner’s critique that does seem to demand some addition to the SM model is that RA apparently applies within the PPP. As Wanner notes, the preferred interpretation of a sentence like John said Bill died yesterday has the adverb attached within the lower clause rather than within the higher clause. On the assumption that the PPP can take account of six or seven words simultaneously, this preference for low right attachment cannot be due to LA; it must be due to some structural principle guiding the PPP’s attachment operations, rather than to the discontinuity between the PPP’s operations and those of the SSS. We might as well concede, while we are about it, that there is also some tendency towards RA within the SSS. For example, in sentence (29) the adverbial clause after he had finished his chores seems to attach more naturally as a modifier to the lower verb taken than as a modifier to the higher verb said. (Cf. example (24) in FF.) (29) Grandfather his chores.
said that he had taken a long hot bath after he had finished
In the heat of our demonstration that Kimball’s Right Association principle is insufficient to account for all attachment preferences, we seem to have overlooked the evidence that RA - as well as LA - does govern the parser’s decisions.
446
J. D. Fodor and L. Frazier
The fact that RA holds within both the PPP and the SSS makes things easier for us, because we can assume that it is a general property of the parsing routines, just like MA and RALR. As noted in FF (p. 314), there seems to be no need to postulate any differences between the PPP and the SSS with respect to the kinds of operations they can perform or how they perform them, other than differences which follow inevitably from the basic division of labor between them ~~ differences concerning the size of the sentence chunks they work on, the rate at which attachment decisions must be made, and so on. But if we are to meet our original goal attributing all parsing tendencies to the fundamental structure and operating characteristics of the parser, rather than to explicit ‘strategies’ which tell the parser what to do at choice points, then we need to provide some sort of plausible story about how and why RA constrains the parser’s operations. We will argue that RA attachments are very natural in the SM, in view of the way in which it builds phrase markers for sentences, and also that RA attachments lead to computational savings. In this latter respect, RA is much better motivated in the SM than in current ATNs. Let us consider first how and why RA applies in Wanner’s ATN. The answer to the lzow question is quite straightforward. Wanner observes that postponing SEND and JUMP arcs as long as possible (within the tolerance of RALR) will ensure that an incoming constituent is attached into the lowest phrase that is currently open that contains a legitimate attachment position for it. Thus there does exist a simple general characterization of RA in ATN terms. And hence it is at least conceivable that RA is innately encoded into the human sentence parsing mechanisms in the form of a constraint on the order of arcs in the network that the language learner constructs. The only further question to be answered, then, is whether there is any reason why evolution should have favored a sentence parsing mechanism that is innately constrained in this way, rather than one with the opposite constraint on arc orderings, or no constraint at all. Wanner’s suggestion is that in a parser with RA, “shifts between constituents will be minimized”. We will return to this claim shortly, but first we should consider why minimizing shifts between constituents might be advantageous. Wanner claims that minimizing shifts between levels of the phrase marker would minimize garden paths (which are extremely wasteful of computational effort). The idea is that “syntactic structure is generally more predictable within constituents than across constituent boundaries”. As it happens, this is true only for parsers resembling the SM, and not for Wanner’s ATN. Suppose, for example, that the initial portion of a sentence contains two incomplete verb phrases, one subordinated to the other, and that the incoming constituent is a prepositional phrase. A parser that abides
of
Is the human sentence parsing mechanism an ATN?
447
by RA will attach this prepositional phrase within the lower of the two verb phrases. But the existence of a prepositional phrase will be equally predictable (or unpredictable) at both levels. Indeed, in this case, it is the very same subnetwork of the ATN (the VP network) that will be the source of both predictions. There is only one respect in which future constituents are more predictable within a level of the phrase marker than across levels. This is that the immediately preceding words of the sentence, which have just been processed, will be a better basis for predicting what might appear next at the same level than for predicting what might appear next at some higher level. Thus in sentence (30), the words that immediately precede the final prepositional phrase are Mary had been reading, and these do establish that a tophrase is possible and even likely in the lower clause. (30) John was reading a book that Mary had been reading to Susan. But these same words contain no indication at all about what, if anything, is likely to occur next in the higher clause. It is the earlier words John was reading a book that are a useful basis for predicting what will appear at the higher level, but a parser that was unable to take distant words into account in making its predictions could not benefit from this information. Thus, predictability considerations would provide a motive for avoiding level shifts only in a parser that ‘forgets’ earlier parts of the sentence as it processes later parts. The PPP of the SM has this property, but no current ATN does. Unless we have overlooked some other source of motivation, it looks as if the only reason why an ATN would be structured so as to avoid level shifting must be that level shifting is inherently costly. This is plausible enough, since each shift constitutes a redirection of attention, and requires the parser to keep track of where it is coming from and where it can legitimately go to next. Shifting thus increases the housekeeping chores and also perhaps increases the chances of error. So now we must return to the question of whether RA does in fact minimize shifts between levels in an ATN. The answer is that it does not. One way in which RA might minimize level shifts is by favoring the correct analysis of a sentence more often than not; then the processor would not have to shift up and down between levels of the phrase marker correcting the positions at which it had attached incoming constituents. For this to be true, it would have to be the case that sentences with RA attachments occur more frequently than sentences with other attachments. This probably is the case. But since there is no grammatical reason for it, the reason probably has to do with the greater processing complexity of other
448
J. D. Fodor and L. Frazier
attachments. Hence, this presumed frequency difference provides no independent motivation for the parser’s avoidance of non-RA attachments. Another way in which RA might minimize level shifts is by favoring errors whose correction requires fewer shifts than other kinds of errors would. We won’t go through a detailed demonstration, but it does seem clear that RA errors are no less costly to correct than errors of the opposite kind. An RA error would be an error of shifting up to a higher level of the phrase marker too late; correction would require shifting down to detach a constituent from the lower level, and then shifting up again to attach it at the higher level. The opposite kind of error would be an error of shifting up to a higher level too soon; correction would require shifting down to insert an extra constituent at the lower level, and then shifting up again to continue the analysis of the sentence. The only desirable outcome of a preference for the lowest possible attachment would be that this permits a uniform correction strategy: if the current attachment proves untenable, try the next highest attachment. However, a preference for the highest possible attachment would have permitted an equally orderly correction strategy. A third way in which RA might minimize level shifts is much more direct, and it deserves more detailed attention. It turns on the principle: a shift postponed (by an RA attachment) is a shift saved. As it happens, this explanation of why a parser should abide by RA is not available in current ATNs, though it is in the SM - and, of course, in ATNs that resemble the SM in relevant respects. The issue is whether a parser that has attached the final word of a sentence low down on the right of the phrase marker, in accord with RA, can terminate its computations there, or whether it still has to shift up level by level through the phrase marker to the top S node even though there are no more lexical items to be attached at those levels. If it need not shift up to the top level, then the level shift that was avoided by the RA attachment of the last word will never have to be made at all; hence RA will result in computational economies. But if the parser must shift up to the top level, regardless of whether there are any attachments to be made on the way, then the total number of level shifts will be exactly the same for the RA attachment of the last word as for any higher attachment of it; hence RA will not result in any computational economies. (We should point out that this question of level shifts after the last word is attached is just a special case of a much more general question, viz., whether the parser can move in one fell swoop from a level at which it has just attached an item to a higher level at which it will attach the next item, or whether it must shift from one to the other passing through all intermediate levels on the way. To simplify the exposition, we will restrict attention, in what follows, to the special case of
Is the human sentence parsing mechanism an ATN?
449
sentence-final shifts. But we would like it to be clear that the advantage of the SM, which does avoid shifts to levels where there is nothing to be attached, is not restricted to sentence endings.) ATNs are so structured that each SEEK action is paired with a SEND action. If, for example, the VP subroutine is interrupted by a SEEK instruction activating the NP subroutine, this NP subroutine must end with a SEND action which switches control back to the VP subroutine. If the VP subroutine was called by the S subroutine, the VP subroutine must end in a SEND action back to the S subroutine. In current ATNs, there is no way of combining these two SEND actions so that the NP subroutine can feed back directly to the S subroutine, even if there are no more constituents to be attached at the intermediate VP level. It is because of this that an ATN must complete its analysis of a sentence with a sequence of SEND actions leading, in effect, up the right hand side of the phrase marker from the node at which the last lexical item was attached to the level of the topmost S. And it is because of this that, in a sentence such as (31), attaching to Susan at the lower VP level, in accord with RA, results in no saving of computational effort compared with attaching it at the higher VP level. (31)
I
said
b ’
\
I
lied i to
T Susan
This characteristic of current ATNs may seem to be a quite superficial one, which could easily be modified. But this is not so. One problem in modifying it is that the interplay of SEEK and SEND actions is part of the ATN ‘language’ itself, rather than part of the program formulated in that language; in Pylyshyn’s terms (op. cit.), it is buried within the ‘virtual machine’. (This is why SEND arcs do not have to be tagged with explicit
450 J. D. Fodor anu’L. Frazier
specifications of the destinations of their SEND actions.) Thus the pattern of SEND actions is not readily accessible for alteration. But in any case, this step-by-step level shifting is deeply entrenched in current ATNs, since it is essential to the treatment of syntactic prediction. When an ATN subroutine is interrupted by another, there is no looking ahead to see what will be involved in the completion of the first subroutine once the interruption is over. In particular, there is no mechanism for noting the presence of obligatory attachment arcs (i.e. SEEK, CAT or WORD arcs that cannot be by-passed by JUMP or SEND arcs) following the arc at which the interruption occurred. The parser therefore must transfer control back to the interrupted subroutine before terminating its analysis of the sentence, because otherwise it would have no way of establishing that all the constituents that the grammar requires to be present in the sentence have actually been identified in it. In other words, even if there are no lexical items to be added at some intermediate level of the phrase marker, there must be a SEND action up to that level in order to establish that there are no more constituents needed at that level. If the parser did not do this, it would have no way of determining that the input sentence was ungrammatical for lack of an obligatory constituent (e.g., “John put the book); it would have no way of determining that it had been garden-pathed into an incorrect analysis of the input that makes it uppeur to be ungrammatical for lack of an obligatory constituent (e.g., “John put the book [thut Mary hud been reading in the study] ); and it would have no way of recognizing the ‘gaps’ in sentences created by transformational movement or deletion of constituents (c.g., Where did John put the book -?). We will now show how the SM can avoid these redundant level shifts, i.e., shifts to levels at which there is no action to be performed other than that of checking that there is no action to be performed. We have argued (in connection with MA) that the grammatical rules for the language should be stored separately from the specification of computational operations involved in the application of those rules. They will reside in a special ‘rule of the parser as library’, and will be accessed by the executive component needed. This means that the SM cannot rely on a built-in pattern of SEEK and SEND actions, as in an ATN, to ensure that the rules are accessed and applied in the proper sequence in building up a phrase marker. In FF we pointed out that rule applications can be properly sequenced in this model if we assume that “the human parsing mechanism not only processes what it does receive but also makes predictions concerning what it is about to receive.” We elaborated this (pp. 3 16-3 17) as follows: We propose to permit both the PPP and the SSS to postulate obligatory nodes in the phrase marker as soon as they become predictable. even if their lexical realiza-
Is the human sentence parsing mechanism an ATN?
451
tions have not yet been received... If these predicted nodes should continue to dangle for lack of any corresponding lexical items in the sentence, they will signal ungrammaticalities of omission [or ‘gaps’ from which constituents were moved or deleted by transformations, cf. footnote 151 . .. They will also sometimes serve to resolve what would otherwise be temporary ambiguities in sentences.
The role of the word was in the sentence fragment That the youngest of was proved to... is unambiguous. The complement clause must contain a verb phrase, and its position in the phrase marker requires that in the lexical string this verb phrase should precede the verb phrase of the main clause. Therefore was proved to... can only be attached within the subordinate clause, not within the main clause. But either attachment would appear to be legitimate to a parser which did not enter the predictable subordinate VP node before attempting to connect was into the phrase marker. This is only one of the innumerable examples in which node prediction can save a parser from the danger of being garden pathed by potential attachment ambiguities. The idea, then, was that in parsing a sentence such as this, the SM would compute the structure (33) rather than (32). the children
(32)
that A the youngest of the children
s
(33) (NP)/
I
that
the youngest of the children
VP
451
J. D. Fodor and 1,. Frazier
Partial phrase marker (33) makes explicit the need for a VP at each clausal level, and also explicitly indicates the relative ordering of these VPs. All that is needed, therefore, to ensure that the words are attached under the nonterminal nodes in the right places is that the parser should be constrained to move around the bottom of the partial phrase marker in an orderly fashion, expanding nonterminal nodes in sequence from left to right, without skipping any. (Or, as noted, if there is no alternative to skipping a node, the parser will recognize a transformational gap or an ungrammaticality of omission.) The important point here is that in the SM model, predictable nodes are prefigured in the phrase marker. Like the paired SEEK and SEND actions in the ATN, this ensures that rule applications are properly scheduled; and it is also independently motivated in any parser whose rules are stored separately from its action plans, since it minimizes rule accessing. Each rule is accessed just once, and all the information it contains is entered into the phrase marker simultaneously. In building the structure (33), for example, the parser must access the rule S -+ NP VP in order to attach the subject noun phrase; and as it does so, it enters not only the NP node but also the VP node into the phrase marker. The alternative would be for the parser to enter just the NP node, and then subsequently E-access the rule to extract the information about the VP node. This re-accessing of rules (as many times as there are nodes to the right of the arrow) would presumably be costly (especially as the parser would have to keep track of how much of the rule it had already entered into the phrase marker). And it would also call for a considerable amount of record-keeping in order to ensure that the rules were re-accessed in the right order ~ e.g. that the lower application of S + NP VP in (33) was completed before the higher application. Efficiency considerations in the SM (though not in current ATNs) theEfore favor the node prediction alternative. It is the fact that the SM enters obligatory nodes into the partial phrase marker that guarantees that it does not need to terminate its analysis of a sentence by shifting up level by level through the tree checking the rules at each step to ensure that all obligatory constituents have indeed been found in the lexical string. Instead, it can simply look to see whether there are any dangling nodes in the phrase marker it has constructed so far. If there are none, it can safely terminate its computations in the knowledge that the analysis it has computed for the sentence is complete. The SM model therefore does explain how RA is advantageous for the human sentence parsing mechanism: an RA attachment can avoid a shift of attention from one level of the tree to another, and the level shift avoided by an RA attachment has a good chance of being avoided altogether.
Is the human sentence parsing mechanism an ATN?
453
A brief digression: we have described the SM as simple ‘looking to see’ whether the phrase marker contains any dangling nodes, but clearly this metaphor must be cashed out in the form of some explicit mechanism. The kinds of representation and computation generally made use of in the modelling of sentence parsing may not lend themselves to an appropriate implementation of this idea9 ; we suspect that models of visual skills (e.g., Kosslyn and Shwartz, 1977) may come closer to what we have in mind. What we want to capture is the notion of scanning the phrase marker, spotting the next node, and jumping across to it. This is surely tl;,: appropriate way to characterize subjects’ performance in the corresponding visual task - for example, to focus on node A in diagram (34) and then to shift attention as rapidly as possible to the first node that is not connected to the string of words at the bottom. (34) Y X-B
C
A -
1
‘Given the computational devices in standard use for sentence parsing at present, the easiest way to implement this in practice might well be to have the parser track through the tree structure itself, passing from the node at which the last attachment was made to the node at which the next attachment is to be made vti the nodes at intermediate levels. This would be like the step-by-step level shifting that we have criticized in ATNs. Nevertheless, it could still save the SM some computational effort when there is no material to be attached at a given level. Because the SM explicitly encodes its predictions into the phrase marker in the form of dangling nodes, the device that searches the tree for the next dangling node to be dealt with (or any dangling node left at the end of the parse) could be a very superficial look-ahead device, distinct from the routines which actually do the work of parsing the sentence. It would not have to have access to any grammatical information about sentence structure, and it would have no parsing decisions to make;its only job would be to distinguish between the presence of a node and the absence of a node. Thus, unlike current ATNs, the SM would save the effort of checking the grammar against the phrase marker at every level up the right hand side of the tree. Our argument about the benefits of RA to the SM therefore go through even on this assumption. As noted, however, we hold out some hope of finding a more direct implementation of the idea of ‘looking to see’ whether a dangling node is present. Indeed this is essential to us if the SSS can receive a package from the PPP that contains a dangling obligatory node. If the SSS were to attach this package and then select the next node to work on by moving up the tree from the point at which it made its most recent attachment, it would overlook this dangling node within the package it just attached. What it must do instead is ‘stand back’ from the partial phrase marker it has constructed so far, and identify the leftmost dangling node in the whole structure.
454
J. D. Fodor and I,. Frazier
We would certainly not expect subjects to find node B by searching up through nodes X and Y and then down the branch to B. Until a better understanding of the mechanism of visual scanning has been achieved, the visual metaphors in our description of how the parser scans the mental representation that it has constructed for the sentence are likely to remain no more than metaphors. But we would suggest that they are at least the right metaphors, and that the human sentence parsing mechanism can shift its attention directly to the next node at which some action is called for without having to compute its way through all intervening nodes. To summarize: because the SM extracts information from its grammatical rules and encodes its syntactic predictions explicitly into the phrase marker that it is constructing, it needs only a simple node detection device to determine whether the analysis it is computing meets all the syntactic obligations imposed by the grammar. It therefore does not have to check the phrase marker against the rules level by level all the way to the top; if there is nothing to be attached at some level, it does not have to shift to that level. Therefore, RA attachments will minimize the number of computations involved in parsing sentences. The SM thus provides an answer to the M~/IJ’ question about RA, and it remains only to show that the Izow question can also be answered. The argument from economy of rule accessing for entering nonterminal nodes into the phrase marker before their lexical realizations have been encountered in the input implies that optional nodes as well as obligatory nodes should be established in the tree as soon as the Elevant rule is accessed, i.e., as soon as the leftmost daughter node introduced by the rule is needed for the attachment of some lexical item.” For example, the optional PP node introduced by the rule VP + V NP (PP) would be entered into the
“The insertion of optional nodes into the phrase marker bcforc the input word string has provided any evidence that they arc needed might seem to be hopelessly inefficient since there arc such a vast number of options defined by the grammar. Winograd (1972) has argued that building options such as conjunction into an ATN network is extremely inefficient. but building them into the phrase marker that is constructed for each scntcnce is surely even worst. llowcver, MA severely limits the number of optional nodes that will in fact bc cntcrcd by the SM. It is true that anywhere that there is a noun phrase there is the possibility that the noun phrase will consist of a conjunction of smaller noun phrases, the possibility that it will consist of a head noun phrase followed by a relative clause or a prcpositional phrase, and so on. Rut MA guarantees that these possibilities will not bc contemplated by the SM except in response to lexical items in the input; they will never be pretikurcd in the phrrlsc marker being constructed but will arise only as a consequence of a revision, cnforccd by the input, of a simpler phrase marker. Only the options represented by parentheses and curly brackets within a rule will bc prefigured within the phrase marker; the options stemming from the optional application o/‘a rule will not be. The appearance of massive inefficiency is thcrcforc an illusion.
Is the human sentence parsing mechanism an ATN?
45.5
phrase marker as soon as the verb was encountered in the lexical string. This optional node would obviously have to be distinguished from obligatory nodes, since the absence of any lexical realization for an optional node does not constitute an ungrammaticality (or a ‘gap’) in the sentence; unlike an obligatory node, an optional node can be by-passed in the course of fitting the words around the bottom of the phrase marker, without this being an indication that something is missing from the sentence. (How optional nodes are distinguished in mental representations from obligatory ones is of no concern to our argument. We could suppose that they are entered with parentheses around them, or in pink rather than in blue.) At the point at which the final prepositional phrase is to be attached in the sentence (31), the partial phrase marker that has been constructed will therefore be (35). (35)
Bill
“A I lied
W)
The parser will, as argued, scan the phrase marker and attend to its dangling nodes in sequence as it moves around the bottom of the structure. It will therefore encounter the lower PP node in (35) before it encounters the higher one. Since this node is marked as optional, the parser could in principle skip it and move on around the phrase marker to the next one. But it seems reasonable to suppose that the parser will take advantage of the opportunity that this node affords for the attachment of the words to Susan.““* (This situation is quite different, of course, from when the parser ‘t What we have been arguing is that within the SM, RA could be favored by selection pressures because it minimizes computational effort. Rut the question arises whether RA would, in any case, result automatically from the natural functioning of the SM. There arc reasons for thinking that it would. I:irst, RA will tend to increase the size of the phrasal packages composed by the PPP, thus reducing the total number of packages in the sentence and hence the rate at which the SSS has to make its decisions. As noted in l:F, the SM operates most efficiently if the PPP does its fair share of the work.
456 J. D. FodorandL.
Frazier
Second, the SM is not restricted to top-down parsing. Thcrcfore, in deciding whether or not to bypass an optional node in the phrase marker it can take account of the nodes over the word or phrase that needs to be attached - for example, of the P node that the lexicon supplies over a preposition, and even of the inevitable PP node above that. Since the parser is under pressure to attach incoming elements into the phrase marker as quickly as possible, we would perhaps cxpecl that whenever it detects a match between the incoming phrase and the attachment possibility offered by the optional node already in the tree, it will take advantage of it. (As noted in IF, the SSS is under considerably less time pressure in making its attachments than the PPP. So it might have the leisure to scan ahead and notice that it can afford to relinquish a low attachment opportunity because there is another equally good one coming along. Together with the fact that the SSS has more information about how constituent meanings fit together, this would explain why the RA tendency is apparently somewhat weaker within the SSS than within the PPP.) In other words, although it is conceivable that an SM parser could be explicitly programmed to systematically by-pass optional nodes prefigured in the phrase marker, this does not look to be a natural way for it to function. “It should be noted that the SM’s favoring of the nearest optional node that is prefigured in the phrase marker does not predict violations of MA in scntcnccs such as Joe bought the book for Susan. It is true that the less preferred attachment point for for Susan in this sentence (as modifier within the object noun phrase) appears earlier on a path around the bottom of the phrase marker than the preferred attachment point (as daughter to the VP). But MA entails that before for Susan is encountered in the input, the parser will have constructed the partial phrase marker Ci) rather than (ii).
s
(0 NPfi”P Joe
i bougll
t
tit I book
the
s
w NPAVP Joe
m
(PP)
;’ bought
NPACPP)
Det
N
I the
book
1
In other words, the optional PP at the verb phrase level wiZ1 be prefigured in the phrase marker, bccausc it is introduced by a rule that has already had to be accessed for the attachment of prior words; but the optional PP within the object noun phrase will nof be prefigured in the phrase marker, because it would have had to be introduced by the unmotivated application of an optional rule. Thus our account of how RA applies in the SM is quite compatible with the dominance of MA over the preference for low right attachments (in casts where this is not reversed by the limited view of the PPP).
Is the human sentence parsing mechanism an ATN?
457
has run out of words in the input sentence and needs to know whether there remain any dangling nodes that would invalidate the analysis. In the latter case, optional nodes would be ignored as the tree is scanned.) Our argument has been that RA can be imposed just as easily within the SM as within an ATN, and furthermore that the SM model makes it at least comprehensible that RA should have been favored by evolutionary selection. We should now consider whether there is any way in which ATNs could be modified to incorporate some such explanation of RA. Several ideas come to mind. One way would be to annotate SEEK instructions with information about obligatory arcs that must still be traversed after the SEEK action is complete. The conventional S network, for example, could be modified as in (36), with a tag on the SEEK NP arc reminding the parser of the need to return to the SEEK VP arc. (36)
SEEK NP [SEEK VP]
SEEK VP
SEND
In the normal course of events, the tag would be returned to the S network by the SEND action that terminates the SEEK NP subroutine, and would be cancelled as soon as the SEEK VP action was initiated. But at the end of the sentence, the parser could stop its computations without engaging in the usual sequence of vacuous SEND actions, as long as no current SEEK action was tagged for an obligatory constituent. Alternatively, since obligatory constituents are, in a perfectly good sense, anticipated in the structure of the network, there could be a device for tracking ahead through the network, recording in a special memory store the existence of obligatory arcs that have yet to be traversed, and cancelling them as and when they are in fact traversed. As far as we can see, there are no real objections to such mechanisms. But they are certainly complications in an ATN system. The tags on SEEK actions, or the arcs listed in the special memory store, would duplicate information that is already in the network without putting it where it must eventually end up, i.e., in the phrase marker being constructed for the sentence. In the SM, all this record-keeping is done in the phrase marker itself; information is taken from the rules and entered into the phrase marker, without being duplicated anywhere else on the way. An ATN could be designed to do the same, of course. But we would emphasize that such an ATN would differ from current ATNs in just the way that we pointed out in FF: it would have to have access to the phrase marker it has constructed while making its decisions (e.g., the decision whether or not to terminate its
458
J. D. Fodor and L. Frazier
parse of the sentence). We must confess that our way of expressing this general point in FF was misleading, in a way that Wanner (p. 223) has quite properly drawn attention to. Though we didn’t quite say so, we did imply that the parser needs to be able to view the phrase marker it is constructing in order to be able to detect several alternative attachment possibilities and compare their merits with respect to some general geometric principle. This is actually inconsistent with our goal of dispensing with overt ‘strategies’ which tell the parser what to do when there is a choice between alternative actions. Instead, as Wanner points out, the general structural preferences of the parser result precisely from its not looking at alternative attachment possibilities and deliberately choosing between them, but simply doing the first (or only) thing that present itself as a possibility. We hope we have made it clear in the present discussion that, despite this expository imprecision? it is still true that the most plausible explanations for some of these general preferences (specifically, RA and LA) presuppose that the current partial’ phrase marker is at least partially ‘visible’ to the executive component of the parser.
6. Conclusion The most striking (and, we think, the best motivated) property of the SM model of the HSPM is its two-stage structure. But there are two other properties that we also regard as important which are not shared by current ATNs. One is the separation of grammatical information about the language from the action plans that determine how the grammatical information is to be used. The other is the accessibility of the current partial phrase marker to the decision making routines. In FF we argued for the equation: HSPM = SM # current
ATNs
We have argued here that this equation still holds, even if the revised ATN that Wanner has proposed is included among current ATNs. Our own attempts to devise ATNs that do simulate the SM and hence the HSPM suggest further that the functional architecture of ATNs in general does not match that of the human sentence parsing mechanism. This conclusion must of course be a tentative one, for it is always possible that those who are more experienced than we are in working within the ATN framework will be able to construct an ATN which ranks high on the implicit evaluation metric, has the same performance characteristics as the SM, and yet does not share these three basic properties of the SM. We think
Is the human sentence parsing mechanism
an ATN?
459
that it is at least worthwhile, however, to get these issues out into the open, for ATNs seem to have a considerable appeal in psycholinguistics - perhaps because they look to be so well-tailored to the demands of natural language sentences with their recursive embedding of phrases within phrases. But it may be that what this amounts to is nothing more than that an ATN embodies a phrase structure grammar and an efficient means of applying it to word strings; it thus avoids all the complications of analysis-by-synthesis routines, or ‘backwards’ transformational derivations, and also the imprecision of ‘detective’ models such as that proposed by Fodor, Bever and Garrett (1974). But it must be borne in mind that these advantages are not exclusive to ATNs.