OXFORD UNIVERSITY PRESS
Great Clarendon Street, Oxford OX2 6DP Oxford University Press is a department of the Universit...
71 downloads
1381 Views
66MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
OXFORD UNIVERSITY PRESS
Great Clarendon Street, Oxford OX2 6DP Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide in Oxford New York Auckland Bangkok Buenos Aires Cape Town Chennai Dar es Salaam Delhi Hong Kong Istanbul Karachi Kolkata Kuala Lumpur Madrid Melbourne Mexico City Mumbai Nairobi Sao Paulo Shanghai Taipei Tokyo Toronto Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York ©
the several contributors 2003
The moral rights of the author have been asserted Database right Oxford University Press (maker) First published 2003 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose this same condition on any acquirer British Library Cataloguing in Publication Data Data available Library of Congress Cataloging in Publication Data Data available ISBN 0-19-924561-4 (hbk.) ISBN 0-19-924562-2 (pbk.) 1 3 5 7 9 10 8 6 4 2 Typeset by Newgen Imaging Systems (P) Ltd., Chennai, India Printed in Great Britan on acid-free paper by BiddIes Ltd, Guildford and King's Lynn
1 Agency and Self-Awareness: Mechanisms and Epistell101ogy Naomi Eilan and Johannes Roessler
CONTENTS Introduction Mechanisms of control Epistemological issues Action and perceptual consciousness: dissociations and connections 4.1 Awareness of bodily movements in intentional action 4.2 The role of perceptual consciousness in the control of intentional action 5. The sense of ownership
1. 2. 3. 4.
1. INTRODUCTION In the mid-1980s Libet conducted a series of experiments which, he argued, showed that we are aware of intentions to act only after the brain area responsible for initiating acting has already been activated. He, and others, took these experiments to show that our brain (decides' on courses of action independently of our consciousness, where this in turn threatens the idea of free will, the idea that our actions are controlled by our conscious intentions. For immediate purposes the main point to note here is the intuitive connection highlighted by the debate surrounding Libet's claims, between the kind of control we think we have of our actions, on the one hand, and our awareness of our intentions and actions, on the other. For the suggestion underpinning Libet's claims is that unless the initiation of the action is something we are aware of, the action itself is not under the kind of control we think we have as agents, the kind of control in virtue of which we speak of freedom of the will. Though there is undoubtedly a powerful intuition about the existence of some kind of link between agency and self-awareness, it turns out to be extremely difficult to articulate and defend. A pathological case that brings out both the power of the intuition and the complexities of the problems involved in articulating and explaining it is the neurological syndrome labelled 'Anarchic Hand'. Patients with
2
Naomi Eilan and Johannes Roessler
Anarchic Hand syndrome sometimes find one of their hands perforn1ing complex, apparently goal-directed movements they are unable to suppress (except by using their 'good' hand). Sometimes the anarchic hand interferes unhelpfully with intentional actions performed with the use of the other hand (it may unbutton a shirt the patient is trying to button up). Sometimes it performs movements apparently unrelated to any of the agent's intentions, such as (in one notorious example) a movement resulting in picking up some leftovers from somebody else's plate in a restaurant. In most of these cases patients go on to claim that it feels as if the actions performed by the anarchic hand are not theirs, and that the hand is doing something they did not intend or want, and cannot control. Perhaps the most immediate question that might come to mind about the anarchic hand's movements is: are they actions? In the light of patients' reaction to the n10vements, the first impulse may be to deny this. On the face of it, the patients are not in control of, or responsible for, the movements at all. Yet there does seem to be a sense in which the activities of the anarchic hand are skilfully controlled: they are not pure reflexes, but clearly devoted to a particular goal, and, relative to the goal, well-executed. Then the question is: what is the nature of the control that is absent in these cases, and whose absence fuels the intuition that they are not to be regarded as actions? A similar question arises when we look at patients' awareness of the movements. Just as there is evidently some kind of control in the anarchic hand case, there is also some kind of awareness on the subject's part of what is going on. It is not as if the subject does things unknowingly. Rather, there is a sense in which she has knowledge of what is going on of the same kind one might have as an observer of someone else's actions. The question is: how should we characterize normal 'inside' knowledge relative to which characterization we can describe the anarchic hand case as deviant? It is here that we should hope to find some help in characterizing the sense of agency, and ownership, that accompanies most actions. So there is a question about the causal explanation of actions-the mechanisms of control-in the normal case. And there is a question about the source of our knowledge of actions (again, in the normal case). And, of course, there is the question of how these two sets of issues are related to each other. In his tutorial paper 'Control of Mental Processes', Stephen Mansell says there are two distinct Il1ysteries that come up when we attempt to explain the nature of control. One is the question of how, in information-processing terms, we should explain the kind of control that seems to break down in cases like Anarchic Hand syndrome. The other is how we should explain consciousness. His strategy is to divide and conquer, leaving with 'some relief the phenomenal commitments of control to others', while focusing instead on the first question only, namely, what observations of behaviour can tell us about how we control our minds and behaviour (Monsell, 1996: 108). Methodologically, as the title of the volume suggests, we have taken the opposite tack. The explicit idea here is that we may be able to make progress with each of the 'mysteries' only by examining the interconnections between them. However, to take
-------------------------------------
Mechanisms and Epistemology
3
this methodological line is not the same as delivering a positive substantive answer to the question: is the correct account of control one on which there is a constitutive connection between the kind of control we exert over our everyday normal actions, and self-awareness? Implicit in many of the papers collected here we find support for both negative and positive answers to this question. In taking this methodological line we are also simultaneously inviting readers to draw on both psychological and philosophical resources in addressing these questions. The central challenge here, as we hope to bring out in these introduct0ry comments, is to get right the relation between common-sense psychology and various branches of professional psychology (e.g. neuropsychology, cognitive psychology, and developmental psychology). In what follows we will not attempt a comprehensive summary of the concerns of each contribution-the papers speak eloquently for themselves, each one raising fascinating issues that we will not be touching Oll. Rather, our aims are twofold. First, in Sections 2 and 3 we sketch a map of some of the background issues that inform the psychological and philosophical discussions of (a) control mechanisms and ( b) epistemology, where the main aim will be to facilitate cross-disciplinary reading of the papers. Second, in Sections 4 and 5, we consider how the epistemological and mechanism questions interact in specific cases, drawing on the papers in this volume.
2. MECHANISMS OF CONTROL
Failures of Control What if anything can we learn about normal intentional action control by considering abnormal actions such as that exhibited in Anarchic Hand syndrome? There are in fact two kinds of questions they raise. One kind is: what does our bafflement about such cases show about our everyday understanding of the way actions are controlled, relative to which anarchic hands count as deviant? What are the everyday expectations that are thwarted in such cases? The other question is: what do such cases tell us about the mechanisms involved in control of everyday action? How can we learn from such failures about the proper functioning of action control? In this section we want to layout some of the basic ideas that inform the psychological and philosophical papers n10st directly concerned with control. We do so by putting the initial bafflement we feel about the deviant cases in terms of a crude paradox, to which these basic ideas can be read as a response (though they are not explicitly presented as such). To give a flavour of the kinds of cases at issue here, it will be helpful to have before us a brief list of the kinds of failure of control that are considered in the psychological literature concerned with supplying an information-processing account of normal action control. There are basically three kinds.
4
Naomi Eilan and Johannes Roessler
1. Pathological cases. Earlier we mentioned Anarchic Hand syndrome. This is one of many pathological cases, which psychologists use to try and gain insights into the mechanisms and anatomical basis for the normal kind of controL Other pathological examples of failure of control, all of which tend to be classed as 'frontal lobe' problems include: (a) utilization behaviour, where subjects apparently lose the ability to overcome
the power of a stimulus to invoke a habitual action. So for example, such a patient may be found making a cup of tea whenever a teabag comes into view; or start dealing cards whenever they come into view, and so forth; (b) perseveration, where subjects seem to be unable to stop performing a particular sequence of actions, for example, a patient will go on sorting cards according to one dimension (colour), even though he has explicitly been given the instruction to switch to another (e.g. shape); (c) dissociations between verbal announcement of intentions and actions, for example, a patient may announce an intention to do one thing while doing the opposite and simultaneously describe what he is doing as the opposite of what he intends. And nlany others (for a good summary of these see Monsell, 1996). 2. 'Slips of action'. The second kind of case psychologists use for formulating theories about normal control of actions are labelled 'normal' failures of control, that is, non-pathological, of the kind that happen to all of us, nlost of which we tend to classify as cases of absent-mindedness. The locus classicus for this is Reason's categorization of such slips of action, which takes as its point of departure James's description of going upstairs to change for dinner and finding himself getting into his nightclothes and climbing into bed instead. Reason labels this a 'double capture error', where attention is captured by some internal preoccupation, allowing the action to be captured by a stimulus associated with a strong habit. A double capture of omission rather than commission is failing to stop off for the paper on the way horne, despite the intention· to do so. Another kind of case he labels 'lost intentions', as in the familiar cases of finding oneself in a room and not remembering why on earth one wanted to be there in the first place. Yet another case is labelled one of 'detached intentions'-as when you cross a room in order to shut the window but shut the cupboard door instead. (For a comprehensive survey and description of these and many other types of slips of actions see Reason, 1984; and Monsell, 1996.) 3. Infant failures of controL The third kind of deviation from the norm is found in the developnlentalliterature. Here, the concern is not to investigate the adult norm on the basis of deviations, but rather to gain insights into the developmental stages involved in achieving the human mature norm. An example of a failure of control here is the failure of children until the age of about 5 to be able to play Simon says garnes, in which players must listen to commands to perform simple actions but are required to carry out only those commands that
Mechanisms and Epistemology
5
are preceded by the phrase 'Simon says'. They evince every sign of understanding the instruction verbally, but as soon as play begins seem unable to switch from one kind of instruction to another. This feature has been replicated in a number of experiments, discussed in Frye and Zelazo's chapter, where children seem to be unable to switch from one task to another, in some kinds of case but not others, while at the same time being able to repeat the instructions and manifesting some kind of understanding of them. Now a background question is, of course: how are these different kinds of deviations from the norm related to each other? The question of the connection between developmental findings, on the one hand, and slips of action and pathological cases, on the other, is raised explicitly by Frye and Zelazo, and we return soon to their paper, and Jennifer Hornsby's comment on it. Let us begin with the first two kinds of deviation, pathologies and slips of action, which are normally considered together in the psychological literature, and are thus treated by Perner and by Humphreys and Riddoch. Here too there are, of course, fascinating and difficult questions about the connections between different kinds of slips and different pathologies. But for the purposes of introducing the issues we are interested in, let us focus on the n'\'o kinds of cases that figure most prominently in these two papers. Double Capture Errors. Suppose you are driving along a familiar route (say, on your way home from work), intending to take an unfamiliar turn (say, to go the supermarket). It is a common experience that such intentions are sometimes ineffective: when you reach the critical junction there is a chance that you will absent-mindedly drive on along the familiar route. The (cups' experiment. Humphreys and Riddoch report on their experimental work with a patient, ES, who had suffered brain damage to the supplementary motor area in both hemispheres. ES sometimes showed the kinds of symptoms associated with Anarchic Hand syndrome. But under experimental conditions, she also showed a more specific impairn1ent. In one experiment, ES was presented with a cup placed on the table in front of her, aligned with either the right or the left side of her body. She was asked to pick up the cup, using the left hand when the cup was placed on the left, and the right when it was placed on the right. ES had no difficulty complying with the instruction to pick up the cup, but she often used the wrong hand. Strikingly, a key factor was the direction in which the cup's handle was facing. If the cup was placed on the left, with its handle pointing to the right, ES would tend to pick it up, conveniently, with her right hand. She would usually claim to be unaware of not having complied with the instruction.
The Paradox Our first question was: what is it about our everyday conception of intentional action relative to which both cases are surprising? An in1mediate response to the
6
Naomi Eilan and Johannes Roessler
very question might be that we do not find the slips of action surprising in anything like the way we do the pathological cases. That may be so, but the explanation for the lack of surprise might be simple amused familiarity in the slips of action case. If there is a substantive difference between the two, it has to be argued for, and we will return to such an argument later on. In the meantime, there is at least prima-facie value, when our interest is in our everyday expectations, in being forced to consider these two kinds of cases together. And when we do, the initial bafflement about all these cases can be stated as follows. In many of them an intention appears to be thwarted, not, familiarly and unproblematically, by a physical obstacle, or physical inability, nor, on the face of it, by any failure of reasoning capacities, or by other agents. Rather n1any of these cases seem to be ones in which the agent herself thwarts her own intentions by acting with a conflicting intention. And the air of paradox here turns on the potential threat to the unity of agency implicit in such a description. To spell this prima-facie paradox out in more detail, it will help to have before us the bare bones of current philosophical understanding of the nature of intentional action. Very roughly, on this conception, it is a constitutive feature of intentional actions (as opposed to a reflex, say) that they are explained in terms of the agent's intentions. Such explanations appeal to a good reason, from the agent's perspective, for performing the action, where the agent's perspective will normally involve desires for particular outcomes and beliefs about how to achieve them. Though on some accounts this precludes the explanation from being causal, most philosophers today would accept Davidson's point that the only way an intention can rationally explain an action is if it causes it. So in citing the agent's reason for performing an action we are citing its cause. Note that actions tend to be construed here as particulars with nun1erous properties or descriptions. For example, 'reaching for a drink' can be a description of the same action as 'spilling the drink'. The notable difference, of course, is that when it comes to explaining the action in terms of your intention, only one of the descriptions will be suitable and relevant. A common way of putting the point is that actions are intentional only under descriptions; an action may be intentional under one description, yet unintended under another. Consider now our example of a slip of action. If the action of turning left at the light is an intentional action, then it must be done for a reason. Such a reason conflicts with the reason I have for not turning left, namely, my intention to stop off at the supermarket. Such conflicts often occur, of course, and we resolve them by deciding on one course of action rather than another, which in this case we would describe as discarding the previous intention in favour of the one that informs my turning left. We change our mind about what to do. But on the face of it no such change of mind has occurred, that is what makes us baffled, amused, and irritated. How can it be that the very same agent who decides on one course of action performs an action -----that conflicts with it without havin~_~~~~~cie3_~~e_2~eY!Q~~!I!t~!1tipn3
Mechanisms and Epistemology
7
The bafflement here, and the problem, are, of course, very similar to those that get discussed under the heading of 'weakness of the will' or 'akrasia'. How can it be that I decide to stop smoking forever and a second later reach out for a cigarette V\Tithout having discarded the intention to stop? We seem to have before us two competing sources of agency within the same agent. But a feature that prima facie distinguishes the cases we are considering from those that fall under the heading of akrasia is that we do not, even on the face of it, have a case of desires competing against reason. There appears to be no sense in which a reason-dodging powerful desire is at play in driving me to turn left. We appear to have cases here of unmotivated, cold irrationality. The moves one might make in resolving our paradox can be represented as falling into two camps according to how they develop a distinction drawn by James between environmentally driven actions, which he called 'ideon10tor actions', and what he called 'willed actions'. The former are defined as bodily movements caused solely by the 'bare idea' of their perceivable effects. (James's examples include spotting a pin on the floor and picking it up while simultaneously carrying on with a conversation.) The latter are said to be preceded by 'an additional conscious element, in the shape of a fiat, mandate, or express consent' (189Gb: 522). The first camp says that while there is an important distinction between environmentally driven actions and those preceded by deliberation, it is a distinction between types of intentional action. Correlatively, our everyday notion of intentional action can, if not given misconceived philosophical glosses, dissolve the paradox. The second camp says that amongst ideomotor actions we must recognize a class of non-intentional actions, actions that cannot be explained in terms of personallevel intentions but only by an information-processing psychology. Correlatively, it is this information-processing notion of action that we need for dissolving the paradox. Both the chapters by Perner and by Humphrey and Riddoch fall into the second camp, though not in explicit response to the paradox, and we begin with them, before returning to the first kind of response.
The Two-Level Theory Much work in the psychology of action is informed by what is known as the ideomotor conception of action. On this conception, actions, in contrast to movements that can be explained by appeal to SR psychology or pure physiology, are controlled by representations of their perceivable results (James's 'bare idea of a movement's sensible effects': see 189Gb: 522), in the sense that the representation is causally responsible for the movement and may serve to correct errors during execution. (Note that the ideon10tor conception of action is to be distinguished from the notion of ideomotor actions, alluded to a moment ago. The former is intended to subsume all actions, ideomotor as well as 'willed' actions. For detailed discussion of both the ideomotor conception of action and ideomotor actions, see Wolfgang Prinz's chapter in this volume.)
8
Naomi Eilan and Johannes Roessler
Now, explanations of events by appeal to the goal-representation that causes them are more inclusive than intentional explanations as characterized above, in at least two respects. First, on the face of it, goal-directedness can apply to the behaviour of subjects where appeal to reasons for acting is for some reason inappropriate, but which is none the less of a kind to which SR explanations do not apply. Second, consider the mechanisms involved in executing even simple actions, like lifting a cup. Information-processing accounts of how we succeed in implementing our intentions will appeal to the operations of a variety of subsystems, the operation of which will need to appeal to represented goals. And here, too, the ascription of reasons for action is otiose and inappropriate. (We will discuss some of the issues raised by such accounts in Section 4.) The sense of paradox in the absent-minded case was derived fron1 an assumption that intentional action is action for a reason, from the subject's perspective, and that the subject should be credited with two conflicting intentions, and acts on one of them only, without having abandoned the other. But once a weaker notion of action is introduced, there is n10re room for manoeuvre. In the absentminded case, and in pathological cases, we can re-describe either or both of the intentions in terms of goal-representations, and then go on to give a purely causal, mechanistic (i.e. non-rational) account of how it comes about that, in particular circumstances, the goal informing a person's behaviour may be overpowered, causally, by a different goal, giving rise to an action incompatible "vith the first goaL On this kind of account, our bafflement is due not to a sense of paradox, but to unfulfilled causal expectations based on generalizations from the usual case. More specifically, the following two claims are common ground between Perner and Humphreys & Riddoch, though, as will emerge below, they develop the second claim in rather different ways. First, intention-thwarting behaviour, such as in the examples just given, is controlled by a low-level information-processing mechanism, whereby stimuli automatically activate well-learned, habitual activities. (The familiar sight of the junction activates the action of turning left; the sight of the right-facing handle prompts the right hand to be selected for grasping the cup.) Second, intentions do not control behaviour (directly' but by modulating the lower-level mechanism. To illustrate, Humphreys and Riddoch suggest that if you form a prior intention, say, to go to the supermarket before driving home, you thereby prin1e, or pre-activate, a particular response to a particular stimulus (e.g. the response of turning right at the junction where you normally turn left). Whether or not your intention will be effective depends on whether, when you reach the junction, the excitation of the primed response is greater than that of the response habitually activated by the stimulus. So the low-level mechanism serves two explanatory purposes in this picture: it is responsible for unintended habitual actions; and it is the mechanism by which intentions get a purchase on behaviour. Humphreys and Riddoch note an interesting parallel between this approach and late selection theories of perceptual attention. According to the latter, perceptual
Mechanisms and Epistemology
9
analysis of stimuli, including categorization, operates in parallel, independently of deliberate attention. So processing, not only of physical features such as spatial position, but also, crucially, of the identity of stimuli, is unselective. Similarly, the two-level approach to action control maintains that habitual stimulus-response associations are processed independently of deliberate attention. Accordingly, when an agent fails to. modulate the responses automatically activated by stimuli, whether due to absent-mindedness or to brain damage, her behaviour will be controlled by stimuli rather than by her intentions. An action such as lifting a cup, then, may be caused directly by the perception of the cup, or by the perception being primed by a prior intention. And both chapters suggest that in the first case we should describe the action as unintentional' not just in the familiar sense in which we talk about unintended aspects of intentional actions, but in the radical sense that it is an action that has no intended aspect at all-an action that is intentional under no description. So, as Perner argues explicitly, we have to abandon the view, which has been influential in the philosophy of action, that for an event to be an action just is for it to be intentional under some description. Now, that view has been challenged before. For example, according to O'Shaughnessy (1980), unthinkingly tapping a rhythm is a 'sub-intentional' action. But one important difference between the two challenges is this. Sub-intentional actions, according to O'Shaughnessy, are causally explained by desires-he is not suggesting that such behaviours fall outside the purview of common-sense psychological explanation. But on Perner's view, common-sense psychology simply has nothing illuminating to say about double capture errors. Neither intentions nor desires are relevant. So we need an informationprocessing notion of habitual action to fill the explanatory gap in the common-sense scheme. Moreover, the sense of paradox we set out with is, on this account, the consequence of a mistaken over-extension, and, thereby, mistaken application, of our notion of intentional action.
Common Sense and Mechanism It is precisely at this point that a defender of the first line of response to -the paradox will want to step in. The first line, recall, was that our notion of intentional action, so long as it is not given an incorrect theoretical gloss, can cope with dissolving the paradox generated by slips of action. A defender of this approach will insist that, contra Perner, such actions are intentional under some description. They might be compared with cases where we act without any deliberation, but clearly intentionally, in response to perceived affordances (e.g. accelerating when perceiving a traffic sign signalling the end of a speed restriction). Recall our earlier example of a double capture error: you turn left despite the fact that this thwarts your intention to go to the supermarket. On Perner's view, when we say that you took the left turn 'automatically', we show some awareness that the
10
Naomi Eilan and Johannes Roessler
explanation of your action is to be given not in terms of your practical reasons, the aims you were pursuing, but in terms of a mechanism that (on this occasion) operated independently of your objectives. On the alternative interpretation, the term 'automatic', in this context, should be read as spelling out how you acquired the intention to turn left: the intention was not the result of active deliberation; rather, you let yourself be 'saddled with' it by uncritically taking at face value the perceived practical significance of a stimulus. 1 Of course, this interpretation does not make the impairnlent to the unity of agency disappear-it suggests, precisely, that in the cases under consideration agents thwart their own intentions by acting with a different intention. But it may help to understand why there is nothing paradoxical about such impairments. The sense of paradox stems from the assumption that agents are active all the way down, as it were-active, that is, not just in executing but also in acquiring intentions. Once we acknowledge that intentions can be acquired unintentionally, we will no longer be baffled by breaches of the unity of agency. As James put it, 'not only is it the right thing at the right time that we thus involuntarily do, but the wrong thing also, if it be an habitual thing' (1890a: 114). On the other hand, given that rational agents are certainly able actively to form and revise their intentions, if they put their mind to it, we may still think of such cases as exceptions to the norm, where the norm is a matter, not of causal generalizations from the usual case, but of normative principles, such as a coherence, that are constitutive of rational agency. Clearly even on this alternative interpretation, the resources of common-sense psychology in explaining your absent-minded left turn remain limited. We can explain your action in terms of your intention, but will find little to say, other than referring to habit and absent-n1indedness, about the explanation of your intention. At this point an inforn1ation-processing theory might yield further illumination: for example, the unselective processing of habitual stimulus-response associations might be invoked to explain why the stimulus caught your attention. So the debate is between, on the one hand, theories on which informationprocessing mechanisms are held to give an exhaustive causal account of an action, and, on the other hand, theories on which information-processing explanations work in tandem with common-sense rational explanations. A sin1ilar (though far bigger) issue arises with respect to the explanation of actions informed by prior intentions. As noted earlier, Perner and Humphreys & Riddoch agree that the 'lower level' of control is also highly relevant here: intentions 1 This kind of view might also be motivated by philosophical concerns with the nature of rulefollowing. Crudely, the thought would be that we must make room for the notion of habitual intentional actions, where one finds oneself saddled with an intention, on pain of getting wrong various clearly intentional aspects of our actions. Compare, for exan1ple, John McDowell's paraphrase of Wittgenstein's account of rule-following: 'When one follows an ordinary sign-post, one is not acting on an interpretation. This gives an overly cerebral cast to such routine behaviour. Ordinary cases of following a sign-post involve simply acting in the way that comes naturally to one in such circumstances, in consequence of some training that one underwent in one's upbringin~~(~~~o~~~ ~9~~ ~Ol.
Mechanisms and Epistemology
11
get executed by priming the activity of the lower-level mechanism. But how should we conceive of the interaction between the (higher' and the (lower' level? On this issue the chapters by Perner and Humphreys & Riddoch present somewhat different perspectives. Humphreys and Riddoch speak of an information-processing system that is sensitive to environmental factors as well as intentions, without proposing an analysis of the latter in information-processing terms. On a natural interpretation, the (higher level' here simply is the level of common-sense psychology, or the (personal level'. So the claim might be put by saying that when you form a prior intention you thereby prime the appropriate stimulus-response associations (though of course this is not something you are aware of doing). We might think of this top-down influence on the lower-level system as a mechanism of (delegating control to the environment' (Gollwitzer, 1996). It enables you to carry out your intentions effortlessly, as it were: when the time comes, the appropriate stimuli will select the right actions for you. In Perner's account, in contrast, the distinction between two levels is explicitly understood as a distinction between two information-processing systems. An influential version of this idea is the distinction between a system of (contention scheduling' and the (supervisory attentional system' (Norman and Shallice, 1986). Briefly, these are two different kinds of mechanisms for resolving conflicts amongst action schemas (action programmes activated by stimuli). Contention scheduling consists in the mutual inhibition of competing schemas and the mutual activation of supporting schemas, with the most strongly activated schema winning out. Alternatively, a schema may be selected because its activation value has been strengthened by the (supervisory attentional system', an information-processing system devoted to the planning and execution of non-routine actions. Now, a common cOlnplaint against the idea of a (supervisory attentional system' is that it credits an information-processing system with the causal powers of a rational agent. For example, Dennett ridicules the system as (an ominously wise overseerhomunculus who handles the hard cases in the workshop of consciousness' (1998: 288). On one reading, the complaint here is that the theory lacks explanatory force-it does not tell us how the operation of the supervisory attentional system carries out the kinds of functions we associate with rational agency. Perner aims to meet this objection by offering a substantive account of the (higher level'. He suggests that the distinction between the two levels corresponds to a distinction between two kinds of representations. Put very crudely, the lower level involves representations that are strongly context-dependent and (connectedly, as Perner argues) lack explicit semantic structure. At the higher level, in contrast, representations not only have explicit semantic structure; they are also (fact-explicit'-they make explicit whether they aim to represent fact or fiction. In other words, they involve a simple form of metarepresentation. Perner goes on to put this distinction to work in presenting a detailed taxonomy of the phenomenology of action, as well as in defending a version of a (higher-order thought' theory of consciousness.
12
Naomi Eilan and Johannes Roessler
Without considering these further developments here, and indeed without going into the details of Perner's proposal, it is worth highlighting a methodological issue raised by it. What should count as evidence for the existence of two distinct information-processing systems underpinning 'automatic' and 'deliberate' behaviour? Norman and Shallice (1986) put considerable weight on neurological disorders which may be taken to show that the two systems can be selectively impaired. For example, in their view, Utilization behaviour provides an illustration of contention scheduling operating on its own, unmodulated by the supervisory attentional system. By the same token, though, it may be argued that the functions associated with the 'higher level' are subserved by distinct, dissociable systems. Just this argument is made in Humphreys and Riddoch's chapterentitled 'Fractionating the Intentional Control of Behaviour'. Their point is not just that the processes underpinning intentional control cannot be located in a single anatomical area. Rather, what ES's performance in the 'cups' experiment, as well as other data, suggest is that there is a functional distinction between intentional control of the selection of targets (e.g. which cup to pick up) and intentional control of responses (e.g. which hand to use). Humphreys and Riddoch argue that there are two distinct information-processing systems that underpin these functions. Damage to one of the systems can leave the other intact. Of course, one might insist that there is still an important sense in which these systems are components of a single 'higher-level' system. As for evidence for the existence of such a systen1, Perner would argue that there is compelling evidence with which we are all familiar. As he puts it, 'dual control theory helps avoid the conclusion that conscious will is an illusion'. We are all comn1itted to thinking of intentions formed on the basis of deliberation as explanatory of many of our actions. The two-level inforn1ation-processing theory, Perner argues, vindicates that commitment. The dialectal situation here is far from clear, though. One response to Perner might be that the empirical evidence, as discussed, for example, by Humphreys and Riddoch, tells against the hypothesis of a single higher-level system, and that, if the latter is required to make good our commonsense picture, then so much the worse for common sense. Again, one. might be sceptical on philosophical grounds about the possibility of vindicating the common-sense conception of 'conscious will'. The worry might be that the price for vindication is reductive explanation, and that the latter is not a feasible project, given the prominent role of normative notions, and/or conscious experience, in common-sense explanations of action. And of course there is the background question of whether such vindication by a particular kind of informationprocessing theory is required to avoid the conclusion that conscious will is an illusion. (The issues arising from this last question are familiar from other areas, such as debates as to what should count as evidence for the 'language of thought' hypothesis. See, for example, Davies, 2000.)
Mechanisms and Epistemology
13
Control and Knowledge To return to our paradox, it is worth drawing attention to the possibility that the different kinds of example may not all be amenable to the same kind of treatment. We have focused on the case of everyday slips of action, where there is clearly room for debate as to whether there is a description under which a double capture error may be an intentional action. But consider the standard symptoms of Anarchic Hand syndrome. The case against thinking of these movements as intentional actions is stronger than with non-pathological absentmindedness, as patients manifest both their awareness of what is going on and their determination to stop it by trying to arrest the movement using their good hand. So perhaps in this kind of case it is undeniable that we need an informationprocessing conception of action. (Compare Peacocke's suggestion that the movements constitute actions that do not have an agent-that are not someone's actions.) Again, the case of ES may call for yet another kind of explanation. There is no doubt that ES's movements in the 'cups' experiment are intentional actions-they are clearly intentional under the description 'picking up the cup'. But how about the description 'using the right hand', in a case where this violates the instructions (and her intention to con1ply with them). Humphreys and Riddoch suggest that the source of the problem is that the mechanism by which prior intentions prime the lower-level system is damaged; the intentional specification of the target (the cup) is effective, but the representation that should modulate response selection is 'degraded'. It might be said that this amounts to being unable to act intentionally as far as response selection is concerned. On the other hand, one might insist that perceiving the cup's handle as pointing to the right gives ES a good reason to use the right hand (it 'affords' picking up with the right), which would suggest that the response selection is intentional, albeit, owing to her pathological absent-mindedness, at variance with her prior intention. Consider now an example described by Frye and Zelazo of 3-year-olds' inability to play the following kind of game. The children are presented with two cards, for example, a red triangle and a blue circle. The task is to match a series of cards to these targets by some specified dimension (e.g. by colour). After a few rounds they are asked to switch to a different dimension (e.g. shape). It appears that 3-year-olds have great difficulty with this game: they tend to continue to sort by the first dimension, despite grasping the rules of both the shape game and the colour game, and being able to play each one separately, and despite being invited to switch to the shape game. The question, then, is, given the intention to cooperate with the examiner, why do 3-year-olds persistently refuse to switch dimensions? Initially, one might suppose that when asked to switch from the colour gan1e to the shape game children form the right intention, but then fail to inhibit an incorrect habitual response, the kind of explanation we may want to give in both pathologies and absent-minded cases. In fact, however, 3-year-olds' behaviour
14
Naomi Eilan and Johannes Roessler
seems to be perfectly deliberate, as can be seen, for example, from their comments on others' performance on the task, in which they insist that others too should persist in sorting by colour rather than shape. This is what they think, on the face of it, they should be doing, and it is relative to that that they judge their own and others' success and failure. Yet on separate occasions, they exhibit everything one would expect if they do know what should be done when the instruction is to sort by shape. So the developmental cases of failures of control seen1 to raise specific issues, and a special kind of challenge. While in the case of ES, for example, we may resort to a combination of common-sense psychology (to explain the use of the right hand in picking up a cup that affords picking up with the right) and neuropsychology (to explain ES's inability to select responses in line with her prior intentions), neither of these kinds of explanations seems capable of casting n1uch light on 3-year-olds' performance in the card-sorting experiment. Crudely, their behaviour seems to resist explanations in terms of some kind of (perhaps pathological) absent-mindedness; rather, it seems to force us to revise our conception of what it is for a mind to be present. In her commentary on Frye and Zelazo's chapter, Jennifer Hornsby argues that from the point of view of common-sense psychology the 3-year-olds' behaviour is inexplicable. The common-sense principle that renders it such is, she suggests: 'If you know what you should do and are able to do it then (in the absence of any tendency not to do it), you will do it.' There is good evidence that they do know what they should do, that they have no tendency not to play the game, and yet they don't do it. But if we take their behaviour to be governed by the common-sense generalization, we must say that they do not know what they should do. So, at the post-switch stage we must treat the child as both knowing and not knowing what she should do. This, in turn, suggests, that if their behaviour is to be at all explicable, we must turn to something other than common-sense psychology, an empirical theory that will explain these results. In effect, what Frye and Zelazo propose is a resolution of this paradox by ap-peal to a developmental theory on which the children are unable to work out what they should do, relative to the intention to cooperate with the examiner. What they cannot do is formulate the kind of switching rule required for success in the task. In more detail, on the developmental account Frye and Zelazo propose, the middle of the second year is a crucial milestone in the sense that children for the first time control their behaviour by the use of 'conditional rules' linking means and ends. Put differently, they engage in practical reasoning. Frye and Zelazo place this change in the context of a general developmental progression from domination by the 'exigencies of environmental stimulation' to a more autonomous mode of control, which they characterize in terms of the progressively more complex rules children are capable of employing in forming intentions. With respect to the developmental change manifested in the card-sorting task, they argue that -------------------------------
-------------~
Mechanisms and Epistemology
15
children's problem is one of adopting a new intention in situations where the response to a given kind of stimulus that is correct relative to the new intention would have been wrong relative to the previous intention. A central claim of the chapter is that the problem is a general, domain-unspecific limitation in 3-yearaIds' practical reasoning abilities, namely, the inability to employ a higher-order rule to determine which of two rules to apply. In this sense, if the knowledge that should be guiding them is knowledge of the switching rule, they do not know wl;1at to do. Independently of whether or not the account works as a theory that supplements in some sense our common-sense theory (for critical discussion, see Hornsby), one important ingredient highlighted in their account is the intuitive connection between intentional action and knowledge. Hornsby is raising the question of whether our everyday notion of intentional action applies to the children's behaviour in these experiments, when certain kinds of connection with knowledge are not met. More specifically, underpinning our puzzlement about the children is an intuitive connection between (a) acting intentionally; (b) knowing what one ought to be doing in thus acting (grasping the rule); and (c) knowing whether or not one is doing what one should be doing. That is, we assume that if an action is intentional then, in normal circumstances, and when all is open to view, the subject should be able to tell whether or not the action she has performed complies with what she ought to be doing, given her intention. This connection with knowledge has, in fact, been implicit all along, from the first formulation of the paradox generated by failures of control. There is puzzleraising con1petition between two reasons for action only if we assume that subjects are aware of how they ought to be acting, given both intentions, and aware of whether or not their actions conform to these intentions. (This intuitive connection can be brought out by considering one way of resolving the paradox in both cases, namely, driving the offending action underground. If we assume the second action is unconscious, the paradox dissolves.) If this is right then whatever story we tell about the mechanisms of action control, some account will have to be taken, in whatever way, of this intuitive connection in our common-sense psychology. This is the issue to which we now turn.
3. EPISTEMOLOGICAL ISSUES The problem of agents' knowledge was put on the philosophical agenda by Elizabeth Anscombe in her classic Intention (1957). According to Anscombe, it is a defining characteristic of intentional action that the agent has non-observational knowledge of what she is doing (not just of what she is intending, or trying, to do). AnscoITlbe's account of the source of this knowledge is best approached by looking at the case of knowledge of future actions. Suppose you are planning a visit to the
16
Naomi Eilan and Johannes Roessler
supermarket and someone asks you what you are going to buy. The obvious way to answer that question is to disclose the content of your shopping-list-to make a prediction of your future behaviour on the basis of deciding, or having decided, what to do. Put differently, you respond to a theoretical question about the future by answering a specific practical, or deliberative, question. 2 This contrasts sharply with an answer based on theoretical evidence, as when you predict that you will probably end up with a box or two of Turkish delight because you always do. Anscombe's central claim is that non-observational knowledge of our own current actions is available for the same reason, and in the same way, as knowledge of our own future actions. Roughly, I know about my current (intentional) actions without any need for observation or other kinds of evidence because they are the upshot of my practical reasoning. We can think of Anscombe's account as an attempt to chart a middle course between what, in her view, are two unattractive extremes. Traditionally it has been held that we have knowledge of (certain aspects of) our own actions 'from within', or from the first-person perspective, on a basis that differs from the kinds of evidence we use in finding out about others' actions. One extreme position would be to reject the very idea of a first-person/third-person asymmetry in relation to knowledge of agency-to assimilate the agents' perspective to that of an outside observer. At the other extreme lies the traditional account of that asymmetry. On this view, knowledge 'from within' is to be equated with introspective knowledge, the only possible objects of which are thought to be 'inner' conscious states or events. Knowledge 'from within' of bodily actions turns out to be introspective knowledge of such things as acts of the will or tryings. (Anscombe ridicules this as an appeal to 'a very queer and special sort of seeing eye in the middle of acting' (p. 57).) The general idea behind her middle course is that agents' knowledge is sui generis in the following sense: Agents' Knowledge. The agent of an action is aware of what she is doing in virtue of controlling her action, rather than on the basis of observation or introspection. The relevant notion of control, for Anscombe, is that of rational control, where this is described as a matter of practical reasoning 'leading to action' (p. 60). The problem, then, is to understand how practical reasoning can constitute a source of knowledge of agency. On the face of it, there is a kind of category mistake here. Practical reasoning yields a grasp of practical reasons, an understanding of why it is desirable to perform a particular action. But what we need when we are trying to elucidate the source of agents' knowledge is an account of the epistemic, or theoretical, reasons on which such knowledge is based-reasons for believing a proposition to be true. Put crudely, what is not clear is why a decision to do 2
See Moran (2001) for an illuminating discussion of the relation between these kinds of questions.
Mechanisms and Epistemology
17
something should constitute any kind of knowledge. Anscombe's radical move in response to this worry is to characterize agents' knowledge as 'practical' in the following sense. We tend to think of propositional knowledge 'as something that is judged as such by being in accordance with the facts. The facts, reality, are prior, and dictate what is to be said, if it is knowledge' (p. 57). Anscombe calls this 'speculative' knowledge. In contrast, 'practical knowledge' aims for the facts to fit with it. If a claim to practical knowledge turns out to be incorrect, this is because "reality has refused to comply with the dictate of knowledge: 'the mistake is in the performance, not in the judgement' (p. 82).3 In effect, Anscombe maintains that intentions that have been correctly executed constitute knowledge of the intended action. As intentions are based on practical reasons, so is agents' knowledge. Many contributors to the present volume share Anscombe's view that there are important connections between action control and knowledge of actions 'from within', but it is comn10n ground between them that the notion of practical knowledge provides little help in unpacking the connections. For one thing, Anscombe's account does not hold in full generality. Intention plus success does not in general equal knowledge of succeeding. You may intend to buy a nice melon, and succeed in doing so (there is no 'mistake in the performance'), while being, at best, hopeful of having succeeded. Of course, it is plausible that there are descriptions of your action under which you do know what you are doing, for example, 'I am buying "this" melon' (said while pointing at a particular melon) or 'I am trying to buy a nice melon'. What is theoretically unsatisfactory about Anscombe's account is that it does not tell us how it is that you know what you are doing under these latter descriptions, but not under the first description. A natural suggestion at this point is the following. What is missing from Anscombe's account is any appeal to the experience of agency. To determine under which descriptions agents have knowledge 'from within' of what they are doing, we need to give an account of the content of that experience. As the papers in this collection testify, there are numerous ways in which this general suggestion can be developed; and numerous substantive issues, both epistemological and phenomenological, which need to be resolved to make it good (some of which are discussed in the next section). Here we confine ourselves to sketching some fundamental theoretical choices to be made. 3 For an early manifesto of what Hintikka calls the tradition of 'maker's knowledge', compare the following passage from Maimonides: 'There is a great difference between the knowledge which the producer of a thing possesses concerning it, and the knowledge which other persons possess concerning the same thing. Suppose a thing is produced in accordance with the knowledge of the producer, the producer was then guided by his knowledge in the act of producing the thing. Other people, however, who examine this work and acquire a knowledge of the whole of it, depend for that knowledge on the work itself. For instance, an artisan makes a box in which weights move with the running of water, and thus indicate how many hours have passed ... His knowledge is not the result of observing the movements as they are actually going on; but, on the contrary, the movements are produced in accordance with his knowledge' (Guide for the Perplexed, part iii, ch. xxi, quoted in Hintikka, 1974: 84).
18
Naomi Eilan and Johannes Roessler
One question is what to make of Anscombe's project of charting a middle course between the two extremes. Suppose we adopt the general line that agents' knowledge is made available, in a sense to be explained, by the experience associated with agency. Then we have three options. We might take the view that doing proper justice to experience requires adopting the first extreme (abandoning the traditional idea of a first-person/third-person asymmetry relative to knowledge of bodily actions). Or again, we may favour the second extreme (equating knowledge of actions 'from within' with introspective knowledge of inner states or events). Finally, we might argue that getting right the role of experience will enable us to find the middle course Anscon1be was contemplating. The first two options are defended in the chapters by Joelle Proust and Brian O'Shaughnessy, respectively. Proust takes issue with the orthodox view that knowledge of one's own intentional action is non-observational. At least one reason for the attraction of that view, she argues, is a traditional, but misconceived, account of the third-person case. On this account, observation merely furnishes knowledge of bodily movements. To establish that a movement constitutes an intentional action we need to rely on an inference, invoking a particular intention as the best explanation of the observed movement. In opposition to this, Proust contends that intentions can be directly manifest in the 'dynamic pattern' of observed goal-directed movements. A second corrective to the orthodox view is her focus on unpremeditated, spontaneous actions-actions that have an intention-in-action but no 'prior intention', in Searle's sense (Searle, 1983). Proust claims that, at least in these cases, we acquire knowledge of the intentional content of our own actions by observing our movements. This leads her to deny any difference in principle, relative to such cases, between the bases of knowledge of one's own and knowledge of others' actions, though with the following important qualification. Proust draws a distinction between the source of knowledge of the intentional content of an action (what is being done?) and the source of knowledge of ownership of an action (who is doing it?). Appeal to observation is intended only to address the first issue. Knowledge of ownership, in her view, arises from a conscious 'sense of effort'. (We will return to this in Section 5.) The other (by Anscombe's lights) extreme position is occupied by Brian O'Shaughnessy. O'Shaughnessy argues that (a) where there is physical action, there is a mental event of trying, or willing, which is identical with the action, and (b) events of the type willing are conscious experiences-part of the content of the 'stream of consciousness'. In his view, we have immediate experiential knowledge only of tryings; put differently, we are introspectively aware of our actions only under descriptions of the form 'I am trying to ... '. Knowledge of successful bodily actions, even actions as humble as moving one's arm, is always based on inference. The route by which O'Shaughnessy reaches these conclusions might be called the argument from total failure (a close relative of the argument from
Mechanisms and Epistemology
19
illusion). Consider the case of total failure, where, due to sudden paralysis, an attempt at bodily action does not even involve a bodily movement. In this situation' O'Shaughnessy reasons, the subject is still aware of trying to act (not just of the intention of doing so). He concludes that a phenomenologically salient mental event of trying also occurs in the case of attempts that are crowned with a measure of success; here the trying constitutes the 'inner', introspectively accessible aspect of bodily action. Suppose, finally, that we are sympathetic to Anscombe's idea of a middle ground, while remaining firn1ly committed to the view that agents' knowledge is grounded on the experience of acting. Then our project might be described as that of transcending the divide between the inner and the outer: we are seeking an account on which the perspective of conscious agency yields direct knowledge of bodily actions. However, the basic issue between Proust and O'Shaughnessy (does agents' knowledge arise from an 'outer', sensory awareness or from an 'inner', introspectively accessible experience?) remains relevant. For there are two strategies we may adopt in pursuing the middle course. Roughly, one is to bring the model of introspective knowledge to bear on bodily actions; the other is to argue that agency involves a distinctive kind of perceptual experience, a kind of experience that differs from that of an outside observer and constitutes a first-person perspective on bodily actions. We find versions of the first strategy in the contributions by Christopher Peacocke and Lucy O'Brien (Chapters 3 and 17). Peacocke defends the view that awareness of trying is a sound basis for non-inferential knowledge of successful bodily action. His argument turns on the causal role of trying. Not only is it the case that trying, say, to move one's hand normally causes one's hand to move. Rather, Peacocke suggests, it is partly constitutive of trying to move one's hand that it is an event of a kind which normally generates the appropriate movement. Peacocke concludes that an agent who is aware of trying to move her hand is entitled to judge that she is in fact moving her hand. (One aspect of the disagreement between Peacocke and O'Shaughnessy concerns the relation between trying and moving: O'Shaughnessy rejects the idea that tryings cause movements; in his view, successful tryings are identical with bodily actions, and the latter 'incorporate' (rather than cause) bodily moven1ents.) Lucy O'Brien's acccount makes no appeal to tryings at all. Part of her project is to n1ake a case for the view that bodily actions are 'as primitive a psychological phenomenon as beliefs and perceptions'. In other words, she rejects the assumption that actions can be analysed into a mental and a physical element (of which only the former would qualify as a possible object of introspective awareness). In her view, getting right the epistemology of action requires getting right what it is for bodily actions (rather than mere tryings) to be conscious. The central notion of her account is that of actions occupying attention. Briefly, her suggestion is that conscious actions-those that occupy attention-are actions performed on the basis of an assessment of possible options, hence with
20
Naomi Eilan and Johannes Roessler
a <sense of control'. Such actions are immediately accessible to the subject in the sense that engaging in the action provides a reason for the belief that one is doing so. The second, perceptual strategy is explored in Chapters 15 (by Jerome Dokic) and 18 (by Johannes Roessler). Dokic's central claim is that experiential knowledge of agency arises from proprioceptive bodily experience. In acting intentionally, he argues, we are proprioceptively presented with our movements as controlled by ourselves. This is a kind of knowledge respectively). In response to this stimulus one of two response keys in front of the Subject has to be operated, either the left
Experimental Approaches to Action
169
one or the right one (R 1 and Rr> respectively). A set-up like this allows for two tasks, differing in how responses are assigned to stimuli:
compatible:
SI
Sr
RI
Rr
incompatible:
X
When the assignment is spatially compatible, stimuli and responses share a common spatial feature (both left or both right), whereas they will always exhibit two different spatial features in the incompatible assignment (right and left or left and right). As has been shown in a large number of experimental studies, response performance for compatible assignments is clearly superior to incompatible assignments, and this holds for both response times and error rates (e.g. Fitts and Biederman, 1965). It is not easy to account for an effect like this in terms of practice and contiguity. Rather, the effect suggests that responses can become prespecified by stimuli on the basis of shared features. This is not far from claiming that perception induces action by virtue of similarity. What exactly could these shared features refer to and what precisely could be the basis of the correspondence between stimuli and responses that generates the effect? This has been studied for the Simon task, a close relative to the standard compatibility task (Simon, 1969; Simon and Rudell, 1967). In this task, two response keys are assigned to the two hands, and two possible stimuli can occur, for example, a high-pitched and a low-pitched tone (H and L, respectively). The task requires to press the left of the two keys in response to one stimulus (say, L) and the right-hand key in response to the other (say, H). Stimulus identity (Hvs. L) is therefore the relevant stimulus dimension. Yet, in addition to this, the position of the stimulus is varied as a further dimension (left vs. right of a fixation mark). This dimension is irrelevant in the sense that the participant is instructed to focus on stimulus identity and completely ignore stimulus position. Hence, the four resulting pairings of stimulus identity and position can be classified according to correspondence between stimulus and response positions. Positions do correspond when stimulus L is presented on the left-hand side or stimulus H on the right-hand side. They do not correspond in the other two cases. In this task, too, reaction times are short in the case of correspondence and long in the case of non-correspondence. This is the Simon effect, and again one can think of it in terms of shared features in perception and action: responses are fast when stimuli and responses share the same spatial position but slow when the positions are different. Yet, there are several levels at which the correspondence could be effective. First, when, for example, the left-hand key is pressed in response to a left-hand stimulus the motor command for the hand movement is generated in the same brain hemisphere in which the sensory code for the stimulus has been generated
170
Wolfgang Prinz
immediately before. Also, the response is generated by the left hand, that is, by an effector attached to the body hemisphere that corresponds to stimulus position. This may be summarized under the notion of anatomical correspondence. Second, correspondence may be claimed between the location at which the stimulus is presented and the location at which the action is performed. This may be called locational correspondence. In the standard Simon task, locational correspondence is always confounded with anatomical correspondence. There is, however, an easy way to unconfound them by having subjects perform the task with their hands crossed (Simon et aI., 1970; Wallace, 1971). If anatomical correspondence is the critical factor, one has to expect that the effect is completely reversed in this condition. In contrast, it should be unaltered if locational correspondence counts. The results of the experiments are clearly in favour of locational correspondence, suggesting that the functional origin of action induction resides at a high level of representation at which environmental events and actions (= stimuli and responses) are localized in extracorporal space. One may even go one step further and distinguish between the location of the action itself and the location of the action's goal, provided one can separate the two. In the standard paradigm where the action is a simple key press there is no way to separate the two because the action (= pressing down the key) and the goal ( = the key being pressed down) share the same location. What happens when one dissociates actions and goals? In this case, in addition to locational correspondence, intentional correspondence might be effective as well-that is, correspondence between stimulus location and goal location-irrespective of response location. Hommel (1993) ran an experiment to test this view. In his experiment, the hands were left uncrossed. What was crossed instead were the links between the two response keys and two additional lights that were triggered by operating the keys. As a result, when the left-hand response key was pressed, a feedback light on the right-hand side would go on, whereas a feedback light on the left-hand side would go on when the right-hand response key was pressed. The task was a Simon task, with stimulus identity (pitch) as the relevant and stimulus location as the irrelevant dimension (high vs. low-pitched tones, coming from loudspeakers mounted on the left vs. right-hand side). There were two groups of subjects. In the control group, participants were instructed to press the left key in response to low-pitched tones and the right key in response to high-pitched tones and ignore the lights altogether. In the experimental group the same task was administered under a different instruction. This time participants were instructed to switch on, as fast as possible, the light on the right-hand side in response to lowpitched tones and the light on the left-hand side in response to high-pitched tones. The purpose of the design was to implement two different intentional sets: one in the control group that refers to a goal residing in the movement itself (= key press) and one in the experimental group that refers to a goal beyond the movement (= light onset). In the control group where the lights served no particular function
Experimental Approaches to Action
171
the standard Simon effect was obtained. However, in the experimental group, where the lights served the function of action goals, the location of the movements was virtually irrelevant. Instead, the critical factor was the correspondence between stimulus location and goal location: responses were fast when the stimulus and the target light shared the same location (despite the fact that the response was always located on the opposite side). Conversely, responses were slow when the stimulus and the target appeared at different locations (despite the fact that the location of response corresponded to that of the stimulus). Two major conclusions seem to emerge from this research. First, action induction seems to be based on representations which code stimuli and responses as environmental events in extracorporal space. Second, it seems that action induction is more dependent on shared features between stimuli and goals rather than between stimuli and movements. 1.3 Common Coding In summary, then, we see the classical sens.orimotor framework faced with two major challenges: to provide functional roles for similarity and for goals. In order to meet these challenges, we need to come up with a novel framework that blends elements from the sensorimotor and the ideomotor stance. The framework of common coding is meant to offer solutions to both of these challenges (cf. Hommel et al., 2001; Prinz, 1984, 1987, 1990, 1997a, b). Similarity: Commensurate Coding
How can a functional account be given of the fact that certain patterns of stimulation may, under certain conditions, induce certain patterns of action by virtue of similarity? Basically, there can be no role for similarity without commensurate coding. The notion of common coding suggests that, somewhere in the chain of operations that lead from perception to action, the system generates certain derivatives of stimulation and certain antecedents of action that are commensurate in the sense that they share the same system of representational dimensions. Such shared dimensionality is a prerequisite for any form of similarity-based processing. Commensurate coding is a strange notion for the sensorimotor framework for perception and action. This framework relies entirely on separate coding. Stimuli and responses are, by definition, distinct and incommensurate, and this applies to their internal representations as well: stimulus codes stand for patterns of stimulation arising in receptor systems and movement codes stand for patterns of excitation travelling to effector systems, and there is no way these codes could be compared with each other, or matched. What is required, instead, is a mapping device that translates stimulus codes into movement codes. The metaphor of translation which is in fact abundantly used in the reaction time literature (Welford, 1968, 1980; Sanders, 1998; Massaro, 1990) stresses the incommensurability between stimulus codes and movement codes: there is one
172
Wolfgang Prinz
representational language for stimulus information, organized in terms of sensory dimensions and there is another language for movement information, organized in terms of motor dimensions. The operation mediating between the two translates from one language into the other: it bridges the gap between perception and action by creating links between incommensurate entities. The claim I want to defend is not that the sensorimotor framework is mistaken. Instead, I maintain the weaker claim that it is incomplete. Quite obviously, there must be routes from perception to action where incommensurate codes are translated across separate representational domains. Yet, to account for action induction, there must also be routes where stimuli and responses meet in a common representational domain-represented through commensurate codes and speaking the same language so that they can be matched to each other without translation. In this domain, action codes can be the same as (or similar to) perceptual codes with respect to some of the representational dimensions and different or dissimilar with respect to some others. Therefore, perception and action codes will overlap to various degrees and, as a consequence, they may induce each other by virtue of their overlap, or similarity. What codes could be functional in the supposed common domain? A number of options can be considered, varying on two dimensions (d. Prinz, 1984, 1990, 1992). One dimension refers to the functional locus at which perception and action meet. The other refers to the nature of the information contained in the supposed commensurate codes. As to the first dimension, the functional locus at which perception and action meet may either be placed on the perceptual or the action side. On the one hand, one can think of the meeting place as being located in a sensory or perceptual coding domain where stimuli and responses are both represented in terms of stimulusrelated coding dimensions. This view posits a perceptual basis for action representation. For instance, one may assume that action and perception can talk to each other directly because actions are represented through their sensory or perceptual consequences. On the other hand, one can also think of the meeting place as being located in a motor- or action-related domain of coding where stimuli and responses are both represented in terms of response-related coding dimensions. This view posits an action basis for perceptual representation. For instance, one may assume that perception and action can talk to each other directly because perceptual events are represented through movements or actions that lead to them. s 5 For instance, according to motor theories of speech and motion perception, the perception of speech and motion is based on mandatory entries in motor modules for the generation of speech gestures and body movements, respectively (ef. e.g. Liberman et aI., 1967; Viviani et aI., 1997). Note that the notion of a perceptual basis for action representation is much broader in scope than the notion of an action basis for perceptual representation. An action basis for perceptual representation can only be claimed for the limited range of perceptual events that can be directly generated through corresponding action (as in speech, movements, etc.). Conversely, since all action goes along with perceptual consequences, a perceptual basis for action representation can be claimed for any action.
Experimental Approaches to Action
173
The principle of common coding, as I conceive it, adopts the first of these two views. It implies the notion that actions are represented at the common meeting place in terms of their sensory or perceptual consequences. This has important implications for both learning (i.e. the formation of linkages between actions and their consequences) and performance (i.e. the control of goal-directed action). As to the second dimension, the information contained in the supposed commensurate codes can be represented at various levels of coding. At the one extreme, one can think of low-level codes of elementary sensory features of stimuli and responses. At the other extreme, one can think of high-level codes of complex semantic features of environmental events and actions. Since we must assume that coding at all of these levels is involved in all kinds of sensory and motor activity, the question of interest is not what codes and what levels of coding exist, but rather at which of these levels the meeting place for commensurate codes is established. Empirical evidence and theoretical considerations both argue for high-level representation and abstract semantic codes. As was indicated above, evidence from the Simon task suggests that action induction relies on correspondence between events and actions in distal, extracorporal space and not on correspondence between stimulus and response patterns defined in coordinates of proximal body anatomy. This suggests a strong role for abstract semantic codes. Theoretical considerations argue for abstract coding and distal reference, too. In a way, abstraction must be considered a prerequisite for creating commensurability among otherwise incommensurate entities. For instance, the sensory representation of an action that a person performs him or herself will be entirely incommensurate with a perceptual representation of the same action he or she observes in another person. Since one representation is in terms of kinesthetic features and the other in terms of visual features, there is no obvious way to match them. Yet, at a more abstract level of representation, the same two events may be commensurate, for instance, with respect to their kinematic structure (spatio-temporal pattern) or their semantic content (meaning or goal). It is only at this level where one could induce the other by virtue of similarity.6 Thus, we can now narrow down the functional locus of the meeting place for commensurate codes for perception and action as follows. It needs to be a representational domain where stimulus-related and response-related information are both coded as environmental events. The difference between the two is that events are actor-independent whereas actions are actor-dependent. Otherwise, they are 6 One could argue that codes at low and high levels of representation differ from each other in two independent respects, both referring to the information contained in them, namely, degree ofabstraction and reference. Degree of abstraction concerns the difference between concrete (sensory) versus abstract (semantic) features. Reference concerns the difference between proximal versus distal representation. The two factors may be logically independent. Yet, in functional terms, they are correlated in the sense that abstract codes with distal reference offer optimal conditions for commensurate coding.
174
Wolfgang Prinz
completely commensurate. They share a common space for semantic, or concep_ tual representation and a common reference system for spatial and temporal localization. As a result, they are represented as two interrelated components of a single, coherent stream of meaningful happenings.
Goals: Action Codes Commensurate coding of events and actions notwithstanding, our approach is still sensorimotor in its basic functional logic, as long as we do not provide a functional role for goals in action control. Therefore, we need to add to it an ideomotor component, yielding an extended framework with functional roles for both external and internal causes of action. Goals may vary along a number of dimensions. For instance, they can be concrete (like e.g. catching a rolling ball by a particular grip) or abstract (like e.g. winning a game of chess). Further, they can refer to resident versus remote effects of action. For example, one can raise one's arm just in order to perform this particular gesture (for instance, in dancing) or in order to open a window, or in order to say hello to somebody else. In these three cases, the same movement is used to achieve three entirely different goals. In the first case, the goal is resident in the movement itself, whereas it is remote in the other two cases. Remote goals can in many cases be realized through a number of different movements. For instance, one can open a window either by raising one's arm or just by asking somebody else in the room to open the window. Since the same movement can realize several goals and the same goal can be realized through several movements, it is reasonable to assume that goals and movements are represented independently of each other. This leads us to propose a distinction between two basic constituents of action codes, namely, goal codes and movement codes. Further, we propose that action codes subserve two basic functions, evaluation and control. Outcome evaluation is readily explained if one assumes that movement code activation tends to go along with goal code activation and that goal codes serve as a reference for the evaluation of the outcome actually observed. More delicate is the issue of control, that is, how goal codes can playa causal role in movement control. As stated above, this requires the reversal of the links established between movements and goals, that is, working back from goals to movements suited to realize them. The standard solution to this problem has been to postulate anticipatory goal codes and assume that they are furnished with the power to control the execution of movements suited to realizing these goals. In the literature, there have been various attempts to solve this seeming puzzle, like Ach's 'determining tendencies' (Ach, 1905), Hull's 'fractional antedating goal response' (Hull, 1943), or Greenwald's (1970) 'ideomotor mechanism'. These mechanisms share the notion that actions are controlled by anticipatory goal codes which have
Experimental Approaches to Action
175
emerged from previous learning of relationships between movements and their outcomes. Without going into the details of the mechanisms proposed in the literature, the nature of this learning may be illustrated by a simple thought experiment (d. Prinz, 1997b). Suppose that whenever a certain movement is performed, the performing system is capable of keeping track of the movement's resident and remote effects up to a certain level of remoteness. When the same movement is .repeated several times, some of these repetitions will yield similar effects, others will yield different effects. Close, or resident, effects will tend to be more similar to each other across repetitions than remote effects. As a result, the total pattern of movement effects will take the shape of a divergent fan. That fan specifies the variety of effects that have been observed to follow from performing this particular movement, including relatively constant close effects and more variable remote effects. Suppose, furthermore, that divergent fans are available for a large variety of movements, each of them representing possible effects of the respective movement. Naturally, these fans will overlap to a considerable extent. This is because the large number of events can be effectuated by a variety of movements. This is particular true of more remote effects. For example, as mentioned above, the closing of a window can be effectuated through a large variety of movements, including, for example, the raising of one's arm in a particular way-but also, for example, through asking somebody else in the room to close the window. As a result, one can account for the pattern of linkages between movements and their effects in two ways. One way is to conceive it as an assembly of strongly overlapping divergent fans, individuated on the basis of the movements and specifying the variety of possible effects. The other way is to conceive it as an assembly of convergent fans, individuated on the basis of potential action effects and specifying the variety of possible movements that may lead to these effects. These action fans emerge and develop irrespective of how the movements come about. The sole requirement is a powerful network, capable of learning contingencies between movements and effects. Our framework proposes that action codes are built upon knowledge about relationships between movements and effects, as stored in convergent fans. The critical requirement to add is that the linkages built into action fans can be used either way-that is, not only in accordance with temporal order but in the backward direction as well. If such backward computation is warranted, goal codes are furnished with the power to activate their movement codes, that is, elicit movements through which the goals can be attained. 7 7 The structure of action codes may be functional not only in action control but in ordinary perception as well. Action codes may account for the enactive aspects of perceptual functioning as claimed by motor theories of perception (cf. n. 5). As goal codes refer to certain environmental events (those that can Occur as effects of one's own movements), they will not only be activated when these movements
176
Wolfgang Prinz
Since action codes are made up of two basic constituents, action induction can take two different forms, namely, goal induction and movement induction. As indicated before, there is often no way to distinguish them. This is because most of the evidence relies on examples where the goal is resident in the movement, so that the theoretical distinction cannot be mapped to a corresponding difference in empirical observations (except for Hommel's dissociation experiment which strongly supports goal induction). At this point, we may leave it as an empirical question to which extent action induction relies on goal or on movement induction. The sole constraint derived from theoretical considerations is that, as a rule, environmental events cannot directly induce the execution of movements unless corresponding goals are activated. This assumption is required in recognition of the trivial fact that-at least in humans-action does not come about as a mandatory consequence of perception. Even induced action does not follow from perception without intention, that is, without goal codes switched in. Matching Codes
The theory of common coding suggests that action induction relies on matching. s I use the term of matching in order to refer to computational operations suited to link codes to each other within the boundaries of a common representational domain. This view implies that any two codes that belong to the same representational system can get matched to each other, irrespective of their contents. Still, the fact that the matching is always performed within the boundaries of a closed representational system has important functional implications for both performance and learning. are performed but also when pertinent events are perceived independent of one's own movements. The perception of such events will then imply two activations: first, activation of corresponding event codes, and second, by virtue of the links inherent in action codes, activation of codes for movements suited to effectuate these events. Interestingly, this notion can also be phrased in an entirely different theoretical jargon. Ecological approaches claim that there can be no perception of events that does not at the same time specify (more or less explicitly) certain movements afforded by these events. Action codes could thus be used for building functional architectures for motor theories as well as Gibsonian (affordance-based) theories of perception. 8 Note that common coding theory does not suggest that matching replaces mapping throughout. Quite on the contrary, mapping operations must be ubiquitous in the system. First, as indicated above, there is no reason to believe that the supposed meeting place for commensurate codes is the sole route that leads from perception to action (and vice versa). There must be a number of other routes as well, most of them presumably located at lower levels of coding where mapping is the only way to link codes to each other. Second, we must not forget that, in order to allow for matching at one place, the system needs to invest in additional mappings in a number of other places. For instance, the claim that events and actions can only be commensurate when movements are represented through their effects requires efficient mapping relationships between movements and effects-and vice versa. In the same vein, the claim that common coding applies to abstract codes implies complex transformations from low- to high-level coding, which all require the mapping of codes across differently structured and, hence, incommensurate domains.
Experimental Approaches to Action
177
First, codes that overlap will prime each other. For instance, when a red stimulus which requires a right-hand response is presented on the right-hand side of the stimulus screen, the codes representing the event on the screen and the required action in response to it will overlap with respect to the feature of spatial location (right-hand side). In a case like this, the event code will partially pre-activate, or prime the action code, with the strength of priming depending on the degree of overlap. This priming accounts for action induction, with obvious implications for performance. Second, when the codes do not overlap, the matching operation will still help to specify their mutual relationships within the common representational domain. This specification is derived from the functional architecture of the common representational system. On the basis of its inherent representational dimensionality this architecture allows us to compute how far the two codes are away from each other (functional distance) and on which particular dimensions they differ (functional difference). This has important implications for learning and automatization. 9 Consider first the implications for performance. The notion of overlap-based priming implies that, upon activation of a particular event code, action codes may get primed or partially primed, dependent on code overlap. This priming, since it is due to the fact that the codes share strictly identical components in the same representational domain, will be mandatory and automatic, and the system has no way to suppress or escape from it. Obviously, such mandatory action priming can either support or act against the required mapping, depending on the structure of the task. Consider, for the sake of illustration, three different trials from the above-mentioned task where the onset of a coloured light is the imperative stimulus for pressing one of two keysa left-hand key in the case of a green light and a right-hand key in the case of a red light. In the first trial, a red light appears at the centre of the screen. This trial is neutral in the sense that no obvious overlap and, hence, no priming between the stimulus code and the response code can be claimed. In the second trial, a red light appears on the right-hand side. In this case, where stimulus and response share the same (relative) location, the stimulus code will partially prime the response code for the correct response, thereby supporting the mapping required by the instruction. Conversely, in the third trial where the red light is flashed on the left-hand side of the screen, it will prime the action code for the incorrect response, thereby acting against the required mapping by creating conflict
9 Both implications do not apply to the mapping of codes (i.e. linking codes across representational domains). First, if there is no common representational system shared by the codes to be linked, there can be no overlap and, hence, no priming. Second, mapping across domains has no way to compute any meaningful relationships between the codes involved and is therefore blind to their functional difference and distance.
178
Wolfgang Prinz
between the response required by instructions (right-hand) and the response induced by the stimulus (left-hand). As a result, in any task, action induction will emerge to the extent a given stimu- : Ius primes certain responses that compete within the task. lO The neutral status of I the first trial derives from the fact that no such priming occurs at all. The difference between the second and the third trial is that the priming either pertains to the required response or to one of its competitors, with beneficial and detrimental effects on performance, respectively. Consider now the implications for learning. On the one hand, event codes and action codes are built on the basis of sensory codes representing physical characteristics of stimulus events and action effects. On the other hand, they represent objects and events with distal reference and in terms of their semantic properties. As a result, the information contained in these codes goes far beyond the sensory' basis it is built on. In fact, we must assume that the system learns to represent (physical) events like stimuli and responses through appropriate (semantic) event codes, both actor-dependent and independent. This view implies that the way in which stimuli and responses are represented in this domain is not fixed at all. Instead, it is subject to learning, that is, operations resulting in structural alterations of the event codes involved. For instance, it is reasonable to assume that, in the course of practising a specific task, the structure of the codes involved gets altered so as to optimize the conditions for matching, that is, for both exploiting the benefits and avoiding the detrimental effects of priming as much as possible. This goal can be achieved by two means: by creating new overlap (where priming helps) or by erasing old overlap (where priming hurts).ll Task performance will then become automatic to the degree that the required mappings become supported by priming and, hence, take the form of matches. There will certainly be limitations to the structural alterations that can be achieved through learning. For example, the fact that the Stroop effect does not I
,
10 A different picture may emerge with code overlap across tasks, for instance, in a situation where two tasks (51 -> R\; 52 -> R2 ) are arranged such that the stimulus for the second task (52) is presented at the time where the response for the first task (R,) is being selected and prepared. In a situation like this, where two independent tasks address the same codes at the same time, there should be a way to protect and isolate these activations from each other. Therefore, we need to distinguish between two basic cases: we may expect to see induction when stimuli codes overlap with reponse codes within tasks. However, when the same overlap arises across tasks we may expect to see interference arising from that overlap (ef. e.g. Mtisseler, 1999; Mtisseler and Hommel, 1997; Stoet and Hommel, 1999; SchuM et al., 2001) 11 New overlap may be generated in various ways. For instance, one of the two codes may expand to overlap with (part of) the other, or both may expand and create a new zone of overlap. Conversely, overlap may become deleted by erasing overlapping features from one of the two codes or from both. There is a noteworthy parallel between the distinction between creation and deletion of overlap and the distinction between enrichment and differentiation in perceptual learning (cf. Gibson and Gibson, 1955; Postman, 1955). However, unlike Gibson and Postman who each defended one of them, we hold that both can be operating at the same time: creating code overlap requires associative enrichment, whereas selective differentiation is required for deleting code overlap.
Experimental Approaches to Action
179
go away after extensive specific practice seems to suggest that code components that have been established in life-long learning cannot be ignored, or deleted, for the sake of a particular task. Whether the same kind of limitation holds for the creation of new overlap remains to be explored.
2. EXPERIMENTAL EVIDENCE In the following, I will go through some recent experiments from our lab in order to indicate ways of studying action induction, that is, similarity-based relationships in the interaction between stimuli, goals, and actions as well as the putative operations in the underlying codes. 2.1 Instructed Reactions In most experimental tasks, actions come about as re-actions to certain stimuli, and they do so by virtue of assignment rules as fixed in the experimental instructions. For instance, in a typical response selection task, participants are instructed, to respond, for example, with one of two key presses, R l and R2 , in response to one of two stimuli, 51 and 52' respectively. Once the task set has been implemented this way, individual stimuli, 51 or 52' are then presented in an unpredictable order and participants are required to deliver the appropriate responses as fast as possible. The task thus requires the selection of responses to given stimuli on the basis of a prespecified rule.
Response 5election In one of our tasks we wanted to study the relative contributions of symbolic and iconic response specification in a response selection paradigm (Brass, 1999; Brass et al., 2000). In this task, participants were always required to select among two responses, namely to lift, as fast as possible, either the index finger or the middle finger of their right hands in response to the stimulus presented. In other words, the choice was between two fingers that could be used to perform the same manual gesture, that is, lifting. The stimulus was always provided by a hand on the screen which was the mirror image of the participant's right hand. We used two different instructions. One was iconic, or imitative, requiring the participant to lift the same finger as was being lifted by the hand on the display. The other instruction was symbolic, requiring the participant to lift the same finger as was marked by a cross on the display. With each of these two instructions, three different classes of stimuli could be presented: baseline, congruent, and incongruent. For baseline stimuli, only one of the two features was shown (under iconic instructions one finger was lifted and no cross was shown; under symbolic instructions one finger was
180
Wolfgang Prinz
marked by a cross, and no lifting was shown). For congruent stimuli, the same finger that was lifted was also marked by a cross, whereas for incongruent stimuli one finger was lifted and the other one was marked by a cross. The major results from this experiment can be summarized as follows. First, when the finger to be lifted was cued by a stimulus finger performing the same gesture (= baseline/iconic), response times were much shorter than when it was cued by a stationary stimulus finger marked by a cross (= baseline/symbolic). This seems to reflect a difference in the degree to which the selection of a given response can be supported by iconic versus symbolic information, suggesting a pronounced advantage of iconic matching over symbolic mapping. Second, there was strong iconic interference with symbolic instructions, and it was observed in both directions: iconic congruency helped and iconic incongruency hurt (relative to baseline). Third, there was also an (albeit weaker) symbolic interference effect with iconic instructions, this time only in the sense that symbolic incongruency hurt (relative to baseline). In sum, the results suggest that iconic cueing of response gestures is much more powerful than symbolic cueing. In a further experiment we decided to weaken the iconic similarity between stimulus and response gestures, with everything else completely unchanged. In this experiment the same two instructions and the same three types of stimuli were combined, but a different response gesture was used throughout: instead of an upward lift, the task this time required a downward tap of the finger indicated by the stimulus. Under these conditions, the baseline difference virtually disappeared. Under symbolic instructions iconic incongruency was still effective but under iconic instructions only a slight effect of symbolic incongruency was preserved. In sum, though the strong advantage of iconic over symbolic response specification had now gone, a substantial impact of iconic incongruency was still preserved. Clearly, this finding suggests that weakening gesture similarity also weakens the impact of iconic response specification-without, however, deleting it completely.
Response Initiation In another set of experiments, we addressed the issue to which extent iconic cueing can even be effective under conditions of full response certainty, that is, when the response to be generated is kept constant over a number of trials (Brass, 1999; Brass et al., 2001). The issue of whether or not compatibility effects are obtained in simple reaction tasks is controversial in the literature. If any such effects are observed they tend to be weak and not very robust. This has often been taken to support the claim that stimulus-response compatibility effects arise in response selection-an operation that, by definition, is involved in response selection tasks but not in simple initiation tasks (see Hommel, 1997, for a discussion and overview). Therefore, if one could show that substantial compatibility effects arise
Experimental Approaches to Action
181
in a task involving no choices and, hence, no response selection at all, this would challenge the notion that this particular operation is the functional locus where compatibility effects emerge. In the experiments participants were presented with a randomized sequence of two different stimulus gestures, namely, an index finger that would move either up or down at an unpredictable point within a time window of a few seconds. Participants had to respond with one of the same two gestures with their index finger. This time, however, response gestures were always kept constant within blocks, requiring the same, pre-known response gesture over and over again (say, moving up), irrespective of the stimulus gestures shown (moving up or down). Therefore, since stimulus gestures and stimulus onset times were randomized within blocks, the task exhibited (1) some degree of stimulus uncertainty, (2) some degree of temporal uncertainty, but at the same time (3) full response certainty. Over a number of experiments we observed huge compatibility effects for both response gestures (somewhat more pronounced for downward than for upward movements): response gestures were much faster when prompted by corresponding stimulus gestures as compared to non-corresponding gestures. Further, in one of our experiments we made an attempt at separating two factors that were confounded in the first experiment, namely, direction compatibility (upward vs. downward movement of the finger) and movement compatibility (flexion vs. extension of the finger). In this experiment, participants ran through four blocks. Two blocks were an exact replication of the previous experiment. For the two remaining blocks, we turned the display upside down and mirrored the resulting image. Under these conditions, two new stimulus gestures emerge: tapping (= flexion) goes up and lifting (= extension) goes down. Using the same two response gestures throughout (= tapping down and lifting up), these four blocks should allow us to decouple the effects of direction and movement compatibility. Results showed that both of these factors were effective and added to each other: response gestures were particularly fast when prompted by stimulus gestures with the same direction and the same movement, and they were particularly slow when stimulus gestures were different on both dimensions. Performance was intermediate with similarity on one and dissimilarity on the other dimension. These observations appear to rule out the classical view that similarity-based compatibility effects can only arise in the process of response selection proper. Instead, they support the notion that iconic, or imitative, response specification may be higWy effective even under conditions in which the reaction to be performed is completely prespecified and predetermined. Moreover, they suggest a functional role for similarity on (at least) two independent dimensions, one referring to the direction of movements in spatial terms and the other to the pattern of movements in bodily terms.
182
Wolfgang Prinz 2.3 Spontaneous Action
In a further set of experiments we made an attempt to study action induction under conditions where there is no explicit task set that demands certain actions in response to certain stimuli. What we wanted to study instead was the spontan- , eous occurrence of movements and their relation to the events going on in the actor's environments. Such spontaneously occurring movements have sometimes been called ideomotor movements (cf. Prinz, 1987). As the term suggests, these' movements have often been discussed in close relationship to ideomotor approaches to action. Still, the descriptive term (to which I refer here) must not be confounded with the theoretical concepts addressed above. Ideomotor movements may, under certain conditions, arise in a person who is observing the course of certain events. Classical examples of ideomotor action are body movements induced by watching other people's actions. For instance, while watching, in a slapstick movie, an actor who walks along the edge of a plunging precipice, observers may often be unable to sit still and watch quietly. They will move their legs and their arms or displace their body weight to one side or the other. Further, as mentioned above, ideomotor movements may also be induced by watching physical events resulting from actions. For instance, while following the course of a bowling ball, people are frequently unable to resist these movement tendencies. The involuntary, or even countervoluntary nature of ideomotor actions has placed them among the curious phenomena of mental life. Moreover, the fact that they are instrumentally completely ineffective makes them even more mysterious (cf. Prinz, 1987). Quite obviously, ideomotor movements fall under the rubric of action induction phenomena. So far, however, it has been an open question how the pattern of body movements that is induced in the observer is related to the course of events that induce them. Basically, two answers to this question have been suggested. The classical answer believes in perceptual induction, that is, induction based on similarity between the events perceived and the movements induced. This answer was already inherent in James's Ideomotor Principle, according to which the mental act of representing certain movements (like perceiving them) will always induce a tendency to perform the same or similar movements. Accordingly, perceptual induction invokes that the observer tends to repeat in her actions what she sees happening in the scene. It considers ideomotor actions a special class of imitative actions-special in the sense of lacking an underlying intention to imitate. A competing answer is offered by intentional induction. This principle relies on intended rather than perceived events. It holds that the observer tends to perform actions that are suited to realizing what he wants to see happening. In other words, he is believed to act in a way that would be suited to reaching certain intended goals if his movements were effective. In a way, then, this principle considers ideomotor actions a special class of goal-directed, instrumental actions-special in the sense of being instrumentally ineffective.
Experimental Approaches to Action
183
We developed a paradigm that should allow us to study the relative contributions of perceptual and intentional induction (Knuf, 1998; Knuf et al., 2001). The task was modelled after the logic of the bowling-ball example. Participants saw a ball moving towards a target on a screen, either hitting or missing it. At the beginning of a trial, the ball was shown at its starting position at the bottom, and the target position was shown at the top. Starting positions and target positions were always chosen such that the ball had either to travel in a north-eastern or north-western direction in order to hit the target. Participants triggered the ball's computercontrolled travel and observed its course. The ball's travel was divided into two periods, instrumental and induction. During the instrumental period (which lasted about one second) participants could manipulate one of the two objects by corresponding joystick movements, depending on condition. In the ball condition, joystick movements acted to shift the ball to the left or the right (after which it would continue travelling in the same direction as before). By this means participants could shift the ball's trajectory and have a chance of hitting the target. In the target condition, joystick movements acted to shift the target to the left or the right in an attempt to give it a chance of getting hit. (Initial motion directions were chosen such that the ball would never hit the target without correction.) We reasoned that this task should allow us to study ideomotor movements occurring during the induction period (which followed the instrumental period and lasted for about two seconds). We examined how joystick movements occurring during this period (where they are no longer effective) were related to the happenings on the screen. Perceptual induction predicts the same pattern of joystick movements for both conditions: they should always point in the same direction as the ball motion (leftwards with the ball travelling north-east, rightwards with the ball travelling north-west). Intentional induction predicts a more complex pattern. First, it leads one to expect that systematic joystick movements should only occur on trials with upcoming misses but not with upcoming hits. On upcoming hits, participants should be able to see, or extrapolate, that the ball would eventually hit the target, so that no further instrumental activity was required to achieve the goal. On upcoming misses, participants should likewise be able to extrapolate that the ball would eventually miss the target-which should then induce ideomotor movements performed in a (futile) attempt to affect the further course of events. More specifically, the details of these attempts should depend on two factors: the object under initial instrumental control (ball vs. target) and the side on which the ball is expected to miss the target (left vs. right misses). In the ball condition (where the ball is under initial control), joystick movements should act to push the ball towards the target (i.e. rightward in the case of a left miss, and leftward in the case of a right miss). In the target condition (where the target is under initial control), joystick movements should act to push the target towards the ball (leftward in the case of a left miss and rightward in the case of a right miss).
184
Wolfgang Prinz
The results of our experiments lent strong support to intentional induction but not to perceptual induction. First, the direction of ball movement (north-west vs. north-east) did not appear to be a major determinant of the direction of induced movements. Second, on trials with upcoming hits, induced movements were virtually absent. Third, on trials with upcoming misses, pronounced induced movements emerged, whose directions were dependent on both the object under initial control (ball vs. target) and the side of the upcoming target miss (left vs. right), exactly in line with the pattern predicted by intentional induction. These findings suggest that, at least in our paradigm, ideomotor movements are much more strongly governed by representations of intended than of perceived events. Further experiments have shown that perceptual induction may in some cases be effective, too. For instance, when one looks at ideomotor movements induced in effectors that are not instrumentally involved in joystick control (like head and foot movements), one sometimes sees perceptual induction, too, suggesting that non-instrumental effectors tend to follow the ball's travelling direction. Intentional induction was, however, also effective in head and foot movements. Accordingly, a more comprehensive view will need to encompass both (weak) perceptual induction and (strong) intentional induction.
3. TO CONCLUDE What can we learn from our experiments about the functional underpinnings of action control? Two major lessons appear to emerge, supporting the basic claims made in the first section. The first one is that watching certain events may automatically induce a tendency to produce actions that resemble these events. Interestingly, this does not only apply to observing other people's actions but may also extend to physical events that follow from one's own or other people's actions. Second, and perhaps even more importantly, when people watch actions or their effects they seem to represent them not only in physical terms (i.e. their spatio-temporal pattern) but in semantic terms as well (i.e. their underlying goals and the extent to which they are being achieved). In summary, theoretical considerations and experimental observations both suggest a novel conceptual framework for action control that provides roles for similarity and for goals in the underlying functional machinery. In order to develop such a novel framework, we need to combine sensorimotor with ideomotor strands of thought. This is what common coding theory is about. REFERENCES ACH, N. (1905), Ober die Willenstatigkeit und das Denken. Gottingen: Vandenhoeck &
Ruprecht.
Experimental Approaches to Action
185
ANDERSON, J. R. (1983), 'The architecture of cognition', Cambridge, Mass.: Harvard University Press. BRASS, M. (1999), 'Imitation and ideomotor compatibility'. Dissertation, University of Munich. _ _ BEKKERING, H., WOHLSCHLAGER, A., and PRINZ, W. (2000), 'Compatibility between observed and executed: comparing symbolic, spatial, and imitative cues', Brain and
Cognirion,44: 124-43. _ - - - - - (2001), 'Movement observation affects movement execution in a simple response task', Acta PsychoLogica, 106: 3-22. DESCARTES, R. (1664), Traitee de I'Homme. Paris: Girard. DONDERS, F. C. (1862), 'aber die Schnelligkeit psychischer Processe', Archiv fur Anatomie
und PhysioLogie, 657-81. FITTS, P. M., and BIEDERMAN, 1. (1965), 'S-R compatibility and information reduction',
JournaL ofExperimentaL PsychoLogy, 69: 408-12. FREESE, M., and SABINI, J. (eds.) (1985), GoaL-Directed Behavior: The Concept ofAction in PsychoLogy. Hillsdale, NJ: Erlbaum. GIBSON, J. J., and GIBSON, E. J. (1955), 'What is learned in perceptual learning? A reply to Professor Postman', PsychoLogicaL Review, 62: 447-50. GOLLWITZER, P. M., and BARGH, J. A. (eds.) (1996), The PsychoLogy of Action. Linking Cognition and Morivation to Behavior. New York: Guilford Press. GREENWALD, A. (1970), 'Sensory feedback mechanisms in performance control: with special reference to the ideomotor mechanism', PsychologicaL Review, 77: 73-99. HELMHOLTZ, H. (1852), 'Messungen tiber Fortpflanzungsgeschwindigkeit der Reizung in den Nerven', Archiv fur Anatomie, Physiologie und wissenschaftLiche Medizin, 199-216. HERSHBERGER, W. A. (ed.) (1989), VolitionaL Acrion: Conation and Control. Amsterdam: North-Holland. HOMMEL, B. (1993), 'Inverting the Simon effect by intention: determinants of direction and extent of effects of irrelevant spatial information', Psychological ResearchlPsychoLogische
Forschung, 55: 270-9. --(1997), 'Toward an action concept model of stimulus-response compatibility', in B. Hommel and W. Prinz (eds.), Theoretical Issues in Stimulus-Response Compatibility. Amsterdam: North-Holland, 281-320. - - MOSSELER, J., ASCHERSLEBEN, G., and PRINZ, W. (2001), 'The theory of event coding (TEe): a framework for perception and action', BehavioraL and Brain Sciences, 24:
849-937. HULL, C. L. (1943), PrincipLes of Behavior. New York: Appleton-Century-Crofts. JAMES, W. (1890), The Principles ofPsychoLogy. New York: Macmillan. JEANNEROD, M. (1997), The Cognitive Neuroscience ofAcrion. Oxford, UK: Blackwell. KNUF, L. (1998), Ideomotorische Phiinomene: Neue Fakten [iir ein altes Problem. Aachen: Shaker. - - ASCHERSLEBEN, G., and PRINZ, W. (2001), 'An analysis of ideomotor action'. Journal
of ExperimentaL Psychology:
Genera~
130 (4): 779-98.
LIBERMAN, A. M., COOPER, F. S., SHANKWEILER, D. P., and STUDDERT-KENNEDY, M. (1967), 'Perception of the speech code', Psychological Review, 74: 431-61. LOTZE, R. H. (1852), Medicinische Psychologie oder die Physiologie der Seele. Leipzig: Weidmann.
186
Wolfgang Prinz
MASSARO, D. W. (1990), 'An information-processing analysis of perception and action', in O. Neumann and W. Prinz (eds.), Relationships between Perception and Action: Current Approaches. Berlin: Springer-Verlag, 133-66. MEYER, D. E., and KIERAS, D. E. (1999), 'Precis to a practical unified theory of cognition and action: some lessons from EPIC computational models of human multiple-task performance', in D. Gopher and A. Koriat (eds.), Attention and Performance, XVII. Cambridge, Mass.: MIT Press, 17-88. MILLER, G. A., GALANTER, E., and PRIBRAM, K. H. (1960), Plans and the Structure of Behavior. New York: Holt, Rinehart & Winston. MOSSELER, J. (1999), 'How independent from action control is perception?', in G. Aschersleben, T. Bachmann, and J. Musseler (eds.), Cognitive Contributions to the Perception of Spatial and Temporal Events. Amsterdam: Elsevier, 121-47. --and HOMMEL, B. (1997), 'Blindness to response-compatible stimuli', Journal of Experimental Psychology: Human Perception and Performance, 23: 861-72. NEWELL, R. (1990), Unified Theories of Cognition. Cambridge, Mass.: Harvard University Press. POSTMAN,1. (1955), 'Association theory and perceptual learning', Psychological Review, 62: 438-46. PRIBRAM, K. H. (1971), Languages of the Brain: Experimental Paradoxes and Principles in Neuropsychology. Englewood Cliffs, NJ: Prentice Hall. PRINZ, W. (1984), 'Modes of linkage between perception and action', In W. Prinz and A. F. Sanders (eds.), Cognition and Motor Processes. Berlin, Heidelberg: Springer-Verlag, 185-93. --(1987), 'Ideomotor action', in H. Heuer and A. F. Sanders (eds.), Perspectives on Perception and Action. Hillsdale, NJ: Erlbaum, 47-76. --(1990), 'A common coding approach to perception and action', in O. Neumann and W. Prinz (eds.), Relationships between Perception and Action: Current Approaches. Berlin, New York: Springer, 167-201. --(1992), 'Why don't we perceive our brain states?', European Journal of Cognitive Psychology, 4: 1-20. --(1997a), 'Perception and action planning', European Journal ofCognitive Psychology, 9 (2): 129-54. --(1997b), 'Why Donders has led us astray', in B. Hommel and W. Prinz (eds.), Theoretical Issues in Stimulus-Response Compatibility. Amsterdam: North-Holland, 247-67. ROSENBAUM, D. A. (1991), Human Motor Control. San Diego: Academic Press. SANDERS, A. F. (1980), 'Stage analysis of reaction processes', in G. E. Stelmach and J. Requin (eds.), Tutorials in Motor Behavior. Amsterdam: Elsevier, 331-54. SCHUBO, A., ASCHERSLEBEN, G., and PRINZ, W. (2001), 'Interactions between perception and action in a reaction task with overlapping S-R assignments', Psychological Research, 65 (3): 145-57. SIMON, J. R. (1969), 'Reactions towards the source of stimulation', Journal of Experimental Psychology, 81: 174-6. --and RUDELL, A. P. (1967), 'Auditory S-R compatibility: the effect of an irrelevant cue on information processing', Journal ofApplied Psychology, 51: 300-4.
Experimental Approaches to Action
187
_HINRICHS, J. V., and CRAFT, J. L. (1970), 'Auditory S-R compatibility: reaction time as a function of ear-hand correspondence and ear-response-location correspondence', Journal of Experimental Psychology, 86: 97-102. STERNBERG, S. (1969), 'The discovery of processing stages: extensions of Donders' method', Acta Psychologica, 30: 276-315. STOET, G., and HOMMEL, B. (1999), 'Action planning and the temporal binding of response codes', Journal of Experimental Psychology: Human Perception and Performance, 25: 1625-40. VIVIANI, P., BAUD-Bovy, G., and REDOLFI, M. (1997), 'Perceiving and tracking kinesthetic stimuli: further evidence of motor-perceptual interactions', Journal of Experimental Psychology: Human Perception and Performance, 23: 1232-52. WALLACE, R. A. (1971), 'S-R compatibility and the idea of a response code', Journal of Experimental Psychology, 88: 354-60. WATSON, R. I. (1913), 'Psychology as the behaviorist views it', Psychological Review, 20: 158-78. WELFORD, A. T. (1968), Fundamentals ofSkill. London: Methuen. --(ed.) (1980), Reaction Times. London: Academic Press.
8 Perception and Agency Thomas Baldwin
Within the empiricist tradition perception is conceived as essentially passive in comparison with exercise of the active will: Locke writes 'For in bare naked Perception, the Mind is, for the most part, only passive'.l According to pragmatists this contrast is overdone and leads to the mistakes of 'the spectator theory of knowledge' which are to be corrected by giving due weight to the connections between perception and agency: Dewey writes that 'Experience, in other words, is a matter of simultaneous doings and sufferings'.2 As ever within philosophy there are truths on both sides of this dispute; the difficult task is to strike the mean.
I want to start, not with perception itself, but with belief, and in particular with Moore's paradox, since it illustrates well the ambivalent relationship between belief and the will which is itself one aspect of the complex relationship between perception and the will. Moore's paradox is that I cannot coherently affirm such things as 'The sun is shining, but I don't believe that it is' or 'I believe that the sun is not shining, but it is' despite the fact that I know that there are many truths which I do not believe and that many of my beliefs are false. At one level, the explanation for this paradox is that a rational thinker who affirms that the sun is shining is committed to accepting that he believes that the sun is shining, so that he cannot coherently combine the simple affirmation that the sun is shining with the denial that he has that very belief or with the affirmation that he has the contradictory belief. In his discussion of Moore's paradox, however, Wittgenstein comes at the matter from a slightly different direction. 3 He suggests that Moore's paradox shows that for each of us the self-ascription of present belief is redundant in the sense that there is no difference from a first-person perspective between affirming 'I believe that the sun is shining' and just affirming 'The sun is shining'. An earlier version of this chapter was presented at Oxford and I am much indebted to the discussion on that occasion, in particular to Bill Brewer, Naomi Eilan, and Martin Davies. 1 J. Locke, An Essay concerning Human Understanding, ed. P. Nidditch (Oxford: Clarendon Press, 1975; 1st pub. 1689), II. IX. 1. Emphasis in original. 2 J. Dewey, 'The Need for a Recovery of Philosophy' (1917), reprinted in The Essential Dewey, i ed. L. Hickman and T. Alexander (Bloomington, Ind.: Indiana University Press, 1998),49. Emphasis in 3 L. Wittgenstein, Philosophical Investigations (Oxford: Blackwell, 195:3), ii, p. x. original.
Perception and Agency
189
Once I have described what I believe the facts concerning some situation to be, I do not need to add a further description of the situation itself; my description of that was given by my description of my beliefs about it. This way of explaining Moore's paradox reverses the direction of that suggested above while confirming the underlying point. For the redundancy of our self-ascriptions of belief rests on our commitment to the truth of what we believe. This commitment is intimately linked to the role of beliefs as reasons for beliefs and action. Even though what we assume, we assume to be true, we do not take our assumptions to provide us with reasons for belief or action, and this is reflected in the fact that in making an assumption we do not commit ourselves to its truth. Furthermore, although the connection between rationality and truth is most obvious in the case of theoretical reasoning, it is the role of beliefs as reasons for action which is more fundamental to our commitment to their truth. For since the conclusions of theoretical reasonings are just further beliefs, our commitment to the truth of the beliefs we employ as reasons for them remains dependent on our commitment to the truth of the conclusions we draw from them. Thus an explanation of our commitment to the truth of beliefs is deferred rather than grounded by concentrating on their role in theoretical reasoning. By contrast, in the case of practical reasoning we employ our beliefs as premises in reasonings whose conclusions we attempt to put into action. So in this case, our commitment to the truth of these beliefs is not dependent on our commitment to the truth of other beliefs inferred from them, and we can readily understand it as manifested in our willingness to act. This sets up a connection between belief and agency (I shall return to it at the end of this chapter). Equally, however, the redundancy of belief shows that the connection here has to be at arm's length only. For where a thinker regards the truth of a proposition p (that, e.g., the sun will shine tomorrow) as not up to him, he is committed by the redundancy of belief to regarding the truth of the proposition I believe that p as similarly not up to him. For suppose, on the contrary, he could envisage himself deciding to believe that the sun will shine tomorrow while recognizing that his decision will not affect the actual state of the weather. Only a little reflection is required to exhibit such a decision as inherently irrational: the redundancy of belief implies that when deliberately bringing it about that he believes that the sun will shine tomorrow he is doing something which will commit him to the truth of the proposition that the sun will shine tomorrow; but his recognition that his decision can make no difference to whether the sun will shine tomorrow implies that he has no reason for making this commitment. It only follows from this that rational thinkers cannot regard their own beliefs as subject to their will. It does not follow that these beliefs may not in fact be in some respects subject to their will; indeed we are surely all of us prone to wishful thinking. But this fact about ourselves cannot show itself to us. We may acknowledge that it often would suit the internal economy of our feelings to be able to maintain Our belief in something-for example, the belief that our children are honest. But faced with the question as to whether we believe this, the redundancy of
Thomas Baldwin
190
first-person ascriptions of belief forces us to confront the simple question 'are my children honest?', which is a question whose answer is not to be found by appealing to our feelings about the matter, but by reference to the behaviour and dispositions of our children. Thus we have to look to the world to find answers not only to questions about the world, but also to questions about our beliefs.
2
So far I have used Moore's paradox to discuss the ambivalent relationship between belief and the will. In turning to perception, there can be no immediate extension of exactly the same line of argument: for there is no incoherence in thinking, while I gaze at a waterfall 'it appears as though the rocks are moving, but they are not'. As Wittgenstein observed when discussing Moore's paradox: 'One can mistrust one's senses but not one's own belief'.4 None the less, the intimate connections between perception and belief lead one to expect that if belief is not subject to the will, then the content of perception must be similarly independent of it. For although perceptions are not beliefs, their content provides us with evidence and thereby reasons for beliefs: the fact that it appears to me that p is a reason for me to believe that p. Hence if we were to suppose that the content of perception was subject to the will, we would have to suppose that such reasons for belief can be created at will: for in deciding how things appear to us, we would be creating evidence for ourselves. But since reasons for belief are reasons for the truth of what is believed, evidence cannot be in this way subject to the will. Evidence which is created at will can provide no rational support for hypotheses whose truth is independent of the will. The argument so far is structural: perception could not provide us with evidence if we regarded it as entirely subject to the will. If one now adds the familiar empiricist thesis that meaning is dependent upon perceptual evidence, it will follow that a conception of perception as subject to the will be destructive of meaning and mental content in general. This conclusion is confirmed by an argument of Wittgenstein's which is not as well known as it deserves to be. The argument occurs in his Remarks on the Philosophy ofPsychology, ii and begins with a series of questions: 79 Isn't it conceivable that there should be a man for whom ordinary seeing was subject to the will? Would seeing then teach him about the external world? Would things have colours if we could see them as we wished?
Wittgenstein does not answer these questions at once; initially he discusses the sense in which imagery is subject to the will. But then he returns to his questions: 91 Is it conceivable that visual impressions could be banished or called back? What is more, isn't it really possible? If I look at my hand and then move it out of my visual field, 4
L. Wittgenstein, Philosophical Investigations, 190.
Perception and Agency
191
haven't I voluntarily broken off the visual impression of it?-But I will be told that that sort of thing isn't called 'banishing the picture of the hand'! Certainly not; but where does the difference lie? One would like to say that the will affects images directly. For if I voluntarily change my visual impression, then things obey my will.
The point here seems to be that although my ability to modify the 'things' I perceive, for instance, my hand, is a way of voluntarily changing what I perceive, this is not what the hypothesis of §79 was supposed to concern. Instead we are supposed to imagine a case in which someone can change their 'visual impressions' without doing so by altering the things they see. In the next three sections Wittgenstein argues that this supposition is incoherent: 92 But what if visual impressions could be controlled directly? Should I say, 'Then there wouldn't be any impressions, but only images'? And what would that be like? How would I find out, for instance, that another person has a certain image? He would tell me.-But how would he learn the necessary words, let us say 'red' and 'round'? For surely I couldn't teach them to him by pointing to something red and round. I could only evoke within myself the image of my pointing to something of the sort. And furthermore I couldn't test whether he was understanding me. Why, I could of course not even see him; no, I could only form an image of him. Isn't this hypothesis really like the one that there is only fiction in tl1e world and no truth? 93 And of course I myself couldn't learn or invent a description of my images. For what would it mean to say, e.g., that I was forming an image of a red cross on a white background? What does a red cross look like? Like this??-But couldn't a higher being know intuitively what images I am forming, and describe them in his language, even though I couldn't understand it? Suppose that this higher being were to say, 'I know what image this man is now forming; it is this: ... '-But how was I able to call that 'knowing'? It is completely different from what we call 'knowing what someone else is imaging'. How can the normal case be compared with the one we have invented? If I think of myself in this case as a third person, then I would have absolutely no idea what the higher being means when it says, with regard to someone who has only images and no impressions, that it knows which images that man has. 94 'But nevertheless can't I still imagine such a case?' The first thing to say is, you can talk about it. But that doesn't show that you have thought it through completely. (5 o'clock on the sun).
This long passage is, in effect, an application of the famous Private Language Argument. The basic thought is that the hypothesis that the content of perception is subject to the will turns out to be incoherent-like the thought that the time is 5 o'clock on the sun-because it undermines the possibility of perceptual-, or image-, content. The argument has two stages. In the first (§92), Wittgenstein points out that the hypothesis undermines the possibility of any shared language for describing the appearances of things (e.g. as red and round). For pointing at red round things in order to check that there is agreement in judgement concerning their colour and shape provides, under this hypothesis, no way of grounding a common understanding of the language: for who is to know how such things look to another
192
Thomas Baldwin
person if their appearance to him is subject to his will. But without any basis for agreement in judgement on such obvious matters as shape and colour there can be no shared language concerning them. This then undermines my confidence that my own experience is experience of a real world; indeed, once I recognize that my own experience is subject to my will, I should acknowledge that I have no reason to take it as experience of a real world at all. Instead the world as I experience it will be just my own 'fiction' and it will, for example, be for me an open question whether things themselves are coloured in anything like the ways in which I have decided they are to appear to me to be (as Wittgenstein suggested in the last sentence of §79). The second stage of the argument (§93) then takes the point further by criticizing the assumption that there is any conceptual content at all in the experiences and images of this subject isolated within their own fictional world. For without any perceptual contents independent of the subject's will and thus justifiable by reference to observed features of the world itself, the thinker's judgements as to what its images are images of (e.g. a red cross) are vacuous-not for simple verificationist reasons, but because there is no basis for a distinction between 'is right' and 'seems right' in the thinker's putative application of these concepts. We are straightforwardly back on the territory of the Private Language Argument here; and in this case, as ever, appeal to a God who 'knows' what images I have simply begs the question as to whether there is anything for such a 'higher being' to know in the first place. Wittgenstein's argument shows nicely how the hypothesis that perception is subject to the will undermines not only serious epistemology (the possibility of learning about the external world), but also, and thereby, the possibility of any conceptualization of perceptual content. Furthermore, although the argument does draw on the thought that voluntary perceptions cannot be regarded as providing evidence concerning a real world, the argument does not depend on a simple-minded empiricist verificationism concerning meaning and content; instead, as with the original Private Language argument, it draws only on the need for some basis for the 'is right'l'seems right' distinction, and the absence of any such basis within the purely voluntary experience of the subject envisaged in §§92-3. Hence, recoiling from Wittgenstein's reductio we have to accept that, in this fundamental sense in which imaging is subject to the will, perception is not. Thus any account of perception which seeks to show how agency, and thus the will, is in some respects implicated in perception has to respect this conclusion. There are limits to the involvement of the will in perception. 3
Having explored these limits, however, I now want to see how much involvement of the will remains possible in the case of perception. In the case of belief, it will be recalled, I argued that despite the fact that belief is not subject to the will, the involvement of beliefs in practical reasonings, and thus indirectly in agency, is an
Perception and Agency
193
essential condition of the fact that beliefs incorporate a commitment to truth. So there is here a precedent for supposing that some involvement of perception in agency is not only possible, but essential. An initial point to make is that it certainly does not follow from the considerations advanced so far that our epistemology has to be based on passive observation. As Collingwood famously maintained,s an important part of scientific inquiry is working out the right questions to ask, and this is not a matter of simply waiting for them to emerge from the data. So there is plenty of space here for mental and practical activity in constructing hypotheses and experimental methods to test them. All that follows from the conclusion that perception is not subject to the will is that at the heart of empirical inquiry there has to be a moment when the investigator has to stand back in order to wait and see what the results are-whether the light bends, what proportion of peas are smooth or wrinkled, or whatever. Empirical inquiry is in this respect a curious enterprise: preparations, often lengthy and massively expensive, are undertaken-but always and inescapably in such a way as to construct a space (Heidegger's 'clearing') within which the results of inquiry become apparent to investigators who cannot dictate what they are to be. Duhem's famous thesis was that the significance of the results of empirical inquiries is always open to question; but it does not follow that we are free to decide what these results were. This conception of the practical question and answer dialectic ('Man proposes; nature disposes') suffices, I think, to show that empiricism is not tied to a merely passive 'spectator theory of knowledge'. But the activity involved in the arrangement of empirical inquiries is always prior to, and essentially separate from, observation itself. The issue I now want to pursue is whether there are any more intimate relationships between perception and agency whereby, somehow, agency contributes to the structure of perception despite the limits on this involvement explored earlier. One kind of involvement is in fact discussed by Wittgenstein himself, namely that which occurs in cases of 'seeing as' where a figure can be seen in two waysfor example, as a drawing of a duck or as of a rabbit (Jastrow's duck-rabbit).6 For in a case of this kind, once we are familiar with it, we can switch from one way of seeing it to the other more or less at will. But the role of the will in this context is restricted to one of selection between alternatives that perception informed by past experience makes available to us. It is not that we can see things however we choose-as if we could see the duck-rabbit as a drawing of a cow if we wanted. Rather, as with an ambiguous sentence, the figure admits of more than one perceptual interpretation; and once we are familiar with two or more of them, we can choose between them. These interpretations are not, however, created at will: even with advice from others one needs to find for oneself the requisite perceptual 5 6
R. G. Collingwood, The Idea ofHistory (Oxford: Clarendon, 1946),269. Wittgenstein, Philosophical Investigations, 194.
194
Thomas Baldwin
organization, which, once achieved, leads to the experience of the 'dawning of an aspect'. In these cases, then, the will can only switch us between ways of seeing or hearing which it has not itself created. Cases of this kind involve a form of perceptual attention, and this is a more general feature of perceptual consciousness in which the will is often engaged. Again, the basic activity of the will in this area is one of selecting among a variety of perceptual inputs that are independently available to the subject. In simple cases the selection is of one sense-modality, for example, hearing, at the expense of others; but usually it is a matter of concentrating on one situation that is currently perceived in preference to other currently perceptible situations (e.g. attending to a lecture instead of noticing the people passing outside the window). The role of perceptual attention in this respect is central to the perceptual process. Like most organisms we are assailed all the time by sensory inputs which make available to us potential information about our environment and ourselves that far exceeds our limited capacity to extract the implicit information or do anything with it. A perceptual system which lacked a means of concentrating its limited resources on just some of its sensory inputs would be unable to respond coherently in a complex environment; instead of keeping track of a prey or learning from a spoken message it would be continuously distracted by other sights, sounds, and smells. No doubt there are simple organisms which lack this capacity. But I think Naomi Eilan is right to argue that perceptual consciousness comes only with perceptual attention/ that a subject has a 'point of view' only where it is able to attend selectively to its sensory inputs, and thus develop a coherent and ordered representation of those aspects of its current environment in which it locates itself. In many cases, of course, perceptual attention is involuntary: movements at the periphery of the visual field typically 'catch one's eye'; and we all know the infuriating way in which the attempt to conduct sustained conversation at a party can be disrupted by involuntary switches in auditory attention to a conversation that is just audible in the distance. Is it then wrong to think that the will makes a central contribution to perception through its exercise in directing perceptual attention? These cases show that the will is not omnipotent in this area-we can try, but fail, to attend to something. Perceptual attention remains in some respects involuntary, doubtless for evolutionary reasons: our ancestors needed the ability to respond promptly to potential dangers that arise outside the situation currently attended to. None the less, as rational beings we need to be able to harness our capacity for perceptual attention to our own ends, and we do this precisely by developing the ability to direct it ourselves. Learning to take control of our capacity for perceptual attention, in so far as we can, is a bit like learning to control our breathing, in so far as we can. 7 N. Eilan, 'Consciousness and the Self', in The Body and the Self, ed. N. Eilan (Cambridge, Mass.: MIT Press, 1995),337-57.
J. Bermudez, A. Marcel, and
i.
Perception and Agency
195
If we think of sight and touch, perceptual attention is closely related to voluntary control of the relevant sense-organs, and this then provides another way in which the will is involved in perception. But what remains debatable is how deep this involvement runs-how far, first, attention does involve a capacity for adjustment of the relevant sense-organs; and second, how far, as a consequence, the existence of these bodily sense-organs is implicated in perceptual consciousness itself. One way to think about these questions is to connect them with Quasim Cassam's thesis that, as self-conscious subjects, we possess an 'intuitive' awareness of ourselves as embodied. 8 For if perceptual consciousness brings with it perceptual attention, and if attention involves a capacity to direct one's sense-organs, then Cassam's conclusion that we possess an intuitive awareness of ourselves as embodied follows rather more straightforwardly from the hypothesis of self-consciousness than he supposes. A case which shows the need for caution here, however, is hearing; for, intuitively, our auditory system is one over which we exercise little voluntary control and which, for that very reason, is of all our senses the most apparently'disembodied'. It is no accident that Strawson chose to write about a 'sound' world in order to explore his Kantian fantasies about objectivity and space9-for in this sense-modality we are, phenomenologically, as unlocated in space as we can be. Though of course we often turn our head in the direction of a sound, shifts in auditory attention do not require movement of the ears-when listening to an orchestra we do not have to physically 'focus' our ears on different sound sources in order to attend to one type of instrument rather than another; we can do the selection at a later stage in the auditory processing. It is less clear that we can altogether detach our conception of ourselves as hearing sounds from an awareness of our ears: Russell was at one time very scornful of Dewey's suggestion that we can only speak of sounds as 'heard' where we imply that our ears were involved, 10 though he later changed sides on this issue and was, characteristically, equally scornful about his earlier view. II On balance I think a version of Russell's early view was right: we can imagine some primitive people thinking that we use our noses to hear things. Admittedly, this belief might lead to some odd habits, and it would not take much to correct it; but there does not seem to me anything incoherent in the supposition. The case of hearing shows, then, that the argument suggested earlier, that perceptual attention involves voluntary control of a physical sense-organ and thus brings with it an intuitive awareness of ourselves as embodied subjects, is too hasty. Except in the case of touch, perceptual consciousness does not need to be informed by an awareness of the role of the sense-organs involved. This does not 8 9
10 11
Q. Cassam, Self and World (Oxford: Clarendon, 1997). P. F. Strawson, Individuals (London: Methuen, 1959), ch. 2. B. Russell, Collected Papers, viii, ed. 1. Slater (London: George Allen & Unwin, 1986), 150. Ibid. 255.
196
Thomas Baldwin
show that it is altogether independent of an intuitive awareness of our embodiment. My own view is that some such intuitive awareness is implicit in perceptual consciousness-even auditory consciousness-because of the way in which perceptual consciousness incorporates an implicit recognition of the causal role of the physical objects of perception. But spelling this out is a complex and contentious matter and independent of my concerns here. 12 4
So far I have suggested that, despite my earlier conclusion that the content of perception cannot be subject to the will, the will can have an important part to play in the development and control of perception, particularly in the direction of perceptual attention. What I now want to discuss is how far a connection with the will is a condition for the possibility of perceptual content, even though the will cannot determine this content. This is of course an issue which connects with many disputed questions concerning the determination of mental content in general, but I hope to be able to discuss it without too many contentious commitments. A good place to start is at the subpersonallevel, at the level of the sensory representations to be found in simple organisms. Whatever one's favoured theory of content, it seems plausible to hold that it is only where properties of these representations are such as to enable them to contribute to the determination of behaviour which contributes to meeting an organism's needs or fulfilling its goals that these representations have any content at all. For without a connection of this kind with behaviour, a putative sensory representation is no more than a state of the organism with a structure systematically determined by significant features of its environment; but a state which is in this wayan effect of the environment is not thereby a representation of it. Exposure to radiation may well have systematic effects within an organism; but in the absence of any connection between these effects and a capacity for appropriate behaviour, these effects are not states which represent for the organism the level of radioactivity to which it has been exposed. The content of a sensory representation in a case of this kind is, therefore, a function both of its distinctive environmental causes and of the way in which it is apt to contribute to the determination of appropriate behaviour. Different accounts of 'naturalized intentionality' will fill out in different ways details of the requisite causation, the appropriate behaviour, and the relationship between the two. But these need not concern us. What is important here is just that there is an essential role for the determination of appropriate behaviour. For this sets the stage for the thesis with which I am primarily concerned, that the possibility of perceptual content is dependent upon a role for agency. This thesis is not established 12
Cf. T. Baldwin, 'Objectivity, Causality and Agency', in The Body and the Self, ed. J, Bermudez,
A. Marcel, and N. Eilan (Cambridge, Mass.: MIT Press, 1995), 107-26.
Perception and Agency
197
simply by the conclusion already reached; for that concerns subpersonal systems, whereas agency requires an agent, and thus at least a subject of action, if not a person. But it will be clear that in thinking about cases of this latter kind we can build upon the simpler one. The first case to think about is that in which perceptual consciousness is 'subjective' in the sense that the perceiver is present within it as its subject. In this case perceptual consciousness involves the organization of sensory fields in such a way that the perceiver's 'point of view' emerges as a point of origin within these fields. This point of origin need not be conceptualized as such; indeed conceptual thought need not be presupposed at all. What is primarily implied is an egocentric organization of perceptual content which gives it coherence and order from the perceiver's point of view, including sufficient integration of short-term memory with present experience to enable the perceiver to keep track of their relationship to objects within the perceived environment. But is it also implied that this subject be an agent, something which we can think of as capable of setting itself to do something, and thus of forming intentions or giving itself goals which it is also capable of abandoning when it fails to achieve them? We certainly connect the subjective structure of what we perceive with its potential for purposive activity: for example, what is near is what it would be easy to reach. It may well be argued, however, that this connection just reflects the truth of the converse of the thesis that is under consideration here, that is, the truth of the thesis that agency requires subjectivity. The plausibility of this thesis derives from the fact that agency is essentially first-person (as agents, we set ourselves to do things) and therefore requires beliefs, and thus perceptions, which represent the world from a first-person point of view. One cannot set oneself to chase a cat unless one can locate oneself in relation to the cat; and this latter capacity involves perceptions with an egocentric structure. But what of the original thesis, that the subjectivity of perception depends upon agency? One line of thought starts from the assumption that a subjective perceptual consciousness is one in which we are able to keep track of our location within a changing environment in which we are ourselves altering our location as we move around. This ability requires the ability to distinguish changes in the perceived world which are consequences of one's own movements from other changes in the environment, and, the suggestion is, our ability to make this distinction draws on the fact that it is where our movements are purposive that we are immediately aware of them as our own and are therefore able to distinguish their consequences from other changes in the perceived world. The objection to this suggestion, however, is that one can just as readily envisage this distinction being accomplished through the operation of a subpersonal system which monitors bodily movements and feeds information about them into the processing of perceptual input. The appeal to agency here achieves nothing that cannot be more straightforwardly achieved without it.
198
Thomas Baldwin
This objection seems right to me, and to be of a type which can be readily replicated to shoot down other suggestions which invoke agency as a condition of other distinctions inherent in subjective perceptual consciousness. But there remains one more general consideration in favour of the thesis that subjectivity requires agency, namely by treating it as an extension of the earlier thesis that sensory states have representational content only in so far as they are capable of contributing systematically to the determination of appropriate behaviour. For if we apply this thesis specifically to the hypothesis that sensory states have subjective representational content, then, the suggestion is, this supposed subjective content should be such that it makes a distinctive contribution to the determination of appropriate behaviour, and the only obvious way in which such a distinctive contribution can be made is precisely where the behaviour has the first-person structure that cornes with agency. The difficulty with this argument is that it appears to involve a verificationist assumption: put crudely, the thought seems to be that the way to verify that something has a subjective perceptual consciousness is to find that this consciousness is ll1anifested in the first-person structure of its behaviour as an agent. But I think the argument's rationale can be set at a deeper level than that of straightforward verificationisn1: instead, it rests on the Kantian thought that the first-person structure of subjective perceptual consciousness is such a fundamental feature of it (an 'a priori' feature of it, in Kant's terminology) that it is not something that can be grounded within the empirical structure of consciousness alone. Hence it requires vindication through considerations that look beyond consciousness to its place in human life. For if this is granted then a connection with agency does look to be the only plausible 'transcendental condition' for the possibility of subjective perceptual consciousness, since agency is so deeply implicated in first-person modes of thought. This line of argument remains somewhat speculative, as well as bringing with it the contentious presumptions of Kantian transcendental arguments, which I shall not seek to discharge here. No doubt, therefore, the argument needs refinement, but rather than attempt that task here I want to consider, finally, what difference it n1akes if one assumes that the subject of perception is also a person capable of rational thought. Clearly, if the preceding argument is a good one, it applies to persons in particular: personal subjects of perception must be agents. But there is a separate question as to how far their rationality also brings with it a commitment to agency. One general argument for this thesis rests on the claim made earlier in this chapter concerning the fundamental role of practical reasoning in explaining the commitment to truth inherent in belief. For in taking the subject of perception to be a person, we take it that their perceptual consciousness provides them with evidence which gives them reasons for belief. Hence if it is correct to suppose that the role of such reasons rests on the involvement of beliefs in practical reasoning, it will follow that a personal subject of perception must also be a person capable of rational action.
Perception and Agency
199
This general line of thought is rather abstract, but it can be filled out by considering what is involved in the mastery of some of the most familiar concepts which an ordinary person employs on the basis of their experience, namely practical concepts such as house, pen, bus, and so on. Since these are concepts which characterize things by the use that is made of them, it is obvious that someone who grasps these concepts must have some understanding of these uses; the substantive thesis is then that this latter understanding requires the ability to use things of this kind, which is a form of rational action. At its simplest, the thesis is that to recognize something as a pen one must be able to write with a pen. The argument for this is that without the ability to use a pen, one will not be able to grasp the basis for distinguishing between pens and other apparently similar things; where the basis for a classification is essentially practical, the ability to classify correctly requires the practical ability in question. Admittedly, in the case of some sophisticated but distinctive pieces of equipment, such as an electron microscope, one might have the ability to identify the equipment without the ability to use it; but one's understanding in such cases is limited and derivative: without the ability to use it oneself, one cannot tell when it is working, and one can only identify it because those who can use it have taught one its characteristic appearance. Thus in general, to use Ryle's idioms, in the case of these practical concepts, 'knowing-that' presupposes 'knowing-how'. It may be objected that this case is too easy to be properly exemplary, since a connection with practice is all too obviously built in to a grasp of practical concepts. But even if the argument in this case is straightforward, the case is none the less important because these concepts are characteristic both of our everyday 'common sense' understanding of the world and of a scientific understanding which acknowledges the role of experimental equipment in its justifications. In both cases practical concepts make an ineliminable contribution to our epistemology, and with them comes a necessary connection between rationality and agency. Anyway, the general thesis here can also be supported by considering what is involved in mastery of natural kind concepts. These are certainly not practical concepts; instead they are concepts which purport to identify fundamental kinds within a theoretical framework that unifies and explains a domain of inquiry such as chemistry. None the less there is considerable intuitive plausibility in the pragmatist hypothesis that mastery of these concepts requires the ability to undertake the kinds of empirical inquiry which justify their application. As it stands, this is clearly too strong a demand, since one can have a reasonable understanding of the basic concepts of chemistry, and in particular of the table of elements, while lacking any serious experimental experience and expertise. But the way to reinstate the pragmatist thesis is by distinguishing a first-hand from a derivative understanding of the concepts in question. 13 The derivative understanding of chemistry which 13 Cf. H. Putnam, 'The Meaning of"Meaning" " reprinted in his Philosophical Papers, ii (Cambridge: Cambridge University Press, 1975), 215-71, esp. 227-8.
200
Thomas Baldwin
most of us possess is based upon familiarity with textbooks and so on; but those who write these textbooks rely on the authority of investigators with a first-hand understanding to which the pragmatist thesis does plausibly apply. A simple argument for this pragmatist thesis would be that mastery of the theoretical concepts involved requires a capacity for knowledge of the relevant features of the world which can be obtained, first hand, only by empirical inquiry. The difficulty with this argument, however, is that the demand that knowledge be possible is, on the face of it, verificationist. But this demand can, I think, be circumvented by substituting the demand that the concepts involved be employed in such a way that there can be agreement in judgement concerning their application. For this will be possible only where first-hand investigators apply these concepts in situations where others can exercise their own judgment concerning the case in hand-and these will necessarily be situations in which empirical inquiries are conducted. Thus even where one is dealing with theoretical concepts such as natural kind concepts, there is good reason to hold that the ability to apply these concepts when reasoning on the basis of experience requires rational action, albeit, for most of us, that exemplified by the investigations of those to whom we defer. In this final argument I have substituted the requirement that agreement in judgement be possible for the requirement that knowledge be possible. Such a requirement obviously belongs within a Wittgensteinian account of concepts and language-games, and in setting out a brief review of the position put forward in this chapter this final line of thought can be placed alongside the considerations which I took earlier from Wittgenstein himself. The earlier claim was that if perception is imagined to be entirely subject to the will, its content will be entirely idiosyncratic and there will then be no possibility of agreement in judgement even concerning apparently obvious features of a shared situation; and where there is no such possibility of agreement in judgement, Wittgenstein argued, there is no basis for supposing that experience has any conceptual content at all. The final argument can be reconstructed as starting from the opposite hypothesis: that the will, so far from being almost omnipotent in our lives, is largely impotent, so that we are unable to conduct rational activities such as empirical inquiry. It is then argued that without this capacity we cannot construct situations within which agreement in judgement is possible; but without such agreement, again, conceptual thought is not possible. Hence the two arguments frame the constraints which govern the relationship between the will and perception: on the one hand, the contents of perception cannot be dictated by the will; on the other hand, a person who is a subject of perceptual consciousness needs to be someone who is capable of rational action.
9
Fractionating the Intentional Control of Behaviour: A Neuropsychological Analysis Glyn W Humphreys and M. Jane Riddoch
1. INTRODUCTION
Neuropsychological studies have played an important role in guiding our understanding of cognitive function. For example, studies of selective deficits in particular tasks, following brain lesions, provide constraints on models of cognition, demonstrating functional modularity between processes underlying the tasks. In many instances, the deficits reflect impairments to specific forms of stored representation or in gaining access to those stored representations-even when the representations are normally accessed automatically (examples here would include deficits in object recognition and in the recognition of printed and auditory words; see Humphreys, 1999; Coltheart et ai., 1980; Hall and Riddoch, 1997). In other cases, though, the deficits do not reside in automatic processes but rather in control processes that either modulate access to stored representations or that modulate the consequences of the access process. A classic illustration of this is socalled 'utilization behaviour', following damage to the frontal lobes. Patients displaying utilization behaviour may act upon stimuli present in their environment irrespective of the tasks they are asked to perform or even irrespective of the physical constraints of the situation. Lhermitte, for example, placed pairs of spectacles in front of one patient who proceeded to place each pair on his face-even though this involved placing one pair on top of the others (see Lhermitte, 1983)! Such behaviours appear to represent extreme examples of loss of task-based or instructional control over action. Task constraints appear neither to prevent irrelevant stimuli accessing learned associations nor to halt the learned associations from being enacted. These deficits in control processes can provide important information about the nature of intentional behaviour. In this chapter, we will review neuropsychological evidence indicating that the intentional control of behaviour can break down in a number of ways. The dissociations between the different forms of intentional control show that self-agency is not unitary but instead involves a number of separable components. We suggest that it is only by identifying these components that we will begin to understand the nature of intentional behaviour. This work was supported by grants from the Medical Research Council and the Wellcome Trust (UK).
202
Glyn W Humphreys and M. Jane Riddoch
We begin by discussing frameworks for the intentional control of action, derived from studies of normal subjects. Such frameworks are then useful for reviewing neuropsychological evidence on the fractionation of intentional behaviour.
2. ACTION SLIPS AND THE INTENTIONAL CONTROL OF ACTION In everyday life, our actions usually correspond to aspects of our general intentional will. The intentional control of action occurs even in instances where psychologists have shown that the behaviour is targeted at a stimulus that the subject appears to have no awareness of-an example being patients with 'blindsight', who may reach with accuracy to a stimulus that they self-report as not being able to 'see' (e.g. Weiskrantz, 1986). Although targets in such instances appear nQt to be coded consciously, we would hold that the reaching action itself is intentional and based on the instructions given to the patients by the experimenters ('just guess where the object is'). One might even suggest that actions such as reaching and grasping need to be elicited intentionally, by instruction, in order to be directed to targets that are not themselves available for intentional report. Initiation of the motor response itself is contingent on intentional control, even if stimulus processing, and the linkage of the stimulus to the response, is not. Thus, once recruited by intention, the action can be made to a stimulus that the actor seems unaware of. But, one might counter, this is surely a matter of degree, for some responses do appear to be generated unintentionally-an example being 'leakages' in non-verbal behaviour when an actor is telling a falsehood. There is certainly good evidence that electrophysiological measures, such as evoked potentials or galvanic skin responses, can be generated unintentionally. In neuropsychology, such measures have been used to demonstrate unconscious coding of stimuli in a variety of disorders, from visual neglect (Vallar et al., 1991), to achromatopsia (Humphreys et al., 1992) and prosopagnosia (Bauer, 1984), where patients cannot make intentional discrimination responses to the stimuli involved. Further examples are balance reactions and eye movements. Balance reactions can be based in part on variations in patterns of 'optical flow' as we move around the environment (or as objects move around us), even though people are typically unaware of the information that they are responding to (or even that some form of balance reaction has taken place; see Lee and Aaronson, 1974). Similarly eye movements may be elicited and directed to the locations where stimuli appear even when we are directed to delay the movement and to make it in the direction opposite to where the stimulus appears (as in so-called 'anti-saccade' tasks; see Rafal et al., 2000). Here eye movements are made contrary to the intended behaviour. Is it the case then that some behaviours may only be initiated intentionally (e.g. reaching and grasping), whilst other-perhaps more 'primitive' actions-can be
Fractionating Intentional Control
203
evoked even without intention? Galvanic skin responses, balance reactions, and stimulus-elicited eye movements, for example, may operate through special-purpose, hard-wired processes, perhaps represented sub-cortically. Other behaviours, not hard-wired in, are based on activation within the cortex, and subject to intentional control. This point, on whether some behaviours are critically dependent on intentional control, is one that we will return to at the close when we discuss dissociations between actions such as pointing and grasping. For now, we wish to contrast this dichotomous view (some behaviours are dependent on intention, others are not), with a more continuous approach. The continuous approach states that all motor behaviours can be generated unintentionally under some circumstances (including reaching and grasping, our starting examples); what is critical is to define the circumstances where intentional control does and does not operate, and how intentional control of behaviour is effected (when it is). Support for this continuous approach comes from the study of 'slips of action', where we make errors that to some degree transgress an intentional goal (see Reason, 1984, for documentation of such slips, based on diary studies). An example might be taking a familiar turning on a route instead of following a less familiar road to a new destination. Such errors can involve actions that are not only learned (e.g. turning a wheel and changing gear in a car), but that also involve a whole sequence of motor behaviours (avoiding pedestrians, responding to traffic lights). It seems unlikely that these actions are generated by special-purpose (sub-cortical) mechanisms, and, indeed on the surface, these action errors seem little different in kind from intended actions. What seems more critical here than the nature of the actions is the overlap between the stimuli being processed and the specification of the intention. In particular, action slips often arise when there is some degree of partial match between stimuli and the intention. In our example of driving, the intention may involve components such as 'starting along a familiar route', 'taking a new turn at a particular junction', 'taking particular care when changing gear in this car', and so forth. The general intention, then, can specify a set of component actions, any of which can be released by appropriate stimuli. We can speak of the intention providing a response set that is activated by stimuli (cf. Broadbent, 1971). The control of action itself, though, is not directly based on the intention but on whether the excitation of representations in the intended response set is greater than that of other actions that are activated by the stimuli. If the intended actions are activated more weakly than actions not in the intended response set, but which are nevertheless strongly linked to the stimulus, then these other actions may be elicited and an action error results. Figure 9.1 illustrates this idea. The continuous approach we have outlined makes two points: (i) that there are not classes of behaviour that are intentional and others that are not; all behaviours may be generated unintentionally, and (ii) that intention does not directly generate behaviour, rather it modulates response activation within a system that is sensitive to environmental (bottom-up) factors as well as intention (top-down control).
Glyn W Humphreys and M. Jane Riddoch
204
Intention Stimuli:
Start on familiar Familiar landmarks - - - - - - . . route
~
~~ :r-----------------, Turn left at : Ll~~_:~?_~
c=J Primed by intention
C::J Not primed by intention
j
Do not turn left at junction
Response set FIGURE
9.1. Outline framework for how intention may modulate our taking a novel turn on a familiar road
Supervisory attentional system
Stimulus - - - - e t
Hierarchy of action
Response
Contention scheduling system
FIGURE
9.2. Illustration of the Norman and Shallice (1986) framework
The approach is incorporated into models such as that proposed by Norman and Shallice (1986). Norman and Shallice distinguish between a 'contention scheduling systen1' (CSS) and a supervisory attentional system (SAS). The CSS contains response programs that are activated partly by stimuli, in a bottom-up manner, and partly by the SAS, which modulates activity in the CSS. Within the Norman and Shallice framework, an action slip would arise when a response is strongly activated by a stimulus and there is insufficient modulation of other responses from the SASe The framework is presented in Figure 9.2.
Fractionating Intentional Control
205
This approach is also similar to what have traditionally been labelled 'lateselection' theories of selection (cf. Deutsch and Deutsch, 1963). Late-selection theories propose that all stimuli are processed to a high level, with stored representations being accessed without any limits apart from those imposed by sensory constraints (e.g. stimuli falling on the periphery of the retina will be less likely to activate stored representations than those falling on the fovea, due to differences in acuity). On this view, all possible response associations are accessed by stimuli, and behaviours are selected from the competition between the activated responses. Response activation will be modulated both by associated stimulusresponse contingencies and by top-down intention. In experimental psychology, arguments about whether selection operates 'late' (after stimuli access stored representations) or 'early' (prior to access to stored representations) have been waged for the past forty years (see Broadbent, 1958; Treisman, 1960, for some initial views; see Mack and Rock, 1999, for a later exposition). Recently, behavioural experiments have been added to by studies using techniques from cognitive neuroscience (including physiological recordings of single-cell responses, neuropsychological and electrophysiological studies of evoked potentials, and studies of functional brain imaging), and, we suggest, some movement has been made towards resolution of the long-standing question. There is clear evidence that selection does not only operate at a late stage of processing but also on early stages concerned with perceptual analysis of stimuli. To give but one example, Rees, Frith, and Lavie (1997) used functional brain imaging to measure activation in a brain region (MT) known to be responsible for coding motion in the stimulus (see Zeki, 1993). They presented a central word against a background of moving dots and found that activation of MT by dots was affected by the degree to which subjects attended to the central word. When subjects had to perform a 'low-level' task on the word (detect a particular letter) there was more activation of MT by the background dots than when subjects had to identify the word. This demonstrates that perceptual processing of stimuli is affected by ongoing task activity. When we are engaged in a more difficult task there is less perceptual analysis of irrelevant stimuli than when we undertake an easier task on a relevant stimulus. It seems that selection of the relevant over the irrelevant stimulus can be 'early' (at a perceptual stage) or 'late' (even at a response stage), depending on factors such as the difficulty of the task at hand. Lavie (1995) proposes that task difficulty influences the attentional resources available to process irrelevant stimuli. When the primary task is difficult there are fewer resources available to process irrelevant stimuli to high levels than when the primary task is easy. There is 'early selection' in the former case and 'late selection' in the latter. We return to this point concerning task difficulty when we consider the factors governing the control of behaviour in neuropsychological patients. This empirical work suggests that intentional processes not only influence selection between competing responses but also selection at earlier, perceptual
206
Glyn W Humphreys and M. Jane Riddoch
stages of stimulus processing. Theories need to be elaborated to account for how intentions are implemented at these earlier stages of processing, as well as at the level of response selection. The neuropsychological evidence we present below supports a view in which processes of stimulus selection and response selection are functionally separable and even dependent on different neural systems. Intentional control of behaviour operates through separable stimulus and response-based mechanisms of selection.
3. NEUROPSYCHOLOGICAL DISORDERS OF INTENTIONAL CONTROL OF ACTION As we have noted, slips of action by normal subjects can provide important insights into the nature of intentional (and unintentional) behaviour. However (and probably fortunately, as far as our survival is concerned!), these slips of action occur infrequently; also their occurrence is opportunistic rather than being under experimental control. For both of these reasons, slips of action do not provide an ideal database for theory development. In contrast, following brain injury, patients can have profound deficits in the intentional control of action. The deficits may even occur with sufficient frequency and systematicity to make them amenable to experimental investigation. One form of impairment in intentional action can be found in patients with what has been termed 'anarchic hand syndrome' (see e.g. Della Sala et ai., 1991). In this syndrome, patients make hand movements that are not under their volitional control. As one hand goes to unlock a door, the anarchic hand may move to lock it again-though the patient intends that the door be opened. In one early case, Goldstein (1908) described a patient making spontaneous movements of the left hand that were unwilled and that could not be inhibited, and stated that 'the hand does what it likes'-as if the hand had one form of intention that was distinct from the intention consciously expressed by the patient. Della Sala et ai. (1991) argued that a distinction be drawn between anarchic hand actions and behaviourally similar responses labelled as 'alien hand' symptoms (e.g. Brion and Jedynak, 1972). In alien hand behaviours, patients fail to report ownership of the wayward limb and they may not acknowledge when an inappropriate action has been made. Anarchic hand behaviours, however, are acknowledged by the patient as being their own, even though not under the control of intention. Our own view is that whether a behaviour is acknowledged as being a patient's own may be dependent on several factors, including the perceived mismatch between the inappropriate action and the goal of the patient, as we discuss below. The distinction may not necessarily reflect some qualitative difference between patients. We will use the term anarchic hand simply to describe actions that are not under willed control by the patient.
Fractionating Intentional Control
207
For the most part, descriptions of anarchic hand behaviours have been acecdotal and have concentrated on providing an anatomical account of the syndrome. Della Sala et al. (1994), for example, discuss these behaviours by one hand in terms of patients having damage within one hemisphere to the neural region involved in the internal control of action (the supplementary motor area, or SMA), along with lesions of the corpus callosum that disrupt communication from the unaffected hemisphere. As a consequence, hand actions made by the damaged hemisphere are driven by environmental factors rather than the patient's intention. In our own work, we have attempted to study the nature of these environmental factors experimentally, to provide a more articulated functional account of how unintended actions arise. We have asked: are the unintended actions based on learned stimulus-response associations? Do different factors determine the actions by each hand? And, can there be partial rather than complete loss of intentional control? We consider the first two questions to begin with. Riddoch et al. (1998) conducted an experimental analysis of one patient, ES, who had cortico-basal degeneration, which could have prevented the SMA within each hemisphere from modulating action. As a consequence, ES showed aspects of anarchic behaviour with both hands in everyday life. For example, ES described how her left hand once struck her aunt at a dinner party, though ES was mortified when this happened! Indeed, in many circumstances, ES sat on her hands to prevent inappropriate actions from occurring. Hence we conclude that she was aware when gross errors of action arose. Our experimental analysis used a very simple task. ES was presented with a cup on a table in front of her, with the cup positioned in line with either the left or right side of her body. The position of the handle of the cup varied orthogonally with the position of the cup with respect to ES, so that a left-side cup could have its handle on the left or right (across different trials). ES was required to pick up the cup using the hand aligned with the position of the cup with respect to her body, and to ignore the position of the handle. However, despite the simplicity of the task, ES made many errors. These errors typically involved her making cross-body hand movements to pick up the cup, particularly when the handle of the cup was aligned with the opposite (and task-inappropriate) hand. For instance, her right hand might pick up a cup on her left side when the handle of the cup faced right. Interestingly, ES did not seem to be aware that she made errors on these trials-when asked whether she was doing the task correctly she replied, 'I think so!' Was there then a failure to grasp the task rule? We think not. ES was able to repeat back the instructions, she showed generally good comprehension, and she could discriminate her left from her right side. Moreover, as we elaborate in the next paragraph, her performance could be altered systematically by varying the task and the nature and orientation of the stimulus. There were circumstances when the task rule could be followed. In subsequent experiments, Riddoch et al. demonstrated that the frequency of inappropriate hand actions made by ES varied with the task and the stimulus.
208
Glyn W Humphreys and M. Jane Riddoch
Incorrect responses were reduced when she had to point to the position of the handle of the cup on each trial. They were also reduced when we replaced the cup with a cup-like non-object, formed by glueing together two cylinders (a small one on the side of a larger one, to act as the handle),1 and when we turned the real cups upside-down so that they were no longer in a familiar orientation. All of these manipulations affected inappropriate responses by ES's right hand more than those made by her left hand. In contrast, her inappropriate left-hand errors were reduced when there was less spatial uncertainty in the task (e.g. when responses were made to a constant position, but with the hand of response being cued randomly by the word left or right). The fact that ES could make appropriate responses when the task or the orientation of the stimulus changed (to become less familiar) indicates that ES could comprehend the task rules-simply she found it difficult to implement the rules when the stimulus was strongly associated with the inappropriate response (e.g. a cup with its handle on the right being strongly associated with a right-hand grasp response). ES's right-hand responses were modulated by the task, and by both the familiarity of the object (the cup vs. the cup-like non-object) and the familiarity of its orientation (the upright vs. the inverted cup). The effects of object familiarity indicate that responses were activated on the basis of learned stimulus-response associations. The effects of object orientation, though, suggest that associations were probably not formed between actions and the semantic representations of the stimuli (that would mediate recognition of the cup as a familiar drinking vessel), since semantic representations are likely to be indifferent to object orientation (so the cup is recognized as the same object across different orientations; see Biederman and Cooper, 1991, for evidence of the lack of an effect of left-right orientation on object recognition). Rather, responses were activated from learned associations between visual representations and actions. These learned associations may be particularly strong for the right hand, given that ES was pre-morbidly right-hand dominant. The effect of the task is also of interest, since it demonstrates that intentional processes could be implemented under some circumstances. For example, ES may have been able to specify a response set from the task instructions (e.g. for either pointing or grasping), but she had difficulty in modulating the excitation of actions within the response set. Due to a lack of top-down, intentional modulation, she was liable to make errors by selecting the overlearned (and highly activated) response rather than the task-based response. ES's left hand was affected less by variations in the strength of stimulusresponse associations, and more by manipulations of positional uncertainty. This I Since ES tended to make more correct responses to the handle of the cup-like non-object than to a real cup, it cannot be argued that her errors with the cup were due to the difficulty of picking up the cup when its handle was incongruent with the hand required by the task rule (e.g. the difficulty of picking up a left-side cup with her left hand when its handle faced right). This difficulty would have occurred with both the real cup and the cup-like non-object.
Fractionating Intentional Control
209
can be accounted for if the right and left hemispheres are influenced by contrasting factors. For instance, the left hemisphere (controlling the right hand) seems affected by learned stimulus-response associations (see above). The right hemisphere (controlling left-hand responses) may be dominant under conditions of spatial uncertainty; hence inappropriate left-hand responses were made frequently unless the spatial uncertainty of the target reduced. In summary, the evidence from Riddoch et al. indicated that unintentional grasp responses in anarchic hand syndrome could be studied systematically. These responses were determined by learned relations between actions and visual representations of stimuli (for the right hand) and by the spatial uncertainty of the target (for the left hand). Some effects of intention remained apparent, though, since changing the response reduced the frequency of unintentional grasp actions, but such intentional effects were not sufficient to overcome the influence of overlearned stimulus-response associations on behaviour. Riddoch et al. also tested whether the inappropriate responses generated in their experimental set-up were indeed unintentional. In this test, ES was asked to make grasp responses across her mid-line to blocks aligned to the left and right sides of her body: make a right-hand response to a block on the left and a left-hand response to a block on the left-exactly the errors she made frequently in the experiment with cups. Despite ES being physically able to make these reaching responses (shown by the errors with the cups), we were in fact unable to teach her the cross mid-line task rule-she always made grasp responses using the hand on the side where the target object fell. Now she was unable to make cross mid-line responses intentionally! This result is not too surprising. The cross mid-line responses to blocks had to compete with overlearned tendencies to respond using the hand nearer to the stimulus, and she then had difficulty in preventing this overlearned response from being made. This again demonstrates lack of intentional control of action. The results were also not confined to hand responses. Riddoch et al. (2001) found that the pattern of performance generalized to foot responses (ES in fact had anarchic feet!). We performed the equivalent experiment to the study with cups, but this time ES had to place either her left or right foot into a shoe placed on her left or right side, irrespective of whether the shoe was for her left or right foot. She again had great difficulty in preventing an overlearned response to the stimulus. For example, she often moved her right foot into a right shoe placed on her left side. 4. THE DISTINCTION BETWEEN OBJECT SELECTION AND RESPONSE SELECTION FOR ACTION The errors made by ES are consistent with her having an impairment in taskbased, intentional control of behaviour. In fact, the responses are reminiscent of utilization behaviours made by patients with frontal lobe damage (see above),
210
Glyn W Humphreys and M. Jane Riddoch
though the work on utilization behaviours has been advanced by showing some of the circumstances under which such behaviours arise. In addition, and unlike some patients manifesting utilization behaviour, ES demonstrated some effects of task instructions on performance. This is evidenced by the contrasting errors found when pointing rather than grasping responses were required. It was also clear in other studies that examined ES's ability to select which of two objects to make an action to. Instead of presenting just one object, Riddoch et ai. (2000a, b) used two objects and required grasping responses to just one. As before, the task was to respond to this target using the hand aligned with the side where the object appeared. The target was cued by its colour or its shape. To test the effects of response activation, Riddoch et al. sometimes used a target that was less associated with the response than the distractor-for example, the target might be an inverted cup and the distractor an upright cup. Even under these circumstances, ES made very few errors in which a response was made to the distractor. However, she did make many errors in which the wrong response was made to the target. As before, these responses reflected learned stimulus-response associations (e.g. to an inverted cup on the left, with its handle facing right, ES was liable to make a righthand grasping response). These results reveal that ES was relatively good at selecting the object to which she had to respond, even though she was subsequently impaired at selecting the correct response. Also relative response activation, from the distractor compared with the target, had little influence on selection of the target stimulus (though it clearly influenced response selection). This suggests a dissociation between the intentional selection of the appropriate stimulus and that of the appropriate response; intentional response selection was more impaired than intentional stimulus selection. This contrast, between stimulus and response selection, is consistent with a neuroanatomical distinction drawn by Posner and Petersen (1990). They propose that there exist different neural networks supporting stimulus- and response-based selection. Stimulus selection is contingent on a 'posterior' network centred in the parietal lobes, that prioritizes processing of a target object over other, distractor objects present in the environment. Prioritization is based on perceptual properties that characterize the target rather than any distractors. Target features may be primed top-down, by intention (see Chelazzi et aI., 1993, for physiological evidence for top-down priming). Response selection, however, is contingent on an 'anterior' network involving the frontal lobes and the anterior cingulate. Here we propose that selection is based on the relative activation of items in a response set specified by intention, with this activation also modulated by intention (biasing selection towards task-appropriate actions). A framework for these ideas is presented in Figure 9.3. Let us apply this framework to account for our behaviour in situations with multiple objects-for example, reaching to pick up a cup of coffee when there are other breakfast objects on a table. According to the framework, there would first need to be selection of a target based on specification of its perceptual attributes.
Fractionating Intentional Control Stimulus selection
211
Response selection
(Resources)
---1
(Resources)
o FIGURE
Primed by intention
i_-:~
Response
31
Not primed by intention
9.3. Illustration of a framework in which there are separable processes for object selection and action selection
Notes: Object selection is modulated by templates that specify attributes of the target. Action selection is modulated by intentional activation of behaviours in a response set. Different levels of stimulus analysis (perceptual, meaning-based, and so forth) are supplied by independent resources.
Following this, responses associated with the selected object are activated, with the reaching response (hopefully!) being selected for action. The other objects present would not necessarily activate their associated responses, so that effects of relative response strength do not have a major impact on stimulus selection. 2 Other neuropsychological studies are informative about the kinds of templates that can be established for target objects. We conducted a case study with a patient with unilateral neglect who was impaired at finding a target specified by its name or even by a salient perceptual property, such as its colour ('find the red object') (Humphreys and Riddoch, 2001). However, the patient could find a target if we indicated the action that could be performed with it (e.g. if the examiner gestured an action before target and distractor stimuli were presented). This intriguing result suggests that templates can be relatively abstract, even being specified terms of an intended action which could be matched by perceptual properties that 'afford' that behaviour. These action-based templates may normally be represented along with templates denoting particular perceptual properties of targets (colour, size, etc.). In the patient we examined, the effect of the brain lesion was to disrupt use of specific perceptual templates for search. 5. DISTRACTOR CAPTURE The overwhelming effects of learned stimulus-response relations on behaviour cannot only be observed in a patient such as ES, with cortico-basal degeneration, 2 Although we argue that stimulus selection precedes response selection, it may be that there is some partial activation of the response set before stimulus selection is completed. Nevertheless effects of response competition on stimulus selection should be small compared with effects of competition between stimuli having similar stimulus attributes. .
212
Glyn W Humphreys and M. Jane Riddoch
but also in patients with frontal lobe damage and impaired intentional control of response selection. Riddoch et al. (2000a), for example, replicated the 'cups' experiments with a patient, FK, with bilateral damage to medial regions of the frontal lobes. Like ES, patient FK made errors by selecting the overlearned rather than the task-based response when a target cup had its handle oriented towards the usual but task-inappropriate hand (e.g. reaching with his right hand to a cup on his left that had its handle facing to the right). FK was also good at rejecting a distractor and at selecting a target specified by its colour, even when the distractor had a stronger overlearned response than the target (again, with an inverted target cup and an upright distractor cup). This supports our distinction between stimulus and response selection (Fig. 9.3). There was also one circumstance in which FK made errors by responding to the distractor rather than the target. This was when the distractor fell in the line of the trajectory that would be followed by FK's hand as he reached to the target. Under this circumstance, FK made errors by picking up the distractor rather than the target, even when the stimuli had different colours (see also Humphreys and Riddoch, 2000). Does this mean that distractors activated associated responses along with targets? We think not. This is because the hand that FK used to pick up the distractor was determined by the orientation of the target not the distractor. Consider a circumstance in which the target was a right-facing cup and the distractor, falling in the path of FK's right hand, had its handle facing to the left (Fig. 9.4). FK made errors by picking up the left-facing distractor with his right hand. This type of mistake was striking because FK never made errors in which he responded to a target using a hand incongruent with its handle. It seems that these distractor errors occurred because distractor objects in the reach trajectory were
Target:
Distractor:
FIGURE
9.4. lllustration of 'distractor capture'
Notes: Task requires reaching to a target cup (in black) where the distractor (in white) falls in the reach trajectory.
Source: After Humphreys and Riddoch (2000).
Fractionating Intentional Control
213
selected following correct selection of the target (and activation of the associated response, based on the orientation of the target's handle). If the response to the target was already programmed then the distractor, selected concurrently with this, simply took over the programmed action; distractor errors then arose. These results on both target and distractor errors remain consistent with the distinction we have drawn between, on the one hand, an intentional set for the properties of stimuli and, on the other, an intentional set for possible responses. These two means of implementing intentional behaviour are functionally isolatable following brain damage. 6. ON THE RECRUITMENT OF RESOURCES FOR STIMULUS AND RESPONSE SELECTION We have already discussed the case of patient FK, who had abnormal difficulty in overcoming learned associations between stimuli and actions when required to make relatively novel, task-based responses. This difficulty was apparent not only in the 'cups experiments' but also in a range of other tasks; for example, FK showed large interference effects in the so-called 'reverse-Simon' task in which, ignoring the meaning of a target word (LEFT or RIGHT), he had to press a right or left button according to the word's location on the left or right of a computer screen. In this task his responses were markedly slow when the required response was incompatible with the irrelevant dimension of the stimulus (e.g. making a left button press to the word RIGHT presented on the left of the screen). This is to be expected due to FK's difficulty in intentionally overriding learned response information (the word's meaning) activated by a selected object (the target). Now, earlier we discussed Lavie's (1995) notion that selection can be biased towards early (perceptual) or late (response-based) processes depending upon the difficulty of the task. There is a bias towards early selection under conditions of high task load. Kumada and Humphreys (2002) examined the effects of task load on FK's ability to ignore (select out) irrelevant information in the reverse-Simon task. They contrasted FK's performance when the target word was presented alone, with his performance when the target was presented in a cluttered visual display with irrelevant X's present. The results were counter-intuitive. Rather than worsening FK's performance, the irrelevant X's improved it! FK was better able to ignore the meaning of the target, and to respond selectively to its location, when the X's were present than when they were absent. How can we account for this result? The reverse-Simon task requires that (i) the target is selected, and (ii) responses are assigned according to the target's location and not its associated meaning. We have argued that FK was able to select a target stimulus (part (i», but he was impaired at applying an intentional bias to favour responses to the target's location rather than its meaning. Despite this, his performance could be biased towards location information by increasing the task load. One way of conceptualizing this is in terms of resources being allocated to
214
Glyn W Humphreys and M. Jane Riddoch
enhance different levels of processing of a stimulus: its perceptual properties, its meaning, and so forth (see Fig. 9.3). Increasing the task load recruits resources available for perceptual processing, and consequently there is stronger activation of responses linked to these perceptual representations under load conditions. Our point in raising this result here, though, is to contrast this successful effect of task load on perceptual processing with FK's poor intentional control of response selection. Effects of task load occur in a patient unable to perform intentional selection processes normally. From this we infer that increasing the task load recruits resources for perceptual processing in a relatively automatic manner, irrespective of whether a subject can bias processing by intention. 7. INTENTIONS AND ERROR MONITORING When FK made errors in the 'cups' experiments, he, like ES, failed to acknowledge that a mistake was made-this held even when he picked up a distractor that lay in the reach-path to the target. ES, however, did acknowledge other anarchic hand actions, such as when her hand struck her aunt! She did not have a general problem in detecting errors in action. We speculate that this discrepancy between her report of errors in the 'cups' experiment and in everyday life is due to the size of the disparity between the action made and ES's intentional goals. Patients such as ES and FK are impaired at implementing intentional control of responses. At best the patients are able to determine which actions are part of a response set (e.g. grasping or pointing), but they are poor at modulating activation of the response set. This may reflect generally poor specification of intentional goals for action. These intentional goals may be important not only for modulating the activation of responses, but also for matching against behaviour so that errors are detected. We suggest that error monitoring involves matching behaviour against specified intentions. Impoverished representations of intended actions, then, will tend to be associated with impaired error monitoring. For instance, patients such as ES and FK may be able to match their action (pick up a cup) with a general intentional goal, but the precise details of the action (which hand was used) cannot be assessed because the more detailed intentional information is degraded. ES, however, remains able to detect a flagrant anarchic hand movement because this transgresses even a degraded intentional goal. 8. ARE SOME ACTIONS NECESSARILY INTENTIONAL? We have argued, and presented neuropsychological evidence, that many actions can be effected unintentionally-including, for example, grasp responses to target stimuli. Indeed, under some circumstances, actions can be elicited unintentionally even when a patient cannot generate the same action intentionally (ES's cross-mid-line reaches being a case in point). In presenting a framework for understanding
Fractionating Intentional Control
215
intentional behaviour, though, we initially discussed the possibility that some actions are necessarily contingent on intention-they cannot be performed unintentionally. Now, when discussing evidence from the 'cups' experiments, we noted that task-inappropriate responses made by ES's right hand reduced when pointing rather than grasping actions were required (for similar data with FK, see Riddoch et al., 2000a). Is it possible that right-hand responses differ qualitatively from other forms of responses, in that they may only be generated intentionally? ES may have generated few cross-mid-line pointing errors because pointing was under intentional control. Pointing responses made with the right hand are based on activation of the left hemisphere and they may serve a unique communicative purpose, linked to language (see Edwards and Humphreys, 1999; Robertson et aI., 1995, for some discussion of this). To the extent that language production is under intentional control, so the same may hold for these pointing responses. Of course, these last proposals are highly speculative. Nevertheless, there are neuropsychological data indicating that pointing and grasping dissociate in a number of ways besides the differences we have highlighted here. For example, patients can show unilateral neglect in pointing tasks-when asked to point to the centre of an object they may be biased towards their non-neglected side. However, when asked to grasp the same object, their hand may go to the true centre (e.g. Edwards and Humphreys, 1999; Robertson et al., 1995). In such patients, pointing may be based on a conscious representation of the world, which is spatially distorted. Grasping, in contrast, may be dependent on different representations, that are not necessarily available for conscious report, and that remain unimpaired in neglect. These speculations on the differences between pointing and grasping require further explorations of the relations between the two behaviours.
9. SOME CONCLUSIONS We have discussed findings from neuropsychological patients that indicate that intentional control of behaviour can be effected through dissociable neural systems: a posterior system for intentional selection of objects, and an anterior system for intentional selection of actions. Patients can have poor intentional selection of action but relatively preserved intentional selection of target objects. The data show that unintentional actions can be complex (including reaching and grasping), and that they can be based on learned associations between stimuli and actions. In addition, responses can be biased towards perceptual or higher-level properties of stimuli (e.g. associated word meaning), by varying the task load. However, these effects of task load appear to occur automatically and do not reflect intentional processes. We have also proposed that an error-monitoring operation can be effected, involving the matching of action against a specified intentional goal. If there is poor specification of the intentional goal, error monitoring will be inaccurate.
216
Glyn W Humphreys and M. Jane Riddoch
The general tenet of our argument has been that intentional behaviours are not unitary. Self-agency does not depend on a single, indivisible process; rather there are distinct ways in which different forms of intention are implemented. The implementation of different intentions takes place through separable neural systems.
REFERENCES BAUER, R. M. (1984), 'Autonomic recognition of names and faces in prosopagnosia: a neuropsychological application of the Guilty Knowledge Test', Neuropsychologia, 22: 457-69. BIEDERMAN, I., and COOPER, E. E. (1991), 'Object recognition and laterality: null effects', Neuropsychologia, 29: 685-94. BROADBENT, D. E. (1958), Perception and Communication. London: Pergamon Press. --(1971), Decision and Stress. Oxford: Oxford University Press. BRION,S., and JEDYNAK, C. P. (1972), 'Trouble du transfert interhemispherique it propos de trois observations de tumeurs du corps calleux: Ie signe de la main etrangere', Revue Neurologique, 126: 257--66. CHELAZZI, L., MILER, E. K, DUNCAN, J., and DESIMONE, R. (1993), 'A neural basis for visual search in inferior temporal cortex', Nature, 363: 345-47. COLTHEART, M., PATTERSON, K. E., and MARSHALL, J. C. (1980), Deep Dyslexia. London: Routledge and Kegan Paul. DELLA SALA, 5., MARCHETTI, c., and SPINNLER, H. (1991), 'Right-sided anarchic (alien) hand: a longitudinal study', Neuropsychologia, 29: 1113-27. --'The anarchic hand: a fronto-mesial sign', in F. Boller and J. Grafman (eds.), Handbook ofNeuropsychology, ix. Amsterdam: Elsevier. DEUTSCH, J. A., and DEUTSCH, D. (1963), 'Attention, some theoretical considerations', Psychological Review, 70: 80-90. EDWARDS, M. G., and HUMPHREYS, G. W. (1999), 'Pointing and grasping in unilateral visual neglect: effect of on-line visual feedback in grasping', Neuropsychologia, 37: 959-73. GOLDSTEIN, K. (1908), 'Zur lehre der motischen apraxie', Journal fUr Psychologie und Neurologie, 11: 169-87. HALL, D. A., and RIDDOCH, M. J. (1997), 'Word meaning deafness: spelling words that are not understood', Cognitive Neuropsychology, 14: 1131-64. HUMPHREYS, G. W. (1999), 'Integrative agnosia', in G. W. Humphreys (ed.), Case Studies in the Cognitive Neuropsychology of Vision. London: Psychology Press. --and RIDDOCH, M. J. (2000), 'One more cup of coffee for the road: object-action assemblies, response blocking and response capture after frontal lobe damage', Experimental Brain Research, 133: 81-93. --(2001), 'Detection by action: neuropsychological evidence for action-defined templates in visual search', Nature Neuroscience, 4: 84-8. --TROSCIANKO, T., RIDDOCH, M. J., BOUCART, M., DONNELLY, N., and HARDING, G. (1992), 'Covert processing in different visual recognition systems', in D. Milner and M. Rugg (eds.), The Neuropsychology of Consciousness. London: Academic Press. KUMADA, T., and HUMPHREYS, G. W. (2002), 'Internal vs. external control of visual selection following frontal lobe damage', Cognitive Neuropsychology, 19: 49-65.
Fractionating Intentional Control
217
LAVIE, N. (1995), 'Perceptual load as a necessary condition for selective attention', Journal
of Experimental Psychology: Human Perception and Performance, 21: 451--68. LEE, D. N., and AARONSON, E. (1974), 'Visual proprioceptive control of standing in infants', Perception & Psychophysics, 15: 529-32. LHERMITTE, F. (1983), 'Utilisation behaviour and its relation to lesions of the frontal lobes',
Brain, 106: 237-55. MACK, A., and ROCK, I. (1999), Inattentional Blindness. Cambridge, Mass.: MIT Press. NORMAN, D., and SHALLICE, T. (1986), 'Attention to action: willed and automatic control of behavior', in R. Davidson, R. Schwartz, and D. Shapiro (eds.), Consciousness and SelfRegulation: Advances in Research and Theory, iv. New York: Plenum Press, 1-18. . POSNER, M. I., and PETERSON, S. E. (1990), 'The attention system of the human brain',
Annual Review ofNeuroscience, 13: 25-42. RAFAL, R. D., MACHADO, 1., Ro, T., and INGLE, H. (2000), 'Looking forward to looking: saccade preparation and control of the visual grasp reflex', in S. Monsell and J. Driver (eds.), Attention and Performance XVIII. Cambridge, Mass.: MIT Press, 155-74. REASON, J. T. (1984), 'Lapses of attention in everyday life', in W. Parasuraman and R. Davies (eds.), Varieties ofAttention. Orlando, Fla.: Academic Press, ch. 14: 515-49. REES, G., FRITH, c., and LAVIE, N. (1997), 'Modulating irrelevant motion perception by varying attentionalload in an unrelated task', Science, 278: 1616-19. RIDDOCH, M. J" EDWARDS, M. G., HUMPHREYS, G. W., WEST, R., and HEAFIELD, T. (1998), 'Visual affordances direct action: neuropsychological evidence from manual interference', Cognitive Neuropsychology, 15: 645-84. --HUMPHREYS, G. W., and EDWARDS, M. G. (2000a), 'Visual affordance and object selection', in S. Monsell and J. Driver (eds.), Attention and Performance XVIII. Cambridge, Mass.: MIT Press, 603-26. --(2000b), 'Neuropsychological evidence distinguishing object selection from action (effector) selection', Cognitive Neuropsychology, 17: 547--62. - - (200 1), 'An experimental analysis of anarchic lower limb action', Neuropsychologia, 39:
574-9. ROBERTSON, I. H., NICO, D., and HOOD, B. M. (1995), 'The intention to act improves unilateral neglect: two demonstrations', NeuroReport, 7: 246-8. TREISMAN, A. (1960), 'Contextual cues in selective listening', Quarterly Journal of
Experimental Psychology, 12: 242-8. VALLAR, G., SANDRONI, P., RUSCONI, M. 1., and BARBIERI, S. (1991), 'Hemianopia, hemianesthesia, and spatial attention', Neurology, 41: 1918-22. WEISKRANTZ,L. (1986), Blindsight. Oxford: Oxford University Press. ZEKI, S. (1993), A Vision of the Brain. Oxford: Blackwell Scientific Publications.
10
Dual Control and the Causal Theory of Action: The Case of Non-intentional Action Josef Perner
1. INTRODUCTION This chapter is concerned with how the causal theory of action (Davidson, 1963) in its extended version (Searle, 1983) can be combined with a model of dual control (Norman and Shallice, 1986) in order to distinguish intended actions from non-intended actions and those from mere movements or happenings. My specific focus is on accounting for the fact that one variety of intentional action, willed (controlled or voluntary) action, is part of a cluster of empirical phenomena while automatic actions belong to a different cluster. The former cluster comprises conscious awareness and attention, verbal justification, moral responsibility, executive control (tasks on which frontal lobe patients seem specifically impaired) with the self in charge, and so on, while the latter, surrounding automatic action, includes potential lack of conscious awareness, lack of verbal expressibility, reduced responsibility, and bypasses the control through the self. 1.1 Causal Theory of Action The classical proposal of a causal theory (Davidson, 1963; Brand, 1984) was that a person's doing something is an action if it is caused by a mental event, the content of which relates appropriately to the performance of the action. Three shortcomings of this theory have characteristically been noted (e.g. Searle, 1983). (1) For many actions there does not seem to be (a consciously aware)intention preceding them. (2) The theory cannot explain the lack of intentional action in cases of causal deviancy. (3) Since the mental state is not part of the action but just a prior causal trigger the theory cannot account for the phenomenological difference between actions and other bodily movements. To solve these problems Searle (1983) has proposed a distinction between prior intentions and intentions-inaction. However, the critical point is that the intentional component identified by intentions-in-action is integral and synchronous to the execution of an action. In fact, there is no real need for two different types of intentions. We only need one I would like to thank Naomi Eilan, Johannes Roessler, Elisabeth Pacherie, and Zoltan Dienes for their most perspicacious comments on earlier drafts of this chapter.
The Case ofNon-intentional Action
219
that can vary in its temporal relationship to the bodily movement as O'Shaughnessy (1991) suggested.! With this move two of the named problems can be solved. The intentional component as integral part of the action (Searle's intention in action) enables differentiation of automatic actions without apparent causally prior intention (which the classic proposal seemed to assume) from mere movements. It also explains the phenomenology of action. Since there is a mental component inherent in actions but missing in mere movements, actions feel differently than mere movements. The third problem of causal deviancy can be illustrated with the young man who intends to kill his uncle. When driving on his way home in foggy conditions the preoccupation with his intention to kill his uncle causes him to fatally hit a pedestrian who happens to be his uncle. Although his intention to kill his uncle was (part of) the cause of the action that led to the uncle's death (and so by the traditional theory would count as an intentional act of killing his uncle), he did not kill his uncle intentionally, that is, it was manslaughter not murder. 2 This example can be accounted for by requiring that the intention stipulates in its content not just the goal to kill the uncle but also that this particular instance of an action is to achieve the goal. The example then falls short of being intentional because the young man did not have the intention to kill his uncle with this particular act of running him over.
2. THE CASE FOR DUAL CONTROL 2.1 Non-intended Actions A problem that remains even with the refined causal theory, and which Wakefield and Dreyfus (1991) commented on in relation to Searle's version of it, is that it distinguishes intentional actions from mere movement, but it leaves no room for non-intentional action. Under the similar heading of'unintentionaP action' Searle (1983: 101-3) does discuss the problem of alternative descriptions of one and the same act. Oedipus intentionally married Jocasta and thereby ended up marrying his mother. He did not intend to marry his mother. He married his mother unintentionally. I The intentional part can start well ahead of the movement and even be aborted before the movement begins, giving the impression of a pure 'prior intention'. Or the intentional part can commence simultaneously with the movement giving the impression of a pure 'intention in action'. In many cases, though, the intention begins prior to the movement and then lasts and exerts its influence through to the completion of the action. 2 Since the preoccupation with the intention to kill his uncle was also an integral part of the chaotic action that led to the pedestrian's death, the intention to kill could-without further criteria-not be differentiated from an intention in action. I emphasize this to show that Searle's distinction between prior intention and intentions in action on its own is not sufficient for cleanly solving the causal deviancy problems. 3 On Naomi Eilan's suggestion I adopted the expression 'non-intended' for what I have in mind in order to differentiate it from Searle's case of 'unintended' action.
220
Josef Perner
But that is not the case I have in mind. My concern is with 'action slips' (Reason, 1979). One of the many examples is the following (p. 73): 'I meant to get my car out, but as I passed through the back porch on my way to the garage I stopped to put on my Wellington boots and gardening jacket as if to work in the garden.' Clearly, putting on one's Wellington boots and gardening jacket is more than mere movement. It is an action, but one that was not intended; not intended at all rather than not intended under a particular description. Non-intended actions of this kind have no place in Searle's analysis. Moreover, if we now contrast this case with the case where the very same action is carried out as absent-mindedly except that the larger (prior) intention was to go into the garden, then the same action counts as intentional. It seems that the experience of action while acting is the same in both cases. If, according to Searle, the experience of acting is determined by the intention in action then we have the curious situation of having to say that the non-intended action slip has as much intention in action as when I intend to go into the garden. To drive this point home, imagine yourself driving home fully engaged in a conversation with your passenger. You absent-mindedly switch gears, steer your car and take the left turn, direction home, at the point where, to the right, you'd turn to the supermarket. Despite doing all these actions quite absent-mindedly (and some would say without conscious awareness) we cannot say that they were done non-intentionally, for you will claim that you meant to do everything the way you actually did. By contrast, on another occasion everything is exactly the same except that you are supposed to stop at the supermarket before going home. However, being deeply engrossed in your conversation, you take the usual left turn, direction home, instead of turning right to the supermarket. I think we can agree that at the time of taking this left turn there was no difference in terms of intention in action, and in terms of phenomenal experience of how it felt (if anything at all) 'to take that left turn' between the earlier occasion when you meant to drive straight home and the occasion when you were supposed to turn right to the supermarket but failed to do so. Only when your spouse asks you for the groceries do you discover your mistake and you will say something like, 'Darn it, I didn't mean to turn left, I wanted to go to the supermarket'. The important point here is that in both cases the behaviour has the same intention in action and, therefore, qualifies as action rather than mere movement, on several grounds. Phenomenally the two actions appear to have the same feel and both seem to be coordinated by the same goal representation as part of motor representations (a la ]eannerod).4 However, there is nothing in Searle's theory that 4 It is natural to assume that 'non-intentional' actions are as much governed by some motor representation including goal representations in the sense of Jeannerod in order to assure the coherent sequence of such complicated action. That is, the changing driving situation must implant local goals that direct the next sequence of driving actions. It is unlikely that a whole drive home can be done purely by forward modelling based on the goal specification of 'drive home' at the beginning of the journey. An interesting question though is whether these local goals would pass the test for goal
The Case ofNon-intentional Action
221
helps distinguish the one case as an intended action and the other as non-intended. This is one reason for why a dual control model is needed. 3. A MODEL OF DUAL CONTROL 3.1 Objectives Asatisfactory theory needs to account for several terminological and phenomenal distinctions, and do justice to empirical data. Here is a list of desiderata to be addressed in my proposal: Conceptual and phenomenal distinctions. There are several relevant distinctions used in different research traditions to be integrated in a common framework. For instance, as mentioned, philosophers distinguish mere movements (or events) from actions, which tend to be taken as intentional. But then there is the obvious need to distinguish intentional from non-intentional actions. Animal psychologists (Heyes and Dickinson, 1993) distinguish responses (or habits) from actions. Experimental psychologists contrast automatic with controlled processes (Shiffrin and Schneider, 1977). Neuropsychologists (Shallice, 1988; Frith, 1992; Jahanshahi and Frith, 1998) make a similar contrast between involuntary and willed, controlled or voluntary actions, and social/clinical psychologists (e.g. Kirsch and Lynn, 1999) make a distinction between automatic or non-volitional responses and volitional responses under voluntaryS control. Consciousness and verbal report. The higher level of control is requmng conscious awareness and often enabling verbal reportability. We think we are conscious of and typically able to talk about our intentional, in particular our willed, voluntary actions in order to justify them, but we are often not able to do this for automatic, involuntary actions which often go without conscious recognition. An account of intentional action should give an explanation of these clusters of type of action, conscious awareness, and effability. Tasks requiring executive control. The theory should explain why a series of tasks have been claimed to need higher-level executive control by Norman and Shallice (1986) in their SAS model (supervisory attentional system) and by Baddeley (1996) in his model of the central executive within working memory. Slack in intention judgements. Why are we often unclear and wrong about whether we are or are not authors of actions and events? Contrary to our deepest representation in animal learning: sensitivity to goal devaluation (Dickinson, 1994). I think it would. If you have learned since the last drive home to be afraid of a certain traffic situation then this will make a difference, but at the same time will jolt you out of being absent-minded as this newly devalued situation will break the old routine. S
Thanks to Zoltan Dienes who reminded me to include the term 'voluntary'.
222
Josef Perner ;
convictions, increasing evidence of a schism between actual control and con- ; sciously perceived control leads to an unfortunate tendency to see consciousness \ as an epiphenomenon: our conscious intentions are but perceptions based on internal sensations of innervations (Jeannerod, 1997: 181 ff.), or even worse, post hoc concoctions of what we sense ourselves initiating and of what we see ourselves actually doing (e.g. Nielsen, 1963: 'alien hand' experiment).6 Consequently there are theories (e.g. Wegner and Wheatley, 1999) that see conscious will not as an initiator but as an attribution after the fact. These have gained strength on the basis of recent experimental evidence. Action organizing goals can be subconsciously implanted (Bargh and Chartrand, 1999). There is over-attribution of will (Wegner and Wheatley, 1999) if a thought about an event (e.g. making the cursor stop at a particular object) occurs temporally close to that event (cursor stops at the object) even when it is objectively clear that some other person has caused the event (confederate made the cursor stop). There is action projection to external agents of one's own actions as in the case of 'facilitated communication' (Burgess et al., 1998). That is, it can be demonstrated that people emit unconscious behaviours (guide the handicapped person's attempts at communication) that further some goal (give the correct answer to a question) that one doesn't think one is pursuing (one doesn't want to give the answer, just help the handicapped to express it). One objective of the dual control theory helps avoid the conclusion that conscious will is an illusion. It does so by giving enough slack between our conscious intentions (will) and what our action-producing system (automatic control) actually does and yet at the same time preserves the sense that our conscious will sets (is causally responsible for) the frame within which our actions are produced. A complete theory should also account for the sense of self, and of agency (the self in charge of action), cognitive limitation of conscious/willed action, the attribution of moral responsibility for action, and other concerns, which I will not address. 3.2 My Proposal-In Brief The basic idea is that there are two levels of control which are distinguished by how they represent action schemata using a distinction developed by Dienes and Perner (1999; Perner, 1998). At the lower-level action schemata, environmental conditions and the actions to be performed under these conditions, are represented predication implicitly (a more precise definition of the kind of representation typically associated with procedural representation) and the control within and between schemata is limited to control guided by properties of the representational vehicle (vehicle control). At the higher level action schemata are represented prediction and fact explicitly (related to the notion of declarative representation) and control can be guided in terms of the representational content (content control) of action schemata. 6 This is to be distinguished from the neuropsychological 'alien/anarchic hand' syndrome (Della Sala et aI., 1994; Parkin and Barry, 1991) where one of the patient's hands behaves in an unruly way in conflict with what the other hand does.
The Case ofNon-intentional Action
223
The functioning of the higher and lower systems and their interaction are characterized by the following. (1) The higher level implements action schemata at the lower level. (2) The action schemata of the lower level autonomously initiate and control the execution of actions. (3) The execution of action is checked against the representations at the higher level (monitoring). This model owes much to the dual control system proposed by Norman and Shallice (1986) with the important difference that it overcomes one of the critical shortcomings in specifying the higher-order system (SAS: supervisory attentional system). As Baddeley (1996)-who sees his notion of central executive in working . memory as a close relative of the SAS-has pointed out, the SAS is theoretically void and tacitly assumes the existence of some homunculus. My proposal tries to fill that gap in terms of a difference in explicitness of representation (predication implicit vs. fact explicit) and in type of control (vehicle vs. content control).
The Lower Level Actions are governed by action schemata. As a concrete example, I take a card sorting test for children (Zelazo et aI., 1996). For instance, the child has been told that cards with something red on them have to be filed into the left box, cards with something green on them into the right box. Once this action schema has been established, the child will sort cards correctly if he or she has the goal to sort cards. Importantly, action schemata are not just stimulus-response associations. They also depend on the activation of some goal. In fact, even control of actual hand movements to a target shows this complexity. As Jeannerod (1997) suggested, schemata (motor representations) not only represent the external conditions but the goal of the action and the bodily movements to achieve that goal. There must be monitoring of the action by computing deviations of the unfolding movement from the predicted path by a forward model of how bodily force translates into movement (Wolpert et aI., 1995; as cited by Jeannerod, 1997, fig. 6.3). One characteristic of action schemata is the way in which they represent the environmental conditions, goals, motor responses, and the link between these parts. They represent it predication implicitly-a notion that Zoltan Dienes and I (Dienes and Perner, 1999) have introduced to capture a particularly important facet of how knowledge can leave various aspects of a known fact implicit. I present here a refinement of the basic idea applied to the simple action schema, 'green card: put it into left box'. This schema can be in two states. It can be tacit (nonactivated) or occurrent when it is activated by a green card being presented. Let me at first concentrate just on its antecedent condition of a green card being presented. In its tacit state the 'green card' can be said to explicitly represent the property of being a green card, since there is a clear distinction between the representation 'green card' and, say, 'red card'. There is no representation of any instances that have the property of being green cards. The fact that there are instances of something being green cards, one might suggest, is represented when the antecedent condition
224
Josef Perner
gets activated by a presentation of an actual green card. And one might suggest, that the fact that 'green card' gets activated-rather than 'red card'-represents the fact that being a green card is predicated of the presented instance. However, if one takes that view one has to realize that representing the instance of something being presented and the predication of the property of being a green card to this instance remains indistinguishable from just thinking about a green card without one being presented. In other words, the predicating being a green card of a presented instance remains implicit in the causal effects of the presented instance on the mind-brain I (representational vehicle). We call it predication-implicit because there is no separate distinction in the representational medium that serves the function of representing the fact that a predication is being made. Similarly the entire action schema 'green card: put it into left box' can be said to represent the complex event property of green card being put inside the left ' box. However, the fact that there are particular instances ofwhich this property is being predicated is left implicit in the causal effects that presentation of a green card has on the activation of the schema and the schema on the execution of the desired action. Also the fact that these actions are desired events is left implicit in the causal history of reasons for why this schema was set up. Another feature of action schemata is the kind of control they use. An established action schema passes control from the antecedent condition to the consequent, that is, the action representation. It does so purely on the grounds of an associative link that has been formed in the representational vehicle. Although one can say that this link has representational content, that is, it represents the existing action demands on the organism, and that this content is responsible for the existence of the link, this content need not be evoked for routing control once the link has been established in an existing action schema. This kind of control I have called vehicle control (Perner, 1998). And I define the lower level of control as being restricted to predication-implicit representations and to vehicle control. Norman and Shallice (1986) in their dual control model include at the lower level not only control within each schema but also control among schemata (contention scheduling). Schemata are connected by inhibitory links to competing schemata and by excitatory links to supportive schemata. The inhibitory links are designed to make sure that the dominant, most activated schema is not interfered with by other schemata that might get some activation. The important point here is that once these across schema connections are in place, it is the connections among the schemata as representational vehicles that suffice for implementing this contention scheduling. Now let us look at the learning through conditioning of new schemata and what kind of control that requires. A general learning mechanism typically functions in such a way that if a haphazard response leads to success, then the learning algorithm strengthens those associations that were responsible for the successful response, that is, the link that associated the green card with the action of putting it into the left
The Case ofNon-intentional Action
225
box. Other links not exercised on this last successful trial, like associating the green card with putting it into the right box, become diminished by the learning algorithm. Such a learning algorithm exerts control of a higher order in that it changes the control structure implemented by the individual action schemata. It directs its control of strengthening or weakening schemata on the basis of whether these schemata had been active or not at the time of the last, successful or unsuccessful, action. Having been activated recently is a feature of the representational vehicle independently of its content. Now, again, one could argue that the learning mechanism chooses on the basis of recent activation because that vehicle feature represents that the action represented by the schema is most likely to be responsible for the successful outcome. Hence it directs its control on the basis of the representational content of having been recently activated. And again, although this is true for evolution that developed learning algorithms, it is irrelevant once the algorithm is in place. Once in place it directs its control on the basis of the vehicle feature of recent activation regardless of what that feature might indicate about the world. In sum, the control exercised by established action schemata, the contention scheduling among competing schemata, and the acquisition of new schemata through feedback learning are based on predication-implicit knowledge and vehicle control.
The Higher Level
The higher level is defined by the necessity to entertain predication- and factexplicit representations and to exercise content control over the lower level. Paradigm examples that require the higher level of control are the acquisition of a novel schema through verbal instruction, planning, reasoning, or hypothesis testing. For instance, when given verbal instructions: 'Put the green cards into the left box', the child has to represent the meaning of these instructions. This cannot be done in a predication-implicit way that makes the predication of represented properties (e.g. 'green card') dependent on the causal effect of a presented instance, since there is no instance being presented. Hence the representation of these instructions needs to be predication-explicit, making reference to hypothetical presentations of instances of green cards. That also means that not only predication to instances but also the fact that these are not real but only hypothetically considered instances needs to be made explicit. Making the reality status (fictive vs. factual) explicit we have called fact-explicit (Dienes and Perner, 1999). The same explicitness is, of course, also required for planning, reasoning, and entertaining hypotheses before one can come to a conclusion which action sequence best to employ. Implementing verbal instructions, implementing actions as a result of reasoning and planning, and testing hypotheses all require content control. For instance, when given verbal instructions to put every green card into the left box, one has
226
Josef Perner
to install a new (or sufficiently strengthen an existing weak) action schema by strengthening the association between the input condition 'green card' and the action 'put card into left box'. As in the case of learning by conditioning, there needs to be higher-order control. In the case of conditioning, the higher-order control problem consists in translating the feedback information (+, -) into strengthening certain schemata and weakening others. As discussed above, the appropriate schemata can be found on the basis of vehicle features, for example, strongly or weakly activated during the production of the last action. In the case of verbal instructions, however, the higher order control problem consists of translating the verbal instructions into installing or strengthening a particular action schema. The important difference is that-unlike in the case of conditioningthere are no pure vehicle features that would allow the higher-order control process to find the correct schema. The only target specification is the schema with the content specified by the verbal instructions. Control that needs to identify its target in terms of content I call content control. Since content in itself has no causal power, it cannot direct any control. What is required is a provision in the representational vehicle that allows identification of different representational vehicles in terms of sameness of their content. Content control is also required for implementing new schemata on the basis of reasoning, planning, or hypothesis testing, since the fact-explicit representation of hypothetical events needs to be translated into corresponding action schemata, of which the parts are only identified by their representational content.
Interplay ofLevels The relationship between the two levels needs to be specified in more detail. I have mentioned so far that the higher level exerts control over the lower level by changing connections at the lower level. The actual production of action, however, remains the duty of the lower level. In other words, the higher level represents what should be done and modifies the lower level accordingly. The lower level then executes the wanted action when the occasion arises. Moreover, the higher level cannot only represent what ought to be done but can also provide the triggering stimuli for the relevant action schema and monitor what is being done. Out of this interplay of levels several interesting possibilities can be distinguished that help meet the different objectives.
4. MEETING THE OBJECTIVES 4.1 Conceptual and Phenomenal Distinctions Different types of movement and action can be distinguished depending on whether the actual action produced by the lower level does or does not conform
The Case ofNon-intentional Action
227
to the higher level's prescriptions, and whether the higher level does or does not monitor the lower level: Movement: A bodily movement not due to an action schema, for example, I slip with my elbow off the table. Response/Habit: 7 Movement governed by a schema that is goal oriented but there is no goal representation involved. For instance, if a rat learns to turn right at the T-maze junction because it gets rewarded with food at the end of that branch, then that behaviour is goal oriented (McFarland, 1989, used the term 'goal .seeking') in the sense that the rat does turn right because it leads to a reward. However, the rat could turn right at this junction without any representation of the food waiting at the end (though there are data from goal devaluation studies (Heyes and Dickinson, 1993) suggesting that rats do have goal representations for instrumentally conditioned responses but not for innate ones, e.g. scratching when it itches). Action: Movement due to action schemata governed by a represented goal. Non-intentional (automatic) action: The lower system produces an action that is not envisaged by the higher order system, for example, driving straight home when one wants to go to the supermarket before going home. Intentional action: The higher system sets up the lower system. When the external conditions arise, the lower system executes the appropriate action. (a) Absent-minded (automatic) intentional action: The higher system is not involved in triggering the action or in monitoring its execution, for example, driving home as planned but deeply engrossed in conversation thereby not consciously registering any of the driving actions. (b) Conscious (automatic) intentional action: The higher system is not involved in triggering the action but it does monitor its execution, for example, driving home and taking quick appropriate action that registers after the fact. (c) Willedlcontrolled/voluntary action: The higher system represents for the particular case what ought to be done, triggers the schema, and monitors the execution. This constitutes a case of'willed or controlled action' (e.g. Jahanshahi and Frith, 1998), for example, I have to lift my arm at some self-determined time; or I say to myself in my mind, 'this is the crossing to the supermarketI have to turn right', and I do turn right. 7 In animal psychology (e.g. Dickinson, 1994) the term 'habit' is used for a stimulus-response association that is not dependent on a representation of the reinforcing result (goal) of the response as illustrated by the habit to scratch oneself where it itches. Elisabeth Pacherie (personal communication) pointed out that this usage of the term does not easily generalize to the case of human habits like smoking. There needs to be some goal representation (as part of a motor representation in the sense of Jeannerod) without which no coordinated cigarette scrounging and lighting behaviour would be possible. A possible way to make these usages consistent would be to suggest that smoking is called a habit because it is only the motivational state of nicotine deprivation and perception of the social situation without any goal of wanting to smoke a cigarette that trigger the motor representation (including motor goal) of finding and lighting a cigarette.
Josef Perner
228
4.2 Conscious Awareness: A 11/z-order-thought theory Voluntary control is often practically identified with conscious awareness. So much so, that Jacoby's (1991) process dissociation procedure, a prevalent method of estimating the relative contributions of conscious versus unconscious knowledge to task completion, is based on the assumption that voluntary control requires conscious awareness (e.g. refrain from using a subliminally perceived word to complete a word stem). In contrast, knowledge whose influence cannot be controlled in this way is assumed to be unconscious. My prime objective here is to explain to what degree this quasi-identification is justified. To address this issue I first layout some of my basic intuitions about conscious awareness. Ned Block (1994, 1995) distinguished different types or aspects of consciousness two of which are now widely used: phenomenal consciousness and access consciousness. I first want to address access consciousness, how it (and the related aspects of monitoring and self-consciousness) captures (my) natural intuitions about conscious awareness, and how it relates to higher-level control. Then I will address the issue of how phenomenal consciousness might be related to access consciousness and higher-level control.
Access Consciousness Access consciousness captures the intuition that conscious contents are promiscuously accessible to a variety of inferential processes, in particular to the rational control of action and speech (Block, 1994: 214). One should add accessibility to higher-order thoughts (often distinguished as monitoring consciousness) and selfascription of these thoughts (self-consciousness). To me, only some of these aspects seem critical, namely some form of accessibility to higher-order thoughts. Inferential promiscuity and access to rational action and speech are not defining of conscious awareness. They are features whose typical linkage with conscious awareness is in need of explanation. Accessibility to higher-order thought, one might think, necessitates self-ascription. However, as we are exclusively concerned with how subjects see the world from their own perspective there is no need to differentiate one's own thoughts from those of others. There is no need for a concept of self in order to ascribe the experienced thoughts to oneself, that is, the self as the focal point of conscious experiences can remain representationally tacit (Eilan, 1995: 63). Here are some of my intuitions of why access to higher-order thoughts, a position currently defended by Rosenthal (1986, 2000) and Carruthers (1996, 2000), is critical for consciousness. To me it seems incoherent to claim that I am consciously aware of, say, the monitor in front of me without being able to specify the mental attitude with which I behold the fact of the monitor being in front of me, that is, whether I am seeing it, or just thinking about it, and so on. This intuition is also reflected in most experimental investigations of the distinction between
The Case ofNon-intentional Action
229
implicit/unconscious as opposed to explicit/conscious perception, memory, or learning in healthy people and in patients with neurological deficits such as blindsight. Unconscious knowledge is typically inferred by demonstrating existing knowledge about some state of affairs (e.g. the location of a spot of light) to the person's (blindsight patient's) surprise. On this intuition, a higher-order thought that one knows where the spot of light is seems to distinguish conscious knowledge from the blindsight patient's knowledge. It is often objected that it remains a puzzle how simply making something second order could produce consciousness. This objection, however, ignores two critical points; One point is that not any first-order representation may be accessible to higher-order thought formation as, perhaps, Carruthers's (1996) position suggests. We (Dienes and Perner, 1999), for instance, suggest that higher-order thoughts can only be formed of knowledge with predication- and fact-explicit content. I will give a brief justification of this claim in my examples below. The second, related point is that going from having a first-order state to forming a higherorder thought (HOT) about a first-order state requires representing the first-order state. This is far from being a trivial step whose magnitude has, perhaps, not been sufficiently emphasized by higher-order thought theorists. Concentrating on this point brings into focus the question of how precisely the HOT has to represent the first-order attitude to create conscious awareness. In an attempt to answer critics who suggest that consciousness is not dependent on HOTs but consciousness makes them possible, let me develop a minimalist higher-order account that satisfies the intuition that conscious knowledge implies knowing that one knows. The intuition that conscious awareness implies that we know what we know minimally entails (even for an organism lacking a concept of knowledge) that one should not be surprised (at least not gain any new knowledge) from renewed information about a known fact. That is lacking in blindsight. When a blindsight patient having 'seen' the spot in his blind field turns his head so that he now sees the spot in his healthy field of vision he will subjectively gain 'new' information. s In contrast, a person with unimpaired vision will gain no new information. What makes normal sight different from blindsight is not (so much) the ability to attribute knowledge to oneself, it is the ability to identify the fact in question (location of the spot) across two different instances of informational events, so that these two information events can be related to the fact in question and can be appropriately integrated in a coherent body of knowledge. The integration is only possible if the fact in question is represented predication explicitly. Only then can the information coming from the two visual fields be taken as pertaining to a single fact. 9 8 With this example I want to avoid the task demands that arise in the typical case where the blindsight patient is told that he had pointed correctly to the spot of light in his blind field. To be surprised, the patient needs the higher-order understanding that his correct behaviour must be mediated by 'knowledge' about the spot. 9 Campbell (l997a) developed a similar position about the nature of selective attention, a close associate of conscious awareness.
230
Josef Perner
On this minimal intuition about consciously known facts, predication-explicitness is prerequisite for conscious awareness. Moreover, if we plausibly require that conscious awareness should keep factual states of affairs distinct from fiction then conscious awareness also requires fact-explicitness as a necessary condition. Are predication- and fact-explicitness sufficient for conscious awareness? Higher-order thought theorists will argue that it falls short of accounting for our intuitions. However, we may be closer to higher-order thoughts than it seems. This hinges on the question of ,,,,hat constitutes our concept of knowledge. As Gordon (1995) pointed out, when we stick exclusively to our own present perspective then there is an isomorphism between what we know and what is a fact for us. It is only when we want to understand what others know or don't know in contrast to ourselves (or what we didn't know but know now) that we have to expand by simulation in terms of perspective switching (as Gordon would argue; see also Perner, 1999). OUf concerns about consciousness, however, only pertain to the subject's own perspective. Hence, fact-explicit representation minimally constitutes some meta-awareness of what one knows, which satisfies the basic intuition behind the higher-order thought theory that being consciously aware of a state of affairs entails knowing (assertively thinking) that one knows. So, to claim that predication- and fact-explicit representation is sufficient for consciousness, adn1ittedly, falls short of being a typical HOT theory, since there is no first-order mental state represented by a HOT independent of the explicit representation of factuality. Looked at in this way, we are dealing with a first-order theory of consciousness. Yet, the reason why fact-explicitness accounts for consciousness is that it provides the distinction between what I know and what I don't know-although this distinction remains limited to one's own present perspective. So, knowing/assertively thinking what is a fact and what isn't amounts to (second-order) knowing/assertively thinking what one knows and what one doesn't. In this sense, it is a higher-order thought theory. Caught between the canlps, let it be known as the llj2-order-thought theory of consciousness. More needs to be said about the limitations of this kind of consciousness. Factexplicitness only provides meta-knowledge within one's present perspective. It works for the here and now. It might work for the past (as seen from the present) depending on whether a proper understanding of the past isn't itself dependent on an understanding of perspective and its causal relations to the remen1bering mind (Campbell, 1997b; McCormack and Hoerl, 1999; Perner and Ruffman, 1995). What it doesn't allow is to distinguish what one knew then from what one knows now about then. One's rudimentary HOTs are tied to present knowledge about present or past. To know about past knowledge (or thoughts) one needs to separate one's present perspective from one's past perspective and for this one needs a notion of knowledge independent of factuality. Perspective problems of this kind occur not only with the past but also with fiction where a possible world is created in distinction to the real world. One
The Case ofNon-intentional Action
231
could say that 1liz-order consciousness is world specific as in the case of dreaming. On the 1liz-order model, when I operate in the real world then I am consciously aware of what I represent fact-explicitly about the real world. When I am dreaming I am completely immersed in my dream world. I can be consciously aware of the fact-explicitly represented dream facts, but I would not be able to represent that I am only dreaming 10 and not be able to remember my dreams when back in the real world of waking life. To remember dreams, a full second-order HOT of knowing that one dreamt these scenes is required. This, it is said, is only possible for a few of the many dreams we have (those close to being woken up). But those we do remember, we remember as 'conscious experiences' of the dream world, but are reluctant to say that we are really conscious while dreaming. This ambivalence can be accounted for by the 1liz-order theory: we are conscious during dreaming in the basic sense of fact-explicitness, but not 'fully' conscious as an integral part of our waking consciousness, because the rudimentary dream consciousness is tied to the dream world and cannot be integrated with our perspective from waking life. It can only be integrated if an explicit mental state concept of, for example, dreaming, becomes available so that the experienced dream facts can be put in place when seen from the waking perspective outside the dream. In sum, on the 11Iz-order-thought theory, conscious awareness becomes possible by representing not just features of the world but by representing that there is a world that has these features. Predication- and fact-explicitness allows representation of what features the world has and with that one has an incipient meta-knowledge of what one knows, as the higher-order thought theories of consciousness require. It implies that predication- and fact-explicit representation is not only necessary but also sufficient for, at least, this rudimentary form of access consciousness. Since predication- and fact-explicitness is-as I have argued-necessary for higherorder control, we have gained a justification for the intuition that voluntary control implies consciousness as, for instance, Jacoby (1991) has formalized in his procedure to separate unconscious from conscious influences. The question remains what role phenomenal consciousness plays in this picture. Phenomenal Consciousness
Phenomenal consciousness (Block, 1994) means the 'subjective feel' of our conscious experiences or 'What it is like to be a bat' to follow Nagel's (1974) famous title. A first question for us concerns the relation between subjective feel and access consciousness. Block has argued that they are independent. However, I find it difficult to intuit how one can have feel without access. Carruthers (2000) and Rosenthal (2000b) have argued that access (in particular to higher-order thoughts) is necessary for phenomenal consciousness and even sufficient. 10 Potential exceptions to this view are reports of lucid dreaming (LaBerge, 1985) where one is conscious of dreaming during one's dream.
232
Josef Perner
Carruthers, for instance, starts by pointing out that the specification of phenomenal consciousness as <what it is like to be us' is in need of clarification by distinguishing <what the world is like for us' (worldly subjectivity) from <what our experiences of the world are like for us' (experiential subjectivity). First-order theories (Dretske, 1995; Tye, 1995) can account for worldly subjectivity due to the fact that mental representation presents the world under a certain mode of presentation. However, this only explains what the world is like for the organism (that the world takes on a subjective aspect by being presented). It does not account for-what we really need-experiential subjectivity, namely, what an experience is like (that the organism's experience takes on a subjective aspect). This only follows from a higher-order representation of the experience of the world, because the higherorder representation presents the experience under a certain mode of presentation and thereby confers a subjective aspect upon the experience. Here again, we can ask the question of how the HOT has to represent the firstorder state; how that state has to be conceived, so that it provides experiential subjectivity. Rosenthal and Carruthers provide diametrically opposed answers. Rosenthal (2000a: 207) says that it needs not to be conceived as mental, just as a state. This strikes me as too loose a requirement because if the HOT represents the mental state not as mental but just as a state, then why is the HOT higher order and not just a first-order mental state. In contrast, Carruthers (2000: 195) requires that the first-order state be minimally conceived of as giving a perspective, an appearance of what it is about. This is an extremely strong requirement with extremely counter-intuitive developmental consequences that Dretske (1995) has noted before. Existing developmental data suggest that children do not acquire the notion of appearance much before 4 years (Flavell et al., 1983) and visual perspective (Masangkay et al., 1974; Flavell et al., 1981) and false belief as a mistaken perspective (Wimmer and Perner, 1983). Hence, according to Carruther's theory, children younger than 4 years would not be phenomenally conscious in our adult sense. The counter-intuitive nature of this consequence is underlined when we consider that a host of abilities that are usually closely associated with conscious awareness develop much earlier at around 9 to 18 months. At this age children start to COlllmunicate verbally (Fenson et al., 1994), follow instructions (Shatz, 1978; Babelot and Marcos, 1999), plan novel action sequences (Piaget, 1937; Haake and Somerville, 1985; Willatts, 1997), and show persistency in focal attention (Gardner et al., 2000). They also engage in delayed imitation (Meltzoff, 1988; Bauer, 1996), which amnesic patients lacking explicit/conscious memory do not do (McDonough et aI., 1995). This is also the age at which children show various signs of fact-explicit representation (for review, see Perner, 1991), in particular pretend play (typically also requiring conscious awareness). So, if the 1I/2-order theory, where conscious awareness comes with predication- and fact-explicitness, can also serve as a basis for phenomenal consciousness then we arrive at a much more coherent
The Case ofNon-intentional Action
233
developmental picture and much simpler general theory (see Perner and Dienes, 2003). To see whether this is possible, let me trace the levels of subjectivity again and see how they relate to levels of explicitness. Worldly subjectivity is shown even by predication-implicit representations. If an object's colour registers as 'red' the world is different for the subject than if this colour does not register or registers as 'rose'. However, there is no possibility of an internal appreciation that that is what the world is like. This appreciation only becomes possible, according to Carruthers, with a second-order representation of how red seems to me. The subject can think of an experience of green that is distinct from a concurrent experience of red (Carruthers, 2000: 195). However, a very similar contrastive appreciation also becomes possible with predication- and fact-explicit representation, i.e., that the world is red rather than green. The difference to Carruther's proposal is the following. His proposal requires the ability to freely contrast an experience of red with an experience of green pertaining to a single state of affairs, whereas fact-explicitness leaves each experience tied to a particular state of affairs. However, that still leaves the ability to contrast a concurrent experience of red (pertaining to the real world) with a hypothetical experience of green (by considering a counterfactual world). Predication- and fact-explicitness enables an understanding that the world can be this way or that way and thereby creates different experiences providing a practical understanding of different experiences of the world. So we see that the 1'/zorder-thought theory suggests that with predication- and fact-explicitness one can enjoy not only a minimalist kind of (albeit perspective bound) access to higher-order thoughts but also a minimalist kind of phenomenal consciousness in terms of a practical appreciation of experiential subjectivity.
Summary By arguing for a 1'/z-order theory of consciousness we get less demanding requirements for conscious awareness than from the full-blown higher (second) order theories (Carruthers, 2000; Rosenthal, 1986). Nevertheless, we still capture the intuitions behind these theories, in particular the notion that conscious knowledge of a fact implies some knowledge that one knows that fact. Following on from Carruther's argument about experiential subjectivity, it can also account for a rudimentary, practical (but not conceptual) appreciation of the subjectivity of our experiences. It also enables the more coherent developmental picture that infants become capable of conscious awareness at around 9 to 18 months (Perner and Dienes, 2003) when they show signs of predication- and fact-explicit representation (e.g. pretend play) and correspondingly start to engage in activities that are typically not possible without conscious awareness in adults (e.g. verbal communication) .
234
Josef Perner 4.3 Verbal Report
There is the entrenched intuition that verbal communication requires conscious awareness. Dennett (1978) even made verbal reportability the hallmark of consciousness. Why should there be this intricate link? If I can drive a car through dense traffic absent-mindedly while talking to my passenger, why can't I talk absent-mindedly to my passenger while concentrating on the driving? Perhaps sometimes this can happen, but then the conversation tends to consist of empty phrases, in a similar way to my experience of singing bedtime songs to my children. I kept singing while thinking about more important matters. When interrupted I had no idea where I was in the song and I had to start again from the beginning. My impression is that this mindless singing is possible only because ' the song is known by heart and the control is purely at the level of the representational vehicle, one word follows another regardless of its content. In contrast, in 'intelligent' conversation one needs to control one's flow of words according to the meaning that one wants to convey and cannot rely on an overlearned standard sequence. l1 These observations suggest that for 'intelligent conversation' I need content control. For content control I need predication- and fact-explicitness, that is, conscious awareness on the 1Vz-order-thought theory of consciousness. 4.4 Executive Control Dual control systems are suggested by neuropsychological research. Starting with Luria (1966) and Bianchi (1922), it was found that patients with frontal lobe insult experience a loss of control over their actions (Milner, 1964) and have difficulty with certain tasks. Norman and Shallice (1986) interpreted these difficulties as resulting from the impairment of the higher level of control within their dual control model where automatic control by contention scheduling is contrasted with willed control by the supervisory attentional system (SAS). I therefore refer to the kinds of tasks that Norman and Shallice list as requiring the higher level as' SAS tasks'. In fact, it is this list of impairments that gives empirical meat to their claim about the existence of the SAS. SAS 1. SAS 2. SAS 3. SAS 4.
planning/decision-making troubleshooting novel/ill-learned action sequences dangerous or technically difficult actions
11 Perhaps I should not be so sanguine about my intuitions. It could be that experienced sports reporters are able to work absent-mindedly even though they do not rehearse a set story but make their words contingent on what they see. This could be possible without content control if sports reportersdue to their extensive practice-have developed pathways where the visually observed automatically activates verbal descriptions of the same content. Similar to reading where the visual input of a word automatically activates the phonetic realization of that word as demonstrated by the Stroop effect.
The Case ofNon-intentional Action
235
SAS 5. overcoming strong habitual response tendencies or temptation (e.g. Stroop) A similar approach has been taken in the working memory tradition (Baddeley and Hitch, 1974) where a central executive (CE) is identified as the coordinating agent within working memory and which is considered as a close relative of the SAS (Baddeley, 1996). The nature of the CE, too, has been characterized by a list of tasks to which I refer as the 'CE tasks' (Baddeley, 1996; Baddeley and Della Sala, 1998): CE 1. CE 2. CE 3. CE 4.
coordination of two independent tasks (dual task performance) generation of random numbers attention switching and focusing retrieval and manipulation of information from long-term memory
I have specified the two levels of control independently on theoretical grounds. In particular, the higher level was specified as that level which requires factexplicit representation and content control. I now need to show that fact-explicit representation and content control is required for SAS and CE tasks in order to explain why these tasks become problematic when the higher-level control system is impaired. To this end SAS and CE tasks can be taken together and regrouped into three categories for easier discussion of relevant features. 1. Some tasks require that new action schemata be established or that established tasks be given additional support if not well enough established (SAS 3: novel or ill-learned) or because the established schema is not strong and precise enough for the purpose on hand (SAS 4: dangerous or technically difficult). Vehicle control becomes successful if the relevant connections have been firmly established for each task and once the appropriate relationship among tasks has been established. Hence, in all cases where the established relationship among tasks is not adequate, additional content control is needed. This also applies to the first CE problem, coordination of independent tasks. Although each component task is established and could run by itself on vehicle control, the task of coordinating the two established tasks is novel (or else it would be a single task) and requires content control. 2. When established tasks do not function correctly (SAS 2: troubleshooting) or interfere with what one tries to do (SAS 5: overcoming habits) then one needs additional, content-directed control. The same is required for CE2 and 3: random number generation and attention switching. Both these cases require content control in order to stay away from the established, i.e. randomness requires avoidance of following any regularities and switching attention consists in overcoming the forces that would naturally maintain attention on the established task. 3. Planning and decision-making (SAS 1) consist of entertaining and generating predication- and fact-explicit representations of possibilities and then picking one possibility. Moreover, the generation of possibilities is supposed to be content
236
Josef Perner
governed, that is, projection of unusual future states that can be reached from the present state by possible actions. It needs to be more creative and go beyond generating the usual train of thought (which can be vehicle controlled). As we see, the tasks that are deemed to require executive control are tasks that need content control and, consequently, according to argument, they need factand predication-explicit representations which are accessible to conscious awareness and verbal report. 4.5 Slack in Intention Judgements Why are we often unclear and wrong about whether we are or are not authors of actions and events? Searle's intention-in-action theory leaves little room for such errors. In his theory, every intentional action has its intention as an integral part and so leaves little room for the existence of non-intended actions or for misattribution of agency to ourselves when we were not the authors of an act (e.g. Wegner and Wheatley, 1999). The dual control model gives more room for non-intended actions and more room for misattribution, since the lower system is given leave to operate on its own within the specifications stipulated by the higher system. The attribution of agency, whether it was me who intentionally instigated and .executed this action, must be based on a judgement whether the action and its results are in line with the higher level's stipulations. The relevant information may be a mix of direct innervation sensations (as reviewed by Jeannerod, 1997) and proprioceptive and visual sensory feedback about the action. So, for a willed/controlled action, where the higher system gives the impetus for action and closely supervises the action execution, misattribution errors are rare. And indeed, such errors can be experi- . mentally demonstrated only with the help of most unusual manipulations of visual feedback as in Nielsen's (1963) 'alien hand' experiment, or by drawing attention away from the action execution by various tricks as in Wegner and Wheatley's (1999) Ouija board experiment.
5. NOTES ON SEARLE 5.1 Causal Self-Referentiality Searle (1983) has made the widely known claim that the content of intentions (of prior intentions as well as intentions-in-action) needs to specify self-referentially that the intention be the cause of the action. This claim also made for perception has come under strong criticism (Armstrong, 1991; Burge, 1991) as overintellectualizing basic functions. The aim here is to see to what degree Searle's claim can be partly substantiated without excessive over-intellectualization.
The Case ofNon-intentional Action
237
Searle resists the suggestion that causal reference need only exist in the objective satisfaction conditions: the intention of an action must specify the action which it causes, or else it won't be an intentional action. He insists that causal selfreferentiality is internal to content. Pacherie (2000) took up the challenge to ask about the psychological implementation of causal self-referentiality within Jeannerod's theory of motor representations but her answer to the question of how causal self-referentiality is internal to content remained rather hedged. My task differs from Pacherie's because I do not look for causal self-reference within the (intention in actions of the) lower action producing system but in the interplay of the higher-level system with the lower system, as briefly outlined in Pemer (1998). The higher level controls the lower level by modifying it or supporting particular schemata. The higher level thus implements (or helps implement) concrete action tendencies. The mechanism (an implementation schema)12 that implements the prescriptions given by the higher level in the lower level as action schemata, procedurally represents the causal responsibility of producing the desired action, just as an ordinary action schema procedurally represents stimulus-action-outcome sequences. Let me work out these issues in more detail by focusing on the analogy with the perception case. When looking at a cat causes the tokening of 'cat' (for instance, by a CAT node being activated), then that token represents predication-implicitly the fact that the individual, which caused the tokening, is a cat. It does so because it represents the fact that the individual in front of me is a cat, not just cat-ness. It does so according to consumer semantics (Millikan, 1984; Dretske, 1988) because the token 'cat' has the function of covarying with the presence of a cat and of directing behaviour in relation to the presence of a cat. It does not represent the fact that the 'cat' token was caused by the presence of a cat (although it covaries with being a cause, it does not direct behaviour in relation to being a cause). In case of an 5-A-O action schema, stimulus 5 elicits the action A in order to reach the outcome (goal) O. 50 when 5 occurs and the schema gets activated we can say that the's' token explicitly represents the type of situation 5 and predicationimplicitly represents the presence of a concrete occurrence of 5 (just like in the case of the cat). Corresponding tokens 'a' and '0' and the 's-a-o' connection explicitly represent the action types, outcome (goal) types, and the transformation of 5 into o by applying A. Predication-implicitly they represent the particular action A produced aqd the particular change thereby effected of transforming 5 into O. But, again, there is no implicit representation of the fact that the token 's-a-o' causally leads to the emission of action A. Now consider the implementation schema. It is like an action schema except that it takes as stimulus condition a declarative representation at the higher level 12 An implementation schema could be considered a procedural (predication-implicit) counterpart to implementation actions and implementation intentions studied by Gollwitzer (1999).
238
Josef Perner
of control (e.g. when S occurs then A needs to be taken in order to achieve 0) and takes the action of installing a functional procedure (S-A-O) with the goal that this procedure then produces the action A at the appropriate time. Now this implementation schema has as part of its content that the implemented action schema be causally responsible for actions. If not causally self-referential, it is causally referential.
5.2 Intentions in Action and Causal Deviancy I want to briefly elaborate the example of causal deviancy given by Searle to show that it then provides a difficulty for Searle's theory of intentions in action but can be accommodated by the dual control model. The example was of the young man who wants to kill his uncle and does so unintentionally even though it was the preoccupation with his desire to kill that was causally responsible for running over the pedestrian that happened to be his uncle. The intention-in-action theory can account for the lack of intentionality because there is no intention-in-action that has as its content that the particular driving action is to achieve the desired goal. The dual control model can account for the lack of intentionality because the driving actions were not governed by an action schema that was implemented by the young man's higher-level desire to kill his uncle. Now to distinguish between the theories I elaborate the example. Assume our young, uncle-hating man is very methodical. He knows how difficult it is to coldbloodedly kill a relative. So he registers in a course for professional killers. Following course wisdom, he then practises for months on foggy nights in front of his uncle's favourite pub to run over his uncle with his car. After reaching perfection, he stalks his uncle to the pub with the intention of carrying out his well-practised routine on the way home. While in the pub he has a complete change of heart, sees the error of his ways, and-in my terminology-disavows his well-practised killing schema. Then on the way home he spots his uncle and, despite having been disavowed-the by now automatized killing routine takes over and the uncle is dead before the young man fully realizes what he has done. This was not an intentional killing. This lack of intentionality is difficult to explain on the basis of intentions-in-action since the very same act would count as intentional if the young man had not had a change of heart in the pub. It is not intentional on the dual control model, since the uncontrolled action schema has been disavowed by the higher-level control system. Well-practised action often precedes our thoughts as nicely documented by Wakefield and Dreyfus (1991: 265), quoting the Boston Celtic basketball player Larry Bird describing the experience of passing the ball in the heat of a game: 'A lot of times, I've passed the basketball and not realized I've passed it until a moment or so later.' The difference from our unfortunate, hypothetical young
The Case ofNon-intentional Action
239
man is that Larry Bird's passing throws conform to his declarative intentions and, therefore, they count as intentional.
6. SUMMARY
My main concerns were to account for the existence of non-intended actions within a causal theory of action. A dual control model can satisfy this concern. Intentional action is defined by the match between what the lower level produces and what the higher level stipulates should be done. The higher level of control I defined as requiring predication- and fact-explicit representations and exercising content control (changes in the flow of control are aimed at vehicles with a particular representational content) and the lower level as control by representational vehicle. The need for predication- and fact-explicit representations provides an account for why intentional action is seen as conscious and verbally reportable. To get a tighter fit between consciousness, verbal report, and content control I argued for a minimalist 11/2-order-thought theory of consciousness, which makes the perspective-bound meta-knowledge provided by predication- and fact-explicit representations (instead of a full-blown second-order thought) sufficient for consciousness.
REFERENCES ARMSTRONG, D. M. (1991), 'Intentionality, perception, and causality: reflections on John Searle's Intentionality', in E. Lepore and R. Van Gulick (eds.), John Searle and his Critics. Oxford: Blackwell, 149-58. BABELOT, G., and MARCOS, H. (1999), 'Comprehension of directives in young children: influences of social situation and linguistic form', First Language, 19: 165-86. BADDELEY, A. (1996), 'Exploring the central executive', The Quarterly Journal of Experimental Psychology, 49A: 5-28. --and DELLA SALA, S. (1998), 'Working memory and executive control', in A. C. Roberts, T. W. Robbins, and L. Weiskrantz (eds.), The Prefrontal Cortex: Executive and Cognitive Functions. Oxford: Oxford University Press, 9-2l. --and HITCH, G. J. (1974), 'Working memory', in G. H. Bower (ed.), The Psychology of Learning and Motivation. New York: Academic Press, 47-89. BARGH, J. A., and CHARTRAND, T. L. (1999), 'The unbearable automaticity of being', American Psychologist, 54: 462-79. BAUER, P. J. (1996), 'What do infants recall of their lives?', American Psychologist, 51: 29-41. BIANCHI, L. (1922), The Mechanism of the Brain and the Function of the Frontal Lobes. Edinburgh: Livingstone. BLOCK, N. (1994), 'Consciousness', in S. Guttenplan (ed.), A Companion to the Philosophy ofMind. Oxford: Blackwell, 210-19.
240
Josef Perner
BLOCK, N. (1995), 'On a confusion about a function of consciousness', Behavioral and Brain Sciences, 18 (2): 227-87. BRAND, M. (1984), Intending and Acting. Cambridge, Mass.: MIT Press. A Bradford book. BURGE, T. (1991), 'Vision and intentional content', in E. Lepore and R. Van Gulick (eds.), John Searle and his Critics. Oxford: Blackwell, 195-213. BURGESS, C. A., KIRSCH, I., SHANE, H., NIEDERAUER, K. 1., GRAHAM, S. M., and BACON, A. (1998), 'Facilitated communication as an ideomotor response', Psychological Science, 9: 71-4. CAMPBELL, J. (l997a), 'Sense, reference and selective attention', Aristotelian Society, Suppl., 71: 55-74.
--(1997b), 'The structure of time in autobiographical memory', European Journal of Philosophy,S (2): 105-18. CARRUTHERS, P. (1996), Language, Thought and Consciousness. An essay in Philosophical Psychology. Cambridge: Cambrige University Press. --(2000), Phenomenal Consciousness: A Naturalistic Theory. Cambridge: Cambridge University Press. DAVIDSON, D. (1963), 'Actions, reasons, and causes', Journal ofPhilosophy, 60: 685-700. DELLA SALA, S., MARCHETTI, c., and SPINNLER, H. (1994), 'The anarchic hand: a frantomesial sign', in F. Boller and J. Grafman (eds.), Handbook ofNeuropsychology, 9: 233-55. DENNETT, D. (1978), 'Toward a cognitive theory of consciousness', in C. Savage (ed.), Minnesota Studies in the Philosophy of Science, 9. DICKINSON, A. (1994), 'Instrumental conditioning', in N. J. Mackintosh (ed.), Animal Cognition and Learning. London: Academic Press, 45-79. DIENES, Z., and PERNER, J. (1999), 'A theory of implicit and explicit knowledge (target article)', Behavioral and Brain Sciences, 22: 735-55. DRETSKE, F. (1988), Explaining Behavior: Reasons in a World of Causes. Cambridge, Mass.: MIT Press. A Bradford book. --(1995), Naturalizing the Mind. Cambridge, Mass. and London: MIT Press. EILAN, N. (1995), 'IV*-The first person perspective', Proceedings of the Aristotelian Society, 95: 51--66. FENSON, 1., DALE, P. S., REZNICK, J. S., BATES, E., THAL, D. J., and PETHICK, S. J, (1994), Variability in Early Communicative Development: Monographs of the Society for Research in Child Development, serial no. 242, vol. 59, no. 5. Chicago: Society for Research in Child Development. FLAVELL, J. H., EVERETT, B. A., CROFT, K., and FLAVELL, E. R. (1981), 'Young children's knowledge about visual perception: further evidence for the Level I-Level 2 distinction', Developmental Psychology, 17: 99-103. --FLAVELL, E. R., and GREEN, F. 1. (1983), 'Development of the appearance-reality distinction', Cognitive Psychology, 15: 95-120. FRITH, C. D. (1992), The Cognitive Neuropsychology ofSchizophrenia. Hillsdale, NJ: Erlbaum. GARDNER, J. M., FELDMAN, I. J., KARMEL, B. Z., and FREEDLAND, R. 1. (2000), 'Development of focused attention from 10 to 16 months: effects of CNS pathology and intra-uterine cocaine exposure'. Poster presented at the European research conference on brain development and cognition in human infants: Development and functional specialization of the cortex, Agelonde, France, September 2000.
The Case ofNon-intentional Action
241
GOLLWITZER, P. W. (1999), 'Implentation intentions: strong effects of simple plans', American Psychologist, 54: 493-503. GORDON, R. M. (1995), 'Simulation without introspection or inference from me to you', in M. Davies and T. Stone (eds.), Mental Simulation: Evaluations and Applications. Oxford: Blackwell, 53-67. HAAKE, R. J., and SOMERVILLE, S. S. (1985), 'Development of logical search skills in infancy', Developmental Psychology, 21: 176-86. HEYES, c., and DICKINSON, A. (1993), 'The intentionality of animal action', in M. Davies and G. W. Humphreys (eds.), Consciousness: Psychological and Philosophical Essays. Oxford: Blackwell, 105-20. JACOBY, 1. 1. (1991), 'A process dissociation framework: separating automatic from intentional uses of memory', Journal ofMemory and Language, 30: 513-41. JAHANSHAHI, M., and FRITH, C. D. (1998), 'Willed action and its impairments', Cognitive Neuropsychology, 15: 483-533. JEANNEROD, M. (1997), The Cognitive Neuroscience in Action. Oxford: Blackwell. KIRSCH, 1., and LYNN, S. R. (1999), 'Automaticity in clinical psychology', American Psychologist, 54: 504-15. LABERGE, S. (1985), Lucid Dreaming. Los Angeles: Tarcher. LURIA, A. (1966), Higher Cortical Functions in Man. New York: Basic Books. MCCORMACK, T., and HOERL, C. (1999), 'Memory and temporal perspective: the role of temporal frameworks in memory development', Developmental Review, 19: 154-82. McDONOUGH, 1., MANDLER, J. M., McKEE, R. D., and SQUIRE, 1. R. (1995), 'The deferred imitation task as a nonverbal measure of declarative memory', Proceedings ofthe National Academy of Sciences USA, 92: 7580-4. McFARLAND, D. (1989), 'Goals, no-goals and own goals', in A. Montefiori and D. Noble (eds.), Goals, No-goals and Own Goals: A Debate on Goal-Directed and Intentional Behaviour. London: Unwin Hyman, 39-57. MASANGKAY, Z. S., MCCLUSKEY, K. A., McINTYRE, C. W., SIMS-KNIGHT, J" VAUGHN, B. E., and FLAVELL, J, H. (1974), 'The early development of inferences about the visual percepts of others', Child Development, 45: 357-66. MELTZOFF, A. N. (1988), 'Infant imitation after a I-week delay: long-term memory for novel acts and multiple stimuli', Developmental Psychology, 24: 470-6. MILLIKAN, R. G. (1984), Language, Thought and Other Biological Categories. Cambridge, Mass.: MIT Press. MILNER, B. (1964), 'Some effects of frontal lobectomy in man', in J. M. Warren and K. Akert (eds.), The Frontal Granular Cortex and Behaviour. New York: McGraw-Hill, 313-31. NAGEL, T. (1974), 'What is it like to be a bat?', Philosophical Review, 4: 435-50. NIELSEN, T. (1963), 'Volition: a new experimental approach', Scandinavian Journal of Psychology, 4: 225-30. NORMAN, D. A., and SHALLICE, T. (1986), 'Attention to action: willed and automatic control of behavior'. Center for Human Information Processing Technical Report No. 99. Reprinted in revised form in R. J. Davidson, G. E. Schwartz, and D. Shapiro (eds.), Consciousness and Self-Regulation, iv. New York: Plenum, 1-18. O'SHAUGHNESSY, B. (1991), 'Searle's theory of action', in E. Lepore and R. Van Gulick (eds.), John Searle and his Critics. Oxford: Blackwell, 271-87.
242
Josef Perner
PACHERIE, E. (2000), 'The content of intentions', Mind and Language, 15: 400-32. PARKIN, A. J., and BARRY, C. (1991), 'Alien hand sign and other cognitive deficits following' ruptured aneurysm of the anterior communicating artery', Behavioural Neurology, 4: 167-79. PERNER, J. (1991), Understanding the Representational Mind. Cambridge, Mass.: MIT Press. I A Bradford book. --(1998), 'The meta-intentional nature of executive functions and theory of mind', in P. Carruthers and J. Boucher (eds.), Language and Thought. Cambridge: Cambridge University Press, 270-83. --(1999), 'Metakognition und Introspektion in entwicklungspsychologischer Sicht: Studien zur "Theory of mind" und "Simulation"', in W. Janke and W. Schneider (eds.), 100 Jahre Institut fUr Psychologie und Wiirzburger Schule der Denkpsychologie. Gottingen: Hogrefe, 411-31. English Version on internet: http://www.sbg.ac.at/psy/people/perner/ docs/wuerzburg.doc. --and DIENES, Z. (2003), 'Developmental aspects of consciousness: how much theory of mind do you need to be consciously aware?', Consciousness and Cognition, 12: 63-82. --and RUFFMAN, T. (1995), 'Episodic memory an autonoetic consciousness: developmental evidence and a theory of childhood amnesia'. Special Issue: Early memory. Journal of Experimental Child Psychology, 59 (3): 516-48. PIAGET, J. (1937), The Construction of Reality in the Child. New York: Basic Books. REASON, J. (1979), 'Actions not as planned: the price of automatization', in G. Underwood and R. Stevens (eds.), Aspects of Consciousness. London, New York, Toronto, Sydney, and San Francisco: Academic Press, 67-89. ROSENTHAL, D. M. (1986), 'Two concepts of consciousness', Philosophical Studies, 49: 329-59. - - (2000a), 'Consciousness, content, and metacognitive judgments,' Consciousness and Cognition, 9: 203-14. - - (2000b), 'Sensory qualities, consciousness, and perception', unpublished manuscript, City University of New York. SEARLE, J. (1983), Intentionality. Cambridge: Cambridge University Press. SHALLICE, T. (1988), From Neuropsychology to Mental Structure. Cambridge: Cambridge University Press. SHATZ, M. (1978), 'Children's comprehension of their mothers' directive questions', Journal of Child Language,S: 39-46. SHIFFRIN, R. M., and SCHNEIDER, W. (1977), 'Controlled and automatic human information processing: II. Perceptual learning, automatic attending, and a general theory', Psychological Review, 84: '127-90. TYE, M. (1995), Ten Problems of Consciousness: A Representational Theory of the Phenomenal Mind. Cambridge, Mass. and London: MIT Press. WAKEFIELD, J., and Dreyfus, H. (1991), 'Intentionality and the phenomenology of action', in E. Lepore and R. Van GuIick (eds.), John Searle and his Critics. Oxford: Blackwell, 259-70. WEGNER, D. M., and Wheatley, T. (1999), 'Apparent mental causation: sources of the experience of will', American Psychologist, 54: 480-92.
The Case ofNon-intentional Action
243
WILLATS, P. (1997), 'Beyond the "couch potato" infant: how infants use their knowledge to regulate action, solve problems, and achieve goals', in G. Bremner, A. Slater, and G. Butterworth (eds.), Infant Development: Recent Advances. Hove, East Sussex: Psychology Press, 103-35. WIMMER, H., and PERNER, J. (1983), 'Beliefs about beliefs: representation and constraining function of wrong beliefs in young children's understanding of deception', Cognition, 13: 103-28. WOLPERT, D. M., GHAHRAMANI, Z., and JORDAN, M. I. (1995), 'An internal model for sensorimotor integration', Science, 269: 1880-2. ZELAZO, P. D., FRYE, D., and RApus, T. (1996), 'An age-related dissociation between knowing rules and using them', Cognitive Development, 11: 37-63.
11
The Development of Young Children's Action Control and Awareness Douglas Frye and Philip David Zelazo During the first half-dozen years of life, there are regular-and often remarkable-changes in children's ability to control their own actions (see Zelazo et al., ' 1997, for a review). For example, consider the familiar game of 'Simon Says', in which players must listen to commands to perform simple actions but are ' required to carry out only those commands that are preceded by the phrase ; 'Simon says'. Young children have a particularly difficult time complying with these instructions even though they seem to understand them. Usually, they perform all of the actions, including those they should ignore. Indeed, 4-year-olds may err on almost every occasion, and consistent success is typically only achieved by children several years older (Strommen, 1973). What might account for these age-related changes in action control? One possibility is that young children know perfectly well what they are trying to do, and simply have trouble stopping themselves from acting. This view suggests that the problem is one of response inhibition, and development in this instance reflects an increase in inhibitory control. Another possibility is that with age, children become increasingly aware of the constituents of their actions. According to this view, developments in awareness permit changes in the way that action plans or intentions are formulated, which in turn permits children to act selectively in a wider range of situations. The proposal that there are developmental changes in the formulation of intentions has to be distinguished from at least one other proposal-namely, that the changes are simply in children's skills. Changes in skill clearly alter the intentional actions we can carry out, and perhaps even the intentions we can have. For example, without the requisite training, none of us can perform a splenectomy, no matter how much a medical emergency may demand one. Furthermore, when there is a total lack of knowledge of the skills involved, and hence no chance of accomplishing the act, it seems wrong for someone to say, 'I intend to do a splenectomy today'. There must be innumerable changes in skills, from riding bicycles to tying shoes, that expand the range of actions that children can perform, and consequently, the range of intentions that they are able to entertain. Developmental changes in intention will have more extensive implications for action than changes in specific skills. The Tower of Hanoi, a well-studied measure
Children's Action Control and Awareness
245
of planning, illustrates one such change. At least up until the age of 12 years, the Tower of Hanoi reveals an increase in the number of subgoals that children can plan in advance, and this increase then leads to new success on more complicated, multi-step problems (Welsh, 1991). Increases in the number of subgoals children can plan and execute enable them to perform more complex actions, extended over longer periods of time. Indeed, being able to conjoin subgoals has the potential of affecting any complex action. Thus, changes in planning alter what children can intend to do, and unlike the acquisition of specific skills, they will have implications for children's intentional action across a wide variety of situations.
1. AN EXAMPLE OF AN EARLY AGE-RELATED
CHANGE IN INTENTIONAL ACTION A task that requires a small set of well-defined responses can help to expose how young children's action control and awareness develop because changes in skill can be held constant. One task devised for this purpose is the dimensional change card sort (DCCS). The DCCS is an executive function task that bears some resemblance to the Wisconsin Card Sorting Task (WCST) which has been used extensively with adults and older children to detect differences in functioning associated with damage to areas of the prefrontal cortex. Unlike the WCST, however, the DCCS is a rule-use task rather than a rule-learning task. As such, children are explicitly told which rules to follow; thus, to succeed, they must simply keep the rules in mind and act upon them. The DCCS presents children with two target cards-for example, a red triangle and a blue circle-each of which is affixed to a tray. The children are asked to match a series of test cards to the targets. The test cards-for example, blue triangles and red circles-are designed to match a different target when they are sorted by one dimension (e.g. colour) than when they are sorted by the other (e.g. shape). Children are told the correct rule for sorting on every trial. During a preswitch phase, they are first encouraged to match a small number of cards according to one dimension. Then, during a post-switch phase, then are asked to switch to the other dimension. So, for instance, they may be introduced to the colour game first and told on each trial, 'If it's red, put it here; if it's blue, put it there'. Each time they are then presented with a test card that they are allowed to sort. After several cards, they would be instructed to switch to the shape game and told • each time, 'If it's a triangle, put it here; if it's a circle, put it there'. The primary developmental change found in this task occurs between 3 and 5 years of age (Frye et al., 1995). Whereas 5-year-olds switch their sorting, 3-year-olds typically continue to sort by the first dimension even when they are told the rules for sorting by the second dimension on every trial. This pattern occurs regardless of which dimension is presented first, and it is not limited to the dimensions of
246
Douglas Frye and Philip David Zelazo
colour and shape. Nor does it seem to depend on a specific wording of the rules. Finally, the developmental pattern agrees with a variety of other executive function tasks that, like Simon Says, do not involve sorting cards (Carlson and Moses, 2001). At first glance, the limits in action control exhibited by the young children in the DCCS seem to be compatible with the inhibition account mentioned previously. The 3-year-olds could know what they want to do but be unable to inhibit the outdated, prepotent response (Dempster, 1992; Diamond and Taylor, 1996). In Simon Says, the prepotent response is to comply with the instructions being given. It could then be difficult to inhibit that prepotent response tendency when an instruction is not preceded by the authorizing phrase 'Simon says'. The preswitch responses in the DCCS are not prepotent prior to the task, but they become so during the procedure: whichever game children are asked to play first (i.e. sorting by colour or shape) becomes prepotent and then dominates the child's actions during the post-switch phase. There are other findings with the DCCS that seem to conform with an inhibition account. The DCCS reveals a distinct abulic or knowledge-action dissociation (Zelazo et al., 1996). After 3-year-olds have perseverated on the first dimension of the card sort, they are none the less able to demonstrate an understanding of the rules that they fail to use. For example, when children persist in sorting by colour even though they have been told the correct rules to sort by shape, they can be asked to point to answer the questions, 'Where do the triangles go in the shape game? And where do the circles go?' Three-year-olds almost invariably respond to these knowledge questions correctly. Nevertheless, they continue to perform the incorrect sorting action. When they are told to go ahead and sort a given card according to the shape game ('Okay, good, now play the shape game. Where does this triangle go?'), they still perseverate and incorrectly match according to colour. This dissociation appears to provide the most compelling support yet for the hypothesis that young children understand what they are meant to do on the DCCS but are unable to control their actions accordingly. Specifically, the dissociation data seem to indicate that children know the response they should make but instead carry out the old, prepotent one, as if they know exactly what they are supposed to do, and even try to do it, but simply cannot inhibit their prepotent sorting responses. According to this interpretation, the ability to overcome the prepotent response would not depend on changes in children's intentions (i.e. their understanding of what they should do). Rather, it would depend on the strengthening, or even the establishment, of an inhibitory mechanism (e.g. White, 1965). Despite the prima-facie appeal of a response-inhibition interpretation of the dissociation effect, there are several other findings from research with the DCCS that question its validity, or at least suggest that inhibition cannot apply at the level of the response. First, notice that 3-year-olds switch flexibly between the two pre-switch rules. For example, when sorting in the initial game, they may sort
Children's Action Control and Awareness
247
several red cards into one box on one side and then have no difficulty switching to sort a blue card into the box on the other side. In fact, the positions of the target cards on the trays can be exchanged during the first game, and 3-year-olds will continue to sort without missing a step. These observations indicate that however the 3-year-olds are construing their responses, it is not simply on the level of 'this card goes on the left' because they are not bound to that response when either the test card or position of the initial target cards changes. These observations also show that a simple response-inhibition account cannot be correct. Sorting several cards to one side ought to establish a prepotent response, and a simple inhibition . account would predict that 3-year-olds should perseverate, putting all of the cards into the same box. However, 3-year-olds do not perseverate on a single response or a single rule. Instead, from pre-switch to post-switch, they perseverate on a pair of rules for sorting according to a dimension (e.g., colour or shape). There are additional findings regarding the abulic dissociation in the DCCS that further constrain any interpretation of the phenomenon. Although verbal responses may be intrinsically more flexible than manual ones (Luria, 1961), so that it could be suggested the dissociation depends on a change in response modality, in fact, it is not necessary for the sorting and question answers to be in different response modalities for the dissociation effect to occur. Rather, the DCCS can be modified so that the children only make verbal responses indicating where each card should be placed. In these circumstances, 3-year-olds still perseverate after the game changes, and they still give correct verbal answers to the dissociation questions (Zelazo et al., 1996). This pattern rules out the simple inhibition explanation that the dissociation effect only occurs because the change in modalities allows the child to bypass the original prepotent responses. One further set of observations that employs an error-detection procedure with the DCCS raises even more serious worries about the adequacy of children's understanding of what to do in the task. Instead of having 3-year-olds sort the cards themselves, Jacques et al. (1999) had them observe a puppet playing versions of the DCCS, and asked them to evaluate the puppet's performance. Thus, it was possible to determine if the children knew what should be done without requiring that they execute the necessary actions. The experiment was arranged so that when the rules were changed to the new game, the puppet either perseverated on the old dimension or switched correctly to the new one. In a third condition, the puppet switched on its own even though the new rules had not yet been issued. The results showed that when the puppet perseverated, 3-year-olds incorrectly said it was doing the right thing. When the puppet switched correctly, they incorrectly said the puppet was doing the wrong thing. However, when the puppet switched on its own (i.e. in the absence of a rule change), they correctly said that the puppet was wrong. An additional comparison showed that individual children's evaluation of the puppet's performance was related to their own proficiency on the usual DCCS.
248
Douglas Frye and Philip David Zelazo
The results of the error-detection study make it unlikely that 3-year-olds know perfectly well what to do, but just have difficulty stopping themselves from executing a prepotent response in the DCCS. If they knew what to do, then they should have been able to detect the puppet's errors. Further, if young children fail the usual DCCS because their tendency to repeat well-established actions interferes with the formulation or execution of new ones, then eliminating the requirement that children sort cards on the pre-switch phase should have allowed them to judge successfully on the post-switch phase. Thus, these results are a strong sign that inhibitory control of responses is not what is at issue in the task. Whatever change occurs that allows children to switch their actions in these circumstances, it does not seem to be simply a matter of acquiring the ability to execute appropriately formulated intentions. These results lead us back to the possibility that the3-year-olds do not understand what they should be trying to do in the DCCS, despite their answers to the knowledge questions in the dissociation study. Indeed, the error-detection pattern suggests that the 3-year-olds' own sorting errors were in accord with what they thought was right. For both the puppet's and their own actions, they seemed to think that continuing to match by the old dimension was the appropriate thing to do. Five-year-olds, in contrast, recognize for both the puppet and themselves that the new rules require a change in action. If this interpretation is correct, then it raises two questions: (1) What are the characteristics of 3-year-olds' intentions that lead them to think they are performing correctly in the DCCS?, and (2) How can their mis-actions be reconciled with their accurate question answering in the dissociation experiment? 2. A COGNITIVE COMPLEXITY AND CONTROL ACCOUNT In the broadest of terms, Simon Says and the DCCS ask or instruct the child to carry out an intentional action. In other words, they set a goal for the child or specify the satisfaction conditions that the child's action should fulfil (Searle, 1983). The child must provide a means to the goal or formulate an intention to act in a way that meets the satisfaction conditions that have been set. These characteristics are what make it useful to analyse executive function tasks, including Simon Says and the DCCS, as instances of problem solving (Zelazo et al., 1997). The specific characteristics of the DCCS will be used to explore what sort of change in intentional action is occurring during this period of development. Subsequently, a particular explanation of the change, taken from Cognitive Complexity and Control theory (Frye et al., 1995, 1998; Zelazo and Frye, 1997, 1998), will be given. The large variety of findings with the DCCS can help to determine what it is about the young children's intentional action that changes. The empirical findings are essential because even with a well-defined task like the DCCS there are many ways in which children could be construing their intentions in it. Their goal in the
Children's Action Control and Awareness
249
task could be as general as 'match the cards' or as narrow as 'get this card in this tray'. Similarly, they could be relying on descriptions of actions that vary from 'indicate this choice' to 'use these two fingers to move the card from here to there'. These specifications are likely to be critical for understanding 3-year-olds' characteristic action patterns. It can reasonably be assumed, probably without contention in developmental psychology, that if adults act intentionally, then 3-year-olds do so as well. Intentional action appears to emerge during infancy (please see Sect. 3). Thus, the explanation of 3-year-olds' characteristic intentional action will not depend on whether or not they can employ means and goals (they can), but rather on whether they can adequately specify the appropriate means and goals in particular problem-solving situations. One possibility is that 3-year-olds in the DCCS are able to adopt exactly the goals specified by the experimenter's instructions. In the colour game, then, they would know something like, match by red and blue. When the rules change to shape, they would need to recast the goal to match by triangle and circle. There is strong evidence that 3-year-olds can formulate either of these goals because they can easily play either game if it is presented first. However, to account for the basic findings of the DCCS, this approach would have to assume that the 3-year-olds are also 'captured' by the old means. In other words, they have the correct goal after the change, but they employ the wrong means. This explanation can be seen to be a variant of the 'correctly know what they want to do but cannot accomplish it' approach. The error-detection findings that 3-year-olds think the puppet is acting correctly when it perseverates after the change would seem to rule out this account. Although 3-year-olds may not adopt the correct goals for both games, it may nevertheless be reasonable to assume that they adopt the right ones for the first game. After all, they sort correctly in the first game, using the features given in the instructions (e.g. red and blue). So, it could be that after they have successfully acted in the situation, they are unable to recast the goal when the instructions change. According to this view, after the change, they would actually be making a good choice of means, but in pursuit of the old goal. This possibility accords well with the error-detection results. The approach supposes that children's understanding of the satisfaction conditions for the actions in the task continues to incorporate the first instructions even after these instructions have changed. If so, then the puppet's perseverative responses after the change would continue to meet those satisfaction conditions, and thus appear to be right. The puppet's correct responses when the instructions change would appear to be wrong because these responses violate the original satisfaction conditions. And, of course, the comparison condition of the puppet switching without the change in instructions would also appear to be wrong because the satisfaction conditions really are violated. Even though the incorrectly specified goal account helps to explain the errordetection findings, it leaves the dissociation results untouched. In fact, the dissociation effect seems prima facie to contradict this account. The dissociation results seem to demonstrate that 3-year-olds know how to achieve the second game's goal
250
Douglas Frye and Philip David Zelazo
even after playing the first. For example, when asked where the triangles and circles go in the shape game, they answer correctly. The most straightforward interpretation of this finding is that the knowledge questions indeed show that the child knows how to achieve success in the second game. Still, knowing how to achieve a goal and adopting that goal are two different things. The child could truly know how to play the second game and yet not change the specification of the goal when the instructions change. An example may help to illustrate this possibility. Adults can usually be counted on to accomplish their daily drive to work. However, if there is an early morning meeting at a different campus, there is a chance that the average academic will drive to the office rather than appear at the meeting. The mistake can occur even though the person was informed of the meeting and knows the directions to the other campus. In fact, it should be possible to demonstrate the person's knowledge of the correct route by asking where the given roads lead. Nevertheless, the outcome of the drive is likely to be the office car park. This example indicates that, although the person can formulate either goal (go to the office, go to the meeting) and can have knowledge of how to accomplish the out-of-the-routine goal, this knowledge does not ensure success. When thinking about going to work, the person must adopt the goal of getting to the meeting and keep that goal in mind at the time of acting, in order to select the existing information about how to reach that destination. Similarly, when trying to match the cards in the DCCS, 3-year-olds are capable of forming the goal of playing either game (they sort correctly in whichever game is presented first), and they have the knowledge needed for the new one (they demonstrate this knowledge on the knowledge questions). However, they may fail to switch because they fail to adopt the new goal and select the appropriate means. The dissociation results show 3-year-olds have the relevant knowledge to accomplish either goal. The errordetection results show they lack the knowledge to adopt the new goal when the situation changes. With adults, it seems likely that if they were reminded of the goal ofattending the meeting, they would then have no trouble with the change. Cognitive Complexity and Control theory (CCC) suggests that the 3-year-olds' action pattern in the DCCS is not the result of a momentary memory lapse. Instead, it proposes that there is a developmental improvement in young children's ability to change goals deliberately in specific situations, and that the complexity of the action plans and goal formulations required by the situation will matter. Obviously, 3-year-olds can shift their goals in a relatively simple situation such as sorting a red one and then sorting a blue one in the colour game. It is only when the situation becomes sufficiently complex-two different games with two choices each-that 3-year-olds' pattern of acting fails. Furthermore, the difficulty will only arise when there is an inherent conflict in the situation. Switching goals in the driving example is challenging because the goals are related in that they
251
Children's Action Control and Awareness
/\
'" ,
,
,
51
IF
/\ IF
a1 (A)
THEN
a2
I I
c1
(B)
c2
/\
/\ a1
a2
IF
I(C) I(D) c2 c1
52
a1
/\ a1
a2
I (B) I
I(C) I(D)
c1
c2 c1
(A)
THEN
a2
c2
FIGURE 11.1. Tree structures depicting the unintegrated rules of 3-year-olds (left)
and integrated rules of 5-year-olds (right) in the dimensional change card sort Notes. In the current example. al and a2 refer to aspects of the test cards (blue triangle, red circle). cl and c2 refer to aspects of the target cards (red triangle, blue circle), and sl and s2 are the setting conditions of shape and colour.
both involve getting to work, and yet they are also conflicting in that the satisfaction of one goal precludes the satisfaction of the other. The DCCS is similar. Both games involve matching the same test cards, but because of the way the task is structured, matching a card in one game mismatches it in the other. The left panel in Figure ILl depicts the CCC characterization of the 3-year-olds' goal formulations in the DCCS. The two games are shown with the two choices each. Any given choice on one dimension (match red) results in a mis-sorting on the other (mismatch triangle). It has been established that 3-year-olds have knowledge of how to play both games but are unable to select the new rules flexibly when the instructions change. The account proposes that 5-year-olds can govern their actions in the task because they are able to add setting conditions, as shown on the right side of the figure, that allow them to select between the sets of rules. It is assumed that the setting conditions further specify the goal. Thus, the satisfaction conditions of the action in the DCCS become something like 'match the cards for the colour game' and 'match the cards for the shape game'. The further specification explicitly differentiates the goals, and it allows the correct selection of rules because when reasoning about what action to take, children can choose on the basis of an adequately specified goal. In other words, 5-year-olds are able to change their actions in these situations not because they gain more control over their responses per se, but because they are better able to specify the goal that they are attempting to reach. Recent results by Munakata and Yerys (2001) lend support to this account. These authors found that when the knowledge questions are made more complex (where do the blue flowers go in the shape game?), so that they, like the postswitch sorting questions, require recognition of the more closely specified goal, 3-year-olds often have difficulty on the knowledge questions too. This new finding
252
Douglas Frye and Philip David Zelazo
shows that the complexity of the inferences required is what is important, and: again reinforces the point that it is not the modality of questions versus the sorting' response that matters (Zelazo et ai., 1996, expo 4). Given the explanation that characteristic problems in intentional action may occur because of the underspecification of goals, it is worth asking whether there are other cases in which this difficulty might arise. One possibility is the 'anarchic hand' behaviours exhibited by patients who have suffered various forms of cortical damage (see Humphreys and Riddoch, this volume). One of the patients discussed by Humphreys and Riddoch, referred to as ES, made characteristic errors when asked to use whichever hand was aligned with a cup to pick it up. If the cup was on the left, but its handle was pointed to the right, then ES tended to violate the instructions and reach with the right hand. As with 3-year-olds in the DCCS, this pattern might be the result of an habitual action that is difficult to control. Alternatively, it might be that ES's difficulty lies in the specification of the goal. The instruction that the cup must be picked up with the aligned hand has to be maintained as a part of the goal, but there were indications that it was not. For example, when the hand to be used was explicitly cued by the word left or right, errors were reduced. And, as with 3-year-olds in the error-detection version of the DCCS, ES seemed to judge the cross-reach to be correct, so the failure did not seem to be a case of doing the wrong thing in spite of recognizing it was wrong. Another important case that might be understood in terms of CCC theory is the development of children's theory of mind (see Wellman et ai., 2001, for a recent meta-analysis). Theory of mind has been taken to mean children's understanding of their own and others' mental states. One primary change is the understanding of false belief, or the appreciation that someone else can have a view of the world that is out of accord with reality or the child's own view. Wimmer and Perner (1983) first demonstrated the effect with a story of a character who placed a desirable object in one location, and then was absent when the object happened to be moved. Young pre-schoolers, when asked, predicted that the character would look for the object where it was currently located. This outcome seems similar to the ones that have previously been discussed in that, the child might be underspecifying the intent of the question just as they underspecify the instructions in the DCCS. That is, instead of attempting to determine 'where the character should look from the character's perspective to find the object' they attempt to determine 'where the character should look to find the object' (Frye, 1992). The CCC theory can be generalized to provide an explanation of the various theory of mind findings (Frye, 1999,2000). As such, it can account for the fact that the development of theory of mind shows the same improvement between the ages of 3 and 5, and for the findings that tests of theory of mind performance correlate with measures of executive function, including the DCCS (e.g. Carlson and Moses, 2001; Frye et al., 1995; Perner and Lang, 1999; Zelazo, Burack, Benedetto, and Frye, 1996; Zelazo et ai., 2002). Together, theory of mind
Children's Action Control and Awareness
253
and executive function span a substantial portion of cognitive development, so the generality of the CCC theory is apparent.
3. AN OUTLINE OF THE DEVELOPMENTAL SEQUENCE The present proposal that there are age-related changes in the complexity of children's action plans and goal specifications may help to explain changes in children's action control in the period from 3 to 5 years; however, it is also important to situate these changes within a developmental sequence. This sequence should establish the constituents of intentional action that result in the characteristic patterns of responding shown by the 3-year-olds. To that end, the CCC account has described a sequence of changes in intentional action that occur over the first five or six years of life. A separate model, Levels of Consciousness (Zelazo, 1999, 2000; Zelazo and Zelazo, 1998), outlines a corresponding series of changes in awareness, each of which is associated with an improvement in intentional action. Because action control has been emphasized thus far, the Levels of Control (LOC) approach will be employed to sketch the changes preceding those observed on the DCCS. According to the LOC model, during infancy and pre-school, there are four major age-related increases in the highest level of consciousness that children are able to muster in response to situational demands. These increases are brought about by a functional process of recursion or re-entrant signalling (cf. Edelman, 1989; Elman, 1990) that allows the contents of consciousness at one psychological moment to be reprocessed and considered together with other contents of consciousness. Each degree of recursion has important consequences for the quality of subjective experience, the potential for recall, the complexity of knowledge structures, and the possibility of action control. As regards action control, each degree of recursion permits an increase in the complexity of children's action plans and goal specifications, thereby extending the range of what the child can do. In general, the progression moves consciousness further away from the exigencies of environmental stimulation in what might be called psychological distance (cf. DeLoache, 1993; Dewey, 1985; Sigel, 1993), and allows for the formulation of increasingly complex, and more decontextualized discursive patterns of reasoning and acting. According to this model, abulic dissociations occur (under conditions of interference) until incompatible pieces of knowledge are integrated into a single, more complex action plan via another degree of reflection. In the absence of integration, the particular piece of conscious knowledge that controls manual or verbal responding is determined by relatively local associations. In the LOC model, infants are assumed to be endowed initially with minimal consciousness (Zelazo, 1996), which is meant to be the simplest, but still conceptually coherent, kind of consciousness that can be imagined. It permits the baby to be aware of the world-say of a specific item in the world, a rattle-but not to
254
Douglas Frye and Philip David Zelazo
reflect on that awareness. Or, in other words, the very young baby will have a conscious experience of the world, but will not be conscious of that experience as an experience, and will not be conscious of him- or herself as the subject of that experience. Minimal consciousness supports behaviour, but this behaviour will be relatively stereotypical and rooted in pre-existing behavioural routines or reflexes that are associated with broad classes of stimuli. Hence, contact with a rattle will be likely to elicit sucking as the rattle is assimilated to the broad class of 'suckable things'. Although experience with particular stimuli can change the behaviours I associated with these stimuli-so that, for example, the pattern of sucking certain stimuli can change over time and be coordinated into a higher-order routine, as Baldwin (1968) and Piaget (1952) described in detail (see also Cohen, 1998)-the action will not be intentional. From the baby's point of view, the behaviour does not have satisfaction conditions. Indeed, there is no goal specification at all. The object simply triggers the behaviour, and the baby is only aware, in succession, of the trigger and then of the behaviour itself. The first major change in the developmental sequence occurs between 9 and 12 months of age. The change is modelled by feeding the contents of minimal consciousness back into minimal consciousness via a single recursive loop, which _ results in a higher level of consciousness referred to as recursive consciousness. The content of this level of consciousness is a description or label for an experience of an item in the world. The label, in turn, can trigger an action pattern apart from the item itself. The most obvious example is that babies can first manually search for a completely out-of-sight object at this point (Piaget, 1954). The label serves as a proxy for the object itself, and when it guides behaviour, it can be considered a goal. Note, however, that the child's awareness here is only of the desired object, and not of the relevant action pattern (i.e. the means for obtaining the goal) until that action pattern is triggered. Consequently, the action pattern can be wrong when the situation changes, as in the case of the A-not-B error, which occurs when babies search in the wrong place for a toy they have seen moved from one hiding place to another. There is also direct evidence of this distribution of awareness, in that infants who have been bringing about a particular outcome will be surprised if the outcome unexpectedly changes, but will not be surprised if some unrelated action they perform begins unexpectedly to produce that outcome (Frye, 1991). In Searle's (1983: 85) terms, these are intentional actions with satisfaction conditions that are not self-referential. The infant counts success as enough, and does not yet recognize that the particular action must be the one that brings about, or is directed towards, the outcome. The next significant change occurs about a year later. An additional cycle of recursion allows an awareness of both a goal and also one of the child's actions that can stand in relation to the goal. Essentially, the 18-month- or 2-year-old can form a conditional rule which represents that the performance of an antecedent action (a means) will result in a given outcome (a goal). There is empirical evidence that
Children's Action Control and Awareness
255
the child's awareness of the constituents of intentional action change at this age. Children at the end of infancy are not only surprised when the outcome of an action unexpectedly changes, but they are also surprised when an action that should not bring about an outcome nevertheless does (Frye, 1991). Moreover, when they simply see a novel action but do not witness the intended outcome, they are still able to attempt to bring about the outcome themselves (Meltzoff, 1995). These findings, which show that children expect that only certain actions will produce particular outcomes, indicate the children are now aware of the action, the outcome, and the relation between them. As such, their own intentions can be considered self-referential in Searle's (1983) sense. Although the child cannot yet perform complex actions that involve more than a single action and outcome, awareness of the three fundamental constituents of intentional action (a means, a goal, and the conditional relation between them) is present. The next change occurs between 2 and 3 years of age, and elaborates upon the preceding achievement. Even though 2-year-olds can routinely formulate simple intentional actions, those actions remain largely independent of each other. That is, children do not yet explicitly consider the relations or similarities among different actions. The LOC model stipulates that the next cycle of recursion allows children to relate pairs of actions, or form a pair of rules for acting, that can then be considered in contradistinction to each other. Being able to sort by red or blue in the DCCS is an example of such a rule pair. Zelazo and Reznick (1991) demonstrated this development by having young children sort pictures into two categories (e.g. things that are found inside a house and things found outside). Three-yearolds had no difficulty with the task. In contrast, in spite of being able to label the pictures correctly for inside versus outside, 2.5-year-olds typically only sorted by one of the two rules. Usually, they started to sort correctly, for example, by placing a snowman in the correct box, but then tended to perseverate on putting test cards into that first box (Zelazo et al., 1995). According to the current approach, they were acting to comply with the instructions, but were underspecifying the goal. Being able to sort correctly requires explicitly integrating 'inside' and 'outside' into the satisfaction conditions of the actions. Forming the rule pair of 'if found inside, put here; if found outside, put there' is one way of representing the needed specification, and in order to form that rule pair, children would have to reflect on the rules and explicitly consider the relation between them. The characterization of the change between 2 and 3 years helps to illustrate what else is involved in the subsequent change between 3 and 5 years that leads to Success on the DCCS. Both changes are hypothesized to require a further specification of the goal of the actions. At 3 years, children are able to form a rule pair for sorting by red or blue and another rule pair (considered independently of the first) for sorting by triangle or circle. The recursive step that is theorized to occur between 3 and 5 years, however, makes it possible for the child to step back from these two rule pairs, consider them in contradistinction, and select between them. That is, the additional degree of recursion in awareness allows them to appreciate
256
Douglas Frye and Philip David Zelazo
the hierarchical structure of the task and formulate a corresponding hierarchical system of rules for sorting. As a result, whereas 3-year-olds can only switch between values (e.g. red, blue), 5-year-olds can switch between dimensions (e.g. colour, shape). In line with this interpretation, the setting conditions that come to be added to the specification of the goal are general (e.g. 'in the colour game', and 'in the shape game'). These setting conditions make it possible for 5-year-olds to select the correct actions (and judge someone else's selection of them as being correct) when presented with the change in instructions, but they also have other implications. Because they are general, they extend the range of action beyond the situation. For example, they make it possible for 5-year-olds to recognize perspectives in theory of mind tasks, just as they recognize dimensions in the DCCS. Thus, in the change-of-Iocation false belief task, the addition of setting conditions not only allows children to predict where Maxi will look upon his return, but it also allows ' the understanding that any inference that involves where Maxi thinks the object is will be similarly affected. This example illustrates how the change to a higher level of reflective awareness (i.e. a higher level of consciousness) could increase the child's understanding of both more complex and also more widely decontextualized actions.
4. IMPLICATIONS FOR WHAT THE CHILD CAN DO INTENTIONALLY The age-related changes in children's awareness of the constituents of action expand the range of actions that children can perform intentionally. This developmental claim is similar to, but distinct from, claims that can be made regarding the consequences of knowledge acquisition and skill learning. When children acquire knowledge, such as knowledge of the rules of chess, or learn a new skill, such as how to ride a bicycle, there will of course be changes in children's ability to act intentionally. Unlike these changes, however, the developmental changes in action that are made possible by age-related increases in children's level of awareness will not be restricted to a single domain. The shift in children's awareness of action between the ages of 3 and 5 will be used to illustrate this possibility, and the effects of the shift will be considered for two domains-physical causality and moral reasoning. In both cases, the shift allows children to perform counterstereotypic as well as stereotypic actions. Physical causality. To examine the consequences of the shift between 3 and 5 on children's ability to act upon the physical world, Frye et al. (1996) gave pre-schoolers a simple physical device and asked them to make it work in different ways. The device was a covered ramp with two input holes at the top of the ramp and two outlet holes at the bottom. These holes were connected by two sets of channels.
Children's Action Control and Awareness
257
One or the other set of channels could be blocked off so that marbles inserted into an input hole either rolled straight down the ramp from top to bottom, or rolled
diagonally across the ramp from the top on one side to the bottom on the other. The children were asked to place the marble in the ramp so that the marble would come out of a given outlet hole. Because the ramp operated in two distinct configurations, it could be made to be formally equivalent to the DCCS with there being a 'straight down' condition and an 'across' condition. That is, in the straight condition, to make a marble exit on the left one must insert it on the left, and to make it exit on the right one must insert it on the right. In contrast, in the across condition, the rules are reversed: if out left, then in right; but if out right, then in left. .A light was used to signal what configuration the ramp was in. Children operated the ramp in one of the configurations (e.g. straight, light on) and then it was switched to the other (e.g. across, light off). Pre-schoolers' performance on the task was comparable to their performance on the DCCS. Whereas 5-year-olds could act to produce the desired outcome in either configuration, the 3-year-olds only succeeded in the stereotypical circumstance of the marble rolling straight down. When the marble rolled across the ramp, they still insisted on trying to insert it directly above the place where they wanted it to emerge. To assess whether this problem was really one of intentional action, it was important to determine whether children had the knowledge necessary to understand the workings of the ramp. In the task, children were initially given a demonstration of the ramp in both configurations, and on every trial they were told the rules for how the ramp would operate, just as they were told the rules in the DCCS. The best indication that children understood the workings of the ramp, however, came from an experiment in which one of the input holes was covered over and children were asked to predict where the marble would emerge when it was put into the one available input hole. In this situation, 3-year-olds could make accurate predictions about the marble's path. They could predict when the marble would roll straight down or across depending on whether the light was on or off. Thus, in the one-input version when 3-year-olds had to use a pair of rules involving two goals (e.g. if light on, emerge left; if light off, emerge right), they were able to do so. However, in the two-input version, the physical situation reached the complexity of two pairs of rules involving two pairs of actions that have overlapping outcomes (iflight on, insert left, emerge left; iflight off, insert right, emerge left). In this instance, 3-year-olds were no longer able to choose where to insert the marble to make the device operate as they wished. And, when the situation became this complex, they failed in the characteristic way of expecting the more typical outcome to occur-that the marble would always roll straight down. The findings suggest that with situations of this complexity 3-year-olds will have difficulty with any causal sequence in which there is an exception or change from the way things usually work (cf. Das Gupta and Bryant, 1989). Because of this difficulty, they will not be able to plan the actions needed to initiate these
258
Douglas Frye and Philip David Zelazo
sequences. From our perspective, the developmental change between 3 and 5 years described by the CCC account allows 5-year-olds to extend their intentional actions to appreciate and manipulate non-canonical causal sequences. Morality and action. Moral action furnishes a good example of how the developmental changes in children's ability to specify the constituents of action will enlarge the range of socially relevant actions they can consider. Zelazo, Helwig, and Lau (1996) required 3- and 5-year-olds to use information about an action in order to predict an agent's behaviour towards an animal and then make moral judgements about that behaviour. In the normal situation, the animal experienced pleasure from being petted and pain from being hit. In the non-canonical situation, these causal relations were reversed. Children were further told that the actor was either a nice or a mean person. Following a series of confirmation questions that ensured that children understood each scenario, participants were first asked a behavioural prediction question. For example, in one story involving Sally, who was said always to be nice, the question was phrased as follows: 'Now, Sally knows that mugwumps are weird. She knows that a mugwump cries when you pet it and smiles when you hit it. What is Sally going to do?' This problem is analogous to the DCCS or the ramp task because there were two setting conditions of causal system (normal or non-canonical) and within each setting condition, children had to consider the agent's disposition (nice or mean) in order to predict the agent's action (pet or hit). Although the 5-year-olds did well, the 3-year-olds performed poorly on behavioural predictions with the non-canonical sequences. In the normal situation, children could perhaps take the causal system for granted and ignore it, focusing only on a pair of relations between disposition and action (if nice then pet, if mean then hit). In the non-canonical situation, however, the causal system could not be ignored. In order to make accurate behavioural predictions in this situation, children needed first to consider the appropriate action-outcome relation (e.g. hitting produces smiling), and then to consider the agent's disposition in order to predict the agent's act. Even though 3-year-olds demonstrated that they understood the non-canonical causal system (namely, that hitting the mugwump made the mugwump happy), they did not select this information when determining what the actor would do; instead they reasoned from the usual perspective of a normal causal relation, as if hitting caused harm. This pattern mirrors what has been found in other studies with other tasks, such as the DCCS and the ramp task, and it indicates that the constraint on 3-year-oids' ability to specify actions limits their ability to predict other people's behaviour. By 5 years of age, however, children are able to make accurate predictions even in unusual situations. Similar age differences emerged for children's moral judgements. Mer the agent acted, either hitting the animal or petting it, children were asked about the act's acceptability, 'Was it okay to X?', and about possible punishment, 'Should [the agent] get in trouble?' Although act-acceptability judgements at all ages (and among
I
Children's Action Control and Awareness
259
college students) tended to be based on outcome, there was an age-related increase in the complexity of children's punishment judgements. Three-year-olds' punishment judgements tended to focus either on the act's consequences or the agent's disposition, whereas older pre-schoolers and college students tended to use integrative rules, such as a conjunction rule, in which punishment is given only if both disposition and consequences are negative. As shown in previous studies, the use of integrative rules requiring the simultaneous consideration of two dimensions of judgement first emerges at about 5 years of age or later. Thus, for both a non-social domain, physical causality, and a social one, moral understanding, children's level of awareness and specification of action plans and goals can be seen to make a difference, both for the intentional actions that children themselves can carry out and for the actions they understand. These results demonstrate that the changes in children's understanding of action control will have effects across a broad range of content domains.
5. CONCLUSION It has been argued that developmental changes in children's awareness of actions
expand the range of actions that children can perform intentionally. Beginning in infancy, developmental changes in action control depend on shifts in children's awareness of the constituents of action (means and goals and their relations) and their specification within increasingly complex action plans. Young pre-schoolers' characteristic pattern of perseverative responding when they must follow altered instructions does not appear to be the result of being unable to stop themselves from acting (i.e., a lack of inhibition). In these situations, 3-year-olds display knowledge-action dissociations in which they show an understanding of what should be done, even though they do not perform the correct actions. At the same time, however, their performance on tests of error detection indicates that their difficulty consists in a failure to select the appropriate actions for execution. That is, when they see the correct actions carried out by another person, 3-year-olds judge that person to be mistaken. Our explanation of this pattern is that 3-year-olds underspecify the goals or satisfaction conditions of the actions indicated in the instructions. Although they have knowledge of the correct actions, they do not select them because they do not adequately characterize the goal they are trying to reach. Characterization of the appropriate goal in this situation requires a degree of reflection that is generally beyond them. In contrast, 5-year-olds are able to specify the satisfaction conditions sufficiently to contend with the altered instructions. This improvement allows 5-year-olds to perform the correct intentional actions in more complex situations in which there has been a change or an exception to the way things typically occur. Importantly, this improvement has consequences for children's
260
Douglas Frye and Philip David Zelazo
intentional actions across a wide variety of domains, including instances in which children must act to manipulate atypical causal sequences in the physical world and those in which children must understand that at times it can be praiseworthy to perform social actions that in other circumstances have harmful consequences.
REFERENCES BALDWIN, J. M. (1968), Mental Development in the Child and the Race, 3rd edn. New York: Augustus M. Kelley. Original work published in 1894. CARLSON, S. M., and MOSES, L. J. (2001), 'Individual differences in inhibitory control and children's theory of mind', Child Development, 72: 1032-53. COHEN, L. B. (1998), 'An information processing approach to infant perception and cognition', in F. Simion and G. Butterworth (eds.), Development of Sensory, Motor, and Cognitive Capabilities in Early Infancy: From Sensation to Cognition. East Sussex, UK: Psychology Press. DAS GUPTA, P., and BRYANT, P. E. (1989), 'Young children's causal inferences', Child Development, 60: 1138-46. DELoACHE, J. (1993), 'Distancing and dual representation', in R. R. Cocking and K. A. Renning~r (eds.), The Development and Meaning of Psychological Distance. Hillsdale, NJ: Erlbaum, 91-107. DEMPSTER, F. N. (1992), 'The rise and fall of the inhibitory mechanism: toward a unified theory of cognitive development and aging', Developmental Review, 12: 45-75. DEWEY, J. (1985), 'Context and thought', in J. A. Boydston (ed.) and A. Sharpe (textual ed.), John Dewey: The Later Works, 1925-1953, vol. vi, 1931-2. Carbondale, Ill.: Southern Illinois University Press, 3-21. Original work published in 1931. DIAMOND, A., and TAYLOR, C. (1996), 'Development of an aspect of executive control: development of the abilities to remember what I said and to "Do as I say, not as I do"', Developmental Psychobiology, 29: 315-34. EDELMAN, G. (1989), The Remembered Present. New York: Basic Books. ELMAN, J. (1990), 'Finding structure in time', Cognitive Science, 14: 179-211. FRYE, D. (1991), 'The origins of intention in infancy', in D. Frye and C. Moore (eds.), Children's Theories of Mind: The Development of Social Understanding of Others. Hillsdale, NJ: Erlbaum, 15-38. --(1992), 'Causes and precursors of children's theories of mind', in D. Hay and A. Angold (eds.), Precursors, Causes, and Psychopathy. West Sussex, England: Wiley. --(1999), 'Development of intention: the relation of executive function to theory of mind', in P. D. Zelazo, J. W. Astington, and D. R. Olson (eds.), Developing Theories of Intention: Social Understanding and Self-Control. Mahwah, NJ: Erlbaum, 119-32. --(2000), 'Theory of mind, domain specificity, and reasoning', in P. Mitchell and K. Riggs (eds.), Children's Reasoning and the Mind. Hove, UK: Psychology Press, 149-67. --ZELAZO, P. D., BROOKS, P. J., and SAMUELS, M. C. (1996), 'Inference and action in early causal reasoning', Developmental Psychology, 32: 120-31. - - - - a n d BURACK, J. (1998), 'Cognitive complexity and control: I. Theory of mind in typical and atypical development', Current Directions in Psychological Science, 7: 116-21.
Children's Action Control and Awareness
261
_ _- and PALFAI, T. (1995), 'Theory of mind and rule-based reasoning', Cognitive Development, 10: 483-527. JACQUES, S., ZELAZO, P. D., KIRKHAM, N. Z., and SEMCESEN, T. K. (1999), 'Rule selection versuS rule execution in preschoolers: an error-detection approach', Developmental Psychology, 35: 770-80. LURIA, A. R. (1961), The Role of Speech in the Regulation of Normal and Abnormal Behaviour, ed. J. Tizard. New York: Pergamon Press. MELTZOFF, A. N. (1995), 'Understanding the intentions of others: re-enactment of intended acts by 18-month-old children', Developmental Psychology, 31: 838-50. MUNAKATA, Y., and YERYS, B. E. (2001), 'All together now: when dissociations between knowledge and action disappear', Psychological Science, 12: 335-7. PERNER, J., and LANG, B. (1999), 'Development of theory of mind and executive control', Trends in Cognitive Sciences, 3: 337-44. PIAGET, J. (1952), The Origins ofIntelligence in Children, trans. M. Cook. New York: Vintage. Original work published in 1936. --(1954), The Construction of Reality in the Child, trans. M. Cook. New York: Basic Books. Original work published in 1937. SEARLE, J. R. (1983), Intentionality: An Essay in Philosophy ofMind. Cambridge: Cambridge University Press. SIGEL, 1. (1993), 'The centrality of a distancing model for the development of representational competence', in R. R. Cocking and K. A. Renninger (eds.), The Development and Meaning ofPsychological Distance. Hillsdale, NJ: Erlbaum, 91-107. STROMMEN, E. A. (1973), 'Verbal self-regulation in a children's game: impulsive errors on "Simon Says"', Child Development, 44: 849-53. WELSH, M. C. (1991), 'Rule-guided behavior and self-monitoring on the Tower of Hanoi disk-transfer task', Cognitive Development, 6: 59-76. WELLMAN, H., CROSS, D., and WATSON, 1. (2001), 'Meta-analysis of theory of mind development: the truth about false-belief', Child Development, 655-84. WHITE, S. H. (1965), 'Evidence for a hierarchical arrangement of learning processes', in 1. P. Lipsitt and C. C. Spiker (eds.), Advances in Child Development and Behavior. New York: Academic Press, 187-220. WIMMER, H., and PERNER, J. (1983), 'Beliefs about beliefs: representation and constraining function of wrong beliefs in young children's understanding of deception', Cognition, 13: 103-28. ZELAZO, P. D. (1996), 'Towards a characterization of minimal consciousness', New Ideas in Psychology, 14: 63-80. - - (1999), 'Language, levels of consciousness, and the development of intentional action', in P. D. Zelazo, J. W. Astington, and D. R. Olson (eds.), Developing Theories of Intention: Social Understanding and Self-Control. Mahwah, NJ: Erlbaum, 95-1170. --(2000), 'Self-reflection and the development of consciously controlled processing', in P. Mitchell and K. Riggs (eds.), Children's Reasoning and the Mind. Hove, UK: Psychology Press, 169-89. --BURACK, J. A., BENEDETTO, E., and FRYE, D. (1996), 'Theory of mind and rule use in people with Down Syndrome', Journal of Child Psychology and Psychiatry, 37: 479-84. --CARTER, A., REZNICK, J. S., and FRYE, D. (1997), 'Early development of executive function: a problem-solving framework', Review of General Psychology, 1: 1-29.
262
Douglas Frye and Philip David Zelazo
ZELAZO, P. D., and FRYE, D. (1997), 'Cognitive complexity and control: a theory of the development of deliberate reasoning and intentional action', in M. Stamenov (ed.), Language Structure, Discourse and the Access to Consciousness. Amsterdam and Philadelphia: John Benjamins, 113-53. - - - - (1998), 'Cognitive complexity and control: II. The development of executive function in childhood', Current Directions in Psychological Science, 7: 121-6. - - - - a n d RApus, T. (1996), 'An age-related dissociation between knowing rules and ' using them; Cognitive Development, 11: 37-63. - - HELWIG, C. c., and LAU, A. (1996), 'Intention, act, and outcome in behavioral prediction and moral judgement; Child Development, 67: 2478-92. - - JACQUES, S., BURACK, J. A., and FRYE, D. (2002), 'The relation between theory of mind and rule use: evidence from persons with autism-spectrum disorders', Infant and Child Development (Special Issue: Executive Function and its Development), 11: 171-95. --and REZNICK, J. S. (1991), 'Age-related aynchrony of knowledge and action', Child Development, 62: 719-35. - - - - and PINON, D. E. (1995), 'Response control and the execution of verbal rules', Developmental Psychology, 31: 508-17. ZELAZO, P. R., and ZELAZO, P. D. (1998), 'The emergence of consciousness', in H. H. Jasper, 1. Descarries, V. F. Catellucci, and S. Rossignol (eds.), Consciousness: At the Frontiers of Neuroscience: Advances in Neurology, !xxvii. New York: Lippincott-Raven Press, 149-65.
12
Children's Action Control and Awareness: Comment on Frye and Zelazo Jennifer Hornsby Frye and Zelazo present findings that uncover a particular developmental change in young children's capacities for intentional action. They bring these findings within the purview of two theoretical models-the Cognitive Complexity and Control account (CCC), and the Levels of Consciousness account (LOC). The two models have been claimed to help to explain not only the change discussed in detail in their chapter but a wide range of other findings besides. The special interest of the authors' present contribution, signalled in its conjunctive title 'Control' and 'Awareness', resides in its claim to harmonize the two models. LOC postulates four major developmental changes, occurring at around 9 to 12, 21, 30, and 48 months; CCC focuses on changes between 3 and 5 years of age. Frye and Zelazo suggest now that the LOC's account of a child's development through infancy meshes with the CCC's account of development at the pre-school stage. r want to approach the theoretical models by way of a more everyday perspective on the findings. There is a very general question about how our ordinary, so-called common-sense psychological perspective relates to that of experimentalists and theoreticians of psychology. An instance of this general question seems bound to arise in evaluating the claims that Frye and Zelazo make in developmental psychology.
1. COMMON-SENSE PSYCHOLOGY
'Theory of mind' is often used as if it was more or less equivalent to 'commonsense psychology'. To explain now why r avoid the term 'theory of mind' may help both to indicate what can be meant by 'common-sense psychology' and to emphasize the difference r think there is in this area between commonsensical and theoretical accounts (such as CCC and LOC). Frye and Zelazo say about 'theory of mind' that it 'has been taken to mean children's understanding of their own and others' mental states: But 'a theory of mind' has not usually been supposed to stand for something that children distinctively possess. It is usually meant as a name for what people putatively acquire in childhood in I am grateful to the editors for helpful comments on a draft of this chapter.
264
Jennifer Hornsby
virtue of which, beyond childhood, they possess and are able to gain psychological knowledge about others.' Theory of mind, so understood, is evidently something in whose acquisition and evolvement developmental psychologists will be interested. Yet I suggest that the term 'theory of mind' can be prejudicial to debate-in three different ways. In the first place, 'theory of mind' can seem contentious as a label for what one needs in order to have understanding of others' mental states. For it is a controversial question whether it is correct to assimilate the capacity to deploy psychological concepts to the possession of a theory. As is very well known to workers in the field, it is often argued that this is not correct. 2 Second, there is a good question about the extent to which someone's being able to find others psychologically intelligible is a counterpart of their being psychologically intelligible themselves. Some philosophers argue that these two things are bound to go hand in hand. Thus capacities to think things, to know things, and to want things run alongside capacities to know what others think, or know, or want. If this is so, then it will be misleading to talk about psychological development as if it was simply the acquisition of a capacity to understand others; but this is what is sometimes suggested by speaking of the acquisition of a theory of mind. Third, there is a good question about the extent to which agency and mental states are separable. Adult human beings possess mental states (as well as being capable of knowing others' mental states); adult human beings do things intentionally (as well as being capable of knowing what others do intentionally). These things seem to be related. Indeed, capacities to acquire mental states on the basis of interactions with the world, capacities to acquire mental states on the basis of interactions with others, capacities to reason theoretically, capacities to reason practically, capacities to act in the world on the basis of reasons seem all to be interconnected. The idea that there are interdependencies between them all is what a thesis of the holism of the mental can convey. When 'mind' is glossed as 'mental states' (ef. n. 1), 'theory of mind' suggests a divorce between agency and mentality. Speaking of 'theory of mind' can then seem to disallow the holism. 3
1 The term 'theory of mind' has also come to be used sometimes to stand for a field of research within developmental psychology, rather than for anything that anyone, child or adult, putatively possesses or knows. But this evidently is not what Frye and Zelazo mean by it. Whether, as Frye and Zelazo along with many others assume, psychological knowledge is always knowledge 'of mental states' is something I put in question below. 2 I allude to the debate between simulationists and theory theorists. For a useful discussion, see Davies (1994) and Heal (1994). Use of the term 'theory of mind' seems to me to reinforce a picture that sometimes lies behind the theory theory. But I agree with Davies that the debate has many strands, and is much less clear-cut than is often supposed. 3 See further Hornsby (1997b), where I suggest that, developmentally speaking, not only does understanding of 'mind' go hand in hand with understanding of 'mind-world' relations, but understandings of both of these go hand in hand with understanding of 'world'.
Comment on Frye and Zelazo
265
So I prefer the label 'common-sense psychology' to 'theory of mind' for what we (human adults) are governed by and all use all the time, and what children gradually come to be governed by and to use. Prejudices would be introduced if the term 'theory of mind' was substituted for 'common-sense psychology', because the answers to all three of the questions I have just raised would then seem to have been settled. In my own view, all three questions would be settled wrongly if one followed through the implications of speaking of 'theory of mind'. For I want to affirm all of the following: (1) Someone's ability to apply psychological concepts to others is not a matter of their possessing a theory (not even a tacit theory). (2) The appropriateness of applying psychological concepts to someone is inseparable from that someone's being able to apply those concepts appropriately to others. 4 (3) Our everyday common-sense psychological understanding involves not 'mind' (mental states) specifically but interactions between 'world' (surroundings) and 'mind'. Now common-sense psychology, on this view of it, is a massive and unwieldy subject matter. (1) It is not theoretically tractable; (2) it encompasses individual psychological subjects only in so far as they are mutually intelligible; and (3) it encompasses mind and world. Very evidently, empirical psychologists have to narrow their sights and to study something much more delimited. Perhaps the term 'theory of mind' stands for a field of studies whose concern is specifically understanding of others' mental states. And perhaps the term 'executive function' has grown up in developmental psychology to stand for a field of studies whose concern is specifically action-related capacities-capacities to do things at will and for reasons. If that is so, then common-sense psychology will subsume what is brought under the heads of theory of mind and of executive function, and very much else besides. 2. COMMON-SENSE PSYCHOLOGY AND DEVELOPMENTAL PSYCHOLOGY In his (1999) Frye sets out by saying that 'theory of mind and executive function ... seem very different'. He tells us that 'it is not apparent why the development of mental state understanding would be connected to the control of one's own actions'. But I hope it is apparent now why someone's understanding of others' 4 This is a philosophical claim about the conditions for common-sense psychological predications to be in place. It is not that we can never understand someone by making use of a concept that she doesn't apply to others. When it comes to children, we obviously do use concepts that they haven't yet acquired, as I acknowledge below. The claim may be put in terms that Daniel Dennett has used: someone who takes the intentional stance towards people is someone towards whom that stance can appropriately be taken. In Hornsby (l997a), I attempt to make it seem reasonable that a third-personal point of view (i.e. Dennett's 'intentional stance') and first-personal points of view go hand in hand. The philosophical claim provokes high-level metaphysical questions, so that my treatment here is inevitably cursory.
266
Jennifer Hornsby
mental states and that person's action-related capacities might be supposed to be connected. A connection is made as soon as both are located within common-sense psychology. It is made then by way of two other connections-(a) between commonsense psychological concepts being applicable to X and X's having a capacity to apply those concepts to others (cf. (2) above); and (b) between understanding mental states and understanding of mind-world relations (cf. (3) above). These connections, which link theory of mind with executive function, are not developmental of course. From an everyday perspective, the process by which an infant comes to be a common-sense psychological subject is surely a gradual one to which the acquisition oflanguage (which of course is itself gradual) is crucial. The question, or at least one large question, for developmental psychology is whether one can distinguish steps in the process, and discern definite developmental stages that children pass through. What seems certain, before any theorizing is undertaken, is that such steps will not map neatly onto distinctions that we make commonsensically. It is not as if the child first became intelligible using a batch of concepts and then in a separate step became able to apply those concepts to others. Nor is it as if the various psychological concepts deployed commonsensically might be mastered one at a time. The thesis of the holism of the mental counts against this. And even before philosophical theses are on the scene, it seems right to say that coming to have a grasp of any particular concept is itself a gradual matter, often caught up with the acquisition of other, related concepts. Children engage in the kind of human interaction for which common-sense psychology provides long before they become mature common-sense psychological subjects themselves. Consider the phenomena that psychologists study under the various heads of imitation, affective exchange, pointing behaviours, pretend play, make-believe. Many of us have first-hand experience of all of these. In early joint visual attention, for instance, an adult turns to look at an object, and a baby responds by looking in the same sort of direction; when the baby comes to focus on the object, there is, as it were, a meeting of minds. It is natural to think of such sharing of mental states as part of a process of initiation into a community of persons. Joining in a range of activities partaking of intersubjectivity precedes the achievement that has been labelled acquisition of a theory of mind. Thus although the child's achievement is a gradual one, it cannot seem right to think that children gradually come to 'have minds'. Here we start to see some of the reasons for resisting the picture that can lie behind tall< of acquisition of a 'theory of mind'. In the first place, we see why one might resist assimilating the capacity to deploy psychological concepts to the possession of a theory. The child's accomplishment in early development is a sort of socialization into full personhood, and that fits ill with thinking of it as the learning of any theory. Second, suppose that it is correct to connect being a full-fledged user of common-sense psychology with being a full-fledged subject of commonsense psychology. Then if 'theory of mind' is supposed to stand for what is
Comment on Frye and Zelazo
267
acquired in becoming a user of common-sense psychology, it might seem as if becoming a subject of common-sense psychology had to be equated with coming to be a subject of mentality. But the case of young children shows that our idea of a participant in phenomena of intersubjectivity needs to be distinguished from our idea of being a full-fledged common-sense psychological subject. (In this connection, it is worth noticing that, in most people's view, non-human animals, like very young children, are much stronger candidates for being subjects of experience than they are for being common-sense psychological subjects. s But the present point is only that psychological terms-in a sense of 'psychological' that relates to 'having a mind' as opposed to 'being an automaton'-apply to infants, but that infants are not yet subjects of common-sense psychology.) 3. COMMON-SENSE PSYCHOLOGY AND 3-YEAR-OLDS' DCCS PERFORMANCE I suggest that from the point of view of common-sense psychology, the 3-year-olds' performance on the dimensional change card sort (DCSS) tests is inexplicable. Here is a generalization which I think we take to be true of commonsensically explicable psychological beings, and which the 3-year-olds, in the studies described by Frye and Zelazo, flout: 'If you know what you should do and are able to do it, then (in the absence of any tendency not to do it), you will do it.' (No doubt this needs more qualification;6 and with its hand-waving talk of 'tendencies', it can hardly be thought of as a basic principle of common-sense psychology. But my present concern is only to elicit what is perplexing in 3-year-olds' performance, so that my claim now need only be that common-sense psychology commits us to something a bit like this generalization.) Now consider the 3-yearold who has just sorted by shape, and is told to sort by colour but continues to sort by shape. She knows what she should be doing, because she will tell you if she is asked. She is able to sort by colour, because if she had initially been told to do so, then that is what she would have done. Moreover, since children are cooperative on the task, there is nothing to suggest that she has any tendency not to sort by colour. So why does she not now sort by colour? Commonsensically there seems to be nothing to say-except that she is only 3. And this of course is just to admit that 3-year-olds are not yet full-fledged psychological subjects. One is led to inconsistency if one attempts an adult common-sense psychological treatment of the 3-year-old. For we want to say that the child knows what she should be doing on the DCCS task. But if we take her performance to be governed
5 Evidently much here is controversial; and no doubt my own view that only creatures with language are common-sense psychological subjects is very controversial. 6 One might wish to add, for example, 'and if one is not forgetful of what one should be doing'. But, as Frye and Zelazo say, it can't be made plausible that the 3-year-olds suffer memory lapses.
268
Jennifer Hornsby
by the generalization, then we have to say she does not know what she should be doing. (The fact that she has the wrong view of what a puppet who is supposed to be following the same instructions should do is more evidence that the child does not know what she should be doing.) It seems in short that at the post-switch stage the child, if she were treated as properly common-sense psychologically intelligible, would both know and not know what she should be doing. A similarly surd description could be offered of 3-year-old performance on the ramp task, and again on the 'morality and action' tasks where the children's judgements are not informed by what they seem really to know. Three-year-olds are simply not the sort of rational beings that common-sense psychology takes for granted when it is used in understanding adults. This is not to say that we cannot apply any of common-sense psychology's concepts to 3-year-olds. Indeed, in formulating any view of what is going on with 3-year-olds, we might find commonsense psychological concepts very useful: they might help us to understand our children better. In the light of the findings that Frye and Zelazo present, for example, we might decide that we had sometimes wrongly seen young children as obstinate, having mistaken what is actually a maturational deficit for a defect of the will. There need not then be any difficulty with Frye and Zelazo's claim that a distinction between what the child does intentionally and what she does unintentionally can perfectly well be made during infancy: it is not as if common-sense psychology had no purchase at all on children. It is only that such a distinction cannot have exactly the same significance in young children's case as it does for adults. If one describes the child as doing something intentionally, then one describes her by reference to a network of interconnected concepts, not all the implications of which are yet in place in application to her. We can use common-sense psychological descriptions, but we may be led into absurdities if we follow through on what they would entail as applied to normal adults. The reason, I think, why common-sense psychology can lead to inconsistencies if young children are taken to fall in its domain is that it takes so much for granted: common-sense presupposes an enormous amount about what commonsense psychological subjects are capable of. In the normal case, of normal adults, its presuppositions are warranted. But not all of common-sense psychology's presuppositions are in place where young children are concerned. Nor are they in place in the case of patients with a range of neurological deficits, including the example of the anarchic hand phenomenon to which Frye and Zelazo refer. And its presuppositions let us down again when, for instance, people have extraordinarily outlandish beliefsJ Perhaps part of the explanation of why we can none the 7 See Stich (1983). Stich is among those who view common-sense psychology's breakdowns as part of a case for its elimination: examples of outlandish beliefs feature in an argument which concludes 'so much the worse for common-sense psychology'. It puts a new slant on the debate about eliminativism, I think, to consider child development. This helps one to see some of the problems with the supposition, made by Stich and many others, that common-sense psychology gives a protoscientific account and uses categories that might fix onto well-defined states in some science.
Comment on Frye and Zelazo
269
less rely on common-sense psychology is that we normally encounter mature psychological beings whose brains and nervous systems subserve the range of capacities which it presupposes; and when we encounter others-whether children, or patients with neuropathologies, or people suffering from one or another kind of insanity-we still find beings whose brains and nervous systems are such as to subserve some large range of these capacities. At the points at which commonsense psychology breaks down, we find ourselves with very little to say. We can only try to get at what capacity is lacking. About our 3-year-olds, we might say . that their knowledge does not impinge as it should upon their actions because they are not yet able to keep track of what should be done. Such an account evidently does not cast much light. It avoids paradox simply by acknowledging the absence of a capacity present in mature psychological subjects, generalizations about whom can then be supposed to be inapplicable. (Similarly unenlighteningly, one could say of children who fail the 'false belief test' that they do not yet have a workable notion of 'what he (another) will think'.)
4. THEORY IN DEVELOPMENTAL PSYCHOLOGY Thus I suggest that there are features of young children that common-sense psychology, so far from helping us to understand, finds paradoxical in its own terms. It is here evidently that theory may come in. And it seems now that two sorts of theory might be given. First, theory might attempt exactly to pinpoint specific capacities that young children lack. Then childhood deficits could be described in a much less rough and ready way than they are when it is allowed simply that some relevant capacity or other is not developed. (Theory of this sort begins to encroach upon common-sense if we decide, as I suggested above that we might, that 3-yearolds are not as straightforwardly obstinate as we might have supposed.) Second, theory might postulate capacities attributable specifically to children, treating them as having a psychology proper to them. Theory of this second sort is more ambitious: it attempts to do more than to isolate respects in which children are deficient as compared with adults. It might introduce a stock of concepts to tell a developmental story, which provides, at any stage, a psychological description of children that achieves a consistency missing from a common-sense psychological description of them. For example, it might be postulated that children at a certain developmental state have 'proto-intentions'-states of mind which do not conform to those of common-sense psychology's generalizations about intentions that land us in inconsistency, but which none the less playa role related to that of intentions proper. The division between these two sorts of theory is evidently not a very clear-cut one. Indeed, since the two kinds of theoretical intervention I am imagining would ideally mesh with one another, the separation cannot be sharp. Theories of the first sort work back from common-sense psychology, as it were. Theories of the
270
Jennifer Hornsby
second sort work up towards common-sense psychology. If theories of the two sorts were available, and could be made to dovetail, then we should have a story to tell about how a child comes to be a common-sense psychological subject. According to this division, it seems that CCC attempts a theory of the first sort, and LOC a theory of the second sort. The interest of looking at CCC and LOC together, then, is not merely that CCC deals with a stage of development that is recognized in the more developmentally comprehensive LOC. It seems as if the two accounts might together yield an informative account of how a child comes to be an intentional agent. But despite the potential promise that Frye's and Zelazo's accounts hold out, I fear that I have difficulties about reconstructing from them the kind of overall account I think we want. I shall now take the CCC and LOC accounts in turn, and explain some of my difficulties.
5. THE COGNITIVE COMPLEXITY AND CONTROL ACCOUNT CCC proposes that younger children's faulty performance on the DCCS and other tasks should be thought of as a kind of underspecification. [T]here is a developmental improvement in young children's ability to change goals deliberately in specific situations. ... 5-year-olds are able to change their actions [after a switch in the DCCS test] ... because they are better able to specify the goal that they are attempting to reach. (Ch. II)
Frye and Zelazo use the idea of underspecification also in explaining failures in 'the false belief' test. They say that the 3-year-old 'attempts to determine where the character should look to find the object' but fails to 'attempt to determine where the character should look from the character's perspective'. CCC initially seems hard to fault. Certainly, in some sense or other the 3-year-old acts on an inadequately specified goal. But we need to know more exactly what it is for a goal to be 'underspecified'. One suggestion might be this: a complete specification of a goal is the content of an intention of someone who performs the task adequately; and there is underspecification of the goal where an intention with a similar content but with something omitted is present and explains the deficient performance on the task. According to this suggestion, what is lacking in completeness, where there is underspecification, is the content of an intention. This suggestion fits with the most natural way of understanding what Frye and Zelazo say about the false belief test: they think that the concept of 'the character's perspective' goes missing from the thoughts of 3-year-olds who fail the test. But there is a difficulty with the view we reach if we construe underspecification in line with this suggestion. We have to say that the difference between the 3-year-old and 5-year-old, is a difference of thought content. The 5-year-old who passes the
Comment on Frye and Zelazo
271
false belief test and is capable of realizing that another person will get something wrong has a thought whose content, unlike that of the 3-year-old, includes a concept of 'the perspective [of the other]'. And this is questionable. For we can distinguish between having the ability to appreciate another's perspective, and exercising the concept of another's perspective in one's thoughts. And while it may be very plausible that in a range of circumstances, 5-year-olds can and 3-year-olds cannot appreciate another's perspective, it is surely not so plausible that 5-year-olds actually exercise the concept of another's perspective in their thoughts. (To me it does not seem plausible that adults exercise the concept of 'another's perspective' in thought very often, although of course they often manifest an ability to know what someone else will think about something.) There is not necessarily a criticism of Frye and Zelazo here, because underspecification might be understood differently. But questions are bound to arise about what exactly is added to an account of development when claims are made about specifications of goals. We need to be told exactly which states of mind suffer from underspecification in 3-year-olds, and exactly what is absent from the underspecified states. (If we are indeed meant to understand underspecification as a matter of something's being missing from a state's content, then we must supply an analogue of 'another's perspective' to do the work in accounting for the deficit in the DCCS case that 'another's perspective' does in accounting for the deficit when there are failures on the false belief test.) It may be that the states that CCC postulates (whose objects Frye and Zelazo call goals) are not states of intention as these are ascribed in common-sense psychology, so that we should not think of them as having the kinds of content that adults' thoughts and intentions have. But if this is so, then CCC will emerge as a more ambitious theory (with the pretensions of theory of the second sort I distinguished above). Then we should need to know more about the postulated states, and about how those of 3-year-olds differ from those of 5-year-olds. I have raised these questions about what an underspecification of goals amounts to in order to draw attention to how little in the way of actual explanation CCC appears to provide. It is not clear that, as it stands, it offers us much more than the unenlightening account we looked at earlier, which recognizes that young children cannot keep track of what they should do. The interest of CCC would seem to lie principally in its bringing together a range of findings, and thus in its suggestion of a developmental step which makes a crucial difference across a variety of task domains. Its chief implication seems to be that what is lacking in the 3-year-old in the DCCS task is not some specialized capacity. 6. THE LEVELS OF CONSCIOUSNESS ACCOUNT LOC proposes that the developmental change addressed by CCC is the final change in a series of changes in 'recursive awareness'. LOC, moreover, promises to be more specific about underspecification: it appears to introduce a tangible claim
272
Jennifer Hornsby
about what a greater degree of specification involves-namely, a specification at a higher level of consciousness. Perhaps it was wrong to turn to common-sense psychological concepts in seeking to discover what underspecification amounts to: perhaps we needed further theory, of a more ambitious kind, to put the relevant notion of underspecification into service. Prima facie, then, LaC might not only add to CCC by speaking also to earlier developmental history, it might also add shape and definition to CCC itself. But LaC is hard to judge. How are we meant to understand 'level of consciousness'? And how is the idea of such levels related to an information-processing story? With consciousness in the picture, one expects phenomenology to be some sort of touchstone. But thinking of what one can be aware of oneself doesn't appear to reveal what is meant by a different 'level' of consciousness in the LaC model. Certainly, there can be more to being a conscious subject than enjoying the 'minimal consciousness' that corresponds to the initial level of consciousness. But why should a subject who enjoys more than minimal consciousness be thought to be conscious at a new and different level? Perhaps we secure an idea of a second level of consciousness, above the minimal, by thinking of consciously reflecting on our own conscious states. It doesn't seem credible, however, that such thinking about our own states (consciously reflecting on consciously reflecting on ...) can provide us with as many levels as five. Yet, according to LaC, a fifth level of consciousness is what 5-year-olds reach, having passed from minimal consciousness through four developmental steps. Despite the name that he gives to his account, Zelazo is perhaps not really committed to as many levels of consciousness as there are stages in the account. It might be suggested that what increases is actually the number of the model's layers that are relevant to what can be present to consciousness. On this construal, consciousness itself does not assume new levels, but only the recursions in a process by which information feeds into the contents of consciousness. But if one assumes that the different outputs of greater levels of processing are what really constitute differences in 'levels of consciousness', then one wants to know how the model's different layers connect with the child's awareness, or connect with the contents of states that the child's performance warrants the ascription of. The idea would have to be that each developmental stage brings a different layer of information processing, and at any stage the whole processing system existing at that stage somehow determines what is available to the conscious subject at that stage, or determines states which explain the child's performance. If this is indeed the idea, then something will need to be said about the nature of the determination of the contents of consciousness, or of mental states, by information processors. Does Zelazo want us to think of a special centre in the brain where the deliverances of the various recursions (their number depending upon the child's stage of development) somehow arrive? If so, his model seems to be a version of the Cartesian Theatre, which Dennett criticized in Consciousness Explained (1991).
Comment on Frye and Zelazo
273
But unless we are told more about the relationship between the outputs of the model and capacities of the child,s the model's claims about the inner workings of the child's brain will not seem even to start to explain the different degrees of specification of goals which CCC tells us that the child goes in for.
7. OVERVIEW In the ordinary way we make use of common-sense psychological concepts in understanding mind and agency. Developmental psychology, like any other branch of human psychology, seems bound to take off from, and ultimately be answerable to, a common-sense psychological conception of mind. It begins with one because it treats children who are experimental subjects as we all do; and we treat children as, as it were, en route to full personhood. It is answerable to commonsense psychology, because common-sense psychology provides the categories that we have eventually to rely upon to assess the value of any theories that are put forward. I have made a suggestion about how developmental theory (coming in two roughly distinguishable sorts) might playa role such that it could both take off from and be answerable to common-sense psychological conceptions of children. But I have also suggested that CCC fails to take us much beyond what common-sense already suggests we might say about 3-year-olds' performance on the DCCS, and that LOC departs from common-sense in ways that make it hard to assess.
REFERENCES M. (1994), 'The mental simulation debate', in C. Peacocke (ed.), Objectivity Simulation and the Unity of Consciousness. New York: Oxford University Press, 99-127. Proceedings of the British Academy 83. DENNETT, D. (1991), Consciousness Explained. London: Little, Brown and Co. FRYE, D. (1999), 'Development of intention: the relation of executive function to theory of mind', in P. D. Zelazo, J. W. Astington, and D. R. Olson (eds.), Developing Theories of Intention: Social Understanding and Self-Control. Mahwah, NJ: Erlbaum, 119-32. HEAL, J. (1994), 'Simulation vs Theory Theory: What is at Issue?', in C. Peacocke (ed.), Objectivity Simulation and the Unity of Consciousness. New York: Oxford University Press, 129-44. Proceedings of the British Academy 83. DAVIES,
8 No doubt one needs to look at the data on younger children that are adduced in support of LOC in order to learn more about the relationship between the outputs of the model and capacities of the child. But in the present chapter, I confine myself to the question of what light has been cast on the particular findings described in Frye and Zelazo's contribution.
274
Jennifer Hornsby
HORNSBY, J. (l997a), 'Dennett's Naturalism', in her Simple Mindedness: In Defense ofNaive Naturalism in the Philosophy of Mind. Cambridge, Mass.: Harvard University Press, 168-84. --(1997b), 'Causation in intuitive physics and in commonsense psychology', in her Simple Mindedness: In Defense of Nai've Naturalism in the Philosophy of Mind. Cambridge, Mass.: Harvard University Press, 185-94. STICH, S. P. (1983), From Folk Psychology to Cognitive Science. Bradford Books. Cambridge, Mass.: MIT Press.
13
The Development of Self-Consciousness Michael Lewis
1. THE DEVELOPMENT OF SELF-CONSCIOUSNESS
The concept of agency implies an active organism, one who desires, one who makes plans, carries out actions, and compares their actions to their desires. Such a concept of agency requires consciousness. There is little question that in adults, such conscious processes exist. The question is, from a developmental perspective when do they arise? In our work on the development of self-awareness, we have developed measures which are related to this concept. They include self-recognition, personal pronoun usage (me and mine), and pretend play. In this chapter, we will first develop the idea that the mental state of the idea of me needs to be distinguished from the system aspects of the self, that a theory of mind-that is, the child's knowledge of these internal states, such as thinking and intending-requires self-consciousness. Second, we will show the developmental course of self-recognition and personal pronoun usage, and how they are related to the onset of pretend play; and third, we will show how the onset of these capacities affects the onset of self-conscious emotions, such as embarrassment, shame, guilt, and pride.
2. THE NATURE OF A REFLECTING SELF The adult self is made up of a variety of different aspects, functions, and structures which occasionally work in harmony (see Wylie, 1961, for an historical review of this idea). For example, I certainly know, as I sit here writing, that I have a plan to write this chapter and an outline which I have made to help formulate my thoughts. It is clear that I have intentions and desires and presumably the ability to carry out the task of thinking and writing. Yet the very acts themselves seem to emerge from me almost effortlessly. Indeed, if I focus my attention on them, I find that doing so interrupts the very act that I am performing. It is clear, then, that this self of mine-the body and the mind-that carries out this task does not need and, in fact, may be hindered by my paying attention to myself. A self is necessary to formulate, at least sometimes, what it is that I wish to think about, but does not appear to be involved in the process that actually carries out the task of thinking.
276
Michael Lewis
Consider the example: We give a subject the problem of adding a 7 to a sum of 7s that preceded it (e.g. 7 + 7 = 14 + 7 = 21 + 7 = 28, etc.). It is clear that as We carry out this task, we cannot watch ourselves do the arithmetic. It would seem that one aspect of the self has set up the problem and another will solve it; and it is likely that the first will evaluate the result of what the second did. Let us consider self-deception. How is it possible for a self to deceive its self? It would appear to be a logical impossibility, but only if we believe that a self is a single thing. A self as a single thing could not deceive its self. If, however, we conceive of a self in the manner that Freud (1959) did, one that consists of several aspects or features, then we would be able to argue that one part of the self can deceive another part. The idea of self-deception suggests that the one way to understand the self is to assume the position that there are multiple aspects to the self, which may mean that the self is a modular system-an idea applied to brain structure and process (Gazzaniga, 1988). It is clear that whatever the self may be, it is a complex multiaspect sort of 'thing' or 'process'. This multi-aspect self has been considered in many different ways. In early writing, I have referred to it in terms of 'subjective' versus 'objective' self-awareness (Lewis and Brooks-Gunn, 1979a, b; see also Duval and Wicklund, 1972) or the machinery of self versus the idea of'me' (Lewis, 1994, 1995a). Given this idea of a multi-aspect adult self, how are we to treat the idea of the development of self? From a developmental perspective, not all these aspects exist at birth or even develop at the same time. If they did, there would be little that develops. Thus, it is essential when studying the development of the self that we first agree to the general principle that the term 'self' in and of itself imparts little meaning since it does not specify particular aspects of a self. If investigators talk about the existence of a self at birth or even at 3 months (Gopnik and Meltzoff, 1994; Watson, 1994), they may mean something very different than what others might mean when they talk about the self as evolving in the middle of the second year oflife (Lewis, 1992, 1995b; Lewis and Brooks-Gunn, 1979a). What I should like to do is to make a distinction between different aspects of the self and to argue that the aspect that we, adult humans, refer to as ourselves is, in fact, a rather unique aspect of self, one that we share with few other species (the exceptions being the great apes and, perhaps, porpoises and whales). This aspect of the self develops somewhere towards the middle of the second year of life (Lewis, 1994). It may grow out of other aspects of the self which appear earlier or it may have little connection to them-being related only as part of a developmental function of emerging skills associated with maturational processes. More important, however, is the need to make clear, in both our conceptions and our language, that the functions of this late maturing aspect of self not be assigned to earlier aspects of the self.
The Development of Self-Consciousness
277
2.1 Features of Selves Let me give you an example. In a study of infant learning and emotion, infants of 8 weeks of age are placed in a situation in which their arm pulls result in a reward: in this case a picture appears accompanied by some music for 2 seconds. Each experimental session included a 2-minute baseline during which we were able to determine the baseline or ongoing rate of arm movement. Infants then received a learning phase of contingent stimulation in which the audio-visual stimuli were activated by each arm pull. All infants learned the task within the first 3 minutes of the learning period. When learning was achieved, a 2-minute extinction phase occurred, followed by a second 3-minute learning phase. During extinction, no event was presented after an arm pull. Rates of arm pulling throughout the session were computed as the total number of arm pulls per minute. Facial movements were coded from videotapes of the infants using the Maximally Discriminative Facial Movement Coding System (MAX; Izard, 1979). Coders sampled the videotape segments of each subject using a frame-by-frame analysis of the videotape for each of three facial regions: brows, eyes, and mouth. After coding each component, facial expressions were identified by MAX formulas and their frequency tabulated for each minute of the session. We describe only two-joy and anger face-which could be coded with over 90 per cent agreement between judges. Infants were assigned to the experimental and yoked-control conditions. The experimental subjects' arm pulls resulted in the event occurring whereas the control subjects received the same amount of the event as did the experimental subjects, but it was not related to their arm-pull behaviour. For them, there was no possibility of associating a cause and effect. Look first at the arm-pull data for each age group (see Fig. l3.1a). Notice that control subjects showed no change from the base period to the learning, extinction, and learning phases. Not so for the experimental subjects: to begin with, the infants who could cause the event to go on significantly increased their arm-pull behaviour. Of particular interest are the subjects' responses once the association between arm pull and event ceased to work (extinction). Notice that when the arm pull no longer caused the event, arm-pulling behaviour significantly increased rather than declined over the period of disassociation. Once the extinction phase was over, the infants returned to the rate of arm pulling they showed during the first learning phase. These differences were all highly significant. Now let us turn to the emotional behaviour. There was little joy during the base phase and no change for the control subjects (see Fig. 13.1 b). The subjects who learned showed increases in joy during the initial learning phase, a total decline during extinction, and renewed joy once the second learning phase began. Angry expressions follow a reverse pattern (see Fig. l3.1c). There is little anger during the base or during the initial learning
Michael Lewis
278
f ! ~C_~_~ _ _2_m_o_nt_h_s
1
L1
B
EX
L2
Phase of learning
(0)
2 months
I
f~j - -=~= = = =~_: . . . .:. . 2_C~: : .P:.:N_ 2.0
l...-_ _
B
O)c
Co "'.-
o~
2.0 1.5 I
C .... 1.0 ",c..