Synthese (2011) 183:283–311 DOI 10.1007/s11229-011-9898-4
Integrating psychology and neuroscience: functional analyses as mechanism sketches Gualtiero Piccinini · Carl Craver
Received: 22 December 2010 / Accepted: 26 February 2011 / Published online: 11 March 2011 © Springer Science+Business Media B.V. 2011
Abstract We sketch a framework for building a unified science of cognition. This unification is achieved by showing how functional analyses of cognitive capacities can be integrated with the multilevel mechanistic explanations of neural systems. The core idea is that functional analyses are sketches of mechanisms, in which some structural aspects of a mechanistic explanation are omitted. Once the missing aspects are filled in, a functional analysis turns into a full-blown mechanistic explanation. By this process, functional analyses are seamlessly integrated with multilevel mechanistic explanations. Keywords Mechanistic explanation · Functional analysis · Neuroscience · Psychology · Mechanism sketch 1 Integrating psychology and neuroscience via multi-level mechanistic explanation When psychologists explain behavior, the explanations typically make reference to causes that precede the behavior and make a difference to whether and how it occurs. For instance, they explain that Anna ducked because she saw a looming ball. By contrast, when psychologists explain psychological capacities such as stereopsis or working memory, they typically do so by showing that these complex capacities are
G. Piccinini (B) University of Missouri, St. Louis, MO, USA e-mail:
[email protected] C. Craver Washington University in St. Louis, St. Louis, MO, USA e-mail:
[email protected] 123
284
Synthese (2011) 183:283–311
made up of more basic capacities organized together. In this paper, we focus exclusively on the latter sort of explanation, which is usually referred to as functional analysis. We argue that such decompositional, constitutive explanations gain their explanatory force by describing mechanisms (even approximately and with idealization) and, conversely, that they lack explanatory force to the extent that they fail to describe mechanisms. In arguing for this point, we sketch a framework for building a unified science of cognition. This unification is achieved by showing how functional analyses of cognitive capacities can be and in some cases have been integrated with the multilevel mechanistic explanations of neural systems. The core idea is that functional analyses are sketches of mechanisms, in which some structural aspects of a mechanistic explanation are omitted. Once the missing aspects are filled in, a functional analysis turns into a full-blown mechanistic explanation. By this process, functional analyses are seamlessly integrated with multilevel mechanistic explanations. The conclusion that functional analyses are mechanism sketches leads to a simple argument that psychological explanation is mechanistic. It is generally assumed that psychological explanation is functional—that it proceeds via the functional analysis of cognitive capacities (Fodor 1968a,b; Dennett 1978, Chaps. 5, 7; Cummins 1983, 2000; Block and Segal 1998). If psychological explanation is functional and functional analyses are mechanism sketches, then psychological explanations are mechanism sketches. Mechanism sketches are elliptical or incomplete mechanistic explanations. Therefore, psychological explanations are mechanistic. This further conclusion conflicts with the way psychological explanation is traditionally understood. The received view is that functional analysis is autonomous and thus distinct from mechanistic explanation (Fodor 1965, 1968a, 1974; Cummins 1983, 2000).1 If psychological explanation is functional, neuroscientific explanation is mechanistic (Craver 2007), and functional analysis is distinct and autonomous from mechanistic explanation, then psychological explanation is distinct and autonomous from neuroscientific explanation. As we argue, though, functional and mechanistic explanations are not distinct and autonomous from one another precisely because functional analysis, properly constrained, is a kind of mechanistic explanation—an elliptical mechanistic explanation. Thus, a correct understanding of functional analysis undermines the influential claim that explanation in psychology is distinct and autonomous from explanation in neuroscience. Our argument against the autonomy thesis is not an argument for reductionism, either as it has been classically conceived (as the derivation of one theory from another) or as it is now commonly conceived (as the idea that lower-level mechanisms are explanatorily privileged).2 Instead, our argument leads to a new understanding of how 1 E.g.: “[V]is-à-vis explanations of behavior, neurological theories specify mechanisms and psychological theories do not” (Fodor 1965, p. 177). 2 We do endorse reductionism in the sense that every concrete thing is made out of physical components and the organized activities of a system’s components explain the activities of the whole. Setting aside dualism and spooky versions of emergentism, we take these theses to be uncontroversial. But our rejection of the autonomy of psychological explanation should not be confused with reductionism as usually understood, which entails the rejection of multiple realizability. Several authors argue that functional or psychological properties are not multiply realizable—or that if functional or psychological
123
Synthese (2011) 183:283–311
285
psychology and neuroscience should be integrated—explanatory unification will be achieved through the integration of findings from different areas of neuroscience and psychology into a description of multilevel mechanisms. There are still psychologists who pursue explanations of cognition without concerning themselves with how nervous systems work and philosophers who question whether explanations that incorporate neuroscientific evidence (such as the localization of cognitive functions in the brain) are any better than explanations that ignore neuroscientific evidence. We disagree with each. Insofar as psychologists pursue constitutive explanations, they ought to acknowledge that psychological explanations describe aspects of the same multilevel neural mechanism that neuroscientists study. Thus, psychologists ought to let knowledge of neural mechanisms constrain their hypotheses just like neuroscientists ought to let knowledge of psychological functions constrain theirs.3
Footnote 2 continued properties are multiply realizable, then they are not natural kinds (Bechtel 2008; Bechtel and Mundale 1999; Bickle 2003, 2006; Churchland 2007; Couch 2005; Kim 1992; Polger 2004, 2009; Shagrir 1998; Shapiro 2000, 2004). Many of these authors conclude that psychological explanations either reduce to or ought to be replaced by neuroscientific explanations (Bickle 2003; Churchland 1989, 2007, 1986; Clark 1980; Kim 1992). In response, others have defended the multiple realizability of functional or psychological properties, usually in conjunction with a defense of the autonomy of psychology (Aizawa and Gillett 2009, 2011, forthcoming; Block 1997; Figdor 2010; Fodor 1997; Gold and Stoljar 1999). Our rejection of the autonomy thesis is orthogonal to traditional debates about whether psychological properties or functions are multiply realizable. Even if functional properties are multiply realizable, functional analysis is still a kind of mechanistic explanation; a fortiori, functional analysis is not autonomous from mechanistic explanation, and psychological explanation is not autonomous from neuroscientific explanation. In fact, there is a kind of multiple realizability—multiple functional decompositions of the same capacity—that militates against the autonomy of psychology. Autonomist psychology—the search for functional analysis without direct constraints from neural structures—usually goes hand in hand with the assumption that each psychological capacity has a unique functional decomposition (which in turn may have multiple realizers). But there is evidence that the same psychological capacity is fulfilled at different times by entirely different neural structures, or different configurations of neural structures, even within the same organism (Edelman and Gally 2001; Price and Friston 2002; Noppeney et al. 2004; Figdor 2010). Plausibly each such configuration of neural structures corresponds to a somewhat different functional decomposition. So several functional decompositions may all be correct across different species, different members of the same species, and even different time-slices of an individual organism. Yet the typical outcome of autonomist psychology is a single functional analysis of a given capacity. Even assuming for the sake of the argument that autonomist psychology stumbles on one among the correct functional analyses, autonomist psychology is bound to miss the other functional analyses that are also correct. The way around this problem is to let functional analysis be constrained by neural structures—that is, to abandon autonomist psychology in favor of integrating psychology and neuroscience. Thus, the multiplicity of functional decompositions—a kind of multiple realizability—does not support autonomy but rather undermines it. This being said, we will set multiple realizability aside. 3 The most immediate consequence of the present integrationist program is that theorists of cognition ought to learn how nervous systems work and use that information to constrain their investigations. Of course, many theorists already do this. Indeed, recent decades have witnessed an increasing integration of psychology and neuroscience roughly along the lines advocated here. Many psychologists have been moving away from less mechanistically constrained models towards models that take more and more neuroscientific constraints into account (e.g., Gazzaniga 2009; Kalat 2008; Kosslyn et al. 2006; O’Reilly and Munakata 2000; Posner 2004). Our goal is to express one rationale for that trend: there is no kind of constitutive explanation of psychological phenomena that is distinct from mechanistic explanation, properly conceived.
123
286
Synthese (2011) 183:283–311
In the next section, we outline the received view that functional analysis is distinct and autonomous from mechanistic explanation. After that, Sect. 3 sketches the basic elements of mechanistic explanation, emphasizing that functional properties are an integral part of mechanisms. Sections 4–6 discuss the three main types of functional analysis, arguing that each is a sketch of a mechanism. Section 7 rounds up our argument by going over an example of how identifying a functional kind (neurotransmitter) within a system requires fitting it into a mechanism. 2 The received view: functional analysis as distinct and autonomous from mechanistic explanation There is consensus that psychological capacities are explained functionally, that is, by means of what is often called functional analysis. Functional analysis is the analysis of a capacity in terms of the functional properties of a system and their organization. Three main types of functional analysis may be distinguished, depending on which functional properties they invoke. One type is task analysis: the decomposition of a capacity into subcapacities and their organization (Cummins 1975, 1983, 2000; cf. also Fodor 1968b and Dennett 1978). A second type is functional analysis by internal states: an account of how a capacity is produced in terms of a set of internal states and their mutual interaction (Fodor 1965, 1968a; Stich 1983; cf. also Putnam 1960, 1967a,b). A third type is boxology: the decomposition of a system into a set of functionally individuated components (black boxes), the processes they go through, and their organization (Fodor 1965, 1968a). We focus on these three types of functional analysis because they figure prominently in the literature on psychological explanation. We address each kind of functional analysis separately and argue that each amounts to a mechanism sketch. If there are yet other types of functional analysis, our argument can be extended to cover them. For we argue that a complete constitutive explanation of a phenomenon in terms of functional properties must respect constraints imposed by the structures that possess those functional properties—that is, it requires fitting the functional properties within a mechanism. The received view of the relationship between functional analysis and mechanistic explanation may be summarized as follows: Distinctness: Functional analysis and mechanistic explanation are distinct kinds of explanation. Autonomy: Functional analysis and mechanistic explanation are autonomous from one another. One way to see that proponents of the received view endorse distinctness is that they often claim that a complete explanation of a capacity includes both a functional analysis and a matching mechanistic explanation.4 This presupposes that functional analyses are distinct from mechanistic explanation. 4 E.g.:
Explanation in psychology consists of a functional analysis and a mechanistic analysis: a phase one theory and a determination of which model of the theory the nervous system of the organism represents (Fodor 1965, p. 177).
123
Synthese (2011) 183:283–311
287
Distinctness is defended in slightly different ways depending on which form of functional analysis is at issue. With respect to task analysis, distinctness has been defended as follows. Unlike mechanistic explanation, which attributes subcapacities to the components of a mechanism, task analysis need not attribute subcapacities to the components of the system because the subcapacities may all belong to the whole system.5 While talking about components simpliciter may be enough to distinguish between task analysis and mechanistic explanation, it is insufficient to distinguish between functional analysis in general and mechanistic explanation. This is because some analyses—specifically, functional analyses that appeal to black boxes—do appeal to components, although such components are supposed to be individuated purely functionally, by what they do. To avoid ambiguity, we use the term functional components for components as individuated by their functional properties and structural components for components as individuated by their structural properties. Paradigmatic structural properties include the size, shape, location, and orientation of extended objects. Anatomists tend to study structures in this sense, as do X-ray crystallographers. Paradigmatic functional properties include being a neurotransmitter, encoding an episodic memory, or generating shape from shading. Physiologists tend to study function. Nothing in our argument turns on there being a metaphysically fundamental divide between functional and structural properties; indeed, if we are right, one cannot characterize functions without committing oneself to structures and vice versa. Those who think of functional analysis in terms of internal states and boxology defend distinctness on the grounds that internal states or black boxes are individuated solely by their functional relations with each other as well as with inputs and outputs, and not by their structural properties. While mechanistic explanation appeals to the structural features of the components of the mechanism, functional analysis allegedly does not.6 Distinctness is a necessary condition for autonomy: if functional analysis is a kind of mechanistic explanation, as we argue, then functional analysis cannot be autonomous Footnote 4 continued Functional and mechanistic explanations must be matched to have a complete explanation of a capacity (Cummins 1983, p. 31). 5 Cf. Cummins:
Form-function correlation is certainly absent in many cases … and it is therefore important to keep functional analysis and componential analysis [i.e., mechanistic explanation] conceptually distinct. Componential analysis of computers, and probably brains, will typically yield components with capacities that do not figure in the analysis of capacities of the whole system (Cummins 1983, 2000, p. 125). 6 Cf. Fodor:
If I speak of a device as a “camshaft,” I am implicitly identifying it by reference to its physical structure, and so I am committed to the view that it exhibits a characteristic and specifiable decomposition into physical parts. But if I speak of the device as a “valve lifter,” I am identifying it by reference to its function and I therefore undertake no such commitment (Fodor 1968a, p. 113).
123
288
Synthese (2011) 183:283–311
from mechanistic explanation. But distinctness is not sufficient for autonomy: two explanations may be distinct from each other and yet mutually dependent. Nevertheless, those who endorse distinctness typically endorse autonomy as well—in fact, we suspect that defending autonomy is the primary motivation for endorsing distinctness.7 What is autonomy (in the relevant sense)? There are many kinds of autonomy. One scientific enterprise may be called autonomous from another if the former can choose (i) which phenomena to explain, (ii) which observational and experimental techniques to use, (iii) which vocabulary to adopt, and (iv) the precise way in which evidence from the other field constraints its explanations. The term “autonomy” is sometimes used for one or more of the above (e.g., Aizawa and Gillett forthcoming argue that psychology is autonomous from neuroscience in sense (iv)). We have no issue with these forms of autonomy. Psychology may well be autonomous from neuroscience in these four ways. In another sense, a scientific explanation may be said to be autonomous from another just in case the former refers to properties that are distinct from and irreducible to the properties referred to by the latter. This form of autonomy is sometimes claimed to obtain between psychology and neuroscience. For instance, psychological properties are sometimes claimed to be distinct from and irreducible to the neural properties that realize them. This latter form of autonomy thesis suggests that properties are stacked into levels of being. It is not clear how levels of being can be distinct from one another without being ontologically redundant (Kim 1992; Heil 2003). Doing justice to this topic would take us beyond the scope of this paper and is orthogonal to our topic. When we talk about mechanistic levels and levels of mechanistic explanation, we are officially neutral on whether mechanistic levels correspond to levels of being that are ontologically autonomous from one another. In yet another sense, autonomy may be said to obtain between either laws or theories when they are irreducible to one another (cf. Fodor 1997, p. 149; Block 1997), regardless of whether such laws or theories refer to ontologically distinct levels of being. As we have pointed out before, we are not defending reductionism. Nevertheless, we reject this kind of autonomy. The dichotomy between reduction and autonomy is a false one. Neither psychology nor neuroscience discovers the kind of law or theory for which talk of reduction makes the most sense (cf. Cummins 2000). What they discover, we argue, are aspects of mechanisms to be combined in full-blown multilevel mechanistic explanations. Psychological explanations are not distinct from neuroscientific ones; each describes aspects of the same multilevel mechanisms. Therefore, 7 E.g.:
The conventional wisdom in the philosophy of mind is that psychological states are functional and the laws and theories that figure in psychological explanations are autonomous (Fodor 1997, p. 149). Why … should not the kind predicates of the special sciences cross-classify the physical natural kinds? (Fodor 1975, p. 25; see also Fodor 1997, pp. 161–162) We could be made of Swiss cheese and it wouldn’t matter (Putnam 1975, p. 291). It is worth noting that Fodor’s writings on psychological explanation from the 1960s were less sanguine about autonomy than his later writings, although he was already defending distinctness (cf. Aizawa and Gillett forthcoming).
123
Synthese (2011) 183:283–311
289
we reject autonomy as irreducibility of laws or theories in favor not of reduction but of explanatory integration. A final kind of autonomy thesis maintains that two explanations are autonomous just in case there are no direct constraints between them. Specifically, some authors maintain that the functional analysis and the mechanistic explanation of one and the same phenomenon put no direct constraints on each other.8 While proponents of this kind of autonomy are not very explicit in what they mean by “direct constraints,” the following seems to capture their usage: a functional analysis directly constrains a mechanistic explanation if and only if the functional properties described by a functional analysis restrict the range of structural components and component organizations that might exhibit those capacities; a mechanistic explanation directly constrains a functional analysis if and only if the structural components and component organization described by the mechanistic explanation restrict the range of functional properties exhibited by those components thus organized. Of course, every participant in this debate agrees on one important (if obvious) indirect constraint: the mechanism postulated by a true mechanistic explanation must realize the functional system postulated by a true functional analysis.9 Aside from that, autonomists suggest that the two explanatory enterprises proceed independently of one another.10 As this “no-direct-constraints” kind of autonomy is generally interpreted, it entails, or at least strongly suggests, that those who are engaged in functional analysis need not know or pay attention to what mechanisms are present in the system; by the same token, those who are engaged in mechanistic explanation need not know or pay attention to how a system is functionally analyzed. Thus, according to this kind of 8 E.g.:
Phase one explanations [i.e., functional analyses by internal states] purport to account for behaviour in terms of internal states, but they give no information whatever about the mechanisms underlying these states (Fodor 1965, p. 173). [F]unctional analysis puts very indirect constraints on componential analysis (Cummins 1983, p. 29; 2000, p. 126). While these specific statements by Cummins and Fodor suggest the autonomy of mechanistic explanation from functional analysis rather than the converse, the rest of what they write makes clear that they also maintain that functional analysis is autonomous from mechanistic explanation. For an even stronger formulation of the “no constraint principle” that is pervasive in the literature on functional analysis, see Aizawa and Gillett forthcoming. 9 Cf. Cummins:
Ultimately, of course, a complete theory for a capacity must exhibit the details of the target capacity’s realization in the system (or system type) that has it. Functional analysis of a capacity must eventually terminate in dispositions whose realizations are explicable via analysis of the target system. Failing this, we have no reason to suppose we have analyzed the capacity as it is realized in that system (Cummins 1983, p. 31; Cummins 2000, p. 126). Although we along with every other participant in this debate assume that functional systems are realized by mechanisms, some dualists disagree; they maintain that a functional system may be a non-physical, non-mechanistically-implemented system. We disregard this possibility on the usual grounds of causal closure of the physical and lack of an adequate account of the interaction between physical and non-physical properties. In any case, dualism is not our present target. 10 Michael Strevens suggested to one of us in conversation that psychology and neuroscience may only
constraint each other in this indirect way.
123
290
Synthese (2011) 183:283–311
autonomy, psychologists and neuroscientists need not pay attention to what the other group is doing—except that in the end, of course, their explanations ought to match. The assumption of autonomy as lack of direct constraints is appealing. It neatly divides the explanatory labor along traditional disciplinary lines and thus relieves members of each discipline of learning overly much about the other discipline. On one hand, psychologists are given the task of uncovering the functional organization of the mind without worrying about what neuroscientists do. On the other hand, neuroscientists are given the task of discovering neural mechanisms without having to think too hard about how the mind works. Everybody can do their job without getting in each other’s way. Someday, if everything goes right, the functional analyses discovered by psychologists will turn out to be realized by the neural mechanisms discovered by neuroscientists. And yet, according to this autonomy thesis, neither the functional properties nor the structures place direct constraints on one another. A number of philosophers have resisted autonomy understood as lack of direct constraints. Sometimes defenders of mechanistic explanation maintain that functional analysis—or “functional decomposition,” as Bechtel and Richardson call it—is just a step towards mechanistic explanation (Bechtel and Richardson 1993, pp. 89–90; Bechtel 2008, p. 136; Feest 2003). Furthermore, many argue that explanations at different mechanistic levels directly constrain one another (Bechtel and Mundale 1999; Churchland 1986; Craver 2007; Feest 2003, Keeley 2000; Shapiro 2004). We agree with both of these points. But they do not go deep enough. The same authors who question autonomy seem to underestimate the role that distinctness plays in defenses of autonomy. Sometimes proponents of mechanistic explanation even vaguely hint or assume that functional analysis is the same as mechanistic explanation (e.g., Bechtel 2008, p. 140; Feest 2003; Glennan 2005), but they don’t articulate and defend that thesis. So long as distinctness remains in place, defenders of autonomy have room to resist the mechanists’ objections. Autonomists may insist that functional analysis, properly so called, is autonomous from mechanistic explanation after all. By arguing that functional analysis is actually a kind of mechanistic explanation, we get closer to the bottom of this dialectic. Functional analysis cannot be autonomous from mechanistic explanation because the former is just an elliptical form of the latter. In the rest of this paper, we argue—along with others—that functional analysis and mechanistic explanation are not autonomous because they constrain each other; in addition, we argue that they can’t possibly be autonomous in this sense because functional analysis is just a kind of mechanistic explanation. Functional properties are an undetachable aspect of mechanistic explanations. Any given explanatory text might accentuate the functional properties at the expense of the structural properties, but this is a difference of emphasis rather than difference in kind. The target of the description in each case is a mechanism. In the next section, we introduce contemporary views of mechanistic explanation and show that mechanistic explanation is rich enough to incorporate the kinds of functional properties postulated by functional analysis. Then we argue that each kind of functional analysis is a mechanism sketch—an elliptical description of a mechanism.
123
Synthese (2011) 183:283–311
291
3 Mechanistic explanation Mechanistic explanation is the explanation of the capacities (functions, behaviors, activities) of a system as a whole in terms of some of its components, their properties and capacities (including their functions, behaviors, or activities), and the way they are organized together (Bechtel and Richardson 1993; Machamer et al. 2000; Glennan 2002). Components have both functional properties—their activities or manifestations of their causal powers, dispositions, or capacities—and structural properties—including their location, shape, orientation, and the organization of their sub-components. Both functional and structural properties of components are aspects of mechanistic explanation. Mechanistic explanation has also been called “system analysis,” “componential analysis” (Cummins 1983, pp. 28–29; 2000, p. 126), and “mechanistic analysis” (Fodor 1965). Constructing a mechanistic explanation requires decomposing the capacities of the whole mechanism into subcapacities (Bechtel and Richardson 1993). This is similar to task analysis, except that the subcapacities are assigned to structural components of a mechanism. The term “structural” does not imply that the components involved are neatly spatially localizable, have only one function, are stable and unchanging, or lack complex or dynamic feedback relations with other components. Indeed, a structural component might be so distributed and diffuse as to defy tidy structural description, though it no doubt has one if we had the time, knowledge, and patience to formulate it. Mechanistic explanation relies on the identification of relevant components in the target mechanism (Craver 2007). Components are sometimes identified by their structural properties. For instance, some anatomical techniques are used to characterize different parts of the nervous system on the basis of the different kinds of neurons they contain and how such neurons are connected to one another. Brodmann decomposed the cortex into distinct structural regions by characterizing cyto-architectonic differences in different layers of the cortical parenchyma. Geneticists characterize the primary sequence of a gene. Such investigations are primarily directed at uncovering structural features rather than functional ones. But anatomy alone cannot yield a mechanistic explanation—mechanistic explanation requires identifying the functional properties of the components. For example, case studies of brain damaged patients and functional magnetic resonance imaging are used to identify regions in the brain that contribute to the performance of some cognitive tasks. Such methods are crucial in part because they help to identify regions of the brain in which relevant structures for different cognitive functions might be found. They are also crucial to assigning functions to the different components of the mechanism. Each of these discoveries is a kind of progress in the search for neural mechanisms. Functional properties are specified in terms of effects on some medium or component under certain conditions. Different structures and structure configurations have different functional properties. As a consequence, the presence of certain functional properties within a mechanism constrains the possible structures and configurations that might exhibit those properties. Likewise, the presence of certain structures and
123
292
Synthese (2011) 183:283–311
configurations within a mechanism constrains the possible functions that might be exhibited by those structures and configurations (cf. Sporns et al. 2005). Mechanistic explanation is hierarchical in the sense that the functional properties (functions, capacities, activities) of components can also often be mechanistically explained. Each iteration in such a hierarchical decomposition adds another level of mechanisms to the hierarchy, with levels arranged in component/sub-component relationships and ultimately (if ever) bottoming out in components whose behavior has no mechanistic explanation. Descriptions of mechanisms—mechanism schemas (Machamer et al. 2000) or models (Glennan 2005; Craver 2006)—can be more or less complete. Incomplete models—with gaps, question-marks, filler-terms, or hand-waving boxes and arrows—are mechanism sketches. Mechanism sketches are incomplete because they leave out crucial details about how the mechanism works. Sometimes a sketch provides just the right amount of explanatory information for a given context (classroom, courtroom, lab meeting, etc.). Furthermore, sketches are often useful guides to the future development of a mechanistic explanation. Yet there remains a sense in which mechanism sketches are incomplete or elliptical. The common crux of mechanistic explanation, both in its current form and in forms stretching back through Descartes to Aristotle, is to reveal the causal structure of a system. Explanatory models are evaluated as good or bad to the extent that they capture, even dimly at times, aspects of that causal structure. Our argument is that the motivations guiding prominent accounts of functional analysis commit its defenders to precisely the same norms of explanation that mechanists embrace. One can embrace non-mechanistic forms of functional analysis only to the extent that one turns a blind eye to the very normative commitments that make functional analysis explanatory in the first place (for a different argument to this effect, see Kaplan and Craver forthcoming).
4 Task analysis A task analysis breaks a capacity of a system into a set of sub-capacities and specifies how the sub-capacities are (or may be) organized to yield the capacity to be explained. Cummins calls the specification of the way sub-capacities are organized into capacities a “program” or “flow-chart” (Cummins 1975, 1983, 2000). The organization of capacities may also be specified by, say, a system of differential equations linking variables representing various sub-capacities. What matters is that one specifies how the various sub-capacities are combined or interact so that they give rise to the capacity of the system as whole. Cummins remains explicitly ambiguous about whether the analyzing sub-capacities are assigned to the whole mechanism or to its components. The reason is that at least in some cases, the functional analysis of a capacity “seems to put no constraints at all on … componential analysis [i.e., mechanistic explanation]” (1983, p. 30). We discuss the different types of functional analysis separately. First, “functional analyses” that assign sub-capacities to structural components are mechanistic explanations (Craver 2001). Second, functional analyses that assign sub-capacities to functional
123
Synthese (2011) 183:283–311
293
components (black boxes) are boxological models (see Sect. 6). Finally, functional analyses that assign sub-capacities to the whole system rather than its components are task analyses. Cummins’ examples of capacities subject to task analysis are assembly line production, multiplication, and cooking (Cummins 1975, 1983, 2000). Task analysis is the topic of this section. Task analysis of a capacity is one step in its mechanistic explanation in the sense that it partitions the phenomenon to be explained into intelligible units or paragraphs of activity. For example, contemporary memory researchers partition semantic memory into encoding, storage, and retrieval processes, appeal to each of which will be required to explain performance on any memory task. Contra Cummins, the partition of the phenomenon places direct constraints on components, their functions, and their organization. For if a sub-capacity is a genuinely explanatory part of the whole capacity, as opposed to an arbitrary partition (a mere piece or temporal slice), it must be exhibited by specific components or specific configurations of components. In the systems with which psychologists and neuroscientists are concerned, the sub-capacities are not ontologically primitive; they belong to structures and their configurations. The systems have the capacities they have in virtue of their components and organization. Consider Cummins’s examples: stirring (a sub-capacity needed in cooking) is the manifestation of (parts of) the cook’s locomotive system coupled with an appropriate stirring tool as they are driven by a specific motor program; multiplying single-digit numbers (a sub-capacity needed in multiplying multiple digit numbers) is the manifestation of a memorized look-up table in cooperation with other parts of the cognitive system; and assembly line production requires different machines and operators with different skills at different stages of production. Likewise, the correctness of the abovementioned task analysis of memory into encoding, storage, and retrieval depends on whether there are components that encode, store, and retrieve memories. Task analysis is constrained, in turn, by the available components and modes of organization. If the study of brain mechanisms forces us to lump, split, eliminate or otherwise rethink any of these sub-capacities, the functional analysis of memory will have to change (cf. Craver 2004). If the cook lacks a stirring tool but still manages to combine ingredients thoroughly, we should expect that the mixture has been achieved by other means, such as shaking. If someone doesn’t remember the product of two single digit numbers, she may have to do successive sums to figure it out. The moral is that if the components predicted by a given task analysis are not there or are not functioning properly, you must rule out that task analysis in favor of another for the case in question (cf. Craver and Darden 2001). In summary, a task analysis is a mechanism sketch in which the capacity to be explained is articulated into sub-capacities, and most of the information about components is omitted. Nevertheless, the sub-capacities do place direct constraints on which components can engage in those capacities. For each sub-capacity, we expect a structure or configuration of structures that has that capacity. This guiding image underlies the very idea that these are explanations, that they reveal the causal structure of the system. If the connection between analyzing tasks and components is severed completely, then there is no clear sense in which the analyzing sub-capacities are aspects of the actual causal structure of the system as opposed to arbitrary partitions of the system’s capacities or merely possible causal structures. Indeed, components
123
294
Synthese (2011) 183:283–311
often call attention to themselves as components only once it is possible to see them as performing what can be taken as a unified function. So task analysis specifies sub-capacities to be explained mechanistically, places direct constraints on the kinds of components and modes of organization that might be employed in the mechanistic explanation, and is in turn constrained by the components, functional properties, and kinds of organization that are available.11 At this point, a defender of the distinctness and autonomy of functional analysis may object that although many task analyses are first-stage mechanistic theories as we maintain, the really interesting cases, including the ones most pertinent to cognitive phenomena, are not. The most influential putative example of such is the general purpose computer. General purpose computers can do an indefinite number of things depending on how they are programmed. Thus, one might think, a task analysis of a general purpose computer’s capacities places no direct constraints on its structural components and the structural components place no direct constraints on task analysis, because the components are the same regardless of which capacity a computer exhibits. The only constraint is indirect: the computer must be general purpose, so it must have general purpose components. By extension, if the same kind of autonomous task analysis applies to human behavior, then the type of task analysis with which the received view is concerned is not a mechanism sketch. This objection makes an important point but draws the wrong conclusion. True, general purpose computers are different from most other systems precisely because they can do so many things depending on how they are programmed. But general purpose computers are still mechanisms, and the explanation of their behavior is still mechanistic (Piccinini 2007, 2008). Furthermore, the task analysis of a general purpose computer does place direct constraints on its mechanistic explanation and vice versa; in fact, even the task analysis of a general purpose computer is just an elliptical mechanistic explanation. To begin with, the explanation of the capacity to execute programs is mechanistic: a processor executes instructions over data that are fed to it and returns results to other components (Fig. 1). In addition, the explanation of any given computer behavior is also mechanistic: the processor executes these particular instructions, and it is their execution that results in the behavior. Because the program execution feature is always the same, in many contexts it is appropriate to omit that part of the explanation. But the result is not a non-mechanistic task analysis; it is, again, an elliptical one. It is a task analysis in which most of the mechanism is left implicit—the only part that is made explicit is the executed program. 11 This account of task analysis is borne out by the historical evolution of task analysis techniques in psychology. Psychologists developed several techniques for analyzing complex behavioral tasks and determine how they can be accomplished efficiently. Over time, these techniques evolved from purely behavioral techniques, in which the sole purpose is to analyze a task or behavior into a series of operations, into cognitive task analysis, which also aims at capturing the agents’ cognitive states and their role in guiding their behaviors. To capture the role of cognitive states, agents ought to be decomposed into their components (Crandall et al. 2006, p. 98). The motivation for such a shift is precisely to capture more accurately the way agents solve problems. Focusing solely on sequences of operations has proved less effective than analyzing also the agents’ underlying cognitive systems, including their components. As the perspective we are defending would predict, task analysis in psychology has evolved from a technique of behavioral analysis, closer to task analysis as conceived by Cummins, towards a more mechanistic enterprise.
123
Synthese (2011) 183:283–311
295
Computer Environmental Input
Input
Processor Data, Instructions
Instructions
Memory
Control
Devices Commands Data
Output Environmental Output
Devices
Results
Datapath Intermediate Results
Fig. 1 Functional organization of a general purpose digital computer
Notice that for this type of explanation to be appropriate for human beings, they must turn out to contain general purpose computers; thus, they must turn out to contain processors capable of executing the right kind of instructions plus memory components to store data, instructions, and results. Whether human brains contain this kind of mechanism is an empirical question, and it can only be resolved by investigating whether brains have this kind of organization. If the analogy between brains and general purpose computers is more than just a metaphor, it has to make substantive claims about brain mechanisms. Those claims might turn out to be false, in which case we need to revise our understanding of brains and how they work. A defender of the received view may reply as follows. Granted, whether a system is a general purpose computer is a matter of which mechanisms it contains. But if we can assume that a system is a general purpose computer, then we can give task analyses of its behavior that say nothing about its mechanisms. For when we describe the program executed by the computer, we are not describing any components or aspects of the components.12 We are abstracting away from the mechanisms. This objection misconstrues the role programs play in the explanation of computer behavior. There are two cases. First, if a computer is hardwired to perform a certain computation, the computer’s behavior may still be described by a flow-chart or program.13 In the case of hardwired computations, it is true that a program is not a component of the computer. But it is false that the program description is independent of the description of the machine’s components. In fact, the program describes precisely the activity of the special purpose circuit that is hardwired to generate the relevant computation. A circuit is a structural component. Therefore, a task analysis of a computer that is hardwired to perform a computation is a precise description of a specific structural component. Incidentally, an analogous point applies to connectionist systems that come to implement a task analysis through the adjustment of weights between nodes. 12 Cf. Cummins on programs: “programs aren’t causes but abstract objects or play-by-play accounts” (Cummins 1983, p. 34). 13 In these cases, we find it misleading to say, as Cummins and others do, that the computer is executing the program, because the program is not an entity in its own right that plays a causal role. The same point applies to computers that are programmed by rearranging the connections between their components.
123
296
Synthese (2011) 183:283–311
The second case involves those computers that produce a behavior by executing a program—that is, by being caused to produce the behavior by the presence of the program within them. Here, the program itself is a stable state of one or more computer components. Programs are typically stored in memory components and sent to processors one instruction at a time. If they weren’t physically present within computers, digital computers could not execute them and would not generate the relevant behaviors. Thus programs are a necessary feature of the mechanistic explanation of computers’ behaviors; any task analysis of a computer in terms of its program is an elliptical mechanistic explanation that provides only the program and elides the rest of the mechanism (including, typically, the subsidiary programs that translate high level code into machine executable code). A final objection might be that some computational models focus on the flow of information through a system rather than the mechanisms that process the information (cf. Shagrir 2006, 2010). In such cases, nothing is added to the explanation by fleshing out the details of how the information is represented and processed. Certainly, many computational explanations in psychology and neuroscience have this form. Our point, however, is that such descriptions of a system place direct constraints on any structures that can possibly process such information—on how the different states of the system can be constructed, combined, and manipulated—and are in turn constrained by the structures to be found in the system. It is, after all, an empirical matter whether the brain has structural components that satisfy a given informational description, that is, whether the neuronal structures in question can sustain the information processing that the model posits (under ecologically and physiologically relevant conditions). If they cannot, then the model is a how-possibly model and does not describe how the system actually works. Computational or informational explanations are still tethered to structural facts about the implementing system. Once we know the information processing task, we might think that details about how the information is encoded and manipulated are no longer relevant, and in some explanatory contexts this is true, but details about how the information is encoded and manipulated are, in fact, essential to confirm our hypotheses about information processing. Our reason for thinking that task analysis boils down to (elliptical) mechanistic explanation might be summarized as follows. Either our opponent accepts that there is a unique correct task analysis for a given system capacity at a given time or she does not. If explanations must be true, the idea of a unique correct task analysis for a given system capacity amounts to the idea that there are units of activity between inputs and outputs that transform the relevant inputs (and internal states) into outputs. In real psychobiological systems, flesh and blood lie between the input and the output, and it has been organized through evolution and development to exhibit that behavior. Whether the flesh and blood satisfies the given task analysis depends on whether the flesh and blood contains units of structure that complete each of the tasks. If no such units of structure can be identified in the system, then the task analysis cannot be correct.14 14 Lest there be some confusion on this point, we emphasize again that components need not be neatly localizable, visible, or spatially contained within a well defined area. Any robustly detectable configuration of structural properties might count as a component.
123
Synthese (2011) 183:283–311
297
Our imagined opponent might thus prefer to deny that there is a unique correct task analysis for a given system behavior. Cummins sometimes seems to embrace this view (see Cummins 1983, p. 43). Perhaps our opponent maintains that the beauty of functional analysis is that it allows us to rise above the gory details, details that often vary from species to species, individual to individual, and time to time, and thereby to “capture” features of the causal structure of the system that are invisible when we focus on the microstructural details. She might think that any model that describes correctly and compactly the behavior of the system is equally explanatory of the system’s behavior. Or she might think that higher-level descriptions can render the phenomenon intelligible even when they do not have any tidy echo in microstructural details. We agree that models can be predictively adequate and intellectually satisfying even when they fail to describe mechanisms. Our opponent, however, maintains that predictive adequacy and/or intelligibility are enough for explanation. But this view of explanation is hard to defend. In fact, most philosophers of science agree that neither predictive adequacy nor intelligibility is sufficient for explanation. One can predict an empty gas tank without knowing how it came to be empty. And anyone who has ever misunderstood how something works is familiar with the idea that intelligible explanations can be terrible explanations. Some explanations make phenomena intelligible, but not all intelligible models are explanatory. In short, to give up on the idea that there is a uniquely correct explanation, and to allow that any predictively adequate and/or intelligible model is explanatory, is essentially to give up on the idea that there is something distinctive about explanatory knowledge (see Kaplan and Craver forthcoming). No advocate of functional analysis should want to give that up. Either there is a unique correct task analysis, in which case the underlying mechanism must have components corresponding to the sub-tasks in the analysis, or there is not a unique correct task analysis, in which case task analysis has been severed from the actual causal structure of the system and does not count as explanation. In short, either task analysis is an elliptical form of mechanistic explanation or it is no explanation at all. 5 Functional analysis by internal states The capacities of a system, especially cognitive capacities, are sometimes said to be explained by the system’s internal states (and internal processes, defined as changes of internal states). Common examples of internal states include propositional attitudes, such as beliefs and desires, and sensations, such as pain. In an important strand of the philosophy of psychology, internal states are held to be functional states, namely, states that are individuated by their relations to inputs, outputs, and other internal states (Putnam 1960, 1967a,b; Fodor 1965, 1968a; Block and Fodor 1972). Within this tradition, the functional relations between inputs, outputs, and internal states are said to constitute the functional organization of the system, on which the functional analysis of the capacities of the system is based.15 15 Although this functionalist account is often said to be neutral between physicalism and dualism,
this is an oversimplification. The internal states interact with the inputs and outputs of the system,
123
298
Synthese (2011) 183:283–311
This account of mental states as internal functional states has been challenged. But our concern here is not whether the account is adequate. Our concern is, rather, how functional analysis by internal functional states relates to mechanistic explanation.16 In order to assess functional analysis by internal states, it is necessary to ask the further question: In what sense are such functional states internal? The notion of state may be taken as primitive or analyzed as the possession of a property at a time. Either way, in general, there is no obvious sense in which a state of a system, per se, is internal to the system. For example, consider the distinction between the solid, liquid, and gaseous states of substances such as water. There is no interesting sense in which the state of being liquid, solid, or gaseous is internal to samples of water or any other substance. Of course, the constitutive explanation for why, at certain temperatures, water is solid, liquid, or gaseous involves the components of water (H2 O molecules) and their temperature-related state, which, together with other properties of the molecules (such as their shape), generates the molecular organization that constitutes the relevant global state of water samples. In this explanation we find a useful notion of internal state because individual water molecules are contained within the admittedly imprecise spatial boundaries of populations of water molecules. The moral is this: in general, a system’s states (simpliciter) are not internal in any interesting sense; they are global, system-level states. But there is something interestingly internal to the state of the components of a system, which play a role in explaining the global state of a system (as well as its behavior). This is because the components are inside the system. Are the internal states invoked in functional analysis system-level states or states of components? The qualifier “internal” suggests the latter.17 Here is why the internal states postulated by a functional analysis must be states of the system’s components. Functional analysis by internal states postulates a system of multiple states. Such states are capable of interacting with inputs, outputs, and each other, in order to produce novel states and outputs from previous states and inputs. Notice three features of these systems. First, it must be possible for many states to occur at the same time. Second, inputs and outputs enter and exit through specific components of the system. Third, inputs and outputs are complex configurations of Footnote 15 continued which are physical. Thus, if the internal states were non-physical, they must still be able to interact with the physical inputs and outputs. This violates the causal closure of the physical and requires an account of the interaction between physical and non-physical states. So, while a version of functionalism may be defined to be compatible with dualism, any metaphysically respectable version of functionalism should be physicalistic. 16 Notice that insofar as we are dealing with explanations of cognitive capacities, we are focusing on states that are internal to the cognitive system, whether or not the system’s boundaries coincide with the body or nervous system of the organism. 17 In any case, etymology does support our conclusion. Putnam imported the notion of an internal state into the philosophy of psychology as part of his (1960) analogy between mental states and Turing machine states. Turing described digital computers as having “internal states” (Turing 1950) that belong to the read-write head, which is the active component of Turing machines. In the case of digital computers, internal states are states of an internal component of the computer, such as the memory or processor. Initially, Putnam did not call states “internal” but “logical” (1960, 1967a), and then he called them “functional” states (1967b). But Fodor called them “internal” states (1965; cf. also 1968a; Block and Fodor 1972, and Piccinini 2004). Via Putnam and Fodor, Turing’s phrase “internal state” has become the established way to talk about functional states. As we have seen, it originally referred to the states of a component (of a computing machine).
123
Synthese (2011) 183:283–311
299
different physical media, such as light waves, sound waves, chemical substances, and bodily motions. The only known way to construct a system of states that can occur at the same time and mediate between such inputs and outputs is to transduce the different kinds of input into a common medium, different configurations of which are the different states (e.g., configurations of letters from the same alphabet or patterns of activation of mutually connected neurons). For different configurations of the same medium to exist at the same time and yet be active independently of one another, they must be possessed by different components.18 The agents in a causal interaction have to be distinct from one another. Furthermore, such components must be able to create the relevant configurations of the medium, distinguish between them, and interact with each other so as to produce the relevant subsequent states and outputs from the previous states and inputs. Thus, functional analysis by internal states requires that the states belong to some of the system’s components and constrains the properties of the components. A system can surely possess different global states at the same time, e.g., a color, a speed, and a temperature. Such global states can affect each other as well as the behavior of the system. For instance, a system’s color influences heat absorption, which affects temperature, which in turn affects heat dissipation. But global states can only influence global variables—they cannot mediate between complex configurations of different physical media coming through different sensory systems and generate the specialized configurations of outputs coming from a motor system. Heat absorption and dissipation are not complex configurations of a physical medium—they are global variables themselves. Global variables such as color and temperature can affect other global variables such as heat absorption and dissipation—they cannot transform, say, a specific pattern of retinal stimulation into a specific pattern of muscle contractions. Or at any rate, no one has ever begun to show that they can. One might insist, however, that functional analysis in terms of functional states makes no reference (directly or indirectly) to components, and so need not be a mechanism sketch. The goal of an explanation is to capture in a model how a system behaves; models need not describe components in order to capture how a system behaves; models need not describe components in order to explain. The problem with this view, as with the analogous conception of task analysis above, is that it confuses explaining with modeling. A resounding lesson of 50 years of sustained discussion of the nature of scientific explanation is that not all phenomenally and predictively adequate models are explanations. One can construct models that predict phenomena on the basis of their correlations (as barometers predict but do not explain storms), regular temporal successions (national anthems precede but do not explain kickoffs), and effects (as fevers predict but do not explain infections). Furthermore, there is a fundamental distinction between redescribing a phenomenon (even in law-like statements) and explaining the phenomenon. Snell’s law predicts how light will bend as it passes from one medium to another, but it does not explain 18 In some neural networks, a pattern of neural activation may be taken to store multiple superposed representations. Notice how in this case the different representations cannot operate independently of one another.
123
300
Synthese (2011) 183:283–311
why light bends as it does. One might explain that the light bent because it passed from one medium to another, of course. But that is an etiological explanation of some light-bending events, not a constitutive explanation of why light bends when it passes between different media. Finally, one can build predictively adequate models that contain arbitrarily large amounts of superfluous (i.e., nonexplanatory) detail. Explanations are framed by considerations of explanatory relevance. If functional analysis by internal states is watered down to the point that it no longer makes any commitments to the behavior of components, then it is no longer possible to distinguish explanations from merely predictively adequate models and phenomenal descriptions of the system’s behavior. Nor does such a modeling strategy tell us how to eliminate irrelevant information from our explanations—a crucial explanatory endeavor. In short, “explanation” of this sort is not worthy of the name. In conclusion, “internal” states either are not really internal, in which case they constitute a system-level explanandum for a mechanistic explanation, or they are internal in the sense of being states of components. As we have seen, there are two ways to think of components. On one hand are structural components. In this case, functional analysis by internal states is a promissory note on (a sketch of) a mechanistic explanation. The analysis postulates states of some structural components, to be identified by a full-blown mechanistic explanation.19 On the other hand, components may also be functionally individuated components or black boxes. (For instance, the read-write head of Turing machines is a paradigmatic example of a black box.) When components are construed as black boxes, functional analysis by internal states becomes boxology, to which we now turn our attention.
6 Boxology Black boxes are components individuated by the outputs they produce under certain input conditions. In this sense, they are functionally individuated components. In another important strand of the philosophy of psychology, the capacities of a system are said to be functionally explained by appropriately connected black boxes. For example, Fodor distinguishes the functional identification of components from their structural identification (Fodor 1965, 1968a; for a similar distinction, see also Harman 19 As we have seen, Fodor says that functional analyses give no information about the mechanism underlying these states (1965, p. 177), but at least, they entail that there are components capable of bearing those states and capable of affecting each other so as to generate the relevant changes of states:
[I]t is sufficient to disconfirm a functional account of the behaviour of an organism to show that its nervous system is incapable of assuming states manifesting the functional characteristics that account requires… it is clearly good strategy for the psychologist to construct such theories in awareness of the best estimates of what the neurological facts are likely to be (Fodor 1965, p. 176). Much of what Fodor says in his early works on functional analysis is in line with our present argument and goes against the autonomy assumption that Fodor later defended. For simplicity, in the main text we are assimilating Fodor’s early (anti-autonomy) writings to his later (pro-autonomy) writings. In any event, even Fodor’s early writings fall short of pointing out that functional analyses are mechanism sketches.
123
Synthese (2011) 183:283–311
301
1988, 235).20 Black boxes are explicitly internal—spatially contained within the system. Proponents of boxology appear to believe that capacities can be satisfactorily explained in terms of black boxes, without identifying the structural components that implement the black boxes. What proponents of this type of functional analysis fail to notice is that functional and structural properties of components are interdependent: both are necessary, mutually constraining aspects of a mechanistic explanation. On one hand, the functional properties of a black box constrain the range of structural components that can exhibit those functional properties. On the other hand, a set of structural components can only exhibit certain functional properties and not others. Consider Fodor’s example of the camshaft. Internal combustion engines, Fodor reminds us, contain valves that let fuel into the pistons. For fuel to be let in, the valves need to be lifted, and for valves to be lifted, there must be something that lifts the valves. So here is a job description—valve lifting—that can be used to specify what a component of an engine must do for the engine to function. It is a functional description, not a structural one, because it says nothing about the structural properties of the components that fulfill that function, or how they manage to fulfill it. There may be indefinitely many ways to lift valves; as long as something does, it qualifies as a valve lifter. (Hence, multiple realizability.) What are the components that normally function as valve lifters in internal combustion engines? Camshafts. This is now a structural description, referring to a kind of component individuated by its shape and other structural properties. So there are two independent kinds of description, Fodor concludes, in terms of which the capacities of a system can be explained. Some descriptions are functional and some are structural. Since Fodor maintains that these descriptions are independent, there is a kind of explanation—boxology—that is autonomous from structural descriptions. Functional descriptions belong in boxological models, whereas structural descriptions belong in mechanistic explanations. Or so Fodor maintains. But is it really so? In the actual “functional analysis” of a mechanism, such as an engine, the functions are specified in terms of physical effects either on a physical medium or on other components, both of which are structurally individuated. The functional description “valve lifter” contains two terms. The first, “valve,” refers to a kind of component, whereas the second, “lifter,” refers to a capacity. Neither of these appears to be independent of structural considerations. Lifting is a physical activity: for x to lift y, x must exert an appropriate physical force on y in the relevant direction. The notion of valve is both functional and structural. In this context, the relevant sense is at least partially structural, for nothing could be a valve in the sense relevant to valve lifting unless it had weight that needs to be lifted for it to act as a valve. As a consequence, the “valve lifter” job description puts three mechanistic constraints on explanation: first, there must be valves (a type of structural component) to be lifted; second, lifting (a type of structurally individuated capacity) must be exerted on the valves; and third, there must be valve lifters (another type of component) to do the lifting. For something to be a valve lifter in the relevant respect, it must be able to exert an appropriate physical force on a component with certain structural characteristics in the relevant direction. 20 E.g.: “In functional analysis, one asks about a part of a mechanism what role it plays in the activities characteristic of the mechanism as a whole” (Fodor 1965, p. 177).
123
302
Synthese (2011) 183:283–311
This is not to say that only camshafts can act as valve lifters. Multiple realizability stands. But it is to say that all valve lifters suitable to be used in an internal combustion engine share certain structural properties with camshafts. This point generalizes. There is no such thing as a purely functional analysis of the capacity of an engine to generate motive power. Any attempt to specify the nature of a component purely functionally, in terms of what it is for, runs into the fact that the component’s function is to interact with other components to exhibit certain physical capacities, and the specification of the other components and activities is inevitably infected by structural considerations. Here is why. The inputs (flammable fuel, igniting sparks) and the outputs (motive power, exhaust) of an internal combustion engine are concrete physical media that are structurally individuated. Anything that turns structurally individuated inputs into structurally individuated outputs must possess appropriate physical causal powers—powers that turn those inputs into those outputs. (This, by the way, does not erase multiple realizability: there are still many ways to build an internal combustion engine.) What about boxological models in psychology and neuroscience? A boxologist committed to autonomy may suggest that while our assimilation of boxological models to mechanism sketches is viable in most domains, including internal combustion engines, it does not apply to computing systems. In this special domain, our boxologist continues, the inputs and outputs can be specified independently of their physical implementation, and black boxes need not even correspond to concrete components. In particular, Marr (1982) is often interpreted as arguing that there are three autonomous levels of explanation in cognitive science: a computational level, an algorithmic level, and an implementational level. According to Marr, the computational level describes the computational task, the algorithmic level describes the representations and representational manipulations by which the task is solved, and the implementational level describes the mechanism that carries out the algorithm. Marr’s computational and algorithmic levels may be seen as describing black boxes independently of their implementation. Because the functional properties of black boxes are specified in terms of their inputs and outputs (plus the algorithm, perhaps), the black boxes can be specified independently of their physical implementation. Thus, at least in the case of computing systems, which presumably include cognitive systems, boxology does not reduce to mechanistic explanation. This reply draws an incorrect conclusion from two correct observations. The correct observations are that black boxes may not correspond one-to-one to structural components and that the inputs and outputs (and algorithms) of a computing system can be specified independently of the physical medium in which the inputs and outputs are implemented. As a result, the same computations defined over the same computational vehicles can be implemented in various—mechanical, electro-mechanical, electronic, etc.—physical media. Indeed, some physical properties of the media are irrelevant to whether they implement a certain computation or another. But it doesn’t follow that computational inputs and outputs put no direct constraints on their physical implementation. In fact, any physical medium that implements a certain computation must possess appropriate physical degrees of freedom that result in the differentiation between the relevant computational vehicles. Furthermore, any component that processes computational vehicles must be able to reliably discrim-
123
Synthese (2011) 183:283–311
303
inate between tokens of the relevant types so as to process them correctly. Finally, any components that implement a particular algorithm must exhibit the relevant kinds of operations in the appropriate sequence. The operations in question are different depending on how black boxes map onto structural components. Consider a staple of functionalist philosophy of psychology: belief and desire boxes. Anyone who approves of such talk (unlike at least one of us) knows that belief and desire boxes need not be two separate memory components. True, but this doesn’t entail lack of direct constraints between functional properties and structural components. For the alternative means of implementing belief and desire boxes is to store beliefs and desires in one and the same memory component, while setting up the memory and processor(s) so that they can keep track of which representations are beliefs and which are desires. This may be done by adding an attitude-relative index to the representations or by keeping lists of memory registers. However it is done, the distinction between beliefs and desires constrains the mechanism: the mechanism must distinguish between the two types of representation and process them accordingly (if the organism is to exhibit relevant behavior). And the mechanism constrains the functional analysis: the representational format and algorithm will vary depending on how the distinction between beliefs and desires is implemented in the mechanism, including whether each type of representation has a dedicated memory component. Thus, even though the decomposition into black boxes may not correspond one-toone to the decomposition into concrete components, it still constrains the properties of concrete components. The moral is that computing systems are indeed different from most other functionally organized systems, in that their computational behavior can be mechanistically explained without specifying the physical medium of implementation other than by specifying which degrees of freedom it must possess. But such an explanation is still mechanistic: it specifies the type of vehicle being processed (digital, analog, or what have you) as well as the structural components that do the processing, their organization, and the functions they compute. So computational explanations are mechanistic too (Piccinini 2007; Piccinini and Scarantino 2011). What about Marr’s computational and algorithmic “levels”? We should not be misled by Marr’s terminological choices. His “levels” are not levels of mechanism because they do not describe component/sub-component relations. (The algorithm is not a component of the computation, and the implementation is not a component of the algorithm.) The computational and algorithmic levels are mechanism sketches. The computational level is a description of the mechanism’s task, possibly including a task analysis, whereas the algorithmic level is a description of the computational vehicles and processes that manipulate the vehicles. All of the above—task, vehicles, and computational processes—constrain the range of components that can be in play and are constrained in turn by the available components. Contrary to the autonomist interpretation of Marr, his levels are just different aspects of the same mechanistic explanation.21 21 Marr recognized as much at least implicitly, judging from the way that he intermingled considerations from different levels in his theory of visual shape detection. For example, facts about the response properties of retinal ganglion cells influenced his understanding of what the visual system does and the tools that it has to work with in order to do it.
123
304
Synthese (2011) 183:283–311
Table 1 Six traditional criteria for identifying a neurotransmitter 1. The chemical must be present in the presynaptic terminal 2. The chemical must be released by the presynaptic terminal in amounts sufficient to exert its supposed action on the post-synaptic neuron (or organ). Release should be dependent upon inward calcium current and the degree of depolarization of the axon terminal during the action potential 3. Exogenous application of the chemical substance in concentrations reasonably close to those found endogenously must mimic exactly the effect of endogenously released neurotransmitter 4. The chemical must be synthesized in the presynaptic cell 5. There must be some means of removing the chemical from the site of action (the synaptic cleft) 6. The effects of the putative neurotransmitter should be mimicked by known pharmacological agonists and should be blocked by known antagonists for that neurotransmitter
So black boxes are placeholders for structural components (with arrows indicating input–output relations between components) or sub-capacities (with arrows indicating causal relations between processes) in a mechanism. Boxology is not distinct from mechanistic explanation. Rather it is a first step toward the decomposition of a system into its structural components. Computational and information processing explanations often work by abstracting away from many of the implementing details. But that’s how mechanistic explanation generally works; it focuses on the mechanistic level most relevant to explaining a behavior while abstracting away from the mechanistic levels below. Whether a system implements a given computation still depends on its structural features.
7 An example of functional analysis “Neurotransmitter” is a functional term, and it is a well defined one. When someone asks whether a molecule acts as a neurotransmitter at a synapse there is no ambiguity as to what question is being asked or what kind of evidence would be required to answer it. Table 1 lists six criteria—repeated in most if not all standard neuroscience texts (see, e.g. Kandel et al. 2000, pp. 280–281; Shepherd 1994, p. 160; Cooper et al. 1996, p. 4)—for establishing that a molecule is a neurotransmitter. Although each of these criteria is violated for some known neurotransmitters (especially amino acid transmitters like glutamate), they are nonetheless prototypical. These criteria are clearly designed to show that the putative neurotransmitter is organized within the mechanisms of chemical neurotransmission shown in Fig. 2. We will see that attributing a function amounts to showing how some item fits into a mechanism. In order to understand how criteria 1–6 achieve that objective, we need first to review some of the basic mechanisms of chemical neurotransmission. Neurons communicate with one another by passing neurotransmitters across synapses, gaps between a pre-synaptic axon (shown at the top of Fig. 2) and a post-synaptic neuron (shown at the bottom of Fig. 2). The model for a standard form of chemical neurotransmission begins with the depolarization of the axon terminal following the arrival of an action potential (1). When the axon depolarizes, voltage sensitive calcium (Ca2+ ) channels open (2), allowing Ca2+ to diffuse into the cell. This sudden rise in intracellular calcium concentrations activate Ca2+ /calmodulin protein kinase, which
123
Synthese (2011) 183:283–311
305
Fig. 2 The mechanisms of chemical neurotransmission
can then prime the vesicles containing neurotransmitter (4). This neurotransmitter may have been synthesized (A) in the cell body and transported (B) to the axon terminal, or they may be stored in neurotransmitters in the terminal itself (C). Once primed, vesicles can then dock to the axon wall (5) in such a way that the vesicles can fuse (6) to the membrane and spill their transmitters into the synaptic cleft. These transmitters diffuse (7) across the synapse, where they bind to post-synaptic receptors, initiating either chemical or electrical effects in the post-synaptic cell. This cartoon of chemical neurotransmission is a mechanistic model, involving entities (ions, neurotransmitters, vesicles, membranes) and activities (depolarizing, diffusing, priming, docking, fusing) organized together so that they do something—in this case, reliably preserve a signal across the space between cells. Identifying something as a neurotransmitter involves showing that it is organized within this set of mechanisms. Each of the criteria in Table 1 is designed to show that the putative neurotransmitter is organized—spatially, temporally, actively and quantitatively—into the mechanisms of chemical transmission. Spatially, the transmitter has to be located in the presynaptic neuron and contained within vesicles. Temporally, if the synapse has to convey signals passed by electrical signals that are hundreds of milliseconds apart, there needs to be some means for removing the transmitter from the
123
306
Synthese (2011) 183:283–311
cleft after each episode of release. The transmitter has to be actively integrated within the mechanisms of synthesis of the presynaptic cell (again this is sometimes violated) and with the receptor mechanisms of the post-synaptic cell (namely, by showing that the post-synaptic cell responds to the presence of the molecule or by showing that the putative post-synaptic effect of the chemical substance can be mimicked by applying pharmacological relatives of the transmitter and should be blocked by pharmacological antagonists for the receptors). And quantitatively the molecule’s release must be correlated with activation of the pre-synaptic cell, and it must be released in sufficient quantity to affect the post-synaptic cell. Some of these criteria (such as localization, vesicular packaging, and concentration) make direct appeal to structural features of the synapse. Others involve existential claims about inactivating molecules at the same location and causal claims about inhibitors and agonists. None of these features is independent of the causal details concerning the rest of the mechanism. The example thus illustrates how the functional language at lower-levels of neuroscience involves implicit commitments to structural facts about the mechanism in question. The same, we claim, extends all the way to the highest levels of the neuroscience hierarchy. Functional analysis borrows its explanatory legitimacy from the idea that functional explanations (functional explanatory texts) capture something of the causal structure of a system. It has been understood as a doctrine according to which mental states (and other “high-level properties”) are ultimately to be analyzed in terms of the commerce between them (as well as inputs and outputs). Haugeland (1998) likens the situation to recent changes in the structure of departments within a company. As communication technologies continue to expand, physical locations have become much less significant than are facts about how the lines of communication flow: of who reports to whom, of what tasks are divided among the nodes in this network, of the sorts of communication passed along the different interfaces. There are, to be sure, facts about the locations, structures, etc. of these components. But those facts turn out not to be so important if you want to understand how the department works. Is Larry’s office next to Bob’s, or is it in Saigon? It just doesn’t matter so long as their communication can happen in the time required. Do they communicate by speaking, by telephone, by email, by instant messaging? Again, that just doesn’t matter. Or so we might think. Given that psychological systems are in fact implemented in biological systems, and that such systems are more or less precisely replicated through reproduction, evolution, and development, there are frozen structural constraints on the mechanisms that do, as a matter of fact, implement behavioral and cognitive functions. Learning about components allows one to get the right functional decomposition by ruling out functional decompositions that are incompatible with the known structural details (just as Larry and Bob could not run a tech firm between St. Louis and Saigon by Pony Express). Explanations that can accommodate these details about underlying mechanisms have faced severe tests not faced by models that merely accommodate the phenomenon, and learning about such details is clearly of central importance in the discovery process. In the case of the neurotransmitter, all of the criteria seem to address exactly this point. So the search for mechanistic details is crucial to the process of sorting correct from incorrect functional explanations. If functional analysis is treated as different in kind
123
Synthese (2011) 183:283–311
307
from mechanistic explanation, it is hard to see how research into mechanistic details can play this explanatory role. A second point is that explanations that capture these mechanistic details are deeper than those that do not. To accept as an explanation something that need not correspond with how the system is in fact implemented at lower levels is to accept that the explanations simply end at that point. Psychology ought to construct its explanations in a way that affords deeper filling in. Progress in explanatory depth has two virtues beyond the joy of understanding (which, as noted above, sometimes accompanies knowledge of a mechanism). First it allows one to expand the range of phenomena that the model can save. Compare two models, one that characterizes things without mentioning the structural details, and one that includes, in addition, structural facts about component parts. The latter allows us to make accurate predictions about how the mechanism will behave under a wider variety of variations in background conditions: the effective functioning of a company in a blackout does depend on whether Larry and Bob communicate in person or by instant messaging. Second, and related, knowledge of the underlying components and the structural constraints on their activities affords more opportunities for the restoration of function and the prevention of calamity or disease. Our point is thus conceptual and pragmatic. On the conceptual side, we emphasize that functional analysis and mechanistic explanation are inextricably linked: structural descriptions constrain the space of plausible functional descriptions, and functional descriptions are elliptical mechanistic descriptions. The idea that functional description is somehow autonomous from details about mechanisms involves a fundamental misunderstanding of the nature of functional attribution in sciences like cognitive neuroscience. On the pragmatic side, the demand that explanations satisfy mechanistic constraints leads us to produce better, deeper, better confirmed, and more useful descriptions of the system at hand than we would produce if we allowed ourselves to be satisfied with any empirically adequate boxological models. For these reasons, full-blown mechanistic models are to be preferred. In conclusion, there is no functional analysis that is distinct and autonomous from mechanistic explanation because to describe an item functionally is, ipso facto, to describe its contribution to a mechanism. Furthermore, a full-blown mechanistic explanation describes both the functional and the structural properties of the mechanism; any constitutive explanation that omits the structural properties in favor of the functional ones is not a non-mechanistic explanation but an elliptical mechanistic explanation, at least if it is worthy of the title “explanation” in this domain.22 8 Conclusion Functional analysis of a system’s capacities provides a sketch of a mechanistic explanation. If the functional analysis is just the explanation of the capacities of the system in terms of the system’s sub-capacities, this is an articulation of the phenomenon to be mechanistically explained that points in the direction of components possessing the 22 This is consistent with the obvious point that in many contexts, it is useful to omit many details of a
mechanistic explanation, whether functional or structural.
123
308
Synthese (2011) 183:283–311
sub-capacities. If the functional analysis appeals to internal states, these are states of internal components, which need to be identified by a complete mechanistic explanation. Finally, a functional analysis may appeal to black boxes. But black boxes are placeholders for structural components or capacities thereof, to be identified by a complete mechanistic explanation of the capacities of the system. Thus, if psychological explanation is functional, as so many people assume, and psychological explanation is worthy of its name, then psychological explanation is mechanistic. Once the structural aspects that are missing from a functional analysis are filled in, functional analysis turns into a more complete mechanistic explanation. By this process, functional analyses can be seamlessly integrated with mechanistic explanations, and psychology can be seamlessly integrated with neuroscience. Most contemporary psychologists are unlikely to be surprised by our conclusion. A cursory look at current mainstream psychology shows that the boxological models that were mainstream in the 1970s and 1980s are becoming less popular. What is increasingly replacing them are structurally constrained models—mechanistic models. Much excitement surrounds methods such as neural imaging and biologically realistic neural network models, which allow psychologists to integrate their findings with those of neuroscientists. But some resistance to the integration of psychology and neuroscience remains, in both psychology and philosophy circles. We hope our argument will help soften such resistance. For that purpose, it may also help to see the important role that autonomism about psychological explanation played in establishing cognitive psychology as a legitimate scientific enterprise. Autonomists hope to make room for an understanding of cognitive mechanisms that can proceed on its own, independently of neuroscience. Such a position was useful in the 1950s and 1960s, when knowledge of brain function was less advanced and we had only begun to think about how an integrated cognitive neuroscience would proceed. Before cognitive neuroscience could get off the ground, somebody had to characterize the cognitive phenomena for which neural explanations would be sought. The autonomist vision allowed experimental and theoretical psychologists to proceed with that task without having to wait for neuroscience to catch up. Now the discipline has advanced to the point that these pursuits can meaningfully come together, and there are tremendous potential benefits from affecting such integration. Psychology should not content itself with the discovery of merely phenomenally adequate functional descriptions that fail to correspond to the structural components to be found in the brain. It should aim to discover mechanisms. To explain in cognitive psychology and neuroscience is to know the mechanisms, and explanation is what functional analysis has always been all about. Acknowledgments This paper was refereed in two rounds. John Symons graciously arranged for a round of blind refereeing; thanks are due to him and the anonymous referees. In addition, thanks to those who refereed the paper non-blindly: Ken Aizawa, Robert Cummins, and Dan Weiskopf. Many thanks to audiences at the 2010 APA Pacific Division, Saint Louis University, and University of Missouri—Columbia. This material is based upon work supported by the National Science Foundation under Grant No. SES-0924527 to Gualtiero Piccinini.
123
Synthese (2011) 183:283–311
309
References Aizawa, K., Gillett, C. (2009). The (multiple) realization of psychological and other properties in the sciences. Mind & Language, 24(2), 181–208. Aizawa, K., & Gillett, C. (2011). The autonomy of psychology in the age of neuroscience. In P. M. Illari, F. Russo, & J. Williamson (Eds.), Causality in the sciences. Oxford: Oxford University Press. Aizawa, K., & Gillett, C. (forthcoming). Multiple realization and methodology in neuroscience and psychology. British Journal for the Philosophy of Science. Bechtel, W. (2008). Mental mechanisms: Philosophical perspectives on cognitive neuroscience. London: Routledge. Bechtel, W., & Mundale, J. (1999). Multiple realizability revisited: Linking cognitive and neural states. Philosophy of Science, 66, 175–207. Bechtel, W., & Richardson, R. C. (1993). Discovering complexity: Decomposition and localization as scientific research strategies. Princeton: Princeton University Press. Bickle, J. (2003). Philosophy and neuroscience: A ruthlessly reductive approach. Dordrecht: Kluwer. Bickle, J. (2006). Reducing mind to molecular pathways: Explicating the reductionism implicit in current cellular and molecular neuroscience. Synthese, 151, 411–434. Block, N. (1997). Anti-reductionism slaps back. Mind, Causation, World, Philosophical Perspectives, 11, 107–133. Block, N., & Fodor, J. A. (1972). What psychological states are not. Philosophical Review, 81(2), 159–181. Block, N., & Segal, G. (1998). The philosophy of psychology. In A. C. Grayling (Ed.), Philosophy 2: Further through the subject (pp. 64–69). Oxford: Oxford University Press. Clark, A. (1980). Psychological models and neural mechanisms: An examination of reductionism in psychology. Oxford: Clarendon Press. Churchland, P. S. (1986). Neurophilosophy. Cambridge, MA: MIT Press. Churchland, P. M. (1989). A neurocomputational perspective. Cambridge, MA: MIT Press. Churchland, P. M. (2007). Neurophilosophy at work. Cambridge: Cambridge University Press. Cooper, J. R., Bloom, F. E., & Roth, R. H. (1996). The biochemical basis of neuropharmacology. Oxford: Oxford University Press. Couch, M. (2005). Functional properties and convergence in biology. Philosophy of Science, 72, 1041– 1051. Crandall, B., & Klein, G., et al. (2006). Working minds: A practitioner’s guide to task analysis. Cambridge, MA: MIT Press. Craver, C. (2001). Role functions, mechanisms, and hierarchy. Philosophy of Science, 68, 53–74. Craver, C. F. (2004). Dissociable realization and kind splitting. Philosophy of Science, 71(4), 960–971. Craver, C. F. (2006). When mechanistic models explain. Synthese, 153(3), 355–376. Craver, C. F. (2007). Explaining the brain. Oxford: Oxford University Press. Craver, C. F., & Darden, L. (2001). Discovering mechanisms in neurobiology. In P. Machamer, R. Grush, & P. McLaughlin (Eds.), Theory and method in the neurosciences (pp. 112–137). Pittsburgh, PA: University of Pittsburgh Press. Cummins, R. (1975). Functional analysis. Journal of Philosophy, 72(20), 741–765. Cummins, R. (1983). The nature of psychological explanation. Cambridge, MA: MIT Press. Cummins, R. (2000). “How does it work?” vs. “What are the laws?” Two conceptions of psychological explanation. In F. Keil & R. Wilson (Eds.), Explanation and Cognition. Cambridge: Cambridge University Press. Dennett, D. C. (1978). Brainstorms. Cambridge, MA: MIT Press. Edelman, G. M., & Gally, J. A. (2001). Degeneracy and complexity in biological systems. Proceedings of the National Academy of the Sciences of the United States of America, 98(24), 13763–13768. Feest, U. (2003). Functional analysis and the autonomy of psychology. Philosophy of Science, 70, 937–948. Figdor, C. (2010). Neuroscience and the multiple realization of cognitive functions. Philosophy of Science, 77(3), 419–456. Fodor, J. A. (1965). Explanations in psychology. In M. Black (Ed.), Philosophy in America. London: Routledge and Kegan Paul. Fodor, J. A. (1968a). Psychological explanation. New York: Random House. Fodor, J. A. (1968b). The appeal to tacit knowledge in psychological explanation. Journal of Philosophy, 65, 627–640.
123
310
Synthese (2011) 183:283–311
Fodor, J. A. (1974). Special sciences. Synthese, 28, 77–115. Fodor, J. A. (1975). The language of thought. Cambridge, MA: Harvard University Press. Fodor, J. A. (1997). Special sciences: Still autonomous after all these years. Philosophical Perspectives, Mind, Causation, and the World, 11, 149–163. Gazzaniga, M. (Ed.). (2009). The cognitive neurosciences. Cambridge, MA: MIT Press. Glennan, S. (2002). Rethinking mechanistic explanation. Philosophy of Science, 69, S342–S353. Glennan, S. (2005). Modeling mechanisms. Studies in History and Philosophy of Biological and Biomedical Sciences, 36(2), 443–464. Gold, I., & Stoljar, D. (1999). A neuron doctrine in the philosophy of neuroscience. Behavioral and Brain Sciences, 22, 809–869. Harman, G. (1988). Wide functionalism. In S. Schiffer & S. Steele (Eds.), Cognition and representation (pp. 11–20). Boulder: Westview. Haugeland, J. (1998). Having thought: Essays in the metaphysics of mind. Cambridge, MA: Harvard University Press. Heil, J. (2003). From an ontological point of view. Oxford: Clarendon Press. Kalat, J. W. (2008). Biological psychology. Belmont, CA: Wadsworth Publishing. Kandel, E. R., Schwartz, J. H., & Jessell, T. M. (2000). Principles of neural science (4th ed.). New York: McGraw-Hill. Kaplan, D., & Craver, C. F. (forthcoming). The explanatory force of dynamical and mathematical models in neuroscience: A mechanistic perspective. Philosophy of Science. Keeley, B. (2000). Shocking lessons from electric fish: The theory and practice of multiple realizability. Philosophy of Science, 67, 444–465. Kim, J. (1992). Multiple realization and the metaphysics of reduction. Philosophy and Phenomenological Research, 52, 1–26. Kosslyn, S. M., Thompson, W. I., & Ganis, G. (2006). The case for mental imagery. New York, NY: Oxford University Press. Machamer, P. K., Darden, L., & Craver, C. F. (2000). Thinking about mechanisms. Philosophy of Science, 67, 1–25. Marr, D. (1982). Vision. New York: Freeman. Noppeney, U., Friston, K. J., & Price, C. (2004). Degenerate neuronal systems sustaining cognitive functions. Journal of Anatomy, 205(6), 433–442. O’Reilly, R. C., & Munakata, Y. (2000). Computational explorations in cognitive neuroscience: Understanding the mind by simulating the brain. Cambridge, MA: MIT Press. Piccinini, G. (2004). Functionalism, computationalism, and mental states. Studies in the History and Philosophy of Science, 35(4), 811–833. Piccinini, G. (2007). Computing mechanisms. Philosophy of Science, 74(4), 501–526. Piccinini, G. (2008). Computers. Pacific Philosophical Quarterly, 89(1), 32–73. Piccinini, G., & Scarantino, A. (2011). Information processing, computation, and cognition. Journal of Biological Physics, 37, 1–38. Polger, T. W. (2004). Natural minds. Cambridge, MA: MIT Press. Polger, T. (2009). Evaluating the evidence for multiple realization. Synthese, 167(3), 457–472. Posner, M. I. (2004). Cognitive neuroscience of attention. New York, NY: Guilford Press. Price, C. J., & Friston, K. J. (2002). Degeneracy and cognitive anatomy. Trends in Cognitive Science, 6(10), 416–421. Putnam, H. (1960). Minds and machines. In S. Hook (Ed.), Dimensions of mind: A symposium (pp. 138– 164). New York: Collier. Putnam, H. (1967a). The mental life of some machines. In H. Castañeda (Ed.), Intentionality, minds, and perception (pp. 177–200). Detroit: Wayne State University Press. Putnam, H. (1967b). Psychological predicates. Art, philosophy, and religion. Pittsburgh, PA: University of Pittsburgh Press. Putnam, H. (1975). Philosophy and our mental life. In H. Putnam (Ed.), Mind, language and reality: Philosophical papers (Vol. 2, pp. 291–303). Cambridge: Cambridge University Press. Shagrir, O. (1998). Multiple realization, computation and the taxonomy of psychological states. Synthese, 114, 445–461. Shagrir, O. (2006). Why we view the brain as a computer. Synthese, 153(3), 393–416. Shagrir, O. (2010). Brains as analog-model computers. Studies in the History and Philosophy of Science, 41(3), 271–279.
123
Synthese (2011) 183:283–311
311
Shapiro, L. A. (2000). Multiple realizations. The Journal of Philosophy, XCVII(12), 635–654. Shapiro, L. A. (2004). The mind incarnate. Cambridge, MA: MIT Press. Shepherd, G. M. (1994). Neurobiology (3rd ed.). London: Oxford University Press. Sporns, O., Tononi, G., & Kötter, R. (2005). The human connectome: A structural description of the human brain. PLoS Computational Biology, 1(4), 245–250. Stich, S. (1983). From folk psychology to cognitive science. Cambridge, MA: MIT Press. Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59, 433–460.
123
This page intentionally left blank z
Synthese (2011) 183:313–338 DOI 10.1007/s11229-011-9958-9
Models and mechanisms in psychological explanation Daniel A. Weiskopf
Received: 31 October 2010 / Accepted: 7 May 2011 / Published online: 23 June 2011 © Springer Science+Business Media B.V. 2011
Abstract Mechanistic explanation has an impressive track record of advancing our understanding of complex, hierarchically organized physical systems, particularly biological and neural systems. But not every complex system can be understood mechanistically. Psychological capacities are often understood by providing cognitive models of the systems that underlie them. I argue that these models, while superficially similar to mechanistic models, in fact have a substantially more complex relation to the real underlying system. They are typically constructed using a range of techniques for abstracting the functional properties of the system, which may not coincide with its mechanistic organization. I describe these techniques and show that despite being non-mechanistic, these cognitive models can satisfy the normative constraints on good explanations. Keywords Psychological explanation · Models · Mechanisms · Cognition · Realization 1 Introduction We are in the midst of a mania for mechanisms. In the wake of the collapse of the deductive-nomological account of explanation, philosophers of science have cast about for alternative ways of describing the structure of actual explanations in science and the normative properties that good explanations ought to have. Mechanisms and mechanistic explanation have promised to fill both of these roles, particularly in the ‘fragile sciences’ (Wilson 2004): biology (Bechtel 2006; Bechtel and Abrahamson 2005), neuroscience (Craver 2006, 2007), and, increasingly, psychology (Bechtel 2008, 2009;
D. A. Weiskopf (B) Department of Philosophy, Georgia State University, Atlanta, GA, USA e-mail:
[email protected] 123
314
Synthese (2011) 183:313–338
Glennan 2005). Besides these benefits, mechanisms have also promised to illuminate other problematic scientific notions such as capacities, causation, and causal laws (Glennan 1996, 1997; Machamer 2004; Woodward 2002). Mechanistic explanation involves isolating a set of phenomena and positing a mechanism that is capable of producing those phenomena (see Craver and Bechtel 2006 for a capsule description). The phenomena in question are an entity or a system’s exercising a certain capacity: an insect’s ability to dead reckon, my ability to tell that this is a lime, a neuron’s capacity to produce an action potential, a plant’s capacity for photosynthesis. What one explains mechanistically, then, is S’s ability, propensity, or capacity to F. The mechanism that does the explaining is composed of some set of entities—the components of the mechanism—and their associated activities that are organized in such a way as to produce the phenomena. Mechanistic explanation involves constructing a model of such mechanisms that correctly depicts the causal interactions among their parts that enable them to produce the phenomena under various conditions.1 Such a model should specify, among other things, the initial and termination conditions for the mechanism, how it behaves under various sorts of interventions, including abnormal inputs and internal disruptions, how it is integrated with its environment, and so on. There is no doubt that explanation in biology and neuroscience often involves describing mechanisms. Here I’m particularly concerned with whether the mechanistic revolution should be extended to psychological explanation. A great deal of explanation in psychology involves giving models of various psychological phenomena.2 These models can be formal (e.g., mathematical or computational) or they may be more informally presented. It can be extremely tempting to cast these models in the mold of mechanistic explanation. I’ll argue that we should not succumb to this temptation, and that cognitive models are not, in general, models of mechanisms. While they have some features in common with mechanistic models, they differ significantly in the way that they relate to the underlying system whose structure they aim to represent. Despite this, they can be evaluated according to the standard norms that govern model construction generally, and can provide perfectly good explanations of psychological phenomena. In the discussion to follow, I first lay out the criteria by which good models of realworld systems are to be assessed (Sect. 2). My starting point is Carl Craver’s discussion of the norms of mechanistic explanation, which I propose should be generalized to cover other types of model-based reasoning. I then show that one type of nonmechanistic explanation, Rob Cummins’ analytic functional explanations, can meet these norms despite being entirely noncomponential (Sect. 3). I describe several different cognitive models of psychological capacities such as object recognition and 1 I will make heavy use of the term ‘model’ throughout this discussion. However, I do not have any very specific conception of models in mind. What I will mean is at least the following. A model is a kind of representation of some aspect of the world. The components of models are organized entities, processes, activities, and structures that can somehow be related to such things in the real world. Models can be picked out linguistically, visuospatially, graphically, mathematically, computationally, and no doubt in many other ways. This should be a sufficiently precise conception for present purposes. 2 See, e.g., the papers collected in Polk and Seifert (2002). For a comprehensive history of the construction
of computational models of cognition, see Boden (2006).
123
Synthese (2011) 183:313–338
315
categorization (Sect. 4), and I show that despite being non-mechanistic, these models can also meet the normative standards for explanations (Sect. 5). Finally, I rebuff several attempts to either reduce these modeling strategies to some sort of mechanistic approach, or to undermine their explanatory power (Sect. 6). 2 Three dimensions of model assessment In laying out the criteria for a good mechanistic explanation, Craver (2007) usefully distinguishes between two dimensions of normative evaluation that we can use in assessing these explanations. He distinguishes: (1) how-possibly, how-plausibly, and how-actually models; and (2) mechanism sketches, mechanism schemata, and complete mechanistic models. Here I will lay out what I take to be the most useful way to construe these distinctions. Consider the first dimension, in particular the end that centers on how-possibly (HP) models. HP models are “loosely constrained conjectures” about what sort of mechanism might produce the phenomenon (Craver 2007, p. 112). In giving these, one posits parts and operations, but one need not have any idea if they are real parts, or whether they could do what they are posited to do. Examples here include much early work in symbolic computational simulation of cognitive capacities. Computer simulations of vision written in high-level programming languages describe a set of entities (symbolic representations) and activities (concatenation, comparison, etc.) that may produce some fragment of the relevant phenomena, but one need not know or be committed to the idea that the human visual system contains those parts and operations. Similar critiques have been made of linguists’ syntactic theories: the formal sequence of operations posited in generative grammars—from transformational rules to ‘Move α’—is often psychologically hard to detect.3 How-actually (HA) models, on the other hand, “describe real components, activities, and organizational features” of the mechanism (Craver 2007, p. 112). In between these are how-plausibly models that vary in their degree of realism. Clearly, whether a model is nearer to the HP or HA end is not something that can be determined just by looking at its intrinsic structure. This continuum or set of distinctions turns on degrees of evidential support. To see this, notice that producing HP models is often part of the early stages of investigating a mechanism. This initial set of models is then evaluated to determine which one best explains the phenomena or best fits the data, if any of them do. Some rise to the level of plausibility, and eventually we may settle on our best-confirmed hypothesis as to which is the actual mechanism. That the distinction is epistemic is suggested by the way in which models move along this dimension.4 If we are just considering a set of models to account for a phenomenon, then we can regard them all as how-possibly, or merely conjectural. If we have some evidence that favors one or two of them, or some set of constraints 3 There may be ways to detect the presence of representations such as traces or phonologically empty categories such as PRO by comparing speakers’ grammaticality judgments across pairs that differ in these hypothesized elements. But clusters of converging operations focused on these elements are difficult to come by. 4 This is also stated explicitly in Machamer et al. (2000, pp. 21–22).
123
316
Synthese (2011) 183:313–338
that rule some of them out, the favored subset moves into the how-plausibly column. This implies that a how-actually model is one that best accommodates the evidence and satisfies the relevant constraints. How much of the evidence is required? It must fit at least all of the available evidence as of the time the model is constructed. But while this is necessary, as a sufficient condition it is too weak, as we may simply not have much evidence yet. A maximal view would maintain that it must fit all possible evidence that could be adduced. While models that fit all of the possible evidence are certainly how-actually, making this a necessary condition would be too strong, since it would effectively mean that we never have had, and never will have, any how-actually models—or at least we could never be confident that we did. A more moderate view would be that how-actually models fit the preponderance of evidence that has been gathered above a certain threshold, where this threshold is the amount of evidence that is accepted as being sufficient, within the discipline, to treat a model as a serious contender for describing the real structure of a system. Normally, disciplines require more than a minimal amount of evidence for acceptance, but less than all of the possible evidence, since it can be unclear what is even meant by all of the possible evidence. Models are properly accepted as how-actually when they meet the appropriate disciplinary threshold of epistemic support: less than merely what is at hand, but less than total as well. This is the sense in which I will be interpreting how-actually models here. Terminologically, this might seem uncomfortable: whether a model captures how the mechanism actually is doesn’t seem like a matter of evidential support, but a matter of how accurately it models the system in question. This makes it sound as if a how-actually model is just the true or accurate model of the system. But note that any one of a set of how-possibly models might turn out to accurately model the system, so the difference in how they are placed along this dimension cannot just be in terms of accuracy. So it seems that this is fundamentally an epistemic dimension. It represents something like the degree of confirmation of the claim that the model corresponds to the mechanism. Even if your model is in fact the one that accurately represents the mechanism in question, if you take it to be merely a guess or one possibility among many, then it’s a how-possibly or how-plausibly model. More evidence that this is how the mechanism works makes it inch towards being how-actually. The second dimension of assessment involves the continuum from mechanism sketches to mechanism schemata and complete mechanistic models. A sketch is an “incomplete model of a mechanism”, or one that leaves various gaps or employs filler terms for entities and processes whose nature and functioning is unknown. These terms—‘control’, ‘influence’, ‘regulate’, ‘process’, etc.—constitute promissory notes to be cashed in by further analysis. A schema is a somewhat complete, but less than ideally complete, model. It may contain black boxes or other dummy items, but it incorporates more informative detail than a mere sketch. Finally an ideally complete model omits nothing, or nothing relevant to understanding the mechanism and its operations in the present context, and uses no terms that can be regarded as ‘filler’. The continuum from sketches to schemata and complete models is not epistemic. Rather it has to do with representational accuracy—a term which, as I use it,
123
Synthese (2011) 183:313–338
317
incorporates both grain and correctness.5 Correctness requires that the model not include elements that are not present in the system, nor omit elements that are present. Grain has to do with the size of the chunks into which one decomposes a mechanism. This is a matter of varying degrees of precision. For example, the hippocampus can be seen as a three-part entity composed of CA1, CA3 and the dentate gyrus, or it can be seen as a more complex structure containing various cell types, layers, and their projections, etc. (Craver 2009). But there can be coarse-grained but correct models, as this example shows. Coarse-grained models merely suppress further mechanistic details concerning their components. Presumably this would be an instance of a schema or sketch. I take it that approaching ideal accuracy involves achieving a more correct model (one that includes more of the relevant structure of the system) and also a more fine-grained model (one that achieves greater precision in its depiction of the system). One question about this distinction is what sorts of failures of accuracy qualify a model as a ‘sketch’. Every model omits something from what it represents, for instance, but not every way of omitting material seems to make for a sketch. For example, one way to omit is just not to include some component that exists in the real system. Sometimes this is innocuous, since the component may not be relevant to understanding the mechanism in the current explanatory context. Many intracellular structures are omitted in modeling neurotransmitter release, for instance. But this can also be a way of having a false or harmfully incomplete model. Alternatively one can include a ‘filler’ term that is known to abbreviate something about the system that we cannot (yet) describe any better. The question then arises what sort of relationship terms and components of models must bear to the underlying system for the model to be a good representation of the system’s parts and organization. In particular, it might be that there are components of an empirically validated model that do not map onto any parts of the modeled system. I will discuss some examples of this in Sect. 5. Some of these accuracy failures are harmful and others are not. It seems permissible to omit detail where it’s irrelevant to our modeling purposes, so being a schema is not in and of itself a bad thing. Moreover, complete models may be intractable in practice in various ways. The larger point to notice is that the simple notion of a sketch/schema continuum runs together the notion of something’s being a false model and its being a merely detail-free model. The most significant failures of models seem to arise from either including components that do not correspond to real parts of the system, or omitting real parts of the system in the model (and, correspondingly, failing to get the operations of those parts correct). These failures are lumped in with the more innocuous practices of abstracting and omitting irrelevant details in the notion of a sketch or a schema. A third way of classifying models is with respect to whether or not they are genuinely explanatory, as mechanistic models are assumed to be. Craver (2006) draws a separate normative distinction between merely phenomenological models and genuine explanations. Phenomenological accuracy is simply capturing what the phenomena 5 Giere (1988) also separates accuracy into two components: similarity in certain respects and accuracy to various degrees in each of these respects. My own grain/correctness distinction does not quite correspond to his, but both illustrate the fact that we need to make more distinctions than are allowed by just the notion of undifferentiated ‘accuracy’ that seems to underlie the schema-sketch continuum.
123
318
Synthese (2011) 183:313–338
are. An example of this is the Hodgkin–Huxley equation describing the relationship between voltage and conductance for each ion channel in the cell membrane of a neuron. These may be useful for predicting and describing a system, but they do not provide explanations.6 One possibility that Craver considers is that explanatory models are “much more useful than merely phenomenal models for the purposes of control and manipulation” (2006, p. 358). Deeper explanations involve being able to say how things would have been otherwise, how the system would be if various perturbations occurred, how to answer a greater range of questions about the system, etc. Here we should separate the properties of allowing control and manipulation from being able to answer counterfactual questions. Many good explanations do the latter but not the former. Our explanation for why a gaseous disc around a black hole behaves like a viscous fluid does not enable us to control or manipulate that disc in any way, nor do our explanations of how stellar fusion produces neutrinos. Many apparently phenomenological models can also describe certain sorts of counterfactual scenarios. Even the Hodgkin–Huxley model allows us to make certain counterfactual predictions about how action potentials will perform in various perturbed circumstances. But they are silent on other counterfactuals, particularly those having to do with interventions involving the system’s operations. So we can still say in general that models become more explanatory the more they allow us to answer a range of counterfactual questions and the more they allow us to manipulate a system’s behavior (in principle at least). This sort of normative assessment is also neutral on the question of whether the explanations in question are mechanistic or not. What emerges, then, is a classification of models according to (1) whether they are highly confirmed or supported by the evidence and (2) whether they are representationally accurate. So stated, these dimensions of assessment are independent of whether the model is mechanistic. We can ask whether any theory, model, simulation, or other representational device conforms to the norms of accuracy and confirmation. In addition, models may be classified according to (3) whether they are genuinely explanatory, or merely phenomenological, predictive, or descriptive. Finally, there are broad requirements that models cohere with the rest of what we know. Thus we can also assess models with respect to (4) whether they are consistent with and are plausible in light of our general background knowledge and our more local knowledge of the domain as a whole.
3 Noncomponential analysis Mechanistic models, or many of them, can meet these normative conditions. Strictly descriptive-phenomenological models cannot. But there are effective explanatory strategies besides mechanistic explanation. Cummins (1983) argues that a kind of analytic functional explanation plays a central role in psychology. As with mechanistic explanation, the explanandum phenomenon is the fact that a system S has a capacity to F. In his account, S’s capacity to F is analyzed into various further capacities G1 , . . ., Gn , all 6 See Bokulich (2011) for further discussion of explanatory versus phenomenological and fictional models.
123
Synthese (2011) 183:313–338
319
of which also belong to S itself. F-ing, then, is explained as having the appropriately organized (spatially and temporally choreographed) capacities to carry out certain other operations whose exercise constitutes F-ing. This is a kind of analytic explanation, since it aims to explain one capacity by analyzing it into subcapacities. However, these are not capacities of subparts of the system. The account doesn’t explain S’s F-ing in terms of the G-ing of S’s parts, but rather in terms of the activity of S itself. Cummins calls this functional analysis; it involves “analyzing a disposition into a number of less problematic dispositions such that programmed manifestation of these analyzing dispositions amounts to a manifestation of the analyzed disposition” (1983, p. 28). In many cases, this analysis will be of one disposition of a subject or system into other dispositions of the same subject or system. In such cases, the “analysis seems to put no constraints at all on [the system’s] componential analysis” (1983, p. 30). As an example, Cummins gives an analysis of the disposition to see an afterimage as shrinking if one approaches it while it is projected onto a visible wall. This is analyzed in terms of a flowchart or program specifying the relations among various subdispositions that need to be present: the ability to determine whether an object is visually present, to determine the size of the retinal image and distance to the object, to use these to compute the apparent object size (Cummins 1983, pp. 83–87). He offers analogous functional analyses of grammatical competence, Hull’s account of conditioning, and Freudian psychodynamics. Craver seems to reject the explanatory significance of functional analysis of this sort—call it noncomponential analysis, or NCA. By contrast with NCA, mechanistic explanation is “inherently componential” (2007, p. 131). From the mechanistic point of view, NCA essentially faces a dilemma. One possibility is that without appeal to components and their activities, we have no way to distinguish how-possibly from how-actually explanations, and sketches from more complete mechanistic models. In other words, NCA blocks us from making crucially important distinctions in kinds of explanations. Without some way of making these or analogous distinctions we have no way of distinguishing good explanations from non-explanations. So box-and-arrow models that do not correspond to real components are doomed to be either how-possibly or merely phenomenological models, not mechanistic models. We can put this in the form of an argument: 1. 2. 3. 4.
Analytic functional explanations are noncomponential. Noncomponential explanations provide only a redescription of the phenomenon or a how-possibly model. Redescriptions and how-possibly models are not explanatory. So analytic functional explanations are not explanatory.
The argument is valid. Premise 1 is true by definition of analytic models (at least those that are not linked with an instantiation theory). With respect to premise 3, we can agree that redescriptive models and some how-possibly models are not explanatory.7 But the question here centers on premise 2. The issue is whether there could be 7 The caveat concerns contexts in which we may want to say that a how-possibly account is a sufficient
explanation. In explaining multiple realization, for example, we explicitly consider a range of how-possibly
123
320
Synthese (2011) 183:313–338
genuinely explanatory but non-mechanistic and non-phenomenological models—in particular, in psychology. Returning to our characterization of these distinctions above, we can ask whether NCA models can be assessed along our first two normative dimensions. Are we somehow blocked from confirming NCA models? Evidently not. We might posit one decomposition of a capacity into subcapacities only to find that, empirically, this is not the decomposition that individuals’ exercise of C involves. In fact, even Cummins makes this one of his desiderata: it is a “requirement that attributions of analyzing properties should be justifiable independently of the analysis that features them” (1983, pp. 26– 27). If we analyze a child’s ability to divide into capacities to copy numbers, multiply, add, etc., we need separate evidence of those capacities to back this attribution. If, as I have suggested, we conceive of moving from a how-possibly model to a how-actually model as acquiring more and stronger evidence in favor of one model over the others, we can see getting this sort of confirmation for an analysis as homing in on a how-actually functional analysis. So we can distinguish how-possibly NCA models from how-actually NCA models. Similarly, we can ask whether this NCA model accurately represents the subcapacities that a creature possesses, whether it does so in great detail or little detail, etc. That we can do this is evident from the fact that the attributed capacities themselves can be fine-grained or coarse-grained, can be organized in different ways to produce their output, can contain different subcapacities nested within them, and so on. Consider two different ways of analyzing an image manipulation capacity: as allowing image rotation to step through 2◦ at a time versus 5◦ at a time; or as requiring that rotations be performed before translations in a plane, rather than the reverse; and so on. These ways of filling in the same ‘black boxed’ capacity correspond to the functional analytic difference between sketches, schemata, and complete models. We can, then, assess NCA models for both epistemic confirmation and for accuracy and granularity. But Craver seems to think that NCA models can’t make these distinctions, and he pins this fact on their being noncomponential (2007, p. 131): Box-and-arrow diagrams can depict a program that transforms relevant inputs onto relevant outputs, but if the boxes and arrows do not correspond to component entities and activities, one is providing a redescription of the phenomenon (such as the HH model of conductance change) or a how-possibly model, not a mechanistic explanation. The way we distinguish HP from HA and sketches from schemata, etc., is by positing components and activities. Thus these facts militate in favor of mechanistic models. Call this the Real Components Constraint (RCC) on mechanistic models: the components described in the model should be real components in the mechanism. This is a specific instantiation of the principle that models are explanatory to the extent that they correspond with real structures. The constraint can be seen to flow from the general idea that models are better to the extent that they are accurate and complete Footnote 7 continued models and treat these as explaining the fact that a capacity is displayed by physically disparate systems. See Weiskopf (2011) for discussion.
123
Synthese (2011) 183:313–338
321
within the explanatory demands of the context. The difference is that the focus of this principle is on components and their operations or activities rather than on accuracy in general. I discuss the role of the RCC in distinguishing good explanations from merely phenomenal accounts in Sect. 6. A final objection that Craver levels against NCAs is that they do not offer unique explanations of cognitive capacities, and hence must only be giving us how-possibly explanations. Cummins seems to suggest this at times; for example, he says (p. 43): Any way of interpreting the transactions causally mediating the input-output connection as steps in a program for doing ϕ will, provided it is systematic and not ad hoc, make the capacity to do ϕ intelligible. Alternative interpretations, provided they are possible, are not competitors; hence the availability of one in no way undermines the explanatory force of another. This appears to mean that for any system S there will be many equally good explanations for how it is able to do F, which in turn suggests that these explanations are merely how-possibly—since any how-actually explanation would necessarily have to be unique. In fact, I don’t think that we should presuppose that there is a unique how-actually answer to how a system carries out any of its functions. But this point aside, I think the charge rests on a misunderstanding of how the type of explanation Cummins is concerned with here works. Here he is addressing what he calls interpretive functional analysis. This is specifically the attempt to understand the functional organization of a system in semantically interpreted terms—not merely to describe what the system does as, e.g., opening and closing gates and relays, but as adding numbers or computing trajectories. Interpretive analysis differs from descriptive analysis precisely in its appeal to such semantic properties (1983, p. 34). The point that Cummins wants to make about interpretive analyses is that for any system there may be many distinct yet still true explanations of what a system is doing when it exercises the capacity to F. But this fact does not suggest that the set of explanations is merely a how-possibly set. Consider the case of grammatical competence. There may be many predictively equivalent yet interestingly distinct grammars underlying natural language. As Cummins notes, however, “[p]redictively adequate grammars that are not instantiated are, of course, not explanations of linguistic capacities” (p. 44). Here we may see a role for how-possibly explanations; functional analysis can result in programs that are not instantiated, and figuring out what grammar is instantiated is part of telling the how-actually story for linguistic competence. Even if a system instantiates a grammar, though, there may be other grammars that it also instantiates. And this may be the case even once we pin down the details of its internal structure. A decomposition of a system into components does not necessarily, in his view, uniquely fix the semantic interpretation of the components or the system that they are part of: “[i]f the structure is interpretable as a grammar, it may be interpretable as another one too” (p. 44). Cummins’ NCA-style explanations allow for structural constraints to play a role in getting a how-actually story from a how-possibly story about interpretive functional analysis. The residual multiplicity of analyses comes from the fact that these facts do
123
322
Synthese (2011) 183:313–338
not pin down a unique semantic interpretation of the system. Hence the same system may be instantiating many programs, all of which are equally good explanations of what it does. We needn’t follow Cummins in thinking that it’s indeterminate or pluralistic which program a system is executing; that’s an idiosyncrasy of his view concerning semantic interpretation. If there are facts that can decide between these interpretive hypotheses, then NCA models can be assessed on our two normative dimensions despite not being mechanistic. Whether this is so depends ultimately on whether there can be an account of the fixation of semantic content that selects one interpretation over another, a question that is definitely beyond the scope of the discussion here. There is a larger moral here which will serve to introduce the theme of our next sections. In trying to understand the behavior of complex systems, we can adopt different strategies. For neurobiological systems and phenomena, it might be that compositional analysis is an obvious first step: figuring out the anatomical, morphological, and physiological properties of cells in a region, their laminar and connectional organization, their response profile to stimulation, the results of lesioning or inhibiting them, etc. But for psychological phenomena, giving an account of what is involved in their production is necessarily more indirect. It is plausible that in many cases, decomposing the target capacity into subcapacities is heuristically indispensible—if for no other reason than, often enough, we have no well-demarcated physical system to decompose, and little idea of the proper parts and operations to use in such a decomposition. The only system we can analyze is the central nervous system, and its structure is notoriously not psychologically transparent. Thus (as Cummins also argues) structural hypotheses usually follow interpretive functional hypotheses: we indirectly specify the structure in question by making a provisional analysis of the psychological capacity, then we look for ‘fits’ in the structure that can be used to interpret it. These fits obtain between the subcapacities and their flowchart relations and parts of the physical structure and their interconnections. The question, then, is how models in psychology actually operate; in particular, whether their functioning can be wedged into the mold of mechanistic explanation. In the next section I’ll lay out a few such models and argue that, as with functional analysis, these are cases of perfectly good explanations that are not mechanistic.
4 The structure of cognitive models The models I will consider are all psychological models, in the sense that they aim to explain psychological phenomena and capacities. They are models of parts of our psychology. They are also psychological in another sense: like interpretive functional analyses, they explain these capacities in terms of semantic, intentional, or more generally representational states and processes. There can be models of psychological phenomena that are not psychological in this second sense. Examples are purely neurobiological models of memory encoding or attention. These explain psychological phenomena in non-psychological terms. While some models of psychological capacities employ full-blown intentional states such as beliefs, intentions, and desires (think of Freudian psychodynamic explanations and many parts of social psychology), others posit more theoretically-motivated subpersonal representational states. Constructing
123
Synthese (2011) 183:313–338
323
models of this sort is characteristic of cognitive psychology and many of its allied fields such as cognitive neuroscience and neuropsychology. Indeed, the very idea of there being a ‘cognitive level’ of description was introduced by appeal to explanations having just this form. I will therefore refer to models that explain psychological capacities in representational terms as cognitive models.8 To be more specific, I will focus on representation-process-resource models. These are models of psychological capacities that aim to explain them in terms of systems of representations, processes that operate over and transform those representations, and resources that are accessed by these processes as they carry out their operations. Specifying such a model involves specifying the set of representations (primitive and complex) that the system can employ, the relevant stock of operations, and the relevant resources available and how they interact with the operations. It also requires showing how they are organized to take the system from its inputs to its outputs in a way that implements the appropriate capacity. This involves describing at least some of the architecture of the system: how information flows through it, whether its operations are serial or parallel, whether it contains subsystems that have restricted access to the rest of the information and processes in the system, and the control structures that determine how these elements work together to mediate the input-output transitions. Thus a cognitive model can be seen as an organized set of elements that depicts how the system takes input representations into output representations in accord with its available processes and operations, as constrained by its available resources. In what follows I briefly describe three models of object recognition and categorization to highlight the features they have in common with mechanistic models and those that set them apart. The first model comes from studies of human object recognition. Object recognition is the capacity to judge that a (usually visually) perceived object is either the same particular one that was perceived earlier, or belongs to the same familiar class as one perceived earlier. This recognitional capacity is robust across perspectives and other viewing conditions. One doesn’t have the capacity to recognize manatees, Boeing 747’s, or Rodin’s ‘Thinker’ unless one can recognize them from a variety of angles, distances, lighting, degrees of occlusion, etc. Sticking to visual object recognition, the relevant capacity takes a visual representation of the object as input and produces as output a decision as to whether the object is recognized or not, and if it is, what it is taken to be. There are many competing models of object recognition, and my goal here is to present one representative model rather than to survey all of them.9 The model in question, presented in Hummel and Biederman (1992), is dubbed John and Irv’s Model (JIM). It draws on assumptions about object recognition developed in earlier work by Biederman (1987). Essentially, Biederman hypothesized that object recognition depends on a set of abstract visual primitives called geons. These geons 8 To be sure, in recent years there have been a number of movements in cognitive science that have proposed doing away with representational models and their attendant assumptions. These include Gibsonian versions of perceptual psychology, dynamical systems theory, behavior-based robotics, and so on. I will set these challengers aside here to focus on the properties of models that are based on the core principles of the cognitive revolution. 9 For a range of perspectives, see Biederman (1995), Tarr (2002), Tarr and Bülthoff (1998), and Ullman
(1996).
123
324
Synthese (2011) 183:313–338
are simple three-dimensional shapes that come in a range of shapes such as blocks, cylinders, and cones, and can be scaled, rotated, conjoined, and otherwise modified to represent the large-scale structure of perceived objects (minus details like color, texture, etc.). Perceived objects are parsed in terms of this underlying geon structure, which is then stored in memory for comparison to new views. Since geons are three-dimensional, they provide a viewpoint independent representation of an object’s spatial properties. There need, therefore, to be perceptual systems that can extract this common underlying structure despite degraded and imperfect viewing conditions, in addition to systems that will determine when a match in geon structure is ‘good enough’ to count as the same object (same type or same token). In JIM this process is decomposed into a sequence of subprocesses each of which takes place in a separate layer (L1–L7). L1 is a simplified retina-like structure that represents the object from a viewpoint as a simple line drawing composed of edges; these can be extracted from, e.g., luminance discontinuities in the ambient light. L2 contains a set of three distinct networks each of which extracts a separate type of feature: vertices (points where multiple edges meet), axes of symmetry, and blobs (coarsely defined filled regions of space). L3 is decomposed into a set of attribute representations: axis shape (straight vs. curved), size (large to small), cross-sectional shape (straight vs. curved), orientation (vertical, diagonal, horizontal), aspect ratio (elongated to flat), etc. These attributes can be uniquely extracted from vertex, axis, and blob information. Each of them takes a unique value, and a set of active values on all attributes uniquely defines a geon; the set of all active values across all attributes at a time uniquely defines all of the geons present in a scene. L4 and L5 take their input from the L3 attributes having to do with size, orientation, and position, and they represent the relations among the geons in a scene, e.g., whether they are above, beside, or below one another. L6 is an array of individual cells each of which represents a single geon and its relations to the other geons in the scene (a geon feature assembly), as determined by the information extracted by L3 and L5; finally, L7 represents the network’s best guess as to what the object is, arrived at on the basis of the summed activity over time in the geon feature assembly layer. The second two models come from work on concepts and categorization. Categorization and object recognition are related but distinct tasks.10 In categorizing, one takes some information about an object—perceptual, functional, historical, contextual/ecological, theoretical/causal, etc.—and comes to a judgment about what sort of thing it is. A furry animal with pointy ears that meows is likely a cat; a strong, odorless alcoholic beverage is likely vodka; a meal bought at a fast food restaurant is likely junk food; and so on. Like object recognition, categorization can be viewed a kind of inference from evidence. But categorization can draw on a wider range of evidence than merely perceptual qualities (politicians are defined by their role, antique tables
10 To some extent, the differences between the two may reflect not deep, underlying differences in their cognitive structure, but rather differences in the assumptions and methods of the experimental community that has investigated each one. What is called ‘perceptual categorization’ and object recognition may in fact be two names for the same capacity (or at least partially overlapping capacities). But the literatures on the two problems are, so far at least, substantially distinct.
123
Synthese (2011) 183:313–338
325
are defined by their historical properties, etc.), since concepts are representations that can group things together in ways that cross-cut their merely perceptual similarities. As in the case of object recognition, there are far too many different models of categorization to survey here.11 I will focus on two models that share some common assumptions and structure: the ALCOVE model (Kruschke 1992), and the SUSTAIN model (Love et al. 2004; Love and Gureckis 2007). What these models have in common is that they explain how we categorize as a process of comparing new stimuli to stored exemplars (representations of individual category members) in memory. The similarity between the stimulus and the stored exemplars determines which ones it will be classified with. Both of these can be regarded as descendents of the Generalized Context Model (Nosofsky 1986). The GCM assumes that all exemplars (all of the instances of a category that I have encountered and can remember) are represented by points in a large multidimensional space, where the dimensions of the space correspond to various attributes that the exemplars can have (size, color, animacy, having eyes, etc.). Each exemplar is a measurable distance in this space from every other exemplar. The psychological similarity between two exemplars is a function of their distance in the space.12 Finally, categorization decisions for new stimuli are made on the basis of how similar the new stimulus is to exemplars stored in memory. If the stimulus is more similar to members of one category than another, then it will be categorized with those. Since there will usually be several possible alternatives, this is expressed as the probability of assigning a stimulus s to a category C. The GCM gives us a set of equations relating distance, similarity, and categorization. Attention Learning Covering map (ALCOVE) is a cognitive model that instantiates these equations. It is a feed-forward network that takes a representation of a stimulus and maps it onto a representation of a category. The input layer of the network is a set of nodes corresponding to each possible psychologically relevant dimension that a stimulus can have—that is, each property that can be encoded and stored for use in distinguishing that object from every other object. The greater the ‘value’ of this dimension for the stimulus, the more strongly the corresponding node is activated. The activity of these nodes is modulated by an attentional gate, which corresponds to how important that dimension is in the present categorization task, or for that type of stimulus. The values of these gates for each dimension may change across items or tasks—sometimes color is more important, sometimes shape, etc. The resulting modulated activity for the stimulus is then passed to the stored exemplar layer. In this layer, each exemplar is represented by a node, and the activity level of these nodes at a particular stage in processing is determined by their similarity to the stimulus. So highly similar exemplars will be strongly activated, moderately similar ones less so, and so on. Finally, the exemplar layer is connected to a set of nodes representing categories (cat, table, politician, etc.). The strength of the activated exemplars deter11 For a review of theories of concepts and the phenomena they aim to capture generally, see Murphy (2002). For a review of early work in exemplar theories of categorization, see Estes (1994). For a more recent review, see Kruschke (2008). 12 There are various candidate rules for computing similarity as a function of distance. Nosofsky’s original rule took similarity to be a decaying exponential function of distance from an exemplar, but the details won’t concern us here. The same goes for the rules determining how categorization probabilities are derived from similarities.
123
326
Synthese (2011) 183:313–338
mines the activity level of the category nodes, with the most strongly activated node corresponding to the system’s ‘decision’ about how the stimulus should be categorized. Supervised and Unsupervised Stratified Adaptive Incremental Network (SUSTAIN) is a model not just of categorization but also of category learning. Its architecture is in its initial stages similar to ALCOVE: the input layers consist of a set of separate detector representations for features, including verbally given category labels. Unlike in ALCOVE, however, these features are discrete valued rather than continuous. Examples given during training and stimuli given during testing are all represented as sets of value assignments to these features (equivalently, as vectors over features). Again, as with SUSTAIN, activation of these features is gated by attention before being passed on for processing. The next stage, however, differs significantly. Whereas in ALCOVE the system represented each exemplar individually, in SUSTAIN the system’s memory contains a set of clusters, each of which is a single summary representation produced by averaging (or otherwise combining) individual exemplars. These clusters encode average or prototypical values of the relevant category in each feature.13 The clusters in turn are mutually inhibiting, so activity in one tends to damp down the competition. They also pass activation to a final layer of feature nodes. The function of this layer is to infer any properties of the stimulus that were not specified in the input. For instance, if the stimulus best fits the perceptual profile of a cat, but whether it meows is not specified, that feature would be activated or ‘filled in’ at the output layer. This allows the model to make inferences about properties of a category member that were not directly observed. Most importantly, the verbal label for the category is typically filled in at this stage. The label is then passed to the ‘decision system’, which produces it as the system’s overall best guess as to what the stimulus is. A few aspects of these models are noteworthy: First, unlike NCAs, cognitive models clearly have a componential structure. All three models consist of (1) several distinct stages or layers, each of which (2) represents its own type of information and (3) processes it according to its own rules. Moreover, ALCOVE and SUSTAIN, at least, also (4) implement the idea of cognitive resources, since they make use of attention to modulate their performance. Representations are tokened at ‘locations’ within the architecture, processed, and then copied elsewhere. The representations themselves, the layers and sublayers, and the connections among them that implement the processes are all components of the model. And there is control over the flow of processing in the system—though in this case the control systems are fairly dumb, given that these are rigid, feed-forward systems. So cognitive models, unlike NCAs, break a system into its components and their interactions. This places them closer to mechanistic models in at least one respect. Representations are the most obvious components of such models.14 While in classical symbolic systems they would include propositional representations, here they 13 Strictly speaking, SUSTAIN can generate exemplars as well as prototypes. Which it ends up working with depends on what distinctions are most useful in reducing errors during categorization. But this hybrid style of representation still distinguishes it from the pure exemplar-based ALCOVE model. 14 This is not wholly uncontroversial. Some, such as Ramsey (2007), argue that connectionist networks and many other non-classical models do not in fact contain representations. Rather than enter this debate here, I am taking it at face value that the models represent what the modelers claim.
123
Synthese (2011) 183:313–338
327
include elements such as nodes representing either discrete-valued features or continuous dimensions, nodes representing parts or properties of a perceived visual scene, higher-level representations of relations among these parts, nodes representing individual exemplars that the system has encountered or prototypes defined over those exemplars, and nodes representing categories themselves or their verbal labels. These models also contain processing elements that regulate the system, such as attentional gates, inhibitory connections between nodes, and ordinary weighted connections that transmit activity to the next layers. Layers or stages themselves are complex elements composed out of these basic building blocks. The properties of both sorts of elements—their functional profiles—are given by their associated sets of equations and parameters, e.g., activation functions, learning rules, and so on. Second, the organization of these elements and operations corresponds to a map of a causal process. Earlier stages of processing in the model correspond to temporally earlier stages in real-world psychological processing, changes propagated through the elements of the model correspond to causal influences in the real system, activating or inhibiting a representation correspond to different sorts of real changes in the system, and so on. This also distinguishes these models from NCAs, since the flowchart or program of an NCA is not a causal diagram but an analytical one, displaying the logical dependence relations among functions rather than the organization of components in processing. Cognitive models are thus both componential and causal.15 Third, these models have all been empirically confirmed to some degree or other. JIM has been able to match human recognition performance on tasks involving scale changes, mirror image reversal, and image translations. On all of these, both humans and the model evince little performance degradation. By contrast, for images that are rotated in the visual plane, humans show systematic performance degradation, and so does the model. Similarly, both ALCOVE and SUSTAIN have been compared to a substantial number of datasets of human categorization performance. These include supervised and unsupervised learning, inference concerning category members, name learning, and tasks where shifting attention among features/dimensions is needed for accurate performance. Moreover, SUSTAIN has been able to succeed using a single set of parameter values for many different tasks. Fourth, some aspects of these model clearly display black-boxing or ‘filler’ components. For instance, Hummel & Biederman note that “[c]omputing axes of symmetry is a difficult problem… the solution of which we are admittedly assuming” (1992, p. 487). The JIM model when it is run is simply given these values by hand rather than computing them. These components count as black boxes that presumably are intended to be filled in later. To summarize, cognitive models are componentially organized, causally structured, semantically interpretable models of systems that are capable of producing or instantiating psychological capacities. Like mechanistic models, they can be specified at several different grains of analysis and may make use of epistemic short-cuts like black boxes or filler terms. They can also be confirmed or disconfirmed using estab15 This point is also endorsed by Glennan (2005). While I will disagree with his interpretation of these models as mechanistic, I agree that, as he puts it, in cognitive models “the arrows represent the causal interactions between the different parts” (p. 456).
123
328
Synthese (2011) 183:313–338
lished empirical methods. Despite these similarities, however, they are not mechanistic models. I turn now to the argument for this claim.
5 Model-based explanations without mechanisms Reflect first on this functionalist truism: the relationship between a functional state or property of a system and the underlying state or property that realizes it is typically highly indirect. By ‘indirect’, what I mean is that one cannot in any simple or straightforward way read off the presence of the higher level state from the lower level state. The precise nature of the mapping by which functional properties (including psychological properties) are realized is often opaque. While evidence of the lower level structure of a system can inform, constrain, and guide the construction of a theory of its higher level structure, lower level structures are not simple maps of higher level ones. Thus in psychology we have the obvious, if depressing, truth that the mind cannot simply be read off of the brain. Even if brains were less than staggeringly complex, it would still be an open question whether the organization that one discovers in the brain is the same as the one that structures the mind, and vice versa. In attempting to understand the high level dynamics of complex systems like brains, modelers have recourse to many techniques for constructing such indirect accounts. Here I will focus on just three: reification, functional abstraction, and fictionalization. All of these play a role in undermining the claim that cognitive models are mechanistic. Reification is the act of positing something with the characteristics of a more or less stable and enduring object, where in fact no such thing exists. Perhaps the canonical example of reification in cognitive science is the positing of symbolic representations in classical computational systems. Symbolic representations are purportedly akin to words on a page: discrete, able to be concatenated, moved, stored, copied, deleted. They are stable, entity-like constructs. This has given rise to a great deal of anxiety about the legitimacy of symbolic models. Nothing in the brain appears to ‘stand still’ in the way that symbols do, and nothing appears to have the properties of being manipulable in the way they are. This case is pushed strenuously by theorists like Clark (1992), who argues that the notion of an explicit symbol having these properties should be abandoned in favor of an account that sees explicit representation as a matter of the ease of retrieval of information and the availability of information for use in multiple tasks.16 In fact, from the point of view of neurophysiology, the distinction between representations and the processes that operate over them seems quite illusory. Representations and processes are inextricably entangled at the level of neural realization. In the dynamics of spike trains, excitatory and inhibitory potentials, and other events, there is no obvious separation between the two—indeed, all of the candidate vehicles of these 16 What this amounts to, then, is explicit representation without symbols. For my purposes here I don’t endorse the anti-symbolic conclusion—indeed, part of what I’m arguing is that symbolic models (and representational models more generally) can be perfectly correct and justified even if at the level of neurophysiological events there is only an entangled causal process that lacks the distinctive characteristics of localized, movable, concatenatable symbols.
123
Synthese (2011) 183:313–338
329
static, entity-like symbols are themselves processes.17 Dynamical systems theorists in particular have seen this as evidence that representational models of the mind, including all of the cognitive models considered here, should be rejected (Chemero 2009; van Gelder 1995). But this is an overreaction. Reification is a respectable, sometimes indispensible, tool for modeling the behavior of complex systems. A further example of reification occurs in Just and Carpenter (1992) model of working memory in sentence comprehension. The model (4CAPS) is a hybrid connectionist-production rule system, but one important component is a quantity, called ‘activation’, representing the capacity of the system to carry out various processes at a stage of comprehension. Activation is a limited-quantity property that attaches to representations and rules in the model. But while activation is a entity in the model, it does not correspond to any entity in the brain. Rather, it corresponds to a whole set of resources possessed by neural regions: “neurotransmitter function and various metabolic support systems, as well as the connectivity and structural integrity of the system” (Just et al. 1999, p. 129). Treating this complex set of properties as a singular entity facilitates understanding the dynamics of normal comprehension, impaired comprehension due to injuries, and individual differences in comprehension. Moreover, it seems to offer a causal explanation of the system’s functioning: as these resources increase and decrease, the system becomes more or less able to process various representations. Functional abstraction occurs when we decompose a modeled system into subsystems and other components on the basis of what they do, rather than their correspondence with organizations and groupings in the target system. To stave off an immediate objection, this isn’t to say that functional groupings in systems are independent of their underlying physical, structural, and other organizational properties. But for many such obvious ways of dividing up the system there can also be crosscutting functional groupings: ways of dividing the system up functionally that do not map onto the other sorts of organizational divisions in the system. Any system that instantiates functions that are not highly localized possesses this feature. An example of this in practice is the division of processing in these three models into layer-like stages. For example, in JIM there are separate stages at which edges are extracted, vertices detected, attributes assigned, and geons represented. Earlier stages provide information to later stages and causally control them. At the same time, there appears to be a hierarchy of representation in visual processing in the brain. On the simplest view, at early stages (e.g., the retina or shortly thereafter), visual images are represented as assignments of luminance values to points in visual space. At progressively ‘higher’ (causally downstream and more centrally located) stages, more complex and abstract qualities are detected by successive regions that correspond to maps of visual space in terms of their proprietary features. So edges, whole line segments, movement, color, and so on, are detected by these higher level feature maps. The classic map of these areas was given by Felleman and van Essen (1991); for more recent work, see van Essen (2004). Since these have something like the structure of
17 For a careful, thorough look at what would be required for neural systems to implement the symbolic
properties of Turing machine-like computational systems, see Gallistel and King (2009).
123
330
Synthese (2011) 183:313–338
the layers or stages of JIM, one might expect to be able to map the activity of layers in JIM onto these stages of neural processing. Unfortunately, this is not likely to be possible. At the very least it would entail skipping steps, since there are likely to be several neural intermediaries between, e.g., edge detection and vertex computation. But more importantly, there may not be distinct neural maps whose component features and processes correspond to the layers of JIM. This point has even greater application to ALCOVE and SUSTAIN: since we can categorize entities according to indefinitely many features and along indefinitely many dimensions, even the idea of a structurally fixed and reasonably well-localized neural region that initiates categorization seems questionable. The same could be said of entities such as attentional gates. Positing such entities involves creating spots in the model where a complex capacity like attention can be plugged in and affect the process of categorization. But the notion of a ‘place’ in processing where attention modulates categorization is just the notion of attention’s having a functionally defined effect on the process. There need not be any such literal, localized place. If these models were intended to correspond structurally, anatomically, or in an approximate physiological way to the hierarchical organization of the brain, this would be evidence that they are at best incomplete, and at worst false. However, since these layers are functional layers, all that matters is that there is a stable pattern of organization in the brain that carries out the appropriate processes assigned to each layer, represents the appropriate information, and has the appropriate sort of internal and external causal organization. For example, there may not be three separate maps for vertex, axis, and blob information in visual cortex. This falsifies the model only if one assumes a simple correspondence between neural maps and cognitive maps. There is some evidence that modelers are beginning to orient themselves away from localization assumptions in relating cognition to neural structures. The slogan of this movement is ‘networks, not locations’. Just et al. (1999) comment that “[a]lmost every cognitive task involves the activation of a network of brain regions (say, 4-10 per hemisphere) rather than a single area” (p. 129). No single area “does” any particular cognitive function; rather, responsibility is distributed across regions. Moreover, cortical areas are multifunctional, contributing to the performance of many different tasks (p. 130). In the same spirit, Barrett (2009, pp. 332), argues that psychological primitives are functional abstractions for brain networks that contribute to the formation of neuronal assemblies that make up each brain state. They are psychologically based, network-level descriptions. These networks are distributed across brain areas. They are not necessarily segregated (meaning that they can partially overlap). Each network exists within a context of connections to other networks, all of which run in parallel, each shaping the activity in the others. Finally, van Orden et al. (2001) amass a large amount of empirical evidence that any two putative cognitive functions can be dissociated by some pattern of lesion data, suggesting that any such localization assumptions are likely to fail. Even in cases where there are such correspondences, they are likely to be partial. It has been proposed that several of the major functional components of SUSTAIN can
123
Synthese (2011) 183:313–338
331
be mapped onto neural regions: for instance, the hippocampus builds and recruits new clusters, the perirhinal cortex detects when a stimulus is familiar, and the prefrontal cortex directs encoding and retrieval of clusters (Love and Gureckis 2007). First, none of these are unique functions of these areas; at best, they are among their many functions. This comports with the general idea that neural regions are reused to carry out many cognitive functions. Second, and importantly, the exemplar storage system itself is not modeled here, a fact most plausibly explained by its being functionally distributed across wide regions of cortex. As Wimsatt (2007, pp. 191–192) remarks, “it is not merely that functionally characterized events and systems are spatially distributed or hard to locate exactly…. The problem is that a number of different functionally characterized systems, each with substantial and different powers to affect (or effect) behavior appear to be interdigitated and intermingled as the infinite regress of qualities-within-qualities of Anaxagoras’ seeds.” Cognitive models are poised to exploit just this sort of organization. Finally, fictionalization involves putting components into a model that are known not to correspond to any element of the modeled system, but which serve an essential role in getting the model to operate correctly.18 In JIM, for instance, binding between two or more representations is achieved by synchronous firing, as is standard in many neural networks. To implement this synchrony, cells are connected by dedicated pathways called ‘Fast Enabling Links’ (FELs) that are distinct from the activation and inhibition-carrying pathways. Of their operation, the authors say (p. 498): In the current model, FELs are assumed to have functionally infinite propagation speed, allowing two cells to fire in synchrony regardless of the number of intervening FELs and active cells. Although this assumption is clearly incorrect, it is also much stronger than the computational task of image parsing requires. The fact that FELs are independent of the usual channels by which cells communicate, and the fact that they possess physically impossible characteristics such as infinite speed suggests that not only can models contain black boxes or filler terms, they can also contain components that cannot be filled in by any ordinary entities having the normal sorts of performance characteristics. Synchronic firing among distributed nodes needs to happen somehow, and FELs are just the devices that are introduced to fill this need. We could think of FELs as a kind of useful fiction introduced by the modelers. Fictions are importantly unlike standard filler terms or black boxes. They are both an essential part of the operation of the model, and not clearly intended to be eliminated by any better construct in later iterations. In fact, Hummel & Biederman spend a great deal of their paper discussing the properties of FELs and their likely indispensability in future versions of the model (e.g., they discuss other ways of implementing binding through synchrony and argue for the superiority of the FEL approach; see p. 510). But like reified entities and functional abstraction, they do not correspond to parts of the modeled system. That does not mean that they are wholly fictional. There is reason to think that distinct neural regions, even at some distance, do synchronize their activity 18 For an argument that the use of such fictional posits in even highly reliable models is widespread, see Winsberg (2010), Ch. 7.
123
332
Synthese (2011) 183:313–338
(Canolty et al. 2010). How this happens remains poorly understood, but it is a certainty that there are no dedicated, high speed fiber connections linking these parts whose sole function is to synchronize firing rates. Alternatively we might say: there is something that does what FELs do, but it isn’t an entity or a link or anything of that sort. FELs capture the general characteristic of neural systems that they often fire in synchrony. We can model this with FELs and lose nothing of interest. In modeling, we simply introduce a component that does the needed job—even if we recognize that there is in some sense no such thing. To summarize, we should not be mislead into thinking that cognitive models are mechanistic by the fact that they are componential and causal. The reason is that even if the intrinsic structure of cognitive models resembles that of mechanistic models, the way in which they correspond to the underlying modeled system is far less straightforward. These models often posit elements that have no mechanistic ‘echo’: they do not map onto parts of the realizing system in any obvious or straightforward way. To the extent that they are localizable, they are only coarsely so. In a good mechanistic model, elements appear in a model only when they correspond to a real part of the mechanism itself. This, recall, was the point of the Real Components Constraint (RCC) that was advanced in Sect. 3 to explain why Cummins’ NCAs are not explanatory. But when it comes to cognitive models, not everything that counts as a component from the point of view of the model will look like a component in the modeled system itself—at least not if our notion of a component is based on a distinct, relatively localized physical entity like a cortical column, DNA strand, ribosome, or ion channel. In light of this discussion, I would suggest that we need to make at least the following distinctions among types of models: • Phenomenal models, such as the Hodgkin–Huxley equation; • Noncomponential analyses, such as Cummins’ analytic functional explanations; • Mechanistic models, of the sort described by Craver, Bechtel, and Glennan; • Functional models, of which cognitive models as described here are one example. Phenomenal models obviously differ in their epistemic status, but the latter three types of models seem capable of meeting the general normative constraints on explanatory models perfectly well. In the spirit of explanatory pluralism, we should recognize the characteristic virtues of each modeling strategy rather than attempting to reduce them all to varieties of mechanisms. 6 Objections and replies I now want to consider several objections to this conception of model-based explanation. First objection: These models are in fact disguised mechanistic models—they’re just bad, imperfect, or immature ones. They are models that are suffering from arrested development at the mechanism schema or sketch stage. Once all of the details are adequately filled in and their mechanistic character becomes more obvious, it can be determined to what degree they actually correspond to the underlying system. Reply: Whether this turns out to be true in any particular case depends on the details. Some cognitive models may turn out to have components that can be localized, and to
123
Synthese (2011) 183:313–338
333
describe a sequence of events that maps onto a readily identifiable causal pathway in the neurophysiological description of the system. In these cases, the standard assumptions of mechanistic modeling will turn out to be satisfied. The cognitive model would turn out to be an abstraction from the neural system in the traditional sense: it is the result of discarding or ignoring the details of that system in favor of a coarse-grained or black-box description (much as we do with the simple three-stage description of hippocampal function). However, there is no guarantee that this will be possible in all cases. What I am defending is the claim that these models provide legitimate explanations even when they are not sketches of mechanisms. No one should deny, first, that some capacities can be explained in terms of non-localistic or distributed systems. Many multilayer feedforward connectionist networks, as Bechtel and Richardson (1993, pp. 202–229), point out, satisfy this description. The network as a whole carries out a complex function but the subfunctions into which it might be analyzed correspond to no separate part of the network. So there is no way to localize distinct functions, but these network models are still explanatory. Indeed, Bechtel & Richardson argue that network models are themselves mechanistic insofar as their behavior is explicable in terms of the interactions of the simple components, each of which is itself an obviously mechanical unit. They require only that “[i]f the models are well motivated, then component function will at least be consistent with physical constraints” (p. 228). In the case envisaged here, there may be an underlying mechanistic neural system, but this mechanistic structure is not what cognitive models capture. They capture a level of functional abstraction that this mechanistic structure realizes. This is not like the case of mechanism schemata and sketches as described in Sect. 2. There we have what purports to be a model of the real parts, operations, and organization of the mechanism itself—one that may be incomplete in certain respects but which can be sharpened primarily by adding further details. Cognitive models can be refined in this way. But the additional details will themselves be functional abstractions of the same type, and hence will not represent an advance. Glennan (2005) presents a view of cognitive models that is very similar to the one that I advocate here, but he argues that they are in fact mechanistic. Cognitive models—his examples are vowel normalization models—are mechanical, he claims, because they specify a list of parts along with their functional arrangement and the causal relations among them (p. 456); that is, “a set of components whose activities and interactions produce the phenomenon in question” (p. 457). Doing this is not sufficient for being a mechanistic model, in my view. The remaining condition is that the model must actually be a model of a real-world mechanism—that is, there must be the right sort of mapping from model to world. Glennan correctly notes that asking whether a model gets a mechanism “right” is simplistic. Models need not be straightforwardly isomorphic to systems, but may be similar in various respects. The issue is whether the kinds of relationships that I have canvassed count in favor of their similarity or dissimilarity. Cognitive models as I construe them need not map model entities onto real-world entities, or model activities and structures onto real-world activities and structures. Entities in models may pick out capacities, processes, distributed structures, or other large-scale functional properties of systems. Glennan shows some willingness to allow these sorts of correspondence:
123
334
Synthese (2011) 183:313–338
“[i]n the case of high level cognitive mechanisms, the parts themselves may be complex and highly distributed and may defy our strategies for localization” (2005, p. 459). But if these sorts of correspondences are allowed, and if these sorts of entities are allowed to count as ‘parts’, it is far from clear what content the notion of a mechanism has anymore. It is arguable that the notion of a part of a mechanism should be closely tied to the sort of thing that the localization heuristic counsels us to seek. Mechanisms, after all, involve the coordinated spatial and temporal organization of parts. The heart doesn’t beat unless the ventricles, valves, etc. are spatially arranged in the right sort of way, and long-term potentiation is impossible unless neurotransmitter diffusion across synaptic gaps can occur in the needed time window. Craver (2007, pp. 251–253), emphasizes this fact, noting that compartmentalizing phenomena in spatial locations and determining the spatial structure and orientation of various parts are crucial to confirming mechanistic accounts. If parts are allowed to be smeared-out processes or distributed system-level properties, the spatial organization of mechanisms becomes much more difficult to discern. In the case of ALCOVE and SUSTAIN, the ‘mechanism’ might include large portions of the neocortex, since this may be required for the distributed storage of exemplars. It is more than just a terminological matter whether one wants to count these as parts of mechanisms. Weakening the spatial organization constraint by allowing distributed, nonlocalized parts incurs costs, in the form of greater difficulty in locating the boundaries of mechanisms and stating their individuation conditions.19 Second objection: If these are not mechanism sketches, then they are not describing the real structure of the underlying system at all. So they must be something more like merely phenomenal models: behaviorally adequate, perhaps, but non-explanatory. Reply: This objection gives voice to a view that might be called ‘mechanism imperialism’. It neglects the possibility that a system’s behavior can be explained from many distinct epistemic perspectives, each of which is illuminating. Viewed from one perspective, the brain might be a hierarchical collection of neural mechanisms; viewed from another, it might instantiate a set of cognitive models that classify the system in ways that cut across mechanistic boundaries. This point can be put more sharply. First, recall from Sect. 2 that explanatory models differ from phenomenal models in that they allow for control and manipulation of the system in question, and they allow us to answer various counterfactual questions about the system’s behavior. Cognitive models allow us to do both of these things. Models such as ALCOVE are actually implemented as computer programs, and they can be run on various data sets, under different task demands, with various parameter values systematically permuted, and even artificially lesioned to degrade their performance. Control and manipulation can be achieved because these models depict one aspect of the causal structure of the system. They also describe the ways in which the internal configuration and output performance of the system vary with novel inputs and interventions. So this explanatory demand can be met. 19 The individuation question is particularly important if cognitive functions take place in networks that are reused to implement many different capacities. The very same ‘parts’ could then simultaneously be part of many interlocking mechanisms. This too may constitute a reason to use localization as a guide to genuine mechanistic parthood.
123
Synthese (2011) 183:313–338
335
Further, these models meet at least one form of the Real Components Constraint (RCC) described at the end of Sect. 2. This may seem somewhat surprising in light of the previous discussion. I have been arguing that model elements need not correspond to parts of mechanisms. How, then, can these models meet the RCC? The answer depends on different ways of interpreting the RCC’s demand. Craver, for example, gives a list of criteria that a ‘real part’ of a mechanism must meet. Real parts, he says (2007, pp. 131–133): 1. 2. 3. 4.
Have a stable cluster of properties; Are robust, i.e., detectable with a variety of causally and theoretically independent devices; Are able to be intervened on and manipulated; Are ‘physiologically plausible’, in the sense of existing only under regular nonpathological conditions.
The constructs posited in cognitive models satisfy these conditions. Take attentional gates as an example. These have a stable set of properties: they function to stretch or compress dimensions along which exemplars can be compared, rendering them more or less similar than they would be otherwise. The effects of attention are detectable by performance in categorization, but also in explicit similarity judgments, in errors in detecting non-attended features of stimuli, etc. Attention can be manipulated both intentionally and implicitly, e.g., by rewarding successful categorizations; it can also be disrupted by masking, presenting distractor stimuli, increasing task demands, etc. Finally, attention has a normal role to play in cognitive functioning. Since attentional gates are model entities that stand in for the functioning of this capacity, they should count as ‘real parts’ by these criteria. The point here is not, of course, that these criteria show that cognitive models are mechanistic. Rather it shows that these conditions, which purport to govern ‘real parts’, are in fact more general. To see this, observe that these criteria could all perfectly well be accepted by Cummins. He requires of a hypothesized analysis of a capacity C that the subcapacities posited be independently attested. This is equivalent to saying that they should be robust (condition 2). Conditions 1 and 3 are also straightforwardly applicable to capacities, and a case can be made for condition 4 as well—the subcapacities in question should characterize normal, not pathological, cognitive functioning. These conditions are not merely ones under which we are warranted in hypothesizing the existence of parts of mechanisms, but general norms of explanations that aspire to transcend the merely phenomenal. Since cognitive models (and non-componential analyses) can satisfy these conditions, and since they provide the possibility of control and manipulation as well as allowing counterfactual predictions, they are not plausibly thought of as phenomenal models. Third objection: If these models need not correspond to the underlying anatomical and physiological organization of the brain, then they are entirely unconstrained. We could in principle pick any model and find a sufficiently strange mapping according to which it would count as being realized by the brain. This approach doesn’t place substantial empirical constraints on what counts as a good cognitive model. Reply: This is a non sequitur, so I will deal with it only briefly. First, cognitive models can be confirmed or disconfirmed independent of neurobiological evidence. Many
123
336
Synthese (2011) 183:313–338
such models have been developed and tested solely by appeal to their fit with behavioral data. Where these are sufficiently numerous, they provide strong constraints on acceptable models. For example, ALCOVE and SUSTAIN differ in whether they allow solely exemplar representations to be used or both exemplars and prototypes. Which form of representation to use is a live empirical debate, and the evidence educed for each side is largely behavioral (Malt 1989; Minda and Smith 2002; Smith and Minda 2002). Second, cognitive models are broadly required to be consistent with one another and with our background knowledge. So well-confirmed models can rule out less well-confirmed ones if they generate incompatible predictions or have conflicting assumptions. Models can be mutually constraining both within a single domain (e.g., studies of short-term memory) and across domains (e.g., memory and attention). The ease of integrating models from various cognitive task domains is in part what motivates sweeping attempts to model cognitive architecture in a unified framework, such as ACT-R and SOAR (Anderson 1990; Newell 1990). And third, even models that are realized by non-localized states and processes can be empirically confirmed or disconfirmed. The nature of the evidence required to do so is, however, much more difficult to gather than in cases of simple localization. On the status of localization assumptions in cognitive neuroscience, and empirical techniques for moving beyond localization, see the papers collected in Hanson and Bunzl (2010).
7 Conclusions Mechanistic explanation is a distinctive and powerful framework for understanding the behavior of complex systems, and it has demonstrated its usefulness in a number of domains. None of the arguments here are intended to cast doubt on these facts. However, we should bear in mind—on pain of falling prey to mechanism imperialism—that there are other tools available for modeling complex systems in ways that give us explanatory traction. Herein I have argued that, first, the norms governing mechanistic explanation are general norms that can be applied to a variety of domains. Second, even noncomponential forms of explanation such as functional analysis can in principle satisfy these norms. And third, cognitive models of the sort that are common in many parts of psychology, cognitive neuroscience, and neuropsychology can also satisfy these normative standards without thereby being mechanistic. While psychological models do not describe mechanisms, they are not defective for failing to do so. They gain their explanatory force from picking out functionally individuated strands in the complex causal web that winds through the brain. We can gain various forms of practical and epistemic leverage by attending to these strands, but we should not expect to be able to descry them in the underlying mechanistic structure of the brain itself. Acknowledgments The author thanks the following people for helpful comments and discussion: Carl Craver, Gualtiero Piccinini, and the participants in the Society for the Metaphysics of Science symposium at the 2010 Pacific APA meeting, where an earlier version of this material was presented. Thanks also to an anonymous reviewer for this journal for very thoughtful comments on a previous draft.
123
Synthese (2011) 183:313–338
337
References Anderson, J. R. (1990). The adaptive character of thought. Hillsdale, NJ: Lawrence Erlbaum. Barrett, L. F. (2009). The future of psychology: Connecting mind to brain. Perspectives on Psychological Science, 4, 326–339. Bechtel, W. (2006). Discovering cell mechanisms: The creation of modern cell biology. Cambridge: Cambridge University Press. Bechtel, W. (2008). Mental mechanisms: Philosophical perspectives on cognitive neuroscience. New York: Routledge. Bechtel, W. (2009). Looking down, around, and up: Mechanistic explanation in psychology. Philosophical Psychology, 22, 543–564. Bechtel, W., & Abrahamson, A. (2005). Explanation: A mechanist alternative. Studies in History and Philosophy of Biological and Biomedical Sciences, 36, 421–441. Bechtel, W., & Richardson, R. C. (1993). Discovering complexity. Princeton: Princeton University Press. Biederman, I. (1987). Recognition-by-components: A theory of human image understanding. Psychological Review, 94, 115–147. Biederman, I. (1995). Visual object recognition. In S. F. Kosslyn & D. N. Osherson (Eds.), An invitation to cognitive science (2nd ed., pp. 121–165). Cambridge: MIT Press. Boden, M. (2006). Mind as machine: A history of cognitive science (2 Vols.). Oxford: Oxford University Press. Bokulich, A. (2011). How scientific models can explain. Synthese, 180, 33–45. Canolty, R. T., Ganguly, K., Kennerley, S. W., Cadieu, C. F., Koepsell, K., & Wallis, J. D., et al. (2010). Oscillatory phase coupling coordinates anatomically dispersed functional cell assemblies. Proceedings of the National Academy of Sciences, 107, 17356–17361. Chemero, T. (2009). Radical embodied cognitive science. Cambridge: MIT Press. Clark, A. (1992). The presence of a symbol. Connection Science, 4, 193–205. Craver, C. F. (2006). What mechanistic models explain. Synthese, 153, 355–376. Craver, C. F. (2007). Explaining the brain. Oxford: Oxford University Press. Craver, C. F. (2009). Mechanisms and natural kinds. Philosophical Psychology, 22, 575–594. Craver, C. F., & Bechtel, W. (2006). Mechanism. In S. Sarkar & J. Pfeifer (Eds.), Philosophy of science: An encyclopedia. (pp. 469–478). New York: Routledge. Cummins, R. (1983). The nature of psychological explanation. Cambridge: MIT Press. Estes, W. K. (1994). Classification and cognition. Oxford: Oxford University Press. Felleman, D. J., & van Essen, D. C. (1991). Distributed hierarchical processing in primate visual cortex. Cerebral Cortex, 1, 1–47. Gallistel, C. R., & King, A. P. (2009). Memory and the computational brain. Malden, MA: Wiley-Blackwell. Giere, R. (1988). Explaining science: A cognitive approach. Chicago: University of Chicago Press. Glennan, S. (1996). Mechanisms and the nature of causation. Erkenntnis, 44, 49–71. Glennan, S. (1997). Capacities, universality, and singularity. Philosophy of Science, 64, 605–626. Glennan, S. (2005). Modeling mechanisms. Studies in History and Philosophy of Biological and Biomedical Sciences, 36, 443–464. Hanson, S. J., & Bunzl, M. (Eds.). (2010). Foundational issues in human brain mapping. Cambridge, MA: MIT Press. Hummel, J. E., & Biederman, I. (1992). Dynamic binding in a neural network for shape recognition. Psychological Review, 99, 480–517. Just, M. A., & Carpenter, P. A. (1992). A capacity theory of comprehension: Individual differences in working memory. Psychological Review, 99, 122–149. Just, M. A., Carpenter, P. A., & Varma, S. (1999). Computational modeling of high-level cognition and brain function. Human Brain Mapping, 8, 128–136. Kruschke, J. K. (1992). ALCOVE: An exemplar-based connectionist model of category learning. Psychological Review, 99, 22–44. Kruschke, J. K. (2008). Models of categorization. In R. Sun (Ed.), The Cambridge handbook of computational psychology (pp. 267–301). Cambridge: Cambridge University Press. Love, B. C., & Gureckis, T. M. (2007). Models in search of a brain. Cognitive, Affective, and Behavioral Neuroscience, 7, 90–108.
123
338
Synthese (2011) 183:313–338
Love, B. C., Medin, D. L., & Gureckis, T. M. (2004). SUSTAIN: A network model of category learning. Psychological Review, 111, 309–332. Machamer, P. (2004). Activities and causation: The metaphysics and epistemology of mechanisms. International Studies in the Philosophy of Science, 18, 27–39. Machamer, P., Darden, L., & Craver, C. (2000). Thinking about mechanisms. Philosophy of Science, 67, 1–25. Malt, B. C. (1989). An on-line investigation of prototype and exemplar strategies in categorization. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 539–555. Minda, J. P., & Smith, J. D. (2002). Comparing prototype-based and exemplar-based accounts of category learning and attentional allocation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 275–292. Murphy, G. (2002). The big book of concepts. Cambridge: MIT Press. Newell, A. (1990). Unified theories of cognition. Cambridge: Harvard University Press. Nosofsky, R. M. (1986). Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology, 115, 39–57. Polk, T. A., & Seifert, C. M. (Eds.). (2002). Cognitive modeling. Cambridge: MIT Press. Ramsey, W. (2007). Rethinking representation. Cambridge: Cambridge University Press. Smith, J. D., & Minda, J. P. (2002). Distinguishing prototype-based and exemplar-based processes in dot-pattern category learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 800–811. Tarr, M. J. (2002). Object recognition. In L. Nadel (Ed.), Encyclopedia of cognitive science (pp. 490– 494). London: Nature Publishing Group. Tarr, M. J., & Bülthoff, H. H. (1998). Object recognition in man, monkey, and machine. Cambridge: MIT Press. Ullman, S. (1996). High-level vision: Object recognition and visual cognition. Cambridge: MIT Press. van Essen, D. C. (2004). Organization of visual areas in macaque and human cerebral cortex. In L. Chalupa & J. S. Werner (Eds.), The visual neurosciences (pp. 507–521). Cambridge: MIT Press. van Gelder, T. (1995). What might cognition be, if not computation?. Journal of Philosophy, 92, 81–345. van Orden, G. C., Pennington, B. F., & Stone, G. O. (2001). What do double dissociations prove?. Cognitive Science, 25, 111–172. Weiskopf, D. A. (2011). The functional unity of special science kinds. British Journal for the Philosophy of Science, 62, 233–258. Wilson, R. A. (2004). Boundaries of the mind: The individual in the fragile sciences—Cognition. Cambridge: Cambridge University Press. Wimsatt, W. C. (2007). Re-engineering philosophy for limited beings: Piecewise approximations to reality. Cambridge: Harvard University Press. Winsberg, E. (2010). Science in the age of computer simulation. Chicago: University of Chicago Press. Woodward, J. (2002). What is a mechanism? A counterfactual account. Philosophy of Science, 69, S366– S377.
123
Synthese (2011) 183:339–373 DOI 10.1007/s11229-011-9970-0
Explanation and description in computational neuroscience David Michael Kaplan
Received: 31 October 2010 / Accepted: 21 June 2011 / Published online: 5 July 2011 © Springer Science+Business Media B.V. 2011
Abstract The central aim of this paper is to shed light on the nature of explanation in computational neuroscience. I argue that computational models in this domain possess explanatory force to the extent that they describe the mechanisms responsible for producing a given phenomenon—paralleling how other mechanistic models explain. Conceiving computational explanation as a species of mechanistic explanation affords an important distinction between computational models that play genuine explanatory roles and those that merely provide accurate descriptions or predictions of phenomena. It also serves to clarify the pattern of model refinement and elaboration undertaken by computational neuroscientists. Keywords Explanation · Mechanism · Computational models · Computational neuroscience
1 Introduction Universal agreement over the correct approach to thinking about explanation in neuroscience remains an elusive goal. Nevertheless, there is a building consensus that many sectors of neuroscience explain phenomena by revealing underlying mechanisms (Bechtel 2008; Craver 2007; Machamer et al. 2000). In spite of this convergence, one persistent source of debate concerns the extensibility of the framework of mechanistic explanation to “higher-level” fields of neuroscience including cognitive and computational neuroscience (Chemero and Silberstein 2008; Revonsuo 2001;
D. M. Kaplan (B) Department of Anatomy and Neurobiology, Washington University School of Medicine, Box 8108, 660 South Euclid Avenue, Saint Louis, MO 63110, USA e-mail:
[email protected] 123
340
Synthese (2011) 183:339–373
Rusanen and Lappi 2007; Von Eckart and Poland 2004).1 It is therefore important to examine whether the mechanistic framework accurately expresses the standards for distinguishing adequate from defective explanations in these increasingly important areas of neuroscience. In this paper, I argue that explanation in computational neuroscience is best understood as a species of mechanistic explanation.2 In order to make the case for extending the mechanistic framework to explanation in computational neuroscience, I identify a general constraint on explanatory mechanistic models according to which models are explanatory of a phenomenon if there is a mapping between elements in the model and elements in the mechanism for the phenomenon. This mechanism–model–mapping (3M) constraint embodies widely held commitments about the requirements on mechanistic explanations and provides more precision about those commitments. The idea of being in compliance with the 3M constraint is shown to have considerable utility for understanding the explanatory force of models in computational neuroscience, and for distinguishing models that explain from those merely playing descriptive and/or predictive roles. Conceiving computational explanation in neuroscience as a species of mechanistic explanation also serves to highlight and clarify the pattern of model refinement and elaboration undertaken by computational neuroscientists. The paper is organized as follows. First, I address an important preliminary issue concerning the putative autonomy of computational explanation in psychology and show how various arguments against the possibility of mechanistic computational explanation in neuroscience do not go through (Sect. 2). With this major obstacle removed, I outline the 3M constraint (Sect. 3) and describe how it encapsulates two powerful distinctions for demarcating explanatory from non-explanatory mechanistic models (Sect. 4). I then examine several paradigmatic models in computational neuroscience (Sects. 5–7), showing how each model is subject to, yet differentially satisfies, the 3M constraint. Crucially, all the models considered are in different stages of being transformed into mechanistic explanations. I close by addressing the general relationship between mathematical modeling and mechanistic explanation (Sect. 8). 2 Autonomous computational explanation? Slightly less than three decades ago, Johnson-Laird (1983) expressed what was then a mainstream perspective on computational explanation in cognitive science: “The mind can be studied independently from the brain. Psychology (the study of the programs) can be pursued independently from neurophysiology (the study of the machine and 1 For a related discussion of mechanistic explanation in cognitive neuroscience with specific focus on the
explanatory force of dynamical models, see Kaplan and Craver (forthcoming). 2 Piccinini (2007) defends the closely related claim that computing systems should be understood as
instances of mechanisms of a certain kind. His account thus shares a similar mechanistic orientation, but differs in two important respects from the project undertaken here. First, Piccinini’s goal is to provide an analysis of what computing systems are. His main project is thus ontological. By contrast, the current aim— to provide a mechanistic account of the explanatory force of computational models in neuroscience—is primarily epistemological. Second, when explanation is discussed by Piccinini, focus is placed on how the mechanistic framework applies to ordinary computers and not how it applies to neural systems performing computations (the computing systems of interest in this paper).
123
Synthese (2011) 183:339–373
341
the machine code)” (Johnson-Laird 1983, p. 9; also cited in Dawson 1998). According to this perspective, which Piccinini (2006) usefully dubs computational chauvinism, computational explanations of human cognitive capacities can be constructed and confirmed independently of details about how these capacities are implemented in the brain (Johnson-Laird 1983; Pylyshyn 1984). Computational chauvinism can be broken down into three interrelated claims. First, computational explanation in psychology is autonomous from neuroscience (C1). ‘Autonomy’ is to be understood here in the specific sense that evidence about neural organization does not levy substantive constraints on the shape such higher-level explanations can take. Second, computational notions are uniquely appropriate or proprietary to psychological theory (C2). Fodor endorses this claim when he asserts that, in the “language” of neuroscience, “notions like computational state and representation aren’t accessible” (1975, p. 96; also cited in Piccinini 2006). If the theoretical vocabulary of computation is proprietary to the domain of psychology, neuroscience’s prospects for providing its own legitimate computational explanations of cognitive phenomena are dim. Third, computational explanations of cognitive capacities in psychology embody a distinct form of explanation—functional analysis or functional explanation—not to be assimilated to other types (C3). Of particular importance is the fact that this includes mechanistic explanation, the kind of explanation prevalent in the neurosciences and other biological sciences (Bechtel 2008; Bechtel and Richardson 1993; Craver 2007). Clearly, if C1–C3 are correct and computational explanations are outside the reach of neuroscience, then the stated goal of this paper—to provide an analysis of computational explanation in neuroscience—is doomed to failure from the outset. Therefore, a preliminary task is to rebut the challenge from computational chauvinism. The viewpoint of computational chauvinism shares reinforcing connections with a mainstream interpretation of functionalism, once the dominant position in philosophy of mind (Fodor 1974, 1975; Putnam 1967).3 Functionalists have long maintained that psychology can proceed with its explanatory goals independently of evidence from neuroscience about underlying neural mechanisms. Computational functionalism is supposed to justify a theoretically principled neglect of neuroscientific data based on the alleged close analogy between psychological processes and running software (e.g., executing programs or algorithms) on a digital computer, and the multiple realizability of the former on the latter. According to the analogy, the brain merely provides the particular hardware on which the cognitive software happens to run, but the same software could in principle be implemented in indefinitely many other hardware platforms. The brain is thus deemed a mere implementation of the software. If the goal is to understand the structure of the software—what computations are performed— figuring out the hardware is unnecessary. According to the computational functionalist perspective, neuroscience is at best relegated to pursuing a subordinate and 3 See Piccinini (2010) for a dissenting view. Piccinini argues that there is no entailment relation between functionalism and computationalism, despite the fact that they have been historically conjoined. In his view, they are independent doctrines embodying different theoretical commitments, with different types of evidence counting for and against them. In light of these concerns, the above discussion can instead be construed in terms of the combined position of computational functionalism (i.e., computationalism combined with functionalism), rather than functionalism simpliciter.
123
342
Synthese (2011) 183:339–373
relatively minor role: finding the corresponding neural implementations for computational explanations independently developed in psychology. The consequences of computational chauvinism and functionalism are clearly unfavorable for the explanatory ambitions of computational neuroscience. Fortunately for computational neuroscience, the strong form of autonomy implied by these perspectives has not been borne out in the intervening decades since Johnson-Laird’s bold pronouncement. First, multiple realization and its implications for autonomous psychological explanation have come under serious challenge. As philosophers started to pay closer attention to empirical findings in neuroscience, the evidence for multiple realization began to appear less clear-cut than had been initially assumed (Bechtel and Mundale 1999; Bickle 1998; Shagrir 1998; Shapiro 2000). Second, efforts to construct autonomous computational accounts unconstrained by data from neuroscience, which can then be mapped back on to real neural mechanisms, have been largely unsuccessful (e.g., Newell and Simon 1976). Through no fault of their own, neuroscientists have been unable to discover the implementations for computational processes posited by high-level psychological theories constructed in relative isolation from neuroscience, precisely because of the absence of constraints from evidence about neural structure and function. This pattern of failure undermines autonomy (as expressed in C1), suggesting that the strict division of intellectual labor assumed by computational chauvinism is deeply misguided (Churchland 1986, 2005; Revonsuo 2001). Computational chauvinism is not yet defeated, however, since it finds additional reinforcement from a widespread interpretation of the influential computational framework of Marr (1982).4 Some investigators have taken Marr’s hierarchical framework to imply the possibility of autonomous computational explanations of cognitive capacities insensitive to details about how such capacities are realized in the brain, in line with C1 (Dror and Gallogly 1999; Rusanen and Lappi 2007). Two aspects of this framework have invited this interpretation. First, Marr repeatedly prioritizes the explanatory importance of the computational level.5 This explanatory privileging of the computational level, coupled with the fact that his preferred methodology is top-down, moving from the computational level to the algorithmic, and ultimately, to implementation, has fostered the idea of autonomous computational explanation. Second, in some passages, Marr appears to claim that there are no direct constraints between levels. At best the operative constraints are weak or indirect, flowing downward from the computational level. For instance, he states this: “since the three levels are only rather loosely related, some phenomena may be explained at only one or two of them” (1982, p. 25). If computational explanations were entirely unconstrained by one
4 As is well known, Marr (1982) distinguishes three levels: the computational level (a specification of the input–output function being computed), the algorithmic level (a specification of the representations and computational transformations defined over those representations), and implementation level (a specification of how the other levels are physically realized). 5 For instance, Marr states: “it is the top level, the level of computational theory, which is critically important from the information-processing point of view. The reason for this is that the nature of the computations …depends more upon the computational problems to be solved than upon the particular hardware in which their solutions are implemented” (1982, p. 27).
123
Synthese (2011) 183:339–373
343
another in this manner, it would not be difficult to see how this could be used to draw a conclusion about explanatory autonomy.6 When Marr addresses the key explanatory question of whether a given computational model or algorithmic description is appropriate for the specific target system under investigation, his position changes markedly. On the one hand, as indicated above, his general computational framework clearly emphasizes that the same computation might be performed by any number of algorithms and implemented in any number of diverse hardwares. Yet, on the other hand, when the focus is on the explanation of a particular cognitive capacity such as human vision, Marr firmly rejects the idea that any computationally adequate algorithm (i.e., one that produces the same input–output transformation or computes the same function) is equally good as an explanation of how the computation is performed in that particular system. Consider the fact that after outlining their computational proposal for the extraction of zerocrossings in early vision, Marr and Hildreth immediately turn to a determination of “whether the human visual system implements these algorithms or something close to them” (Marr and Hildreth 1980, p. 205; see also Marr et al. 1979; Marr 1982). This is not an ancillary task. They clearly recognize this to be a critical aspect of their explanatory account. For Marr, sensitivity to information about properties of the human visual system (e.g., the response properties of retinal ganglion neurons) is crucial not only for building successful computer vision algorithms, but also for building successful explanations. According to Marr, the ultimate adequacy of these computational and algorithmic specifications as explanations of human vision is to be assessed in terms of how well they can be brought into registration with known details from neuroscience about their biological implementation. Marr thus parts company with computational chauvinists who maintain that computational explanations are autonomous and unconstrained by evidence about neural implementation. Another striking development weighing against the form of explanatory autonomy embodied by computational chauvinism is the recent emergence of computational neuroscience as a mature field of research.7 As we will see in more detail below, computational neuroscientists routinely employ computational notions to interpret and explain neural systems, demonstrating that such notions are in fact “accessible” to neuroscience. By describing neural systems as carrying out computations, computational neuroscientists seriously weaken the plausibility of (C2), the claim that computational constructs are proprietary to the explanatory framework of psychology. Of course, the computational chauvinist can object that the use of these 6 Rusanen and Lappi (2007), for example, assert that “[i]n the Marrian model, the higher levels are con-
sidered largely autonomous with respect to the levels below, and thus the computational problems of the highest level may be formulated independently of the assumptions about the algorithmic or neural mechanisms which perform the computation.” 7 Computational neuroscience is distinguished from other branches of neuroscience such as experimental
neuroscience by its heavy reliance on tools from applied mathematics and computer science to analyze and model neural systems (Churchland et al. 1988; Dayan and Abbott 2001). Sometimes computational neuroscience is distinguished by its theoretical commitment to the idea that target system under investigation—the brain—performs computations. Sejnowski gives clear expression to this idea: “The term ‘computation’ in computational neuroscience refers to the way that brains process information” (2001, p. 2460). However, this commitment is not required.
123
344
Synthese (2011) 183:339–373
notions beyond their original domain of application in cognitive science is simply illegitimate. In doing so, however, the computational chauvinist stands exposed to the very same charge—since cognitive scientists arguably imported their “proprietary” notion from computability theory originally developed in the fields of mathematical logic and computer science.8 Moreover, because computational neuroscience employs its own notion of computation specifically tailored to its distinctive theoretical interests and explanatory goals, it cannot reasonably be viewed as trespassing on the privileged conceptual terrain of computational cognitive science. Not only do computational neuroscientists routinely build explanatory models that legitimately invoke computations, their models also serve to impose important constraints on psychological theorizing about the computational processes subserving cognitive functions. The field of computational neuroscience thus stands as a powerful existence proof against the claim that computational constructs are proprietary to psychology. Finally, there is reason to doubt the computational chauvinist’s assumption that explanation in psychology differs in kind (C3), and is therefore autonomous from mechanistic explanation in neuroscience. This particular strain of autonomy thinking is especially prevalent among philosophers of the special sciences. The central idea is that explanations of many complex systems—ranging from electronic circuits to digital computers to human brains—proceed via functional analysis (Cummins 1975; Cummins 1983; Haugeland 1987).9 According to the explanatory strategy of functional analysis, the overall behavioral capacities of a target system (e.g., a computational system) is explained by breaking down or decomposing these capacities into a number of “simpler” subcapacities and their functional organization.10 Cummins is quick to point out that functional analysis should not to be confused with mechanistic forms of explanation (his preferred term for this contrasting approach is “componential analysis”). Unlike mechanistic explanations, which explain phenomena via the attribution of various subcapacities to the physical components in the underlying mechanism (for further details, see Sect. 3), functional analyses explain by attributing subcapacities to the system as a whole (Cummins 1983, p. 29). In addition, the “components” identified in such analyses are functionally specified or individuated, and therefore need not be physical parts of the analyzed system. According to Cummins, “[s]ince we do this sort of analysis without reference to an instantiating system, the analysis is evidently not an analysis of the instantiating system” (Cummins 1983, p. 29). Elsewhere Cummins elaborates that functional explanations can be provided “independent of whatever theory is relevant to describing the details of the realization” (Cummins 1983, p. 100). As a result, functional analyses of human cognitive capacities are supposed to be flexible in the degree to which they abstract away from 8 Arguably, no single notion of computation is theoretically privileged. Although all usage might derive (historically, conceptually) from some more general, Turing-equivalent notion, what the term ‘computation’ means in each different discipline reflects the specific theoretical and explanatory goals of that discipline. For a full defense of this claim, see Aizawa (2010). 9 I focus exclusively on the account of functional analysis provided by Cummins (1983), as it is the most
developed in the literature. 10 According to Cummins: “Functional analysis consists in analyzing a disposition into a number of less
problematic dispositions such that the programmed manifestation of these analyzing dispositions amounts to a manifestation of the analyzed disposition” (1983, p. 28).
123
Synthese (2011) 183:339–373
345
details about how those analyzed capacities and sub-capacities are implemented by components and activities in neural systems. Because evidence from neuroscience exerts little or no direct constraint on the shape such analyses can take, and is deemed irrelevant to the overall success of the analysis, a path is evidently opened up for autonomous psychological explanation.11 The divide separating functional analysis and mechanistic explanation is not, however, as wide as it initially seems. Some advocates of the mechanistic approach have explicitly identified functional analysis (or functional decomposition) as a key stage in the development of mechanistic explanations (Bechtel and Richardson 1993; Bechtel 2002). Needless to say, if functional analysis is a critical aspect of mechanistic explanations, maintaining its distinctness from them is deeply problematic. However, the defender of autonomous functional explanation could continue to hold that functional analysis embodies a non-mechanistic form of explanation, by arguing that implementation details about how functional subcapacities are localized to certain physical components in the mechanism—even if such details are typically included—have no real impact on its quality as an explanation. Paradoxically, the prospects for this particular response on behalf of the defender of autonomy are undercut by considerations raised directly by Cummins. Cummins in fact endorses a much tighter relationship between functional analysis and mechanistic explanation than he initially implies. Shortly after declaring the need to keep sharply separated functional and componential analysis, Cummins argues that a “complete” theory or account of a given cognitive capacity “must exhibit the details of the target property’s instantiation in the system (or system type) that has it. Analysis of the disposition (or any other property) is only a first step: instantiation is the second” (1983, p. 31). Cummins continues: “Functional analysis of a capacity C of a system S must eventually terminate in dispositions whose instantiations are explicable via analysis of S. Failing this, we have no reason to suppose we have analysed C as it is instantiated in S” (1983, p. 31). According to Cummins, then, not every functional analysis of a system is equally well-suited to play an explanatory role. The explanatory success of a proposed functional decomposition instead depends on how well it aligns with a structural decomposition of the system into components parts and their activities. Otherwise, one might reasonably wonder, how is the adequacy of a given functional decomposition to be assessed? Cummins appears to recognize that without this further constraint, the resultant analysis could be one possible but non-actual way the system is functionally organized. Clearly, this kind of functional decomposition recommended by Cummins is not autonomous from details about implementation, but instead critically relies on such details for explanatory traction. In fact, this fusion of functional analysis and instantiation directly parallels the interconnected heuristic strategies of functional decomposition and localization at the core of the mechanistic framework (Bechtel and Richardson 1993). When Cummins combines functional analysis with instantiation as two aspects of a complete explanation, the distinction between functional and mechanistic explanation collapses. Importantly, because mechanistic 11 What proponents of autonomy see as a feature, mechanists see as a bug. Craver (2001) criticizes precisely this aspect of functional explanations, claiming that Cummins’ notion of role function is underspecified because it abstracts away from crucial details of how such functions are implemented.
123
346
Synthese (2011) 183:339–373
explanations are constitutive explanations, accounting for the behavioral capacities (and subcapacities) of a target system as a whole in terms of activities and interactions among the component parts of that system, they are at base antithetical to claims of higher-level autonomy. This general failure to differentiate functional explanation (supplemented in the way Cummins suggests) from mechanistic explanation powerfully undermines its utility for securing the explanatory autonomy of computational cognitive psychology.12 The mechanistic framework stands diametrically opposed to the kind of explanatory autonomy assumed by computational chauvinism, functionalism, and extreme forms of computational and functional analysis. Each problematically assumes that computational explanations can be provided via autonomous functional decomposition without delineating the underlying mechanisms (neural or otherwise) that implement the computational process. The claims of autonomy canvassed above are tantamount to claims that computational explanation is not to be evaluated in terms of the powerful norms of mechanistic explanation. According to the view defended here, explanations in computational neuroscience are subject to precisely these same norms. The cost imposed by departing from this view, as both Marr and Cummins appear to recognize, is the loss of a clear distinction between computational models that genuinely explain how a given phenomenon is actually produced versus those that merely describe how it might possibly be produced (discussed in more detail below). And it is precisely by adhering to this distinction (along with a distinction between merely describing or saving a phenomenon and explaining it), that one can identify models in computational neuroscience possessing explanatory force. 3 Mechanistic models Given the recent flood of interest in mechanistic explanation, surprisingly little attention has been given to the nature of mechanistic models themselves, especially the issue of precisely which features endow them with explanatory power. In this section, I outline a constraint on explanatory mechanistic models. In the sections to follow, I employ this constraint to identify explanatory models in computational neuroscience. Because the constraint involves a mapping between model and mechanism, it is useful to begin with a brief reminder about how the term ‘mechanism’ should be understood. Bechtel and Abrahamsen (2005) define a mechanism as follows: “A mechanism is a structure performing a function in virtue of its component parts, component operations, and their organization. The orchestrated functioning of the mechanism is responsible for one or more phenomena” (Bechtel and Abrahamsen 2005, p. 423). Machamer et al. (2000) characterize mechanisms as “entities and activities organized such that they are productive of regular changes from start or set-up to finish or termination conditions” (2000, p. 3). The common thread between these definitions is that mechanisms are active structures that perform functions, produce regularities, underlie capacities or exhibit phenomena; doing so in virtue of the organized interaction 12 See Piccinini and Craver (2011) for a much more developed account of functional explanation as a kind
of elliptical mechanistic explanation.
123
Synthese (2011) 183:339–373
347
among the mechanism’s component parts, and the processes or activities these parts carry out.13 Next consider mechanistic models. A central tenet of the mechanistic framework is that a model carries explanatory force to the extent it reveals aspects of the causal structure of a mechanism, and lacks explanatory force to the extent it fails to describe this structure.14 This core commitment is captured by the following model–mechanism–mapping (3M) constraint on explanatory mechanistic models: (3M) A model of a target phenomenon explains that phenomenon to the extent that (a) the variables in the model correspond to identifiable components, activities, and organizational features of the target mechanism that produces, maintains, or underlies the phenomenon, and (b) the (perhaps mathematical) dependencies posited among these (perhaps mathematical) variables in the model correspond to causal relations among the components of the target mechanism. By articulating exactly those factors the inclusion of which endows a model with explanatory force, the 3M constraint (hereafter, 3M) provides a clear guideline for the construction of explanatory mechanistic models in computational neuroscience and a standard by which to evaluate them as explanations.15 3M aligns with the highly plausible assumption that the more accurate and detailed the model is for a target system or phenomenon the better it explains that phenomenon, all other things being equal (for a contrasting view, see Batterman 2009). As one incorporates more mechanistically relevant details into the model, for example, by including additional variables to represent additional mechanism components, by changing the relationships between variables to better reflect the causal dependencies among components, or by further adjusting the model parameters to fit more closely what is going on in the target mechanism, one correspondingly improves the quality of the explanation. Importantly, 3M does not entail that only completely detailed, non-idealized models will be explanatorily adequate. 3M is perfectly compatible with elliptical or incomplete mechanistic explanations, in which some of these mechanistic details are omitted either for reasons of computational tractability or simply because these details remain unknown. Far from requiring a perfect correspondence or isomorphic mapping between model and mechanism, 3M requires only that some (at least one) of 13 Although there are important differences between these characterizations, they are irrelevant to the
points made in this paper. 14 To understand the intended import of this claim, especially the embedded sufficiency claim, it is impor-
tant to distinguish between explaining why a phenomenon (e.g., an effect, event, or outcome) occurs and explaining how it occurs. Explanations of the former type commonly cite the antecedent cause of an event or effect, whereas explanations of the latter type identify properties within the system enabling the causal regularity to hold. The arguments made in this paper pertain only to explanations of this second type. This basic distinction has been drawn repeatedly in the literature. For instance, Salmon (1989) and Craver (2007) distinguish “constitutive” from “etiological” causal-mechanical explanations; Cummins (1983) distinguishes “property theories” from “change theories”; and Haugeland (1987) distinguishes typical D–N style explanations of events from what he calls “systematic explanations”. 15 3M thus articulates what Weisberg (2007) calls a representational ideal—a principle serving “to regu-
late which factors are to be included in models, set up the standards theorists use to evaluate their models, and guide the direction of theoretical inquiry” (2007, p. 648).
123
348
Synthese (2011) 183:339–373
the variables in the model correspond to at least some (at least one) identifiable component parts and causal dependencies among components in the mechanism responsible for producing the target phenomenon.16 Insofar as this minimal degree of model–mechanism correspondence obtains, the model will be endowed with explanatory force. Relatedly, 3M allows for explanatory mechanistic models to involve various kinds of idealization or mathematical abstraction. By ‘idealization’ I mean the deliberate introduction of simplifications and distortions into models for the purposes of making them more computationally tractable or to better model the dominant features of the phenomenon of interest (Batterman 2002, 2009; Weisberg 2007). Machamer et al. (2000) implicitly acknowledge this possibility when they countenance mechanism schemata, i.e., “truncated abstract descriptions of a mechanism that can be filled with descriptions of known component parts and activities” (2000, pp. 15–16). Because schemata are elliptical mechanistic explanations in which known details have been intentionally removed, they are idealization-involving by definition. It is important that the mere presence of idealizations and approximations does not jeopardize the explanatory status of a model, according to 3M, since these simplifying devices are ubiquitous in computational neuroscience.17 If 3M entailed that explanatory models could not employ idealizations, very few models (if any) in computational neuroscience would satisfy this overly stringent requirement and count as explanatory. According to 3M, then, a computational model will have explanatory force to the extent that it describes some component parts, activities, and organizational features of the target mechanism (however approximated or idealized).18 4 Two mechanistic distinctions The plausibility for 3M as a general constraint on explanatory mechanistic models is bolstered by the fact that it encapsulates two key distinctions for marking off explanatory mechanistic models from non-explanatory ones. The first is a distinction between 16 3M is not intended as an account of the semantics of mechanistic models (i.e., an analysis of how such
models are capable of representing concrete phenomena). It is not a version of the so-called “semantic view”, which ties the representational capabilities of models to relations of isomorphism, partial isomorphism, or similarity obtaining between model and target (see Downes 1992). Despite common appeals to mapping relations between model and target, such relations are exploited in 3M to account for the explanatory force of mechanistic models, not their representational abilities. I thank an anonymous reviewer for highlighting the possibility of confusion over this point. 17 Examples are easy to find. Continuous time is frequently approximated in computational models of
neural systems with a series of discrete time steps to guarantee tractable simulations on digital computers; synaptic integration is idealized in neural network models by a simple linear weighted sum of a unit’s inputs; cable theory idealizes dendrites and axons as perfectly cylindrical, passive conductors; and compartmental models assume dendritic trees to be branched networks of discrete, isopotential compartments even though at the relevant spatial scale they are continuous structures with a membrane potential that varies continuously and not discretely along its longitudinal axis. 18 Although 3M allows for idealization, it does not prescribe exactly which aspects of a mechanistic model can be idealized or approximated without compromising its explanatory and/or predictive power. This will undoubtedly be dictated in part by which aspects of the target phenomenon the model is being deployed to explain.
123
Synthese (2011) 183:339–373
349
phenomenological and mechanistic models (Bunge 1964; Craver 2006; Mauk 2000). The second distinction is between how-possibly and how-actually models (Craver 2006, 2007; Machamer et al. 2000). These two distinctions, along with 3M, will help to shed light on the source of the explanatory power of a range of models in computational neuroscience. 4.1 Phenomenological versus mechanistic models Numerous models across the physical and biological sciences share the following feature: They provide descriptions (often highly abstract, mathematical descriptions) of the phenomena for which explanations are sought. Such models have been dubbed phenomenological models (hereafter, p-models). Philosophers of science have characterized this type of model in roughly similar ways. Bunge (1964) identifies p-models or “black box” models as those that describe the behavioral profile of a target system “as if they were units devoid of structure”, i.e., without positing “unobservables” or intervening variables (1964, p. 235). Craver (2006) follows suit, suggesting that p-models “are complete black boxes; they reveal nothing about the underlying mechanisms and so merely ‘save the phenomenon’ to be explained” (2006, p. 360). Mauk (2000) provides a similar characterization of p-models in neuroscience, defining them as “any physical or mathematical representation of a neural process that produces the correct input/output behavior with no claims about biological validity” (2000, p. 649). Finally, computational neuroscientists reflecting on their own modeling practices have identified an equivalent model type: descriptive model. Descriptive models “summarize data compactly” and are to be distinguished from mechanistic models, which “address the question of how nervous systems operate on the basis of known anatomy, physiology, and circuitry” (Dayan and Abbott 2001, p. xiii). The common thread linking these different construals is that p-models can provide accurate representations of phenomena such as global system behavior or input–output profiles, yet they embody no commitments about the mechanisms responsible for producing those phenomena. Consider the Balmer formula (Eq. 1), which Balmer constructed to describe the four visible lines in the emission spectrum of hydrogen: λ=B
m2 m 2 − n2
(1)
where λ is the wavelength, B is a constant with the value of 364.56 nm, n equals 2, and m is variable taking integer values greater than 2 (m > n). When m is replaced by any integer from the appropriate range, the model outputs the wavelength for a line in the emission spectrum of hydrogen. The Balmer formula is descriptively adequate. It provides an accurate mathematical description of the target phenomenon and can thus be said to “save” it. The model is also useful for making predictions. Although initially constructed to account for the visible spectral lines for hydrogen, the model successfully predicted the existence of several unobserved spectral lines outside the visible range that were later experimentally confirmed.
123
350
Synthese (2011) 183:339–373
Despite its descriptive and predictive successes, the model amounts to little more than ad hoc curve fitting to the empirical data. Balmer settled upon this particular model by trial and error because it provided the best mathematical fit to the four visible spectral lines of hydrogen. None of the model elements support a realistic physical interpretation. For instance, the value Balmer selected for the mathematical constant B was simply a number chosen to fit the data because it stood in the appropriate mathematical relation to every line in the visible emission spectrum for hydrogen. For these reasons, it is widely agreed that the Balmer formula lacks explanatory force and provides no explanation for why the emission spectrum for hydrogen exhibits the characteristic pattern that it does (Cummins 2000; Craver 2006; Hempel 1965).19 As a p-model, it was not constructed to do this. To paraphrase Cummins (2000), it is a mere description of an effect and not an explanation. Neuroscience also has its share of p-models. The Nernst equation provides a mathematically compact representation of a neuron’s equilibrium or resting potential, i.e., the precise point at which ionic fluxes across the membrane due to opposing electrical and chemical forces are balanced. The Nernst equation for the equilibrium potential of potassium (E K+ ) is expressed as: E K+
+ K RT ln + o = zF K i
(2)
where R is the universal gas constant, T is the absolute temperature measured in kelvins, z is the electrical charge carried by a permeant ion, F is the Faraday constant (the charge contained in one mole of a univalent ion), [K+ ]o is the extracellular concentration of potassium, and [K+ ]i is the intracellular concentration of potassium, and ln is the natural logarithm of the concentration gradient. The Nernst equation is descriptively and predictively adequate. It successfully captures the input–output profile implemented by the target system, specifying how different concentration gradients (the inputs) map onto different possible equilibrium potential values for a given ion species (the outputs). It also generates precise predictions about the potential measured across the neuronal membrane at electrochemical equilibrium. Nevertheless, the Nerst equation is explanatorily deficient according to the mechanistic standards outlined above. It does not describe why the mathematical dependencies depicted in the model obtain (e.g., why the equilibrium potential varies linearly with absolute temperature and logarithmically with ion concentration ratio), it only specifies that they obtain. It does not account for how changes in ion concentrations inside and outside the cell actually result in the diffusion of ions across the cell membrane in the direction of their concentration gradients. Nor does it disclose how the critical transmembrane ion concentration differences are maintained. The preceding discussion highlights two basic facts about p-models. First, p-models can be highly effective at describing and predicting phenomena. They are 19 The explanation had to await Bohr’s mechanistic model of the hydrogen atom and developments in
quantum mechanics. According to the Bohr model, the Balmer series for hydrogen is produced when photons of a given frequency are emitted (or absorbed) as electrons jump or transition from higher stationary states (their discrete orbital trajectories around the atomic nucleus) to lower ones, and vice versa.
123
Synthese (2011) 183:339–373
351
constructed to account for the observable properties or behavior of a target system and get evaluated according to how well they accomplish this goal. Second, p-models are often constructed via ad hoc fitting of the model to the empirical data, involving mathematical constructs and parameter values that function merely to save the phenomena to be explained. Because p-models do not delimit components, causal dependencies, and interactions within the target system, they are not in compliance with 3M and are to be sharply distinguished from explanatory mechanistic models. The fact that p-models do not explain mechanistically should be relatively uncontroversial. However, it is worth reflecting briefly on what justifies the further conclusion that p-models are incapable of explaining phenomena via other means. If we suppose that p-models are endowed with explanatory force, but not in virtue of describing mechanisms, where else could their explanatory power come from? One possibility is that their explanatory power stems from their descriptive and predictive power alone. This kind of view, once dominant in the philosophy of explanation, we might call predictivism.20 Predictivism holds that any predictively adequate model (i.e., any model that predicts all the relevant features of the phenomenon in the appropriate range of conditions to the desired level of precision) possesses explanatory force with respect to those aspects of the phenomenon. Arch-predictivists assert, for example, that to explain a phenomenon is to show that its occurrence was to be rationally expected, in the sense that it follows from universal or statistical generalizations together with the relevant initial and boundary conditions (Hempel 1965). As indicated above, the main problem with attempting to tie the explanatory power of p-models to their predictive power along the lines suggested by predictivism, is that merely predictively adequate models can fail intuitively to explain how the system behaves or show why the dependencies are as the model depicts them. This was true of both the Balmer and Nernst equations. This specific failure reflects deeper problems confronting predictivism. The general problems facing predictivism are well-known (Salmon 1989). Many of the textbook counterexamples to the covering law account of explanation are best understood as direct challenges to predictivism. Simple examples readily expose how accurate prediction is insufficient for explanation, and so how the predictive import of a given model can and often does vary independently of its explanatory force. One can accurately predict a storm’s occurrence from a dropping barometer, but this does not explain the occurrence of the storm. Rather, a common cause—a drop in atmospheric pressure—explains both the dropping barometer value and the approaching storm. Similarly, a p-model might be predictively adequate, and yet its variables might only represent factors that are mere behavioral correlates of some common or joint cause for the target phenomenon. Just as we reject the claim that the barometer drop explains the storm, we should also resist the claim that p-models of this kind provide explanations. Similarly, resting tremor, slow movement, rigidity, and postural instability reliably predict Parkinson’s disease, but they cannot be used to explain its occurrence. Symptoms undeniably afford accurate predictions of disease. However,
20
See Kaplan and Craver (forthcoming) for a parallel rejection of predictivism as it arises in the contemporary debate concerning the explanatory force of dynamical models of cognition.
123
352
Synthese (2011) 183:339–373
they are insufficient as explanations presumably because symptoms are caused by, but do not themselves cause, the disease. Accounts of explanation must respect this fundamental asymmetry. For these reasons, explanatory adequacy is thus not (merely) predictive adequacy. In light of these major difficulties, most successor views to predictivism incorporate some variation of the idea that to explain a given phenomenon requires specifying how it was caused. The mechanistic approach embraces a related view according to which explanatory models must reveal the causal structure of the mechanism responsible for producing, underlying, or maintaining the phenomenon of interest. Those wishing to defend the explanatory force of p-models must either search for ways in which p-models can be enriched or supplemented to transform them into explanatory mechanistic models in direct compliance with 3M, or develop an alternative view of the explanatory force of these models based on factors beyond predictive power that successfully avoid the severe problems facing predictivism.
4.2 How-possibly versus how-actually models The second important distinction for identifying explanatory mechanistic models is one between how-possibly and how-actually models (Craver 2006, 2007; Machamer et al. 2000). Initially introduced by Dray (1993, also described in Hempel 1965) to clarify a difference between types of historical explanation, the distinction has since been extended beyond its original domain of application. When generalized, how-actually explanations subsume etiological causal explanations specifying the antecedent causes for why some event occurs (Salmon 1984). How-actually explanations operate somewhat differently when the explanations in question are constitutive, decompositional explanations involving the description of mechanisms. In this context, such explanations pick out from among the entire space of possible mechanisms the mechanism that is, given suitable empirical support, actually thought to be responsible for producing the target pattern or phenomenon. According to Craver, how-actually models possess explanatory force because they describe the “real components, activities, and organizational features of the mechanism that in fact produces the phenomenon” (2006, p. 361). By contrast, the goal of how-possibly models is to explain how some event or outcome could possibly occur. According to Dray, the function of a how-possibly model “is to rebut the presumption that what happened was impossible, or at any rate extremely unlikely given the circumstances” (1993, p. 27). As others have noted (e.g., Forber 2010), a how-actually explanation surely would fulfill this function as well. However, a how-possibly model can provide such a refutation in the absence of the detailed, actual explanation that must be confirmed against empirical evidence. Again, the etiological formulation requires some tweaking to apply to constitutive, mechanistic explanations. When mechanistic explanation is the focus, how-possibly models are “loosely constrained conjectures about the mechanism that produces the explanandum phenomenon. They describe how a set of parts and activities might be organized such that they exhibit the explanandum phenomenon” (Craver 2006, p. 361). Such models posit one possible way in which a capacity or input–output mapping defining
123
Synthese (2011) 183:339–373
353
the explanandum phenomenon might be implemented.21 Unlike merely descriptive p-models, how-possibly models do purport to explain—they would in fact explain the phenomenon if the target mechanism did work as hypothesized. Nonetheless, because how-possibly models are generally offered in the absence of empirical evidence “if the conjectured parts exist and, if they do, whether they engage in the activities attributed to them in the model” (Craver 2006, p. 361), they typically fail to delimit the actual components, activities, and organizational features in the target mechanism for the phenomenon. In short, they do not comply with 3M and therefore fail as explanatory mechanistic models. Describing the behavior of some possible mechanism capable of producing the phenomenon in the target system is thus explanatorily inadequate. To explain a phenomenon mechanistically, the model must describe the real components and activities in the mechanism that are in fact responsible for producing the phenomenon which is the target of explanation. This is precisely what 3M demands. To redeploy an example from above, building a computer vision system capable of detecting object edges by computing zero-crossings shows this to be one possible mechanism by which edge detection can be performed. However, the successful behavior of this model neither provides nor requires evidence that this conjectured mechanism is the one implemented in biological brains. In order for this model to genuinely explain this aspect of human vision, one needs to establish that neural components participate in the zero-crossing computation. Despite failing as mechanistic explanations, how-possibly models can nevertheless serve other purposes. They can enable refinements of current models and generate new testable hypotheses about the actual target mechanism. How-possibly models can also provide invaluable heuristic information to delimit the space of possible mechanisms by showing whether a hypothetical mechanism is or is not able to produce the phenomenon or behavior of interest (Craver 2006; Machamer et al. 2000). To echo Dray, they can rebut the presumption that it was impossible for a given mechanism to exhibit the target phenomenon.22 However useful this information may be, how-possibly mechanistic models remain insufficient as explanations.
4.3 The virtues of explanation Let us end this section by considering briefly one final argument for the appropriateness of the demarcation line 3M draws between explanatory and non-explanatory models. Explanatory mechanistic models—those in compliance with 3M—can be distinguished from p-models and how-possibly models on broadly instrumentalist grounds because they are distinctively useful. 21 Because how-possibly models start with an input–output mapping defining the phenomenon to be
explained and posit a mechanism capable of reproducing this mapping, they typically are descriptively adequate with respect to the explanandum phenomenon. 22 Crick (1989) makes a similar point in claiming that neural networks only provide mere “demonstra-
tions”, but do not explain. They demonstrate it is not “impossible for any neural net to act in such-and-such a way, but they may not perform in just the way the brain does” (1989, p. 131).
123
354
Synthese (2011) 183:339–373
Woodward (2003) has argued that a major difference between descriptive models and explanatory ones is the degree to which the latter provide information that is potentially exploitable for purposes of manipulation and control. He suggests it is precisely this feature that provides the basis for the distinction between explanation and description: We have at least the beginnings of an explanation when we have identified factors or conditions such that manipulations or changes in those factors or conditions will produce changes in the outcome being explained. Descriptive knowledge, by contrast, is knowledge that, although it may provide a basis for prediction, classification, or more or less unified representation or systematization, does not provide information potentially relevant to manipulation. It is in this that the fundamental contrast between causal explanation and description exists. (Woodward 2003, p. 10) For instance, since the mathematical construct B in Balmer’s formula was just a number selected through a data-fitting procedure and embodies no physical commitments, it affords little in the way of experimental interventions. One cannot intervene to change this value. Mechanistic models, in contrast to both purely descriptive, p-models and how-possibly models, delimit variables and parameters that correspond to mechanism components whose states can be can be experimentally manipulated and measured. Mechanistic models go beyond mere description because they afford markedly greater possibilities for control and manipulation of the phenomena they cover. One important way they do this is by delineating not only how a system in fact behaves, but also how it is expected to behave under a range of actual interventions and counterfactual circumstances (Craver 2006; Woodward 2003). Along these lines, p-models may and typically do answer some nonzero number of “what-if-things-had-been-different” or w-questions (Woodward 2003). At best, p-models support only a very limited range of interventions to answer such w-questions, namely, those involving changes to the inputs to the “black box” to determine how the outputs might change. For example, p-models can potentially answer such questions as what happens to the membrane potential if extracellular K+ is changed to such and such a value? Mechanistic models, by contrast, provide deeper explanations because they describe how a target system is expected to behave under a much wider set of initial conditions including “non-standard” experimental conditions (see Craver 2006). They also describe how a system is expected to behave under a more diverse range of interventions (crucially, including those interventions on components in the target mechanism). In doing so, they answer a far wider set of w-questions than do p-models. Because explanatory mechanistic models describe components and the interaction among components, they also show the underlying reasons why the mathematical relations or dependencies between the component variables hold. As a result, they differ from p-models because they can potentially provide insight into why those dependencies might be expected to remain invariant across a certain range of intervention contexts or conditions, yet change or completely break down across others. In this manner, explanatory mechanistic models can thus outperform merely descriptively adequate p-models and how-possibly models when it comes to providing information that is potentially exploitable for purposes of experimental manipulation and control.
123
Synthese (2011) 183:339–373
355
An important consequence falling out of the discussion in this section is that p-models and how-possibly models, no matter what other scientific virtues they might exhibit, can still fall short as adequate explanations if they fail to describe mechanisms. Insofar as computational neuroscience attempts to construct genuine mechanistic explanations, rather than simply mathematically redescribing the phenomenon to be explained or providing heuristically useful how-possibly models, it should attempt to bring its models into compliance with 3M. 5 The Hodgkin–Huxley model The Hodgkin–Huxley (HH) model of the action potential marks a particularly appropriate place to apply the apparatus of mechanistic explanation to evaluate the explanatory force of models in computational neuroscience. First, it is among the best candidates for an explanatory model across all of neuroscience. It also has been used as a pivot point in the debate over the nature of explanation in neuroscience (Bogen 2005, 2008; Craver 2006, 2007, 2008; Schaffner 2008; Weber 2008). Finally, the HH model, along with Hodgkin and Huxley’s more general efforts to develop hybrid methods at the interface of experimental neuroscience and theoretical modeling, were largely responsible for launching the field of computational neuroscience (Ermentrout and Terman 2010). As is well known, Hodgkin and Huxley’s objective was to uncover the mechanism responsible for the time- and voltage-dependent conduction of sodium (Na+ ) and potassium (K+ ) ions across the neuronal membrane. However, at the time they were developing their model, little was known about the biophysical structure of the membrane; preliminary hypotheses about how these conductances might be regulated by the opening and closing of ion-specific channels remained underspecified and untested (Hille 2001; Hille et al. 1999). In addition, the available recording tools could not resolve the currents arising from individual channels.23 For these reasons, Hodgkin and Huxley instead sought to describe mathematically how conductances change as a function of membrane voltage and time in a manner sufficient to predict all the major temporal features of action potential generation and propagation. The core of the model is the total current equation: I = CM
dV + g¯ K n 4 (V − VK ) + g¯ Na m 3 h (V − VNa ) + g¯ L (V − VL ) dt
(3)
where I is the total current passing through the membrane, composed of a capacitive current, C M ; a potassium current, g¯ K n 4 (V −VK ); a sodium current, g¯ Na m 3 h(V −VNa ); and a leakage current g¯ L (V − VL ), a sum of smaller currents for other ions. The terms g¯ K , g¯ Na , and g¯ L represent the maximum conductances for the different ions. V is the displacement of the membrane potential from rest, and VK , VNa , and VL represent the 23 The voltage clamp, which allows the membrane potential to be held fixed at various voltage levels, only enables the measurement of (what we now know to be) aggregate membrane currents reflecting the flow of ions through thousands of channels. Experimental confirmation of single channels, and the measurement of individual channel currents, had to wait for the development of patch clamp recording techniques in the 1970s (see, e.g., Neher 1992).
123
356
Synthese (2011) 183:339–373
equilibrium or reversal potentials for the various individual ions. Finally, the equation includes rate coefficients n 4 and m 3 h representing the fraction of the maximum conductances actually expressed, whose individual time courses are each governed by separate differential equations in the full model (Hille 2001; Hodgkin and Huxley 1952). Besides the formal model, Hodgkin and Huxley also offer a tentative mechanistic interpretation for the key rate variables in their model (for further details, see Craver 2006; Hille 2001). They hypothesize that charged activation and inactivation “particles” moving or reorienting under the influence of an electric field to particular sites in the membrane could control (permit or prevent) the conduction of Na+ and K+ ions. Because the K+ conductance rises gradually without inactivation for as long as the membrane is depolarized, Hodgkin and Huxley interpreted the coefficient n 4 as representing four identical activation particles, each having the same probability n of being in its open or permissive state (where n − 1 is the probability that the particle is in a non-permissive state). Consequently, n 4 represents the joint probability that all four independent particles are correctly configured to allow ion movement. By contrast, the Na+ conductance exhibits a rapid rise with membrane depolarization and rapidly inactivates before depolarization has ended. Therefore, the coefficient m 3 h was interpreted in terms of three identical activation particles and a single inactivation particle. Accordingly, m 3 is the joint probability that all three independent activation particles move into their permissive state and h is the probability that the inactivation particle is configured in such a way as to impede ion flow. The model’s predictive power is beyond reproach. For example, it generates accurate predictions about the propagation velocity of the action potential in the axonal membrane of the squid giant axon (their experimental system) to within roughly ten percent of its experimentally measured value. However, as indicated earlier, the critical question is not whether the model predicts, but instead whether it explains. To their credit, Hodgkin and Huxley are openly skeptical about placing too much weight on the predictive power of the model when evaluating its force as an explanation. They explicitly argue that the predictive successes of their model provides no evidential support for the specific hypothesis they entertain regarding the mechanism of conductance change. As we have already seen, this skepticism is appropriate considering the challenges raised against predictivism. Why else might they embrace this deflationary view about the explanatory status of the model? In reflecting on the nature of their model, Hodgkin and Huxley appear to recognize that it is a p-model. Consider how they acknowledge that alternative, predictively equivalent models involving equations of “very different form” could have achieved an “equally satisfactory description of the voltage clamp data” (1952, p. 541). More specifically, they openly admit that equivalent (or even improved) mathematical fits to their data could be obtained by using different powers for the crucial rate coefficients in their model (Hodgkin and Huxley 1952; also discussed in Hille 2001). Cole and Moore (1960) directly confirmed this by showing that a rather different rate coefficient, n 25 , provides a better fit to the empirical data on the K+ current. Mauk (2000) pushes Hodgkin and Huxley’s line of thought to even more radical extremes. He suggests the conductance-voltage dependencies represented using differential equations in the original model could have been equally well (albeit less elegantly) expressed by a
123
Synthese (2011) 183:339–373
357
lookup table, in which each experimentally recorded value of the membrane potential is explicitly associated with a pair of conductance values for Na+ and K+ . In acknowledging that indefinitely many equations could be used as descriptively and predictively equivalent models for their data, Hodgkin and Huxley undermine the particular mechanistic interpretation they conjecture. This is due to the fact that had different (empirically adequate) mathematical representations been used, these alternative equations might have supported altogether different biological interpretations from the one they offer, or they might not readily support any realistic interpretation at all.24 While it might be biologically plausible to interpret the coefficient n 4 in the HH model as representing the activities of four activation particles facilitating the K+ conductance (described in more detail below), it might be difficult if not impossible to assign an equally plausible physical interpretation to the coefficient n 25 in Cole and Moore’s model. As indicated above, given the restricted descriptive goals of p-models, mathematical constructs that function exclusively to save the phenomenon to be explained often replace physically or mechanistically interpretable variables (i.e., those representing underlying mechanism components) in p-models. Because p-models are only constrained to provide a correct specification of the input–output behavior of a target system, it matters little if the structure of the HH model reflects ad hoc curve fitting of arbitrary equations as long as it accurately reproduces the input– output relationship between the time course of the experimentally set values of the membrane potential and the time course of the measured conductances. By contrast, a mechanistic model of the action potential requires more than a mere specification of the macroscopic behavior of the conductances. As described above, it requires a specification of how these conductances are generated. Finally, the original HH model is not an explanatory mechanistic model because Hodgkin and Huxley’s speculations about the mechanism of conductance change were ultimately incorrect. They were right to place little theoretical or explanatory weight on the mechanistic hypothesis they conjectured. It is now a well established empirical fact that Hodgkin and Huxley’s hypothetical activation and inactivation particles do not exist, and that the two functions they assigned to these particles—ion-selective gating and movement—are separately assigned to distinct membrane structures in the form of the voltage sensors or gates and the conducting pore, respectively. Instead, the modern reinterpretation of the HH equations appeals to specific proteins in the neuronal membrane serving as voltage-sensitive gates on ion-selective channels, which conduct ions depending on whether these gates are in their open or closed state (Dayan and Abbott 2001; Hille 2001).25 24 Craver makes a similar point when he argues that “the choice of a different strategy for building the
equation (for example, using a single variable, or three rather than two) might suggest an entirely different physical interpretation (or mechanism) or none at all” (2006, p. 366). Surprisingly, Craver maintains that this representational flexibility provides evidence the HH model is a how-possibly model rather than a p-model, as I argue here. 25 More specifically, the gating variable m 3 h depicts the states of the voltage-gated Na+ channel which
has two gates: a fast one (the activation gate) and a slow one (the inactivation gate), where both gates must be open for the channel to conduct Na+ ions. At rest, the activation gate is closed and the inactivation gate is in its open state. When the membrane is depolarized, the activation gate opens, allowing Na+ into the cell. As the membrane depolarizes, the inactivation gate closes, so the flow of Na+ is transient. The gating
123
358
Synthese (2011) 183:339–373
Modern findings from X-ray crystallography studies have yielded a much clearer picture of the mechanisms underlying the voltage-dependent gating of ion channels, and have helped to transform the original HH model into an explanatory mechanistic model. Although, for reasons detailed above, Hodgkin and Huxley’s choice of variables and terms in the original model was governed by purely descriptive and predictive aims, it turns out that the critical rate variables in the model do approximately map onto components in the mechanism, along the lines laid down by 3M. MacKinnon and colleagues have shown that four identical structural subunits, each with an inner and outer helix, assemble to form each voltage-gated K+ channel (Doyle et al. 1998). Accordingly, the coefficient n 4 in the HH model is now reinterpreted as a gating variable corresponding to the probability that all four K+ channel subunits undergo the structural change necessary for the channel to open. In this manner, a mapping in accordance with 3M is effected between this model element and a mechanism component. Although this alignment was entirely fortuitous in the original model (since other mathematical constructs could have been employed because Hodgkin and Huxley’s original choice was not guided by evidence about the structure of ion channels), the model now supports a realistic mechanistic interpretation and has been transformed into a mechanistic model with explanatory force.26 Efforts to discover additional details about the underlying mechanism of the action potential have subsequently produced further refinements in the equations themselves. For example, Hodgkin and Huxley’s original modeling work was based upon data from the giant axon of the squid, known to contain only two conductance types. More recent models have been developed to include additional conductances known to be present in cortical neurons. For instance, the Connor–Stevens model includes an additional term for a transient K+ conductance (the so-called A-type current) and a calcium (Ca2+ ) conductance (Dayan and Abbott 2001, p. 197). These elaborations on the HH model are fundamentally guided by the mechanistic goal of achieving increasingly higher fidelity mappings between model and mechanism. Computational neuroscientists have successfully transformed (and continue to transform) the purely descriptive mathematical model Hodgkin and Huxley originally constructed into an explanatory mechanistic model by bringing it into tighter alignment with 3M.
6 The difference-of-Gaussians model I want to turn now to another important model in computational neuroscience, this time the computational model concerns neural activity involved in early vision. First introduced by Rodieck nearly 50 years ago, the difference-of-Gaussians (DOG) model (Eq. 4) describes the spatial receptive-field organization of retinal ganglion cells. Footnote 25 continued variable n 4 for the K+ channel, by contrast, represents a single type of activation gate. For further details, see Catterall (2000), Hille (2001), and MacKinnon (2004). 26 Of course, this does not entail the mechanistic explanation is complete. Many details about the structure and function of ion channels and gating mechanisms remain unknown including the precise mode of action of the voltage sensors embedded in the membrane. For further details, see Catterall (2000) and MacKinnon (2004).
123
Synthese (2011) 183:339–373
359
The model is among the most well known in computational neuroscience (Dayan and Abbott 2001), and has since been extended to account for spatial receptive-field organization of neurons in the dorsal lateral geniculate nucleus (dLGN). Ganglion cells play a number of important roles in early vision. They combine or pool information from sets of photoreceptors. They implement the first stage of neural coding by transforming analog inputs—the graded membrane potentials of photoreceptors—into digital outputs in the form of trains of action potentials or spikes. They convey these spike trains to the lateral geniculate nucleus (LGN), which in turn projects to visual cortical areas. Among the most important functional properties of individual retinal ganglion cells are their receptive fields—the circumscribed region of the retina (and by extension, visual field) within which illumination changes a given neuron’s firing rate. Using single-unit recording techniques, Kuffler (1953) discovered that retinal ganglion cells have receptive fields spatially organized into two concentric and mutually antagonistic regions. Kuffler also identified two types of retinal ganglion cell. On-center ganglion cells increase their firing rate when illumination is restricted to the central region of the receptive field. Simultaneous illumination of both the receptive field center and the surrounding region elevates an on-center cell’s response above baseline levels, but comparably less than when stimulating the center alone. Off-center ganglion cells exhibit the opposite pattern. Illuminating the center of an off-center cell reduces its response, and higher firing is induced when the surround alone is illuminated. Kuffler found that the distribution of responses across the receptive field center and surround for both cell types are hill-shaped, with antagonistic or oppositely signed sensitivities peaking at their shared middle and declining gradually towards their respective edges. Rodieck (1965) proposed that these hill-like activity distributions could be approximated mathematically by modeling them as Gaussian functions. Cells with on-center receptive fields were modeled using Gaussians with peak responses at the absolute center of the receptive field decaying gradually as a function of distance from this center. Off-center receptive fields were modeled with Gaussians peaked at the edges and declining as proximity to the center region increases. Because the overall spatial organization of the entire receptive field exhibits antagonistic center-surround organization the mathematical relation between activation in the center and surround regions is naturally expressed using the difference operator. For these reasons, Rodieck (1965) introduced the mathematical model of the difference of two partially overlapping, circularly symmetric, and concentric Gaussian functions to characterize the spatial receptive-field organization of retinal ganglion cells. According to the DOG model, the spatial receptive field is expressed as: x 2 + y2 x 2 + y2 A2 A1 exp − exp − − F (x, y) = 2π σ12 2σ12 2π σ22 2σ22
(4)
where the center of the receptive field is positioned at coordinates (x, y) = (0, 0); the first term corresponds to the receptive field center; the second term corresponds to the antagonistic surround; coefficient A1 specifies the peak of the center response at (x, y) = (0, 0); A2 corresponds to the peak of the surround, also centered at (0, 0);
123
360
Synthese (2011) 183:339–373
and σ1 and σ2 are parameters specifying the widths of the Gaussian distributions for center and surround, respectively. This model provides a mathematically compact and accurate representation of the phenomenon, accommodating the relevant experimental data obtained by Kuffler concerning the receptive field profiles of neurons in early vision. The model generates precise predictions about the activity of these cells in response to a variety of visual stimuli including small circular spots of light (used by Kuffler), larger diameter spots, moving bars, and sinusoidal gratings (Enroth-Cugell and Robson 1966; Einevoll and Plesser 2002, 2005). Yet, in spite of its descriptive and predictive force, the DOG model lacks explanatory import for why ganglion cells respond as they do to visual stimuli. It was not constructed for this purpose. It is a p-model. The DOG model is a descriptively and predictively adequate p-model for reasons that recapitulate those already given for the HH model. The model captures the input– output function implemented by ganglion cells and gets evaluated according to how well it accomplishes this objective. Spatial patterns of visual stimulation within the cell’s receptive field (the inputs) are mapped onto neural responses (the outputs). The model provides nothing more than a mathematically-couched, empirical description of the receptive field profiles of ganglion (and dLGN) neurons, summarizing the findings obtained from a range of electrophysiological studies. The DOG model was explicitly constructed for curve fitting purposes. As such, it was not constructed to represent anything about the underlying retinal or geniculate circuit that could suitably explain why the receptive fields of ganglion cells exhibit the pattern of spatial organization they do. Just as the original HH model embodies no commitment to the mechanisms that implement the process of membrane conductance change, the DOG model embodies no commitment concerning the neural mechanisms underlying the observed spatial receptive-field properties.27 Rodieck does not constrain his model to delimit components, causal dependencies, and interactions within the target system (as required by 3M). It is not an explanatory mechanistic model. Transforming the DOG model from into an explanatory mechanistic model involves delimiting some of the components and some of the causal dependencies among components in the mechanism responsible for producing the observed structure of the receptive fields, along the lines indicated by 3M. One way to do this, for instance, would be to supplement the model with additional terms corresponding to various components in the retinal (or geniculate) circuit giving rise to the observed response properties of ganglion (and dLGN) neurons. This is because the receptive field properties of individual neurons (including ganglion and dLGN neurons) are determined in part by their physiological and anatomical properties (e.g., the receptor types they express), and in part by the pattern of connectivity between the target cell and those cells sending inputs to them. Retinal ganglion cells respond as they do to illumination because they receive inputs from specific classes of bipolar cells with distinctive physiological properties. On-center bipolar cells express glutamate receptors that cause the cell to depolarize in response to photoreceptor activation. Consequently, on-center ganglion cells form27 One important caveat to this claim is that whatever the mechanism is, it must be capable of reproducing
the input–output mapping characterized by the DOG model.
123
Synthese (2011) 183:339–373
361
Fig. 1 Schematic pattern of connectivity in the dLGN circuit. RC dLGN relay cell; GC retinal ganglion cell; IN intrageniculate interneuron; B1 and B2 : synaptic weights. After Einevoll and Heggelund (2000)
ing synapses with these bipolar cells depolarize (and increase their spiking activity) when their receptive field center is illuminated. Off-center ganglion cells, by contrast, receive inputs from off-center bipolar cells. These bipolar cells express different glutamate receptors that induce hyperpolarization in response to photoreceptor activation, which in turn cause on-center ganglion cells to hyperpolarize (decrease their activity) when the receptive field center is illuminated. Consequently, a crucial step towards transforming the DOG model into a mechanistic model with appreciable explanatory weight is to incorporate details about the pattern of synaptic connectivity among neurons in the retinal (or geniculate) circuit underlying the receptive field profiles described by the DOG model. Computational neuroscientists Einevoll and Heggelund (2000) have sought to refine the DOG model in precisely this fashion. Their supplemented DOG model explicitly incorporates mechanistic details about functionally relevant neural components in the geniculate circuit.28 The model includes additional terms for the excitatory input to a dLGN relay cell from a single retinal ganglion cell and feedforward inhibition from neighboring ganglion cells routed through inhibitory interneurons in the intrageniculate nucleus (see Fig. 1). The advantage of this model compared to the original DOG model is that, in addition to providing a mathematical description of the receptive fields of dLGN neurons, it also makes explicit those functional contributions from the geniculate circuit. Because these variables and parameters have direct physiological relevance and correspond to mechanism components and relationships among components in the geniculate circuit, one can ask how their values change might change under direct experimental manipulation. For instance, how might the spatial organization of the receptive field of a dLGN neuron be expected to change when inhibitory inputs come from several rather than one interneuron in the intrageniculate nucleus? Given only the information captured by the original DOG model, this kind of question does not come into view. The supplemented DOG model thus supports a broader range of interventions than the 28 For further discussion of this neural circuit, see Mastronarde (1992).
123
362
Synthese (2011) 183:339–373
Fig. 2 Computing target location. +: gaze fixation point; xt : vector from fixation to target t in a gaze-centered reference frame; xee : vector from fixation to end effector in a gaze-centered reference frame; and xdv : difference vector between hand and target. After Shadmehr and Wise (2005)
phenomenological model. The efforts of computational neuroscientists to elaborate and refine the DOG model through the incorporation of mechanistic details brings it into tighter alignment with 3M. It has been transformed into a mechanistic explanation.
7 The Zipser–Andersen gain field model As a final example, I want to consider a computational model of sensorimotor transformations underlying visually guided reaching movements. In order to reach out for a visible object, the brain must specify the required movement vector (xdv ) by computing the difference between the current hand position and the location of the object in space (Shadmehr and Wise 2005). Carrying out this computation requires spatial information about both hand or end effector location (xee ) and target location (xt ) (see Fig. 2). In what follows, I focus on one widely discussed model for computing target location (xt )—the Zipser–Andersen model—in which the brain combines retinal information about the target location with eye position information to generate an intermediate representation of the target relative to the head. An important starting assumption for this model is that sensory information about target locations must be transformed so that it can be used by the motor system to generate appropriate movements. Because visual stimuli are transduced as patterns of activation on the retinas, target locations are initially encoded relative to the eye or point of gaze fixation (in an gaze-centered reference frame). However, gaze-centered representations of reach targets (Fig. 2, xt ) are ill-suited for movement planning because muscle activations which move the arm are determined with respect to the insertion points of the arm muscles at the shoulder and elbow. Therefore, information about an object’s location must be mapped or transformed from a gaze-centered representation into one more suitable for generating motor commands to move the limb (in a limb- or joint-centered reference frame). This computation to
123
Synthese (2011) 183:339–373
363
convert spatial information about target location from one reference frame to another is generally referred to as a reference frame transformation.29 Theoretical and computational investigations have suggested this reference frame transformation might be accomplished by a series of intermediate remappings or subtransformations. The first stage involves a transformation of the initial visual input specifying target location in an gaze-centered reference frame into a head-centered reference frame (i.e., relative to a fixed point on the head) by incorporating postural signals about orbital eye position. The second stage involves a transformation of this gaze-centered target representation into a body-centered reference frame by incorporating postural signals about head and neck position. In the final stage, this head-centered representation is converted into a limb-centered reference frame by incorporating postural signals about the joint configuration of the limb and hand. The computational-level theory imposes relatively weak constraints on the underlying neural mechanisms capable of carrying out the computation, requiring only that some components (e.g., single neurons or neuronal populations) can combine visual target information and the relevant postural signals in the appropriate way. A substantial amount of experimental and theoretical work in neuroscience has been devoted to understanding how the brain performs these reference frame transformations. The prevailing idea, based on experimental findings and computational modeling, is that the brain employs a mechanism of gain modulation, in which the amplitude or gain of a neural response to one input is modified in a nonlinear or multiplicative manner by a second set of inputs (e.g., Andersen et al. 1985; Andersen and Mountcastle 1983; Salinas and Thier 2000; Pouget and Snyder 2000). Gain modulations (or gain fields) were first experimentally characterized by Andersen and Mountcastle (1983) and Andersen et al. 1985 in neurons located within area LIP and area 7A of the parietal cortex. They tested the responses of parietal neurons by presenting a visual stimulus in the center of a cell’s receptive field as the monkey fixated different screen locations. They discovered that for a subset of these parietal neurons, their firing rate increased or decreased as if it were being multiplied by some number or constant associated with a specific angle of gaze. Critically, changes in eye position did not induce changes in spatial tuning (i.e., the retinal locations at which these cells are maximally responsive). Instead, changes in orbital eye position simply scaled the retinal receptive field of a given neuron up or down by some gain factor in a roughly multiplicative manner (see Fig. 3a).30 Since the parietal cortex was already thought to play a functional role in spatial transformations based on prior theoretical work, these experimental findings provided additional empirical support for the idea that a mechanism of gain modulation might be the means by which the brain performs computational transformations between different reference frames. 29 To render the computational problem facing the brain more intuitive, imagine trying to plan a reach to
an object in front of you by considering its position relative to your hand. Now try to imagine planing the same movement by considering where the target object is relative to another object in a different room. The computational task of planning a movement vector is thus simplified immensely by selecting the appropriate coordinate frame. This is true both for the intuitive example and the brain. 30 Gain-modulated responses are commonly modeled mathematically as the product of a Gaussian function of retinal location and nonlinear sigmoidal function of eye position (see Fig. 3b). For further details, see Pouget and Snyder (2000).
123
364
Synthese (2011) 183:339–373
Fig. 3 a Depiction of a typical visual receptive field of a parietal neuron showing gain modulation for three different eye positions, ex . The x-axis indicates the position of the visual stimulus on the retina (retinal location, rx ) in degrees from foveal center. The y-axis indicates the neural response (spikes) in arbitrary units. The depicted neuron has a receptive field center (defined by peak response) approximately −18◦ from the center of the fovea. b 3-Dimensional plot of a gain-modulated response obtained by multiplying a Gaussian receptive field (as in A) function of retinal location of target (rx ) with a sigmoid function of eye position (ex ). Neural activity is similarly plotted on the y-axis in arbitrary units. From Pouget and Sejnowski (1997). Reprinted with permission
Zipser and Andersen (1988) were the first to directly implicate gain fields in reference frame computations. In order to demonstrate a computational role for gain fields, they trained a three-layer neural network to transform two sources of inputs— visual target position in an eye-centered reference frame and eye position signals— into a output representation of the target location in a head-centered reference frame. The trained network successfully performed the desired reference frame transformation. More importantly, as a consequence of training-driven weight adjustment, the network model automatically developed visual receptive fields in its hidden layer units that were gain modulated by eye position in a nearly identical way to what had been experimentally observed in parietal cortex (Andersen et al. 1985). Andersen and colleagues took this to indicate that eye position gain fields in the brain might be performing the same computation to accomplish transformations between different reference frames. They state that the “striking similarity between model and experimental data supports the conjecture that the cortex and the network generated by back propagation compute in similar ways” (Zipser and Andersen 1988, p. 684). The fact that the Zipser–Andersen model accomplishes the desired input–output mapping, and in this specific sense can be said to implement the reference frame transformation, provides at best indirect evidence for the hypothesis that this is how brains actually carry out the computation. The model does not, however, merely reproduce the input–output transformation. As described above, the trained network also employs hidden units with response properties that closely match the experimentally measured gain fields in parietal cortex to perform the reference frame transformation. Because the network model uses similar computational elements to those implicated in brain-based spatial transformations, a model-to-mechanism mapping is achieved and the model is consequently endowed with some explanatory force with respect to
123
Synthese (2011) 183:339–373
365
biological sensorimotor transformations. Nevertheless, the modelers themselves did not think they had provided a fully adequate explanation.31 Why is this? Their reasons for taking a deflationary stance on the explanatory status of their model are twofold. First, the model contained biologically implausible elements (most of which were idealizations intentionally introduced for convenience or computational tractability). In particular, Zipser and Andersen express concern over the fact that their multilayer feedforward network uses a biologically unrealistic learning rule— the backpropagation algorithm—for adjustment of the synaptic weights.32 Consequently, Andersen and colleagues trained another neural network to perform the same coordinate transformation task, this time implementing a more biologically plausible reinforcement learning rule in place of the backpropagation procedure (Mazzoni et al. 1991). Their goal was to determine if the desired input–output mapping and desirable gain-modulated response properties in the hidden units could be obtained by a biologically realistic learning algorithm, or whether their network simulation results were dependent on idiosyncratic features of the backpropagation learning rule itself. When the new network successfully performed the reference frame transformation and developed gain fields in the hidden units, they took this to provide even stronger evidence that gain-modulation is a “plausible model of how area 7a may perform coordinate transformations” (Mazzoni et al. 1991, p. 293). The way in which Andersen and colleagues refine the model in the direction of a closer correspondence between model components and known entities and activities in the target mechanism provides a clear indication they hoped to deepen the explanatory force of the model. However, there is reason to think these particular refinements missed their mark and failed to have an appreciable impact on the model’s explanatory status. For one thing, the primary target of explanation is not how sensorimotor transformations are learned, but rather how they are implemented once they are learned. Consequently, the finding of interest from the network model is the gain field “solution” the network settles on to solve the reference frame transformation task, rather than the learning process by which this solution is reached. Incorporating a more biologically acceptable learning rule into the model therefore produces no corresponding explanatory gains.33 The real limitations on the explanatory force of the Zipser–Andersen model is that it is difficult if not impossible to effect a mapping between those elements in the 31
It should be acknowledged that even though Zipser and Andersen understood the explanatory limitations of their network model, they nevertheless maintained definite explanatory objectives for it. For instance, they begin the paper by stating the aim of the modeling effort was to investigate “the question of how the brain carries out coordinate transformation which translate sensory inputs to motor outputs” (Zipser and Andersen 1988, p. 679). The Zipser–Andersen model thus differs from other prominent neural network models such as NETtalk, which also generates some desired input–output (grapheme-to-phoneme) transformation, but does not purport to explain how human subjects effect the same transformation.
32 Biologically implausible features of the backpropagation algorithm include the retrograde propagation of error signals back along axons and the ponderous speed of the algorithm for connection weight adjustment in networks of size. For further details, see Mazzoni et al. (1991). 33 Paradoxically, Zipser and Andersen tacitly acknowledge the irrelevance of refinements to the network learning rule for the model’s explanatory force. They state: “[t]hat the back-propagation method appears to discover the same algorithm that is used by the brain in no way implies that back propagation is actually used by the brain” (Zipser and Andersen 1988, p. 684).
123
366
Synthese (2011) 183:339–373
model giving rise to gain-modulated hidden unit activity and the neural components in parietal cortex underlying gain-modulated responses (arguably, the core requirement imposed by 3M on explanatory mechanistic models). Although Zipser and Andersen do not dwell on the issue, they acknowledge its importance when they remark that the “similarity of the model and test results raises the question of what physiological mechanisms could subserve this equivalence” (1988, p. 684). There are two reasons for this breakdown in the mapping from model onto neural mechanism. First, it is difficult in general to effect a mapping between neural network models and target neural systems. There is typically only a loose and imprecise correspondence between network architecture and neural implementation (see, e.g., Crick 1989; Smolensky 1988). Do network units correspond to single neurons, neuronal populations, or do they completely fail to map onto neural components? Relatedly, does unit activity correspond to the time-averaged firing rate of a single neuron or the population-averaged firing rate of many neurons? Concrete answers to questions of this sort are necessary if the network model is to satisfy the strictures imposed by 3M and thereby possess explanatory force. However, the main source of the explanatory limitations on the Zipser–Andersen model is the fact that the cellular mechanisms underlying gain modulation in the brain remain unknown. Consequently, it is uncertain whether the gain field mechanism posited in the model successfully maps onto the mechanism for gain modulation implicated in sensorimotor transformations in parietal cortex. Establishing this mapping is critical to validate that the mechanism at work in the network model is not just a theoretical construct—a how-possibly model—but in fact is fundamental to how real brains perform gain-field-based reference frame transformations. In order to understand this mapping failure, a more detailed understanding of how gain modulation arises in the Zipser–Andersen network’s hidden units is required. Gain modulation arises in individual hidden units in the Zipser–Andersen network via two separate mathematical functions—one function computes a linear, weighted sum of all the synaptic inputs to the unit, and another function performs a nonlinear, sigmoidal transformation on these inputs to generate the unit’s output activation (Brozovi´c et al. 2008; Zipser and Andersen 1988). These two functions jointly produce the gain-modulated responses observed in the hidden layer units. Critically, this nonlinear-additive mechanism of gain modulation represents a mathematical abstraction of one specific cellular mechanism hypothesized to underlie gain modulation in real neural systems (Brozovi´c et al. 2008). An alternative neural network implementation of gain fields involves a direct multiplicative or product operation on the inputs specifying target location and eye position entering the hidden layer units (Pouget and Sejnowski 1997; Pouget and Snyder 2000). This network implementation of gain modulation differs substantially from the mechanism employed in the Zipser–Andersen model and corresponds to an entirely different mechanistic hypothesis about how gain-modulated responses arise in real neural systems. For example, Mel (1993) has argued for the hypothesis that if the dendrites of a neuron contain voltage-sensitive channels, then nonlinear interactions between dendritic conductances at neighboring synapses that are co-activated can boost responses in a roughly multiplicative fashion. The interaction among synaptic inputs to a neuron thus produces an approximately multiplicative gain modulation at the output level of
123
Synthese (2011) 183:339–373
367
the neuron. Chance et al. (2002) hypothesize a different population-level mechanism capable of producing equivalent responses, in which the gain of a neuron’s response to excitatory driving inputs can be modulated by varying the overall or “background” level of synaptic input from the network as a whole. By simultaneously elevating both the excitatory and inhibitory background inputs from the network in a balanced manner, gain modulated responses can be achieved. Each of these competing mechanistic hypotheses enjoy some degree of empirical support, yet none have been decisively confirmed as the mechanism underlying gain-modulation-based sensorimotor transformations in the parietal cortex. Despite the uncertainty surrounding the underlying cellular mechanisms responsible for gain modulation in parietal neurons, the Zipser–Andersen model embodies determinate commitments to a particular gain modulation mechanism. The model therefore describes how a set of parts and activities might possibly be organized such that they exhibit gain modulation, but it does not (indeed cannot) describe how gain-modulated responses arise in neural systems. Because the model posits one possible way gain-modulation-based sensorimotor transformations might be implemented and is offered in the absence of empirical evidence as to whether parietal neurons engage in the activities attributed to them in the model, this aspect of the network is a how-possibly model. Although a limited, but nonetheless explanatory, model-to-mechanism mapping has been effected between gain-modulated activity in the model and gain-modulated activity in the target mechanism in parietal cortex, an important part of the model remains a mere conjecture about how the same transformation might be implemented via gain modulation in the brain. Despite this explanatory limitation, however, the overall pattern of model elaboration and undertaken by computational neuroscientists seeking to explain reference frame transformations for reach planning is well-aligned with 3M and the general aims of mechanistic framework. These computational neuroscientists are actively engaged in transforming the model in the direction of an adequate mechanistic explanation. Finally, it is interesting to note that computational neuroscientists such as Andersen and Zipser appear to be more acutely sensitive to the demands of adequate mechanistic explanation (and wherein a model derives its explanatory force) than some philosophical commentators on their model. For example, Shagrir (2006) correctly identifies a number of features of the Zipser–Andersen model as exemplifying aspects of mechanistic explanations including the fact that the model involves a specification of an intervening computational process between “initial conditions” and “termination conditions”, in the sense of Machamer et al. (2000). However, he goes on to state that “[i]n this sense a computational explanation is a sort of mechanistic explanation: it explains how the computational problem is solved by revealing the mechanism, or at least a potential mechanism, that solves the problem” (2006, p. 403, author’s emphasis). Shagrir correctly identifies the model as a potential mechanistic explanation. At the same time, he problematically collapses the crucial mechanistic distinction between how-possibly and how-actually models. It is precisely where a given model is positioned with respect to this distinction, as has been examined at length in this paper, which determines whether it is explanatory or not. Models that only reveal a potential or possible mechanism do not possess explanatory force with respect to the target
123
368
Synthese (2011) 183:339–373
phenomenon. Computational models must describe the real structure of the computational mechanisms underlying the input–output mapping in order to explain.
8 Mathematical modeling and mechanistic explanation By way of conclusion, I would like to remark briefly on the general relationship between mathematical modeling and mechanistic explanation. As this paper has indicated, an important feature of contemporary neuroscience is its reliance on mathematical modeling techniques and methods for the description, explanation, and exploration of neural systems. An increasingly large proportion of investigations in the neurosciences sit poised at the interface between computational, mathematical, and experimental work. In spite of this, accounts of mechanistic explanation in neuroscience have been slow to incorporate this important feature or have, at worst, provided analyses in direct odds with it. The anemic effort on the part of mechanists to address the role of mathematical modeling in the construction of mechanistic explanations likely stems from their wholesale rejection of the received view against which the mechanistic approach has been defined—the covering law or D–N model (and its close cousin, predictivism). Accordingly, all aspects of scientific explanation that appear to bear connections (however remote) to this opposing view are summarily discarded. Representative of this trend, Bechtel and Abrahamsen (2005) assert that mechanistic explanations in biological science, including neuroscience, are not represented in terms of universally quantified statements, but rather exhibit an unavoidable “reliance on visuospatial representations” (p. 427). They identify this feature as a departure point from standard nomological explanations involving propositional representations. Yet strikingly, nowhere do they mention mathematical representation and its relation to mechanistic explanation. There seems to be a natural tendency to assimilate mathematical modeling with the construction of D–N style explanations. However, as we have now seen, this assimilation would be mistaken and unproductive. Neuroscientists regularly construct models that are both mathematical in character and mechanistic; and such mechanistic mathematical models are genuinely explanatory to the extent they satisfy 3M. As the models I have discussed in this paper indicate, neuroscientists frequently represent and reason about mechanisms using mathematical models. They also routinely employ box-and-arrow diagrams, flow charts, and other visual aids when interpreting their data and constructing theoretical hypotheses. Nevertheless, as quantitative modeling comes to play an increasingly indispensable role (see Dayan and Abbott 2001), real theoretical progress will ever more intimately depend on these more mathematical forms of representation. While a full defense of this claim lies beyond the scope of this paper, the tight coupling between mechanistic explanation and mathematical modeling in neuroscience can be illustrated by briefly considering one final example of mechanistic mathematical modeling—so-called compartmental modeling of single neurons. In compartmental modeling, the neuron is modeled as a chain of small compartments, each of which is small enough so that the electrical potential can be assumed
123
Synthese (2011) 183:339–373
369
Fig. 4 Schematic depiction of compartmental modeling. From Einevoll (2006). Reprinted with permission
constant throughout its volume.34 This model can be visually depicted (Fig. 4), just as the geniculate circuit underlying the spatial receptive field properties of dLGN neurons can (Fig. 3). It too has a corresponding mechanistic mathematical model (Eq. 5). The mathematical representation is an integral part of the mechanistic explanation. The equation describing the neuron contains terms that map onto component parts and activities of the signal propagating mechanism. The two terms on the left hand side of the equation represent currents between compartment i and the neighboring compartments i + 1 and i − 1. The first term on the right hand side represents currents due to capacitive properties of the cell membrane, the second term represents currents due to synaptic inputs from other neurons, and the third term represents currents due to various other ion channels. As I have already discussed above in the context of the HH model, voltage-dependent ion channels are crucial component parts of the neural mechanism for the action potential. The goal of compartmental modeling is thus to better understand the mechanism underlying the signal processing properties of individual neurons, and the primary outcome is a mechanistic mathematical model. gi,i+1 (Vi+1 − Vi ) − gi,i−1 (Vi − Vi−1 ) = ci
d Vi s j + li + li dt s
(5)
j
The use of mathematical modeling techniques is a ubiquitous feature in all sectors of contemporary neuroscience. This includes the efforts of neuroscientists to construct mechanistic explanations across these domains. Commonly, both the descriptions of 34 This idealization, made for the sake of mathematical tractability, is biologically unrealistic since neuro-
nal dendrites and axons are fully continuous structures. As discussed in Sect. 3, this neither compromises its status as a mechanistic model nor its explanatory force.
123
370
Synthese (2011) 183:339–373
the target phenomena to be explained and the mechanistic models for those phenomena are mathematically couched. Consequently, philosophers wishing to provide more probing analyses of mechanistic explanation in neuroscience need to be more sensitive to this feature. Reflection on the models described in this paper as well as the unifying 3M principle, which is itself unrestrictive about whether a mechanistic model is mathematically expressed or not, should make this fact evident. Because the 3M principle requires that some components described in the model should correspond to the some components in the mechanism for the target system behavior, models that satisfy 3M will ipso facto support mechanistic decompositions. We might highlight this feature of 3M by describing the model itself as supporting a kind of decomposition: model decomposition. However, it must be remembered that the explanatory force of the model derives not simply from it having structure that enables any arbitrary decomposition whatsoever (surely both the Balmer and Nernst equations support this), but rather that the model decomposition supports a high fidelity mapping onto a mechanistic decomposition of the target system. At this juncture, the 3M constraint is engaged, and identifies the pivot point for models that explain versus those that merely describe. 9 Conclusion The aim of this paper is to shed light on the nature of explanation in computational neuroscience. I argue that the mechanistic framework is extensible to models in the domain of computational neuroscience, and that computational explanation in neuroscience is best understood as a species of mechanistic explanation. In support of this claim, I identify a general constraint applicable to explanatory mechanistic models and used this constraint (and several related mechanistic distinctions) to show how paradigmatic models in computational neuroscience can and do explain the phenomenon the cover. Finally, I demonstrate how the pattern of model refinement and elaboration undertaken by computational neuroscientists is made comprehensible by treating these computational models in neuroscience as instances of mechanistic explanations. Acknowledgments Thanks to Carl Craver, Gualtiero Piccinini, and two anonymous reviewers for invaluable comments on an earlier draft of this paper. Thanks also to audience members at the APA Pacific Division Meeting in 2010 for helpful feedback and discussion.
References Aizawa, K. (2010). Computation in cognitive science: It is not all about Turing-equivalent computation. Studies in History and Philosophy of Science Part A, 41(3), 227–236. Andersen, R. A., Essick, G. K., & Siegel, R. M. (1985). Encoding of spatial location by posterior parietal neurons. Science, 230, 450–458. Andersen, R. A., & Mountcastle, V. B. (1983). The influence of the angle of gaze upon the excitability of light-sensitive neurons of the posterior parietal cortex. Journal of Neuroscience, 3, 532–548. Batterman, R. (2002). Asymptotics and the role of minimal models. The British Journal for the Philosophy of Science, 53(1), 21–38. Batterman, R. (2009). Idealization and modeling. Synthese, 169, 427–446. Bechtel, W. (2002). Decomposing the brain: A long term pursuit. Brain and Mind, 3, 229–242.
123
Synthese (2011) 183:339–373
371
Bechtel, W. (2008). Mental mechanisms: Philosophical perspectives on cognitive neuroscience. London: Routledge. Bechtel, W., & Abrahamsen, A. (2005). Explanation: A mechanistic alternative. Studies in History and Philosophy of the Biological and Biomedical Sciences, 36, 421–441. Bechtel, W., & Mundale, J. (1999). Multiple realizability revisited: Linking cognitive and neural states. Philosophy of Science, 66, 175–207. Bechtel, W., & Richardson, R. C. (1993). Discovering complexity: Decomposition and localization as strategies in scientific research. Princeton, NJ: Princeton University Press. Bickle, J. (1998). Psychoneural reduction: The new wave. Cambridge, MA: Bradford/MIT Press. Bogen, J. (2005). Regularities and causality; generalizations and causal explanations. Studies in History and Philosophy of Science Part C, 36(2), 397–420. Bogen, J. (2008). The Hodgkin–Huxley equations and the concrete model: Comments on Craver, Schaffner, and Weber. Philosophy of Science, 75(5), 1034–1046. Brozovi´c, M., Abbott, L. F., & Andersen, R. A. (2008). Mechanism of gain modulation at single neuron and network levels. Journal of Computational Neuroscience, 25, 158–168. Bunge, M. (1964). Phenomenological theories. In M. Bunge (Ed.), The critical approach: In honor of Karl Popper. New York: Free Press. Catterall, W. A. (2000). From ionic currents to molecular mechanisms: The structure and function of voltage-gated sodium channels. Neuron, 26(1), 13–25. Chance, F. S., Abbott, L. F., & Reyes, A. D. (2002). Gain modulation from background synaptic input. Neuron, 35, 773–782. Chemero, A., & Silberstein, M. (2008). After the philosophy of mind: Replacing scholasticism with science. Philosophy of Science, 75, 1–27. Churchland, P. S. (1986). Neurophilosophy. Cambridge, MA: MIT Press. Churchland, P. M. (2005). Functionalism at forty: A critical retrospective. Journal of Philosophy, 102(1), 33–50. Churchland, P. S., Koch, C., & Sejnowski, T. J. (1988). Computational neuroscience. Science, 241, 1299–1306. Cole, K. S., & Moore, J. W. (1960). Potassium ion current in the squid giant axon: Dynamic characteristic. Biophysical Journal, 1, 1–14. Craver, C. F. (2001). Role functions, mechanisms, and hierarchy. Philosophy of Science, 68(1), 53–74. Craver, C. F. (2006). When mechanistic models explain. Synthese, 153, 355–376. Craver, C. F. (2007). Explaining the brain. Oxford: Oxford University Press. Craver, C. F. (2008). Physical law and mechanistic explanation in the Hodgkin and Huxley model of the action potential. Philosophy of Science, 75, 1022–1033. Crick, F. (1989). The recent excitement about neural networks. Nature, 337, 129–132. Cummins, R. (1975). Functional analysis. Journal of Philosophy, 72, 741–765. Cummins, R. (1983). The nature of psychological explanation. Cambridge, MA: Bradford/MIT Press. Cummins, R. (2000). “How does it work?” vs. “What are the laws?” Two conceptions of psychological explanation. In F. Keil & R. Wilson (Eds.), Explanation and cognition (pp. 117–145). Cambridge, MA: MIT Press. Dawson, M. R. W. (1998). Understanding cognitive science. Oxford: Wiley-Blackwell. Dayan, P., & Abbott, L. F. (2001). Theoretical neuroscience: Computational and mathematical modeling of neural systems. Cambridge, MA: MIT Press. Downes, S. M. (1992) The importance of models in theorizing: A deflationary semantic approach. In D. Hull, M. Forbes, & K. Okrulik (Eds.), Proceedings of the Philosophy of Science Association (Vol. 1, pp. 142–153). East Lansing. Doyle, D. A., Cabral, J., Pfuetzner, R. A., Kuo, A., Gulbis, J. M., Cohen, S. L., Chait, B. T., & MacKinnon, R. (1998). The structure of the potassium channel: Molecular basis of K+ conduction and selectivity. Science, 280, 69–77. Dray, W. H. (1993). Philosophy of history (2nd ed.). Englewood, NJ: Prentice Hall. Dror, I. E., & Gallogly, D. P. (1999). Computational analyses in cognitive neuroscience: In defense of biological implausibility. Psychonomic Bulletin & Review, 6(2), 173–182. Einevoll, G. T. (2006). Mathematical modeling of neural activity. In A. Skjeltorp & A. V. Belushkin (Eds.), Dynamics of complex interconnected systems: Networks and bioprocesses. NATO Science Series II: Mathematics, Physics & Chemistry. Amsterdam: Kluwer.
123
372
Synthese (2011) 183:339–373
Einevoll, G. T., & Heggelund, P. (2000). Mathematical models for the spatial receptive-field organization of nonlagged X-cells in dorsal lateral geniculate nucleus of cat. Visual Neuroscience, 17(6), 871–885. Einevoll, G. T., & Plesser, H. E. (2002). Linear mechanistic models for the dorsal lateral geniculate nucleus of cat probed using drifting-grating stimuli. Network: Computation in Neural Systems, 13, 503–530. Einevoll, G. T., & Plesser, H. E. (2005). Responses of the difference-of-Gaussians model to circular drifting-grating patches. Visual Neuroscience, 22(4), 437–446. Enroth-Cugell, C., & Robson, J. G. (1966). The contrast sensitivity of retinal ganglion cells of the cat. Journal of Physiology, 187(3), 517–552. Ermentrout, G. B., & Terman, D. H. (2010). Mathematical foundations of neuroscience. New York: Springer. Fodor, J. A. (1974). Special sciences (Or: The disunity of science as a working hypothesis). Synthese, 28, 97–115. Fodor, J. A. (1975). The language of thought. Cambridge, MA: Harvard University Press. Forber, P. (2010). Confirmation and explaining how possible. Studies in the History and Philosophy of Science Part C: Studies in the History and Philosophy of Biological and Biomedical Sciences, 41(1), 32–40. Haugeland, J. (1978). The nature and plausibility of cognitivism. Behavioral and Brain Sciences, 1, 215–226. Hempel, C. G. (1965). Aspects of scientific explanation and other essays in the philosophy of science. New York: Free Press. Hille, B. (2001). Ion channels of excitable membranes (3rd ed.). Sunderland, MA: Sinauer. Hille, B., Armstrong, C. M., & MacKinnon, R. (1999). Ion channels: From idea to reality. Nature Medicine, 5(10), 1105–1109. Hodgkin, A. L., & Huxley, A. F. (1952). A quantitative description of membrane current and its application to conduction and excitation in nerve. Journal of Physiology, 117, 500–544. Johnson-Laird, P. N. (1983). Mental models: Towards a cognitive science of language, inference and consciousness. New York: Cambridge University Press. Kaplan, D. M. & Craver, C. F. (forthcoming): The explanatory force of dynamical and mathematical models in neuroscience: A mechanistic perspective. Philosophy of Science. Kuffler, S. (1953). Discharge patterns and functional organization of mammalian retina. Journal of Neurophysiology, 16(1), 37–68. Machamer, P., Darden, L., & Craver, C. F. (2000). Thinking about mechanisms. Philosophy of Science, 67, 1–25. MacKinnon, R. (2004). Nobel lecture: Potassium channels and the atomic basis of selective ion conduction. Bioscience Reports, 24(2), 75–100. Marr, D. (1982). Vision: A computational investigation into the human representation and processing of visual information. San Francisco: W.H.Freeman & Co. Ltd. Marr, D., & Hildreth, E. (1980). Theory of edge detection. Proceedings of the Royal Society of London Series B, Biological Sciences, 207(1167), 187–217. Marr, D., Ullman, S., & Poggio, T. (1979). Bandpass channels, zero-crossings, and early visual information processing. Journal of the Optical Society of America, 69(6), 914–916. Mastronarde, D. N. (1992). Nonlagged relay cells and interneurons in the cat lateral geniculate nucleus: Receptive-field properties and retinal inputs. Visual Neuroscience, 8(5), 407–441. Mauk, M. D. (2000). The potential effectiveness of simulations versus phenomenological models. Nature Neuroscience, 3(7), 649–651. Mazzoni, P., Andersen, R. A., & Jordan, M. I. (1991). A more biologically plausible learning rule applied to a network model of cortical area 7a. Cerebral Cortex, 1, 293–307. Mel, B. W. (1993). Synaptic integration in an excitable dendritic tree. Journal of Neurophysiology, 70, 1086–1101. Neher, E. (1992). Nobel lecture: Ion channels for communication between and within cells. Neuron, 8(4), 605–612. Newell, A., & Simon, H. A. (1976). Computer science as an empirical enquiry: Symbols and search. Communications of the ACM, 19, 113–126. Piccinini, G. (2006). Computational explanation in neuroscience. Synthese, 153, 343–353. Piccinini, G. (2007). Computing mechanisms. Philosophy of Science, 74, 501–526. Piccinini, G. (2010). The mind as neural software? Revisiting functionalism, computationalism, and computational functionalism. Philosophy and Phenomenological Research, 81(2), 269–311.
123
Synthese (2011) 183:339–373
373
Piccinini, G., & Craver, C. F. (2011). Integrating psychology and neuroscience: Functional analyses as mechanism sketches. Synthese. doi:10.1007/s11229-011-9898-4. Pouget, A., & Sejnowski, T. J. (1997). Spatial transformations in the parietal cortex using basis functions. Journal of Cognitive Neuroscience, 9(2), 222–237. Pouget, A., & Snyder, L. H. (2000). Computational approaches to sensorimotor transformations. Nature Neuroscience, 3(Suppl), 1192–1198. Putnam, H. (1967). The nature of mental states (Reprinted from Mind, language, and reality, by Putnam, Ed., 1975, Cambridge: Cambridge University Press.) Pylyshyn, Z. W. (1984). Computation and cognition. Cambridge, MA: MIT Press. Revonsuo, A. (2001). On the nature of explanation in the neurosciences. In P. Machamer, R. Grush, & P. McLaughlin (Eds.), Theory and method in the neurosciences (pp. 45–69). Pittsburgh, PA: University of Pittsburgh Press. Rodieck, R. W. (1965). Quantitative analysis of cat retinal ganglion cell response to visual stimuli. Vision Research, 5(11), 583–601. Rusanen, A. M., & Lappi, O. (2007). The limits of mechanistic explanation in neurocognitive sciences. In S. Vosniadou, D. Kayser, & A. Protopapas (Eds.), Proceedings of the European cognitive science conference. London: Francis and Taylor. Salinas, E., & Abbott, L. F. (1996). A model of multiplicative neural responses in parietal cortex. Proceedings of the National Academy of Sciences, 90, 11956–11961. Salinas, E., & Thier, P. (2000). Gain modulation: A major computational principle of the central nervous system. Neuron, 27, 15–21. Salmon, W. (1984). Scientific explanation and the causal structure of the world. Princeton, NJ: Princeton University Press. Salmon, W. (1989). Four decades of scientific explanation. Pittsburg, PA: University of Pittsburg Press. Schaffner, K. F. (2008). Theories, models, and equations in biology: The heuristic search for emergent simplifications in neurobiology. Philosophy of Science, 75(5), 1008–1021. Sejnowski, T. J. (2001). Computational neuroscience. In N. J. Smelser & P. B. Baltes (Eds.), Encyclopedia of the social and behavioral sciences (pp. 2460–2465). Oxford: Elsevier. Shadmehr, R., & Wise, S. (2005). Computational neurobiology of reaching and pointing: A foundation for motor learning. Cambridge, MA: MIT Press. Shagrir, O. (1998). Multiple realization, computation and the taxonomy of psychological states. Synthese, 114, 445–461. Shagrir, O. (2006). Why we view the brain as a computer. Synthese, 153, 393–416. Shapiro, L. A. (2000). Multiple realizations. Journal of Philosophy, 97, 635–654. Smolensky, P. (1988). On the proper treatment of connectionism. Behavioral and Brain Sciences, 11, 1–23. Von Eckart, B., & Poland, J. S. (2004). Mechanism and explanation in cognitive neuroscience. Philosophy of Science, 71, 972–984. Weber, M. (2008). Causes without mechanisms: Experimental regularities, physical laws, and neuroscientific explanation. Philosophy of Science, 75(5), 995–1007. Weisberg, M. (2007). Three kinds of idealization. Journal of Philosophy, 104(12), 639–659. Woodward, J. (2003). Making things happen. New York: Oxford University Press. Zipser, D., & Andersen, R. A. (1988). A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons. Nature, 331, 679–684.
123
This page intentionally left blank z
Synthese (2011) 183:375–388 DOI 10.1007/s11229-011-9882-z
Mechanisms and constitutive relevance Mark B. Couch
Received: 12 October 2010 / Accepted: 12 January 2011 / Published online: 1 February 2011 © Springer Science+Business Media B.V. 2011
Abstract This paper will examine the nature of mechanisms and the distinction between the relevant and irrelevant parts involved in a mechanism’s operation. I first consider Craver’s account of this distinction in his book on the nature of mechanisms, and explain some problems. I then offer a novel account of the distinction that appeals to some resources from Mackie’s theory of causation. I end by explaining how this account enables us to better understand what mechanisms are and their various features. Keywords
Mechanisms · Constitutive relevance · Capacities · Craver
1 Introduction The notion of a mechanism has become increasingly important in philosophical analyses of the sciences. Many philosophers now accept that explanations that appeal to mechanisms have a fundamental role to play in scientific practice (Bechtel and Richardson 1993; Glennan 1996; Machamer et al. 2000). The notion of a mechanism, however, still remains inadequately understood. There is unclarity about what precisely makes something count as the mechanism for a capacity, and no agreement about the criteria we should use in making this determination. One way this problem has become apparent is that we still lack a clear account of how to define the difference between the relevant and the irrelevant parts a mechanism contains. We recognize that there is a difference between the parts that are relevant and those that are not, but we lack an adequate account of how this works.
M. B. Couch (B) Department of Philosophy, Seton Hall University, 400 South Orange Ave., South Orange, NJ 07079, USA e-mail:
[email protected] 123
376
Synthese (2011) 183:375–388
In this paper, I will examine this issue and offer an answer to this particular problem. My concern will be to explain how we should understand the difference between the relevant and irrelevant parts of a mechanism, and the criteria we should use in making this distinction. The paper will be organized in the following way. In Sect. 2, I will explain how the problem of relevance arises in trying to understand mechanisms and why this issue is important in the sciences. In Sect. 3, I will examine Craver’s (2007) recent attempt to address this issue in his book on the nature of mechanisms. Section 4 will show that while the account he offers represents one of the first attempts to address this issue at length in the literature, the problem is that it only provides an account of evidence for relevance and not relevance itself. In Sect. 5, I will present a novel account of relevance which makes use of Mackie’s discussion of causation and inus conditions, and explain how the account improves over the previous account. Finally, in Sect. 6, I will draw some conclusions about the account and its role in helping us to better understand mechanisms in the sciences. 2 The problem of constitutive relevance There are a number of concerns researchers in the sciences have in attempting to understand the phenomena in their domain. These include such things as discovering whether there are general laws that apply to the phenomena in question, what kinds of predictions can be made, how the phenomena can be best explained, and so forth. These are issues that arise in many forms of scientific research. But in addition to these concerns we also find that many researchers are interested in understanding the makeup of structures and how they are capable of performing various capacities. For instance, researchers on human vision in psychology are interested in understanding how the capacity for color discrimination depends on the eye’s structure. The human eye consists of numerous parts that are involved in producing this sort of capacity (the lens, retinal receptors, vitreous humor, etc.). Understanding the capacity of color discrimination and how it is performed by the eye requires an understanding of the parts of the structure, and their operations. In this way researchers become interested in understanding the mechanisms that serve to underlie this capacity. In many sciences like biology, neuroscience, and psychology there is a keen interest among researchers in investigating the details of such mechanisms and how they operate to produce the phenomenon of interest. As commonly understood, a mechanism is a structure or system that performs some capacity. This can be understood as a particular behavior or activity a mechanism performs.1 Take the capacity of color discrimination performed by the eye as our example. Researchers have learned that (as far as the processes in the eye are concerned) information from light in the environment is passed through the pupil and impacts the retinal receptors in the posterior of the eye (i.e., rods and cones). The retinal receptors contain different kinds of rhodopsin molecules that are chemically structured to 1 In my other work I have spoken of capacities and will continue to do so here, although Craver’s preference (2001) is to speak in terms of behaviors or activities rather than capacities. I don’t intend anything significant to depend on this difference in what follows.
123
Synthese (2011) 183:375–388
377
interact with the light. The interaction of the light with the rhodopsin molecules causes a photoreaction in the receptor cells, which leads to a neural signal that is sent through the optic nerve in the back of the eye. After leaving the eye the neural signal is relayed to the appropriate parts of the brain for further processing. The capacity for color discrimination is something researchers on perception want to understand. Explaining how the capacity is performed leads researchers to be interested in the parts of the eye that make this capacity possible. In attempting to understand a mechanism like this researchers confront a difficulty, though. The human eye is a mechanism that contains many parts that contribute to other capacities in addition to color discrimination, such as brightness discrimination, distance perception, motion detection, etc. It is understood that not every part or component of the eye is involved in each of the capacities that are performed. It is known, for example, that the basic system for light reception in the human eye consists of the rod and cone receptors and their photopigments in the retina, since these are what are needed for transducing the information from light in the environment into neural signals. But understanding how the eye acquires the information from light does not require understanding the focusing system within the eye (e.g., the lens system). This is because the focusing system is really responsible for adjusting the information about the light present in the eye, and not for the mere acquisition of such information (cf. Bock and von Wahlert 1965, p. 122). There is, unfortunately, no agreement how to go about distinguishing the relevant from the irrelevant parts in a mechanism like this. We recognize that there is an intuitive distinction to make among the parts a mechanism contains, but we have only a poor understanding of how to explain this. The problem has been recently noted in Craver’s book (Explaining the Brain (2007)) in the course of his discussion of explanation in the neurosciences. He says that the problem of explaining what makes a part of a mechanism relevant to a capacity the mechanism performs (what he calls “constitutive relevance”) has been largely neglected by philosophers in this area. The failure to address constitutive relevance is a major lacuna not just in Cummins’s model of explanation, but also in the systems tradition generally, in recent discussions of mechanistic explanation … and, in fact, in all discussions of “microreduction” in the philosophy of biology and the philosophy of mind. Considerable philosophical effort has been expended on the topic of etiological (that is, causal) relevance, but almost none has been dedicated to the problem of constitutive relevance. (p. 140) Although a lot has been written about the nature of mechanisms and their role in scientific explanations, it is surprising that we still lack an acceptable answer to this basic problem. There are several reasons why an answer to this problem would be important to have. First, having an account of relevance would help us to better understand mechanisms in general. The task researchers have is one of explaining how various mechanisms perform the capacities researchers are interested in. Yet, we have only a poor understanding of how this works if the most we can say is that “the behavior is due to the underlying mechanism in some manner or other.” Ideally, what is wanted is a detailed account of how the different capacities of a system depend on the specific parts present.
123
378
Synthese (2011) 183:375–388
Answering this question is, thus, an important part of understanding what is involved in giving a mechanistic explanation. Second, this issue is related to other, well-known issues about reductionism and interlevel relations in the sciences that have received a lot of attention. One issue is the familiar idea that there need not be a one-to-one relation between kinds of mechanisms and kinds of capacities because of the occurrence of multiple realization. The same capacity for color discrimination may be realized one way in a particular individual, say, and in a different way in another individual. This possibility is thought to have important consequences for issues about reductionism in the sciences since it would mean there is cross-classification between kinds at different theoretical levels, and so no way of reducing them to one another (see Fodor 1974; Kitcher 1984; Kim 1992; Bechtel and Mundale 1999). Presumably, deciding whether this is the case requires an understanding of when the mechanisms considered belong to the same kind or not, and, so, requires we have criteria for individuating mechanisms and their parts. This means that classical problems about reductionism and related concerns depend on a clear understanding of this prior issue. It is to be expected, then, that addressing this problem will contribute towards answering other, fundamental questions about the sciences.
3 Craver’s account of relevance This is the problem of relevance we need an answer for. It will help to consider the problem by describing how one might approach the issue to begin with. In explaining a capacity of a mechanism, researchers are concerned with understanding the details of how the mechanism operates. This requires determining the various conditions under which a capacity is exhibited by the system. In that case, a natural thought to have is that researchers should begin by describing the capacity they want to explain, and then look for the parts within the mechanism that are involved in bringing about the capacity. The relevant parts will be the ones involved in bringing about the capacity, and the irrelevant parts are not so involved. In fact, I think a careful reading of Cummins’ work suggests that he has a similar account of relevance in mind, despite what Craver says before about the systems tradition saying little about this problem. If one looks at his writings, Cummins is careful to explain that the relevant parts of a structure are those which are involved in giving what he refers to as a “functional analysis” (see 1975, p. 196, n25). The idea is that a part is relevant just in case it is involved in providing a functional-analytical explanation of the presence of a capacity.2 So it would be wrong to think that the systems tradition has not said anything about this issue. While I think this is important to observe, I do think Craver is right that this issue is of fundamental importance and that it has not generally received the attention it deserves (for some others who have discussed this
2 Here is the appropriate text from Cummins. “Indeed, what makes something part of, e.g., the nervous system is that its capacities figure in an analysis of the capacity to respond to external stimuli, coordinate movement, etc. Thus, there is no question that the glial cells are part of the brain, but there is some question whether they are part of the nervous system or merely auxiliary to it.”
123
Synthese (2011) 183:375–388
379
issue see Shapiro (2004) and Couch (2009)).3 I also think that the remarks Cummins makes on this point are somewhat vague until we understand precisely how to define the notion of relevance and what this involves. In particular, we need to know what it means for a part to “figure in” providing an explanation of a capacity on the account. One of the virtues of Craver’s book is that it helps clarify the issue by examining it in more detail than is common. I want to develop our consideration of this issue by looking at the account Craver provides. The account offered in the book is along the lines of the suggestion just made, only it is more carefully detailed and nuanced. He says we should think of the task of explaining the performance of a behavior or activity (he prefers these notions) in terms of what he calls giving a “mechanistic explanation” (2007, p. 7). Suppose someone is interested in an activity and wants to learn how it depends on the parts of the mechanism that has it (note that Craver’s account also permits one might begin with a given part of a mechanism first, and then later attempt to discover what activity it performs through investigation). The activity is specified in terms of the behavior the mechanism performs, which is typically described by means of verbs. For example, we have seen that one activity of the eye is the activity of color discrimination that involves transducing information from light into neural signals that proceed through the back of the eye. Once researchers have specified this activity, they then seek to identify the particular activities of the parts of the eye that bring the activity about. In this case, the activity of color discrimination will be explained in terms of the activity of the pupil to let the light in, the activity of the photoreceptors to interact with the light on the retina, the activity of the rhodopsin molecules in the receptors to undergo a conformational change in response, etc. The idea is to identify the subactivities that are involved in the perceptual processes within the eye, and to identify the parts with the activities. These are the individual physical parts of the mechanism (the pupil, the receptors, visual pigments) just mentioned. In short, the idea is to decompose the mechanism into the activities and parts involved in explaining why the overall activity of color discrimination occurs. As Craver says, the explanation of an activity F of a mechanism S proceeds by identifying the subactivities [C1 , C2 , . . . , Cm ] and the parts [P1 , P2 , . . . , Pn ] which have them, and how they are organized to enable the mechanism to perform activity F (2007, p. 7). The organization of the parts within the mechanism is an important element of the explanation, since not any arrangement (temporal, spatial, etc.) of the parts will bring about the activity. So in a complete explanation this is a feature that has to be included in the account. In Craver’s view it is the organization of parts and their activities that together account for the overall activity in question. This complex structure serves as what he calls the “mechanism” for F (where the mechanism is distinct from the overall structure itself). In this sense, a mechanism can be understood as a proper part of a structure and serves as what some would call the realization basis for an activity. When we have 3 Here is a passage from Shapiro, for example, who talks in terms of identifying the relevant properties of mechanisms rather than parts. “I propose to use the name R-property as a label for those properties of realizations whose differences suffice to explain why the realizations of which they are properties count as different in kind…[such properties] are identified in the course of functionally analyzing some capacity…[they are] those properties that causally contribute to the production of the capacity of interest” (Shapiro 2004, p. 52).
123
380
Synthese (2011) 183:375–388
completed identifying this mechanism and made clear how it accounts for the activity, we have given what Craver calls a constitutive, or mechanistic, explanation (2007, p. 108). Now, I noted that not all of the parts of a structure are relevant to each activity it performs, since some parts do not play a role in bringing about a given activity. These would include waste products, say, that float through the inner eye that do not contribute towards the process of color discrimination. These kinds of features Craver calls mere “parts,” and distinguishes them from what are termed “components” (which genuinely contribute to the activity). The question to be answered is how we determine which parts of the structure serve as the relevant components, and which are mere parts. To answer this question Craver appeals to the manipulability account of causation, which has been developed in another context by Woodward (1997, 2003). The idea is to appeal to a view about the nature of causation and causal relevance, and show how this can be extended to the notion of mechanistic explanation. The view Craver presents is that researchers can determine whether a part is relevant to an activity by establishing that they are mutually manipulable. The ability to manipulate a part of a structure to bring about changes in its behavior is evidence the part is relevant to the behavior the structure performs. As he explains, “a part is a component in a mechanism if one can change the behavior of the mechanism as a whole by intervening to change the component and one can change the behavior of the component by intervening to change the behavior of the mechanism as a whole” (2007, p. 141). Put in general terms, the idea is a part X is relevant to some activity Y if interventions on X bring about changes in Y , and vice versa (the manipulability relation in question is symmetrical). We can illustrate the account in the following way. Suppose researchers are interested in showing the pupil is a part relevant to the activity of the eye to control for brightness from light in the environment—this concerns the amount of light permitted to fall on the retina. The way researchers can demonstrate this is by showing that there is a mutual manipulability relation between the pupil and the activity in question. To do this, researchers might first try to bring about changes in the pupil by introducing the chemical Cyclogyl into the eye which impairs the pupil’s ability to operate (this is a chemical used in eye exams to dilate the pupil and facilitate retinal examination). They would then examine whether this brings about changes in the eye’s behavior of brightness discrimination. On the other hand, researchers might next try to bring about changes in the behavior of brightness discrimination by blocking the light entering the eye directly. They would then observe whether this has an affect on the pupil’s behavior. If intervening on the eye in these ways brings about the expected changes, then researchers can say the pupil is relevant to the activity being considered. However, if the interventions do not bring about the changes, then the part does not make a contribution, and is irrelevant to the mechanism. It may be that the part is relevant to the performance of some other activity of the structure, but, insofar as it does not contribute to the specific activity being explained, it does not have a place in the mechanism for that activity. Craver makes some further points about the details of the account worth noting. In particular he claims that the interventions have to meet certain conditions to be appropriate. He says the kinds of changes that matter are what he terms “ideal interventions”
123
Synthese (2011) 183:375–388
381
(2007, p. 154). Roughly, this sort of intervention occurs when we can show that the change in the behavior of the activity is due solely to the change in the component, and not because of other interactions with features of the mechanisms. This condition is required to rule out certain indirect relationships that can be confused with relevance, but which should be excluded. The interventions researchers make must bring about the changes to the activity by means of the component, but not, for example, by directly affecting the activity itself. Craver notes that the various requirements on what makes an intervention ideal fit together with the experimental procedures that are actually used by researchers in understanding how mechanisms work. Researchers in the sciences commonly seek to manipulate mechanisms experimentally in order to identify the relations among the parts and the activities present. This includes performing various types of interference experiments, stimulation experiments, and others Craver describes (2007, p. 146). Each of these forms of investigation helps researchers in understanding how the parts of a mechanism are related to the activities researchers are interested in, and a complete statement of the account will be sensitive to these further points.4 4 A difficulty While this account is an improvement in understanding mechanisms, there is a difficulty with the account of relevance. In describing how a part of a mechanism is relevant to an activity, we have been told that it is facts about manipulability that are important. It is these facts that enable researchers to determine whether a part of a mechanism is relevant to an activity or not. It is important to observe that Craver frequently describes the account in terms of the methods researchers have for establishing whether a part is relevant. In one place in the book, for example, he discusses how one can manipulate sodium channels to alter the membrane potential of a neural cell, and he says that “the ability to manipulate items in this way is crucial evidence for establishing causal and explanatory relationships among the mechanism’s components” (2007, p. 132; italics added). In another place, he describes how the activities of the parts (the X ’s) in a mechanism (S) can be determined to be relevant to the mechanism’s overall activity being explained. He writes, “I conjecture that to establish that X ’s ϕ-ing is relevant to S’s ψ-ing it is sufficient that one be able to manipulate S’s ψ-ing by intervening to change X ’s ϕ-ing…” (2007, p. 159; italics added). The language used here concerns the methods researchers have for establishing whether a part is relevant to an activity. But this way of describing the point makes it seem epistemic; that what we are being offered is a way of deciding whether a part is relevant to an activity. This is an important aspect of understanding the relationship between parts and activities, and, as such, has a role to play in understanding the issues. But this is different from telling 4 I should note that Craver’s account of mechanistic explanation is usefully richer than I am describing it. In Chap. 4, he discusses ways that Cummins’ account of functional analysis differs from the full-blooded notion of mechanistic explanation he has in mind, and he discusses several kinds of organization of components that are different from one another. He also discusses several of the criteria on an account of explanatory relevance. These are important details for understanding the features of mechanistic explanations, but I am overlooking them because they do not contribute to the particular difficulty I raise in the following section.
123
382
Synthese (2011) 183:375–388
us what the relevance relation is. The relation we are concerned with is a relation in the world between the parts of a mechanism and an activity it performs. To provide an account of the relation I take it one needs to explain what this relevance relation consists in. On the account offered, we are told we can determine if a part X is relevant to an activity Y by seeing whether interventions on X bring about changes in Y , and vice versa. It is by observing whether the changes in the part bring about changes in the activity that we can establish the part is relevant or not. Learning this is information of sorts about the part involved. But it is not the same as providing an account of the relevance relation itself which holds between the part and the activity. So insofar as the account is merely concerned with how to establish a part is relevant it seems we are left seeking something more. The account provided does not explain what the relevance relation is independently of the evidence researchers have for manipulating the parts of a mechanism.5 It is worth noting that there is a similar worry that has been raised about the account Woodward offered in his discussion of causation. Early on a complaint was made that an account of causal relations in terms of possible manipulations confuses epistemological issues with the nature of the relations themselves. Glennan, for example, is someone who has complained that the account really addresses the epistemic issue of how one goes about identifying when causes and effects are appropriately related. He writes that “the manipulability theorist has made an important point about the epistemology of causation…. Experimental manipulations can provide evidence that variables are connected, even in the absence of mechanical knowledge of how they are connected” (2009, p. 318; for related worries see Psillos 2007). It should be noted, in fairness, that Woodward has replied to this worry about his account (2000, p. 204), and denied that it should be understood in this way. But regardless how this issue affects Woodward’s view, it would be mistaken to interpret Craver’s account in this way. The language he uses is epistemic and there is no indication in the book that he intends the account to be understood in other terms.6 There is an additional piece of evidence to support this way of thinking about the account. In giving his account Craver tells us that it offers a sufficient (but not necessary) condition for relevance. This is an important feature to understand. If the mutual manipulability account only provides us with a sufficient condition for relevance, this would suggest that there are kinds of relevance not covered by the account, and that there is something more to say about the relation. In that case it seems fair to wonder what more there is to the notion that has not been explained (the account offered later 5 Notice the Craver describes constitutive relevance as “a relationship between a component in the mechanism and the behavior of a mechanism and the behavior of a mechanism as a whole” (2007, p. 146, n26). Understood this way, it is relationship between a component and the mechanism’s behavior. But when it comes to the account he explains how to ascertain if a component is relevant to the mechanism’s behavior (not the relation), which is something different. 6 Let me make two observations about the point being made here. First, Craver has informed me that he
intends his account to be understood in epistemic terms (personal communication). Second, there is a possible misunderstanding of the point being made that I want to avoid. It is no part of my project to criticize the manipulationist account of causation presented by Woodward itself. My only concern in this paper is with the relation of constitutive relevance and whether this is adequately explained on Craver’s account. This issue is not the same as how to evaluate competing accounts of causation since I take it constitutive relations are different from causal relations.
123
Synthese (2011) 183:375–388
383
will attempt to give both sufficient and necessary conditions for relevance). In making this point I should note I am not suggesting that the account provided isn’t useful in certain ways, or that it doesn’t supply us with information that can be used to supplement an account of relevance. Understanding how one can go about establishing that a part is relevant to an activity would be something important to have learned. But it does suggest that there is more to understand about the relation than we have been given. Let me respond to a concern someone might have with this way of understanding the account. It might be suggested that Craver is only concerned with scientific explanation, and so he doesn’t have to worry about this issue. While there is something to this idea, I don’t think it is correct to say he is only concerned in the book with scientific explanation. It is true that he spends a good deal of time focusing on this issue, but it seems to me the account offered overlaps with metaphysical issues in certain ways. For instance, notice he says explicitly that the argument he offers relies on the thesis of multiple realizability, and he invokes realization and supervenience relations in various places. These are commonly understood as metaphysical relations and not simply matters of scientific explanation. He also contrasts his account with Kim’s and Gillett’s accounts of realization that are metaphysical. If Craver’s account is intended to be a genuine alternative to these accounts, then it is hard to understand how it cannot have something to say about these and related relations like relevance. So even if it is said the account is focused on scientific explanation this does not mean these issues are as removed as may be suggested.
5 Another account of relevance I now want to offer an alternative that goes beyond the account presented and which doesn’t have this problem. To approach the account, we can begin by recalling how providing a mechanistic explanation is supposed to work in the terms Cummins uses. The idea is that researchers explain the presence of a capacity by identifying the mechanism that realizes the capacity. For a structure, this will consist of a set of components and their organization that underlie the occurrence of the capacity. In this respect the issue concerns identifying the components of the structure that are involved in determining the capacity’s occurrence. The fact that the components serve to determine the presence of the capacity is why they have to be appealed to in giving the explanation. When we consider a particular structure we have seen that there may be numerous parts present that are not all involved in the capacity’s occurrence. We saw that researchers have learned that the waste products in the human eye are not involved in the basic capacity for color discrimination, while the retinal receptors are important components. This means there is a distinction between those parts in the eye that do not determine the presence of color discrimination, and those that do serve to determine the capacity. Those parts whose absence does not affect the presence of the capacity are not needed, and can be ignored in giving the explanation. So what we are looking for is an account that enables researchers to distinguish the necessary parts from the ones that are unnecessary.
123
384
Synthese (2011) 183:375–388
This way of describing the issue brings to mind a notion familiar from discussions of causation. I think we can usefully appeal in this setting to the notion of an inus condition from Mackie (1974). Mackie is primarily concerned with explaining the nature of causal relations in terms of necessary and sufficient conditions. He offers us the following example to explain the account. Suppose that an electrical circuit in a house short-circuits and causes a fire, and the house burns down. The short circuit caused the fire to occur in conjunction with a variety of other conditions present. These include the short circuit, oxygen, and combustible material. Each of these is an individual condition that is part of a complex condition that caused the fire to occur on the particular occasion. Without each condition present, the short circuit wouldn’t have led to the fire (e.g., without oxygen present, the fire would not have started). Mackie suggests that we can think of the total cause of the fire as a complex condition made of the individual conditions that are needed for bringing the fire about. In light of this, when we say that the short circuit caused the fire, for Mackie, what we are saying is that the short circuit was a necessary condition for the fire to occur on the occasion. In this sense, the short circuit is an insufficient but nonredundant part of an unnecessary but sufficient condition for the fire (i.e., an inus condition). Notice this account allows that a fire can be started in different ways on different occasions since there can be different causes of a fire. It is permitted that in another house the fire may have been caused by a lantern tipping over instead. In this case, the lantern will be the cause of the fire together with the presence of the other conditions of oxygen, combustible material, etc. In this situation as well, the conditions involved will be inus conditions for the fire. I think we can say something very similar about the relevant components of a mechanism. We said a mechanism consists of several components and their organization that together serve to determine the presence of the capacity. The components are perceived as parts of a complex structure that serves to realize the capacity’s presence. And, in a similar way, the individual components of the mechanism will be necessary for the presence of the capacity on an occasion since each component has to be present. What this suggests is that the components of a mechanism that realize a capacity should be seen as inus components. They are the relevant parts of a structure that researchers should appeal to in giving the explanation. I propose to define a relevant part, then, as an insufficient but nonredundant part of an unnecessary but sufficient mechanism that serves as the realization of some capacity. Under normal conditions the retinal receptors are necessary for the processing of information from light in the human eye on an occasion. The processing of information will not occur without the receptors being present, and so the receptors should be seen as inus components. To make this more precise the situation can be described in the following way. Suppose we have a complex structure ABCD that serves to realize a capacity F. Suppose, further, that, of the parts present, only ABC are needed for the presence of F on the occasion, and the other part D is an extra part that serves no role in the presence of the capacity. In this case, the individual components A, B, C, are inus components for the capacity F, and D is an irrelevant part that should be ignored in giving the explanation. This is because the latter part plays no role in explaining why the capacity is present. Furthermore, it should be apparent from this that the complex structure that consists of ABC together with its organization serves as the mechanism for F.
123
Synthese (2011) 183:375–388
385
To avoid misunderstandings and deal with potential difficulties, there are several clarifications it will help to make about the account. First, it should be clear there is a difference between Mackie’s account and the use I am making of it. As I noted, his concern was with explaining the nature of causal relations between events. Now, it is commonly believed that events that are causally related are metaphysically distinct from one another. We know that by flipping the light switch one can cause the lights to come on, and that the event of the lights coming on is a distinct event from the flipping of the switch. But the situation is different with respect to mechanisms and their capacities. In this case, the presence of a mechanism is thought to necessitate the presence of a capacity in a sense stronger than causal necessitation (cf. Levine 2001, p. 14). This is because the presence of the mechanism is thought to metaphysically ensure that the capacity is present. The relation between them should, therefore, not be viewed as a causal one. There is also support for this point from the fact that the effect of a cause is typically believed to occur after the cause (the lights coming on occur after the flipping of the switch). In the case of a mechanism, though, the capacity is thought to be present at the same time the mechanism that realizes it is present. So the inus notion I am using needs to be understood in a different way from the account Mackie offered. Second, it may seem that the account doesn’t fit well with common ways of describing the relation between mechanisms and capacities. It is often said that a mechanism should be sufficient for the realization of a capacity, but should not be considered necessary. This is related to concerns about multiple realization. If different kinds of mechanisms can realize the same kind of capacity, then no mechanism can be necessary. But this is not something the account intends to deny. The account accepts that the mechanisms that realize a capacity are sufficient (in the sense of realization) for the presence of a capacity; it is merely the components of the mechanisms that are necessary in the circumstances for the capacity to occur. The present approach accepts common ways of viewing the realization relation, and adds the inus notion to the account to improve our understanding of the mechanisms.7 The third point is that it is important to be clear about the notion of “in the circumstances” used in the account. Mackie makes a distinction it is important to be mindful of (1974, p. 31). He suggests there are different ways of thinking about the notion and only one is right for understanding the account. When we speak about a particular event causing another event in the circumstances, in one sense we might be referring to the broad circumstances in which the causal sequence occurred. For instance, suppose someone flips a switch and causes the lights to come on in a particular room. The lights coming on will have been caused by someone flipping the switch, but we accept 7 There are two points to note about this. First, in fitting together with common ways of thinking about realization relations the account also serves to improve our understanding of what multiple realization involves. This is because it helps one to understand what the mechanisms are that underlie the capacities. This has been a concern in recent debates, where there has been a lack of agreement about how to identify the mechanisms that should be considered in the examples (Bechtel and Mundale 1999; Bickle 2003; Polger 2008; Aizawa 2009). The account can be seen to have implications for these broader issues. Second, notice that Mackie accepts that the inus relation is a symmetrical relation between causes and effects. In the account I am offering this means the realization relation is a nonsymmetric relation, and the relevance relation is symmetrical. So the account is similar in this respect to the view Craver offers.
123
386
Synthese (2011) 183:375–388
that the same (token) effect could also have been produced by the person introducing a current in the wires going to the light directly. In this way of talking, the cause of the effect might have been different in the circumstances. But Mackie makes it clear that this is not the appropriate sense he thinks is right for understanding (token) causal relations. He suggests that a statement about the flipped switch causing the lights to come on should be understood to mean that, on the particular occasion in which the lights in fact came on, they were caused by the flipping of the switch. In this case, it is the sequence of events which took place that serves as the circumstances to consider. In this sense, the flipping of the switch was necessary for the lights coming on. This is because the flipping of the switch played a noneliminable role in the sequence of events that occurred. We should understand the sense in which a component is necessary in the circumstances for a capacity in a similar way. When this point is understood we can deal with a common concern raised about accounts like those I am presenting. This concerns the problem of redundant parts. Suppose it is claimed that one of the kidneys of a person is a component in a mechanism that realizes a particular quantity of blood filtering done by that kidney, and so is inus for filtering blood. The person also has a second kidney that (in addition to its own work) can perform the same job if there is a problem with the first one. In that case, it seems false to say that the first kidney is necessary in the circumstances for the blood filtering. We have already addressed this concern in our previous remarks. The sense in which the kidney is necessary in the circumstances concerns the particular occasion in which the blood filtering occurred in the individual. Understood this way, the kidney (as it was used) was necessary for the occurrence of blood filtering on the occasion, and so this should not be seen as a problem. The next point to note is that the account presented does not make any commitment to Mackie’s views beyond the notions I considered. There is no commitment, in particular, to the explanation of the general notion of necessity he offered. What is needed for the account to work is that there be some account of this latter notion that is consistent with the account I presented. Mackie himself is explicit that the utility of the inus notion does not depend on the details of his account (1965, p. 44). There is more to be said about how the account offered can be developed to address concerns about the notion of necessity and its analysis. A detailed review of this issue is beyond the scope of this paper, but it is worth noting it is merely required that there be some account of this latter notion that fits with the account offered. It is also worth noting in relation to this that Mackie’s views on the notion of necessity have been extended and refined in more recent work. Strevens (2007), for example, has suggested ways Mackie’s views on this issue can be revised to improve our understanding of causation. While I think appealing to this work is one line of development worth considering, I would allow there may be other ways of doing this as well consistent with my account. The concern I have had is how the inus notion can be used in understanding relevance, and not with giving a full account of these other issues.8 8 One other issue I have not addressed is that Mackie’s account of causation is deterministic. In using his view to characterize the notion of constitutive relevance as I have, this entails that every mechanistic explanation deductively entails its explanandum. This appears to conflict with Craver’s view that mechanistic
123
Synthese (2011) 183:375–388
387
The final point to observe is just that the account provides an explanation of what the relevance relation is, and not of the evidence we have for it. This relation is distinct from whatever evidence we may have about whether the relation obtains on a particular occasion. We may have various ways of discovering that this relation obtains, but these nevertheless remain distinct from the relation itself. So, in this respect, the account goes beyond the manipulability account that was considered before.
6 Conclusion It is commonly suggested that the notion of a mechanism has an important role to play in scientific practice. Many researchers in different fields appeal to mechanisms in explaining the capacities of the phenomena that interest them. This is an important notion in the sciences, and something that philosophers are interested in understanding. Until recently, though, there has been little discussion of what exactly the notion of a mechanism involves and how to identify the components that make them up. The problem we have focused on concerns understanding how the parts of a mechanism are relevant to a capacity it performs. The account presented has attempted to show that there are resources from Mackie’s theory we can appeal to in trying to solve this problem. If the account is on the right track, it will be helpful in clarifying a number of issues about the relation between mechanisms and capacities that have been obscure. Acknowledgements I would like to thank Bernie Berofsky, Stuart Glennan, Pete Mandik, Gualtiero Piccinini, Michael Strevens, and an anonymous referee for assistance with this paper. Carl Craver has been especially helpful.
References Aizawa, K. (2009). Neuroscience and multiple realization: A reply to Bechtel and Mundale. Synthese, 167, 493–510. Bechtel, W., & Mundale, J. (1999). Multiple realizability revisited. Philosophy of Science, 66, 175–207. Bechtel, W., & Richardson, R. (1993). Discovering complexity: Decomposition and localization as strategies in scientific research. Princeton, NJ: Princeton University Press. Bickle, J. (2003). Philosophy and neuroscience: A ruthlessly reductive account. Dordrecht: Kluwer. Bock, W., & von Wahlert, G. (1965). Adaptation and the form-function complex. In C. Allen, M. Bekoff, & G. Lauder (Eds.) (1998), Nature’s purposes (pp. 117–167). Cambridge, MA: MIT Press. Couch, M. (2009). Functional explanation in context. Philosophy of Science, 76, 253–269. Craver, C. (2001). Role functions, mechanisms and hierarchy. Philosophy of Science, 68, 31–55. Craver, C. (2007). Explaining the brain: Mechanisms and the mosaic unity of neuroscience. New York, NY: Oxford University Press. Cummins, R. (1975). Functional analysis. In C. Allen, M. Bekoff, & G. Lauder (Eds.) (1998), Nature’s purposes (pp. 169–196). Cambridge, MA: MIT Press. Fodor, J. (1974). Special sciences. Synthese, 28, 97–115. Glennan, S. S. (1996). Mechanisms and the nature of causation. Erkenntnis, 44, 49–71.
Footnote 8 continued explanations should not be viewed as arguments that allow inferences (explanations rather are ontic). I am not sure this latter point is right, though I have not discussed the issue in this paper.
123
388
Synthese (2011) 183:375–388
Glennan, S. S. (2009). Mechanisms. In H. Beebee, C. Hitchcock, & P. Menzies (Eds.), The Oxford handbook of causation (pp. 315–325). New York, NY: Oxford University Press. Kim, J. (1992). Multiple realization and the metaphysics of reduction. Philosophy and Phenomenological Research, 52, 1–26. Kitcher, P. (1984). 1953 and all that: A tale of two sciences. Philosophical Review, 93, 335–373. Levine, J. (2001). Purple haze: The puzzle of consciousness. New York, NY: Oxford University Press. Machamer, P., Darden, L., & Craver, C. (2000). Thinking about mechanisms. Philosophy of Science, 57, 1–25. Mackie, J. L. (1965). Causes and conditions. In E. Sosa & M. Tooley (Eds.) (1993), Causation (pp. 33–55). New York, NY: Oxford University Press. Mackie, J. L. (1974). The cement of the universe. New York, NY: Oxford University Press. Polger, T. (2008). Two confusions concerning multiple realization. Philosophy of Science, 75, 537–547. Psillos, S. (2007). Causal explanation and manipulation. In J. Person & P. Ylikoski (Eds.), Rethinking explanation (pp. 93–107). Dordrecht: Springer. Shapiro, L. (2004). The mind incarnate. Cambridge, MA: MIT Press. Strevens, M. (2007). Mackie remixed. In J. Campbell, M. O’Rourke, & H. Silverstein (Eds.), Causation and explanation (pp. 93–118). Cambridge, MA: MIT Press. Woodward, J. (1997). Explanation, invariance, and intervention. PSA-1996 vol. 2. Philosophy of Science, 66, S26–S41. Woodward, J. (2000). Explanation and invariance in the special sciences. British Journal for the Philosophy of Science, 51, 197–254. Woodward, J. (2003). Making things happen. New York, NY: Oxford University Press.
123
Synthese (2011) 183:389–408 DOI 10.1007/s11229-010-9869-1
Mechanistic explanation at the limit Jonathan Waskan
Received: 27 August 2010 / Accepted: 31 December 2010 / Published online: 19 January 2011 © Springer Science+Business Media B.V. 2011
Abstract Resurgent interest in both mechanistic and counterfactual theories of explanation has led to a fair amount of discussion regarding the relative merits of these two approaches. James Woodward is currently the pre-eminent counterfactual theorist, and he criticizes the mechanists on the following grounds: Unless mechanists about explanation invoke counterfactuals, they cannot make sense of claims about causal interactions between mechanism parts or of causal explanations put forward absent knowledge of productive mechanisms. He claims that these shortfalls can be offset if mechanists will just borrow key tenets of his counterfactual theory of causal claims. What mechanists must bear in mind, however, is that by pursuing this course they risk both the assimilation of the mechanistic theories of explanation into Woodward’s own favored counterfactual theory, and they risk the marginalization of mechanistic explanations to a proper subset of all explanations. An outcome more favorable to mechanists might be had by pursuing an actualist-mechanist theory of the contents of causal claims. While it may not seem obvious at first blush that such an approach is workable, even in principle, recent empirical research into causal perception, causal belief, and mechanical reasoning provides some grounds for optimism. Keywords Explanation · Mechanisms · Counterfactuals · Causation · Causal perception · Mechanical reasoning · Simulation 1 Introduction Resurgent interest in both mechanistic and counterfactual theories of explanation has led to a fair amount of discussion regarding the relative merits of these two approaches
J. Waskan (B) University of Illinois at Urbana-Champaign, Urbana, IL, USA e-mail:
[email protected] 123
390
Synthese (2011) 183:389–408
(Bogen 2004; Glennan 2002, 2010; Machamer 2004; Psillos 2004; Woodward 2002, 2003, 2004). At present, James Woodward appears to be the pre-eminent counterfactual theorist, and he has put mechanists on the defensive regarding (among others) two important issues. He claims, to a first approximation, that mechanists have not been able to make sense either of explanations put forward absent knowledge of mechanisms (viz., stand-alone causal claims) or of claims about causal interactions between mechanism parts (i.e., in cases where explanations do cite specific mechanisms). Woodward suggests that mechanists can remedy these defects by borrowing key tenets of his counterfactual theory. This may be so, but, I argue, the price of accepting this offer may be the assimilation (into Woodward’s counterfactual framework) and marginalization (to a proper subset of explanations) of the mechanistic approach to explanation. Here I consider another potential remedy for what ails mechanists—namely, an actualist-mechanist theory of the contents of causal claims. Woodward has already criticized this option, but only under the assumption that actualist-mechanist theories require that causal claims make assertions about specific productive mechanisms. In fact, an actualist-mechanist may instead propose that although causal claims always assert something about actual productive mechanisms, they in fact never (or hardly ever) assert anything about specific mechanisms. Still, at first glance it is hard to see how this proposal could be fleshed out into something remotely respectable. Searching for a way to do just that, I draw upon multiple lines of recent research in experimental psychology. Like Woodward (forthcoming), I assume that there is an intimate connection between our concept of causation (and the beliefs, suspicions, etc. in which it figures) and the contents of our causal claims. I argue that a body of research on topics ranging from infant causal perception to adult mechanical reasoning suggests that our concept of causation may (at least initially) just be an abstraction from instances in which we integrate a variety of early perceptual expectations in order to understand how events in specific mechanical systems unfold. This account allows for, without giving pride of place to, an ability to determine the consequences of counterfactual interventions. Woodward has, however, also expressed a number of concerns regarding actualist-mechanist theories of our concept of causation, including the version offered here, and so I attempt to address some of what I take to be the most pressing. Having offered a fleshed-out version of the non-specific, actualist-mechanist theory of our concept of causation, I propose that it lends itself to a corresponding theory of the contents of causal claims, with all of the accompanying benefits. I focus, in closing, on how a theory of this sort might answer the two major criticisms of mechanistic theories of explanation mentioned earlier and, thereby, stave off Woodward’s apparent attempt at assimilation and marginalization. None of this, admittedly, is meant to provide a decisive refutation of Woodward’s criticisms of mechanistic theories of explanation. Rather, my main intent is just to show that there is at least one way, a way that seems to have gone unexplored, in which mechanists might respond. Before getting under way, it is worth noting that there are at least three prominent views regarding what sorts of things explanations are that have been adopted by latter-day mechanists, each arguably reflecting a common manner of speaking (Waskan 2006). Some adopt Salmon (1998) ontic conception and equate explanations with the objective mechanisms that produce the happenings that perplex us (Glennan 2002;
123
Synthese (2011) 183:389–408
391
Craver 2007). Others take explanations to be cognitive states constituted in part by representations of possible productive mechanisms (Nersessian 2002; Waskan 2006, 2010; Wright and Bechtel 2007; Horst 2007; Thagard 2010). Still others equate explanations with the representations, often termed ‘models,’ that are used to communicate and disseminate information about possible productive mechanisms (Machamer et al. 2000; Craver 2006). The criticisms of the mechanistic approach with which I will be grappling here are directed, at least in large part, at theories of this last, communicative sort. 2 Woodward’s criticisms Getting down to business, two of Woodward’s biggest concerns about the mechanistic approach to understanding the contents of explanations (by which is hereafter meant sets of explanatory claims) are the following: (i) All explanations that do make claims about mechanisms bottom out at causal claims that are best understood in terms of counterfactuals (Woodward 2002). (ii) Some explanations are not constituted by claims about mechanisms (Woodward 2004). The first worry concerns what might be called mechanism explanations, which are sets of claims that explicitly describe the mechanisms that might be responsible for producing the happenings that perplex us. Consider, for example, the following explanation, offered by Mink (1996), for movement disorders exhibited in late-stage Huntington’s patients (viz., progressive difficulty in the execution of movements) and in a certain class of Parkinson’s patients (viz., co-contraction rigidity). Both of these are associated with pathologies of the basal ganglia (a set of nuclei lying deep within the brain). The first involves degeneration of neurons of the striatum and the second involves loss of inhibitory neurons of the substantia nigra. To explain the relationship between these neuropathologies and movement disorders, Mink puts forward a set of descriptions and depictions (a model) of the wiring of the relevant cortical and sub-cortical structures. To vastly oversimplify, he claims that motor-planning systems of the cortex initiate movement by focally exciting neurons in the striatum and globally exciting neurons of the substantia nigra. The latter in turn send inhibitory signals to each of the endogenously active neural triggers (think of a switchboard) in the cerebellum for familiar motor patterns (e.g., raising one’s arm, clenching one’s fist, etc.). At the same time, local activity in the striatum leads to selective inhibition of the substantia nigra, which has the effect of selectively disinhibiting specific motor-pattern triggers. It is easy to envision how, on this setup, degeneration of striatum might produce movement-execution deficits (i.e., no brake would ever be disabled) and how degeneration of SN would lead to co-contraction rigidity (i.e., many brakes would be disabled simultaneously). Mechanists are somewhat divided on the details (see Tabery 2004), but they tend to construe such mechanism explanations as sets of claims to the effect that the (regular or singular) happenings that perplex us are the product of parts (or components) that are arranged, act (or change), and interact with (or effect changes in) one another. According to Woodward (2002), problems arise for mechanists insofar as they attempt
123
392
Synthese (2011) 183:389–408
to make sense of what the claims about part interactions come to without invoking counterfactuals. Some of these claims take the form of what I will call bare causal claims (i.e., claims of the form x causes y), and others have, at least superficially, a more contentful form (e.g., x inhibits y, x oxidizes y, x flips y, etc.) (see Bechtel and Richardson 1993; Machamer et al. 2000; Machamer 2004). Woodward rightly notes that mechanists owe us an account of the contents of these claims, of what they assert about the world in virtue of which they contribute to mechanism explanations. The alternative Woodward offers up is his “interventionist account,” which, “connects causal claims to counterfactual claims about what would happen if (perhaps contrary to actual fact) certain interventions were carried out” (2004, p. 59). His proposal appears to be roughly that a type-causation claim to the effect that X causes Y asserts that the values or magnitudes of X and Y are such that there are possible conditions under which, when all else is held constant, intervening on X (which may not be humanly possible) would (via X, rather than some alternate route from intervention to Y) alter Y (or its at least its probability distribution) (Woodward 2002). For instance, to claim that raising/lowering atmospheric pressure (A) causes an increase/decrease in the readings of an appropriately situated mercury barometer (B) is largely to assert that under various interventions on the former the correlation between it and the latter would continue to obtain. Of course, as with any high-level regularity, there will be countless conditions under which the correlation will fail to obtain (e.g., the pressure increases to an astronomical level, the temperature falls to −39 C, the instrument is smashed with a hammer, etc.). Accordingly, when we make a high-level causal claim such this one, it is understood or implied that we are not asserting that the correlation is completely invariant, just that it is “approximately” invariant (Woodward 2002, pp. S368–S370). Thus, one might say that the causal claim, “A causes B,” asserts that an Approximate Counterfactual-Intervention Dependence (ApCoInD) relation holds between the values of A and B. As for token-causation claims, to say that an instance of A (a) caused an instance of B (b) is, I gather, mainly to assert that there are some possible (but non-actual) interventions on a that would have (via a) altered b (Woodward 2004, p. 46). Woodward claims that there are several benefits to this way of making sense of bottom-out claims regarding part interactions (YR). He claims, for instance, that it weeds out illegitimate bottom-out claims (i.e., those that hold only accidentally or as a result of a common-cause) and that it accommodates the narrow scope and exceptionful character of many legitimate bottom-out claims. A number of prominent thinkers appear to have been at least partly swayed by such considerations (see Glennan 2002; Psillos 2004; Craver 2007).1 Indeed, it has to be admitted that Woodward has offered up a promising cure for what ails mechanists. Mechanists should bear in mind, however, that on Woodward’s view mechanism explanations are explanatory precisely in virtue of making fine-grained assertions about where the happenings that perplex us fit within a nexus of alternative possibilities. They render happenings intelligible by “enabling us to see how, if … conditions had been different or had changed in various ways, various of these alternative possibilities would have been realized instead” (Woodward 2003,
1 Lately, Glennan (2010) has seemed to qualify his earlier endorsement of Woodward’s position.
123
Synthese (2011) 183:389–408
393
p. 191). Put differently, a “model in effect encodes a set of predictions about what will happen in various hypothetical experiments…” (2002, p. 376). In short, it appears that on Woodward’s view mechanism explanations just reduce to networks of claims asserting ApCoInD relations (2002, p. 375; 2003, pp. 222–223; 2004, pp. 65–66). Perhaps Woodward is right about all of this, but at the very least it seems to run against the grain of what most mechanists believe, or at least once believed, which is that a mechanism explanation for some happening that perplexes us is explanatory precisely in virtue of its capacity to enable us to understand how the parts of some system actually conspire to produce that happening. I doubt that many mechanists would deny that we can use mechanism explanations, or the understanding they provide, to make inferences about counterfactual conditions, but I should have thought that this is considered of ancillary importance, having to do with how we evaluate our explanations rather than with what makes them explanations in the first place. Woodward’s second criticism is that explanations that are not constituted by claims about the underlying or intervening mechanisms of action. There are, in particular, explanations constituted primarily by claims about causal interactions (e.g., “Taking mandrake root cures infertility”) of whose underlying mechanisms of action we are entirely ignorant. Thus, says Woodward, “Fans of mechanisms owe us an account of what it is that we’ve established when we learn, e.g. that a drug cures some illness but don’t yet know about the mechanism by which it does so. Counterfactual accounts provide a straightforward answer to this question” (2004, p. 66). This criticism has the potential to be far more damaging than the first. Whereas the first may lead to what some mechanists will view as a healthy amendment to their theory, the second more clearly relegates the mechanistic approach to the inferior status of handling only a proper subset of explanations. Taking stock, then, Woodward contends that insofar as mechanism explanations are concerned, the mechanistic approach needs, and ought ultimately to be assimilated into, the counterfactual approach. He would have it, moreover, that the mechanistic approach is relevant to only a portion of the total explanation pie. The counterfactual theory, in contrast, is said to provide a unified account of all explanations. In sum, what Woodward seems to have in mind is the assimilation and marginalization of the mechanistic approach to explanation. Before conceding so much, let us consider whether or not there are any alternative strategies for dealing with these concerns that paint the mechanistic approach in a more positive light.
3 Alternative strategies Notice, to start with, that the causal claims at issue in (ii) are the same basic sorts of claims at which mechanism explanations (at least often) bottom out. Thus, a viable mechanist response to (ii) will in all likelihood answer (i) as well. Notice also that just about any theory of causation will help mechanists at least stem the bleeding with regard to (i), and this may be a large part of the reason why mechanists have been receptive to Woodward’s proposition. However, if they had a viable non-counterfactual (or actualist) theory of causation, mechanists about explanation could decisively stave off Woodward’s apparent attempt at assimilating their approach into his. Moreover, in
123
394
Synthese (2011) 183:389–408
order to answer (ii) and, thereby, resist Woodward’s attempt to marginalize the mechanistic approach to explanation, it would seem that not only is an actualist account of the contents of causal claims needed, but an actualist-mechanist account is needed.2 3.1 Alternative 1: a specific actualist-mechanist theory of causal claims The driving intuition behind some mechanists’ resistance to Woodward’s account of the contents of causal claims is that causal claims make assertions about what actually happens rather than about what would happen (Bogen 2004; Machamer 2004; also see Wolff 2007). In particular, some contend that causal claims make assertions about actual productive mechanisms connecting cause and effect. One thus might, in response to (ii), claim that bare and contentful causal claims are explanatory in virtue of this fact. Moreover, if this account of the contents of these causal claims proved viable, it would also provide mechanists with a ready response to (i). To see, more specifically, what such an account might look like and where it might get into trouble, consider the following causal claim: (1) Degeneration of dopaminergic striatal neurons (S) causes movement-execution difficulties (E). There are at least two distinct ways in which we might conclude that (1) is the case. On the one hand, we might monitor the relative values of S and E and come to believe on this basis, and finally to claim, that S causes E. Let us call this a superficial justification. For our purposes, it matters not whether our observations are passive (e.g., nature sets the values of S) or active (e.g., humans intervene on the values), so long as the general pattern is that when S occurs so does E and (let us assume) that when E fails to occur so does S. Woodward seems to agree that such observations would lend support to (1). He would, I gather, propose that under these evidentiary conditions the causal claim roughly asserts that it is approximately true that were S increased or decreased this would, respectively, (via S) increase or decrease E. As will become clearer by way of contrast, one might say that causal claim under these conditions asserts what might be termed a coarse-grained ApCoInD relation (see Woodward 2004). One might, in principle, also reach the conclusion that (1) is true by learning about the relevant physical parts and processes of the nervous system (see Glennan 1996, p. 66). For instance, one might discover that Mink’s general proposals about the mechanisms connecting motor planning centers and motor pattern triggers are correct and subsequently determine that an implication of this setup is that increasing or decreasing S will result—that is, absent such impediments as the introduction of reuptake inhibitors, electrocortical stimulation, etc.—in an increase or decrease in E. One way mechanists might try to at least gain a foothold on the issue of what causal claims assert would be to suggest that, at least when (1) has this sort of deep 2 I am here granting the central premise of (ii), which is that some genuine explanations assert little more than that the happening in question is the effect of some cause. One might contest this point, but here I will try to show that all may not be lost for mechanists even granting this assumption.
123
Synthese (2011) 183:389–408
395
justification, it makes an assertion about the specific mechanisms by which changes in S bring about changes in E. On this sort of approach, which Woodward seems to anticipate, “once intervening mechanisms are filled in… an appeal to counterfactuals is no longer necessary” (2004, p. 65). This appears to be very much along the lines of what was originally proposed by Glennan (1996), who seemed to think that the high-level regularities that characterize causal interactions between mechanism parts are just the public face of the underlying mechanisms that sustain them. Thus, the rough idea is that S causes E if and only if the two are “connected in the appropriate ways” by mechanisms (p. 56). This is an ontological thesis, but Glennan (1996, 2010), wishes to see it carried over into the semantic realm as well, and thus he might say that (1) asserts that S and E are connected in appropriate ways by mechanisms. More specifically, what (1) asserts on this view is that S produces E through a continuous chain of interactions between the parts of some system. On one natural interpretation of this proposal, what one knows of the underlying mechanisms would, under these deep justificatory conditions, inform, and affect the content of, one’s claim that the one type of happening causes the other. Woodward’s main reaction to this proposal is to point out that the actualistmechanist still has trouble accounting for the fact that we sometimes establish that a causal relation holds even without “provision of extensive information about mechanisms” (2004, p. 66). In other words, even if this actualist-mechanist proposal could make sense of causal claims made under deep justificatory conditions, it still has trouble with shallow conditions. Woodward, on the other hand, can claim to have offered a single, counterfactual account of what both superficially and deeply justified causal claims assert. I mention all of this in part to make clear that Woodward’s worries about the possibility of an actualist-mechanist approach to causal claims are, perhaps for good exegetical reasons, based upon the assumption of what might be called a specific reading of the approach, a reading on which causal claims always make assertions about specific mechanisms of action. On this reading, (1) might be taken to always assert that the occurrence of S in some specific mechanical system produces, through a continuous chain of part interactions, E. The worry, however, is that in those cases where we possess no information about the specific mechanisms of action, there is nothing for the putative mechanical assertion to assert. If there’s a way past this concern about actualist-mechanist theories, it may well begin with the rejection of the idea that causal claims always make assertions about specific mechanisms.
3.2 Alternative 2: a non-specific actualist-mechanist theory of causal claims To see how this alternative might go, notice first that Woodward himself has a couple of options regarding how to interpret (1). On the one hand, he might opt for a semanticcontinuum approach whereby varying the depth of the justificatory information alters what (1) asserts about the world—in particular, the deeper the evidentiary information, the finer the network of ApCoInD relations asserted. Although Woodward could say, in accordance with this approach, that (1) asserts ApCoInD relations under both superficial and deep evidentiary conditions, the approach would not entirely univocal
123
396
Synthese (2011) 183:389–408
in that the precise content of the relevant assertion will still vary with the depth of evidentiary information. On the other hand, he might adopt a genuinely univocal approach according to which (1) always assert the very same thing regardless of the depth of evidentiary information. Of course, since we do not always possess information about fine-grained ApCoInD relations, it would make little sense to say that what a claim such as (1) always asserts is that certain fine-grained ApCoInD relationships obtain. Another (perfectly live) option would be a full-shallow version of the approach on which (1) always make a very non-specific assertion to the effect that E bears an ApCoInD relation to S. Let us call this a non-specific ApCoInD theory of causal claims. Actualist-mechanists have similar options available to them. They might propose that the deeper the information we have to back a claim like (1), the more specific the assertion about productive mechanisms. Then again, they might go for a more univocal approach whereby what (1) asserts is the same regardless of depth of information. As with the non-specific ApCoInD theory, the challenge here is still to formulate a version that is compatible with purely shallow evidentiary information. The problem, of course, is that we still lack an actualist-mechanists account of what causal claims assert under the shallowest evidentiary conditions. Indeed, if we had one, we could use it to anchor the shallow-most endpoint of the aforementioned semantic-continuum. Bearing this in mind, I will nevertheless concentrate my efforts on determining whether or not there is a viable, or potentially viable, consistently non-specific actualist-mechanist— I will use the partial acronym AcMe, under the assumption that it also connotes a lack of specificity—theory of causal claims, one that is analogous to the non-specific ApCoInD theory.3 To be compatible with purely shallow evidentiary (or other) information, the idea behind the AcMe theory would have to be that (1) always (or at least nearly so) makes a non-specific assertion to the effect that S bears some general sort of mechanical relation to E. The tricky part, of course, is figuring out how to flesh this idea out into something useful. I will not take a decisive stand here on precisely how to accomplish this, but I will consider one possibility that I find highly promising. The immediate point of the exercise will be to show that, despite how it may appear at first blush, there may be a way to put flesh on the skeletal notion of an AcMe theory. The proposal considered here originates with research into the psychology of causal perception and belief, to which I now turn.
3 It might be argued, moreover, that the right course is to pursue a univocal approach, regardless of what general sort of theory of causal claims one favors. Prima facie what a causal claim asserts to be the case in a particular situation will not always track everything the speaker believes to be the case about the detailed underlying or intervening mechanisms. For instance, it would be odd to suggest that (1) ever asserts anything about the wiring of the basal ganglia, even if the speaker is someone like Mink who knows a great deal about the mechanisms mediating the causal connection in question. A possible exception would be contexts in which shared background knowledge and rules of conversational implicature lead to a claim asserting more than what it, would seem to when taken purely at face value.
123
Synthese (2011) 183:389–408
397
3.2.1 Empirical inspiration for an AcMe theory: causal perception and belief In what follows, I assume a close connection between the content of our adult concept of causation, and the beliefs in which it figures, and the contents of causal claims.4 Like Woodward, I take research into causal perception in infants and young children to be a good starting point from which to launch an investigation into the origin and character of our adult concept of causation (cf. Woodward forthcoming). Particularly important in this regard is research into how infants perceive so-called launch events in which one simple object in a display is seen to move off, billiard-ball style, immediately after contact with another (Leslie and Keeble 1987; Schlottmann and Shanks 1992). As a point of comparison, launch events seem, for adults, to automatically elicit vivid, cognitively impenetrable impressions of causal interactions, impressions easily dispelled by interposing a spatial or temporal gap between the end of the first object’s movement and the start of second’s. The way in which infants attend to launch event lends some credence to the view that infants have a similar impression of these events. A set experiments frequently cited in support of this proposal involves habituating infants to either a launch event or an otherwise identical no-contact event—that is, one type of event is presented repeatedly until there is a substantial drop-off in the amount of attention (measured by looking time) that infants pay to it (Leslie and Keeble 1987; Scholl and Tremoulet 2000). Following habituation to one kind of event, infants are presented with the other, resulting in a relative increase in the amount of attention paid to the latter. This demonstrates, minimally, that infants can discriminate between the two sorts of stimuli. Subsequent research suggests that it is not just differences in the spatiotemporal arrangements of the items that drives dishabituation; rather there seems to be something distinctive about (apparent) causal interactions that especially garners infants’ attention. Leslie and Keeble (1987) show, for instance, that a change in the direction of motion (e.g., from rightward to leftward) of the items involved produces significantly less dishabituation if the there is no contact between the items than if there is. This, they argue, may be due to the fact that there a switch in the items’ roles (i.e., from agent of causation to patient and vice versa) only in the contact events. Oakes and Cohen (1990) have shown, further, that where there is a either temporal or spatial gap between the end of the first object’s motion and the start of second’s, habituation carries over from one of kind of non-causal sequence to the other but not from either non-causal sequence to the (apparent) causal sequence. It appears, then, that infants specifically discriminate (apparent) causal from otherwise identical noncausal sequences. However, this still does not tell us precisely what is going on in cases where an infant (or even an adult) perceives a causal interaction—that is, it does not yet tell us what the distinctive content is of causal perception. At least with regard to the findings cited above, one might put forward a minimalist expectation-based interpretation according to which infants come to expect that 4 “Concept,” in this context, refers to the implicit knowledge that underlies one’s linguistically appropriate
deployment of a term and that in some way contributes content to one’s thoughts about the term’s referent. Externalism about linguistic content might initially give one pause regarding the connection between the contents of causal beliefs and causal claims, but “cause” is not a natural kind term, at least not in the potentially troublesome sense of its referent having a non-superficial essence that one might expect experts (e.g., causation scientists) to reveal.
123
398
Synthese (2011) 183:389–408
certain types of spatiotemporal sequences will be followed by others (see Woodward forthcoming).5 In fact, the expectation that certain types of events will be followed by others seems to be at least a part of what is involved in causal perception in infants. There is, for instance, a well-documented tendency for infants to look longer at collision events when they have an unlikely or impossible outcome (Baillargeon 1994). They appear to be in some manner or other surprised by such events and to expect a different outcome. One might, on the other hand, give a gestalt interpretation of the perception of launch events according to which content is added to the available sensory evidence in much the way that it might be added in the case of other perceptual illusions—for instance, much as illusory contours seem to be added to the immediate sensory data when viewing the Kanizsa triangle. The proposal that something similar may hold with regard to the perception of launch events does gain traction from the aforementioned fact that, for adults, launch events are like other perceptual illusions in eliciting a vivid, cognitively impenetrable impression, one that can dispelled by even minor alterations of the sensory data. Taken only this far, however, the gestalt interpretation fails to tell us precisely what the distinctive content is of causal perception. One possibility is that the added content involves the projection of unseen forces or bodily sensations of pushing, pulling, and the like onto an experienced scene (see Wolff 2007, p. 86). One might also offer a far more opulent causal belief interpretation according to which the distinctive of the perception of launch events in infants is due to the application of a full-blown concept of causation, a concept at least much like the one utilized by adults when they think and reason about causal sequences. As Woodward (forthcoming) points out, this interpretation of causal perception appears to be undermined by the fact that infants and some non-human primates are able to distinguish causal from non-causal sequences, and yet they fail to exhibit the sorts of behaviors one would expect, and that older children exhibit, if they were forming beliefs about the causal structure of the world. The causal belief interpretation is also called into question by the fact that causal perception and causal belief sometimes deliver conflicting verdicts about the world. For instance, superficial justificatory information (e.g., where a change in the color of one object is reliably followed by motion in another) can lead one to judge that the first happening causes the second even though it does not look that way. Likewise, it may look like a causal connection is present even though superficial (as Woodward terms it) “contingency” information—that is, information about whether or not each happening occurs under variations to the other—leads us to believe otherwise. We should bear in mind, however, that this double dissociation of causal perception and causal belief does not have specifically to do with cases where belief is driven by superficial contingency information. It has been observed with regard to deeper, mechanistic information in, for instance, a cross-sectional study conducted by Schlottmann (1999). In one of Schlottmann’s experiments, subjects from four different age groups (ages 5, 7, 10 and adults) saw two balls dropped into separate holes in a box, the second ball being dropped 3 s after the first. Immediately after the second ball was dropped, 5 For our purposes, it matters not whether the expectations are innate, due to learning biases, or are purely data driven.
123
Synthese (2011) 183:389–408
399
a bell rang from inside the box. Subjects of all ages exhibited a heavy preference for saying that the second ball caused the bell to ring, a tendency which was taken to reflect the triggering of causal perception by temporal contiguity information.6 In the next phase, subjects were familiarized with two mechanisms, depicted in Fig. 1a and b. For reasons that should be apparent, dropping the ball into the see-saw device will make the bell ring very quickly whereas dropping it into the ramp device will make it ring after a delay. Even the youngest age group seemed to quickly understand all of this and, after placing a particular device under one of the holes in the box, they (with a few errors on the first trial among 5 year olds) were able to correctly predict how long it would take for the bell to ring. In the final phase, subjects saw that (but not where) one or the other device was put in the two-hole box. The balls were dropped as in the first phase, with a 3 s delay between them. The key finding has to do with when subjects saw the ramp device being put in the box. Adults in this condition consistently judged that whichever of the two balls was dropped first was the one that caused the bell to ring. In contrast, 5 year olds mostly judged that the second ball dropped (which was temporally contiguous with the effect) caused the bell to ring. In other words, despite understanding how the ramp device works and knowing that it was inside the box, their causal judgments were driven by temporal contiguity information rather than mechanism information. 10 year olds performed much like adults on this task, and 7 year olds exhibited an intermediate level of performance. It appears that at a young age causal perception takes precedence over explicit mechanism information in guiding causal belief, but later in life the opposite pattern obtains. On one reasonable interpretation of the foregoing evidence, causal perception is triggered by certain forms of temporal contiguity information (e.g., involving spatial, sonic, and presumably other properties) and the application of our concept of causation as occurs in cases of causal belief is triggered either by causal perception (preferentially so in young children) or by either superficial or deep justificatory information (preferentially so in adults). It hard to fathom how, even on the gestalt interpretation, the content of that concept might issue directly from the content of causal perception in such as way as to give rise to this complicated set of application conditions (Woodward forthcoming). On the other hand, if one backs away from the claim that the content of our concept of causation derives directly and exclusively from the content of causal perception—for instance, if one allows that causal perception is just one facet of a complicated bootstrapping story—one may find that even the austere expectation-based interpretation of the content of causal perception suffices (i.e., for the purpose at hand). To see how, recall that at least a part of the distinctive content of causal perception is the expectation that a certain type of event will be followed by another. Expectations of this sort are, moreover, not restricted to paradigmatic causal interactions (e.g., collisions), as infants also have expectations about many other facets of the physical world. For instance, they seem to have the following expectations:7 6 This putative instance of causal perception is, it bears noting, not triggered purely by spatiotemporal properties, but also by sonic ones. 7 See Baillargeon (1994), Schlottmann (1999), Scholl and Tremoulet (2000), Woodward (forthcoming).
123
400
Synthese (2011) 183:389–408
Fig. 1 Two ways of making a bell ring (from Schlottmann 1999). a (top): Ball falling into cup forces far end of lever up, knocking object off perch and into brass cup. b (bottom): Ball falls onto tracks, rolls down, and knocks object off perch and into brass cup
Objects will not cease to exist when passing behind an occluding object (persistence). Solid objects will not pass through one another (impenetrability). Objects without support will fall (support). The parts of an object will move in unison (cohesion). The interposing of a stout barrier will prevent one object from colliding with another (blocking). These expectations become progressively refined over the course of infancy so as, for instance, to eventually include the following (Baillargeon 1994):
123
Synthese (2011) 183:389–408
401
When (keeping the material substratum constant) the size of an object is increased, there will be an increase in the distance to which a wheeled object that it strikes will role. An object will fall if the proportion of its bottom surface that hangs over its support structure is too large. An object of one fixed size cannot fit inside of another of a smaller fixed size. It would, in addition, be quite surprising if infants did not also come to refine their expectations regarding how the various materials from which objects are constructed affect their behaviors and interactions. While our adult concept of causation probably does not arise solely from the content of the kind of causal perception associated with launch events, it may well arise from the integrated use of an array of perceptual, or perceptually derived, expectations regarding spatial, temporal, and other properties. Consider, for instance, that independent research into the human ability to reason about mechanical systems suggests that we often understand how the occurrence of a particular event in a mechanical system (e.g., a series of interlocking gears) makes another happen by chaining together in thought or imagination primitive—which is to say that the reasoning process does not incorporate any deeper information about the mechanisms underwriting or mediating the behaviors in question—expectations regarding how one happening leads to another (Schwartz and Black 1996). Early in development, these may the very same expectations at issue in research into the development of the perception of causation, impenetrability, support, and the like (Leslie and Keeble 1987; Schlottmann 1999; Hegarty 2004). There is, in fact, a fair amount of evidence that mechanical reasoning utilizes some of the same systems that are involved in perceptual processing and, in addition, that it often has more of a depictive character than a descriptive character (Hegarty 2004). Such considerations lead many researchers to speak of mechanical reasoning as involving, at least often, the ‘running’ of mental simulations of events (see Waskan 2006). Recall also that even the youngest children in the Schlottmann study seemed able, following exposure to the two devices, to understand why dropping the ball into the hole under which had been placed the device in 2A makes the bell ring quickly and why dropping it into the hole under which had been placed the device in 2B makes the bell after a delay. The research reviewed here suggests one plausible account of how it is that children come, at least initially, to understand how one event in a particular mechanical system makes another happen: It is the outcome of representing the (at least) spatiotemporal arrangement of the relevant objects and chaining together perceptual expectations regarding their behaviors. When this form of understanding arises as events are unfolding and the machinations are visible, one might say it involves ‘on-line’ integration of (at least) spatiotemporal information and various perceptual expectations. When this understanding takes place after the fact or when the known machinations have been hidden, one might say it involves ‘off-line’ integration (a simulation) of the same information. While representations of these sorts would have as their content the actual sequence of events, they might in addition enable children to infer, as needed, whether or not any of countless further circumstances—for instance, placing a stout barrier (blocking) squarely on the ramp (support) of the device in 2B—would interfere with the envi-
123
402
Synthese (2011) 183:389–408
sioned process. One might say that the simulation embodies explicit knowledge of the actual sequence but also tremendous amounts of tacit knowledge—that is, knowledge that is not encoded explicitly but can be generated on demand—of the conditions that might interfere with that process (Waskan 2008). In the same way, an off-line simulation might also embody tacit knowledge of the fact that (provided there is not some other event that would also make the bell ring) the ringing will not occur if the dropping of the ball is prevented or altered in various possible ways (e.g., its trajectory is diverted away from the ramp, it is thrown with great force into the hole, etc.). Thus, a child would also be able to fallibly infer that the device in 2B is under one hole in the box not just by what happens when the ball is dropped into that hole but by what happens when the ball is not dropped, etc.—that is, on the basis of various sorts of contingency information. This proposal strikes me as plausible and consistent with at least a great deal of the relevant empirical literature, and it is not unprecedented. If this were, indeed, the basis for our early understanding of how specific events in specific mechanical systems make other events happen, it is not a big stretch to suppose that we might, over the course of many such episodes, also form some notion that there is a more general category of circumstances under which one event makes another happen. Let us call the concept in question that of one happening (X) mechanically necessitating (MeNe-ing) another (Y). It is, unfortunately, not patently obvious what the precise content of this concept would be, but early in childhood it might amount to the idea that there is a class of circumstances in which the prior occurrence of X in some underlying or mediating system of (at least) spatiotemporally arranged parts is connected to the subsequent occurrence of Y through a chain of expected behaviors of the aforementioned sort (e.g., regarding impenetrability, collisions, support, etc.). The “absent impediment” qualifier simply signals the recognition, exemplified above with regard to the device in 2B, that the MeNe-ing relation in question can be defeated in various ways; it is the counterpart of “Ap” in “ApCoInD.” It is also in keeping with this view to suppose that children might, also based upon their experiences with specific mechanisms, take various sorts of contingency information—for instance, that (barring other MeNers) Y will not occur if X is prevented and that (barring bi-directional MeNe-ing such as occurs with interlocking gears) preventing Y will not alter X—to be implied by X MeNe-ing Y. Thus children might fallibly confirm or refute their beliefs, suspicions, etc. by observing the results of changes (perhaps of their own making) to X and Y. In keeping with the above, we might say that the mental states in question explicitly (albeit in highly generic form) represent the actual relationship between X and Y, but they also have implications regarding (or tacitly represent) the consequences of counterfactual interventions on X and Y. A more direct way to fallibly confirm or refute their beliefs, suspicions, etc. would, of course, be to look ‘under the hood,’ so to speak, and gather deep evidentiary information. Anyone who has observed children older than about 3 years of age knows that both sorts of strategies are routinely employed.8 The sorts of perceptual expectations formed and refined over the course of early childhood would suffice to bootstrap understanding, with regard to at least some
8 Regarding the latter, see Bullock (1984).
123
Synthese (2011) 183:389–408
403
mechanical systems, of how one occurrence MeNes another and, subsequently, a concept MeNe-ing. If this account is correct, one imagines that throughout our lives the set of expectations that we come to draw upon in order to understand how events in specific systems unfold, and thus the sorts that might underlie MeNe-ing more generally, will expand well beyond the basic perceptual repertoire. For instance, if, as has been suggested, superficial contingency and deep mechanical information can both clue us in to new forms of MeNe-ing, then any form so detected might itself become part of our repertoire for representing how events in any number of further systems can be expected to unfold, and so on. One would, moreover, expect that our linguistic community would, whether by helping us to master (inter alia) a great many action verbs (see Machamer 2004) or through trusted testimony, itself be a rich source of relevant expectations. In any case, what I have been building towards is the suggestion that our concept of causation might just be the concept of MeNe-ing and that to believe, suspect, etc. that X causes Y is to believe, suspect, etc. that X MeNes Y.9 This proposal strikes me as perfectly sensible, and even plausible, nor is it without precedent in the psychological literature (e.g., see Schlottmann 1999, Schlottmann et al. 2002). Most who have taken an interest in this literature are, for instance, familiar with Ahn and Kalish’s claim that “what is essential in the commonsense conception of causal relations is the belief that there is some process or mechanism mediating this relation, whether understood in detail or not” (2000, p. 202 emph. mine). It should be stressed, however, that the MeNe theory differs from some actualist-mechanist theories of our concept of causation—for instance, Wolff (2007) and even to some extent Ahn and Kalish (2000, p. 201)—in that it rejects outright the proposal that the transfer of energy, physical force, momentum, etc. lies at the heart of our concept of causation. It is, unfortunately, not possible here to take on the many empirical and philosophical challenges that must be dealt with by a theory of our concept of causation. Still, I would be remiss if I did not at least discuss the concerns that Woodward himself has voiced regarding the MeNe theory.10 Some of these—for instance, the concern that causal perception does not suffice to bootstrap the adult concept of causation, that actualist-mechanistic theories do not account for the sensitivity of causal belief to contingency information, and that actualist-mechanist theories have trouble explaining high-level (e.g., biological) causation due to their reliance on low-level notions like energy or force transmission—have, I hope, been either directly or indirectly addressed by the foregoing remarks. There are, however, still two major outstanding worries. One stems from a case, attributed to Hitchcock, in which chalk on a cue stick gets transmitted to a cue ball, and from there to an eight ball, which subsequently sinks. Woodward’s ApCoInD theory seems easily to account for why it is that we judge that 9 It might be objected that to believe that smoking causes cancer is not to believe that the occurrence of smoking always leads (even absent impediment) to the occurrence of cancer. In response one might follow Woodward’s lead and tack on something like, “or at the very least an increase in the probability of Y.” I have reservations regarding the force of this objection (and hence regarding the need for such an amendment) but I will not contest it here. 10 Woodward voiced these concerns while commenting on a condensed, nascent version of this essay at
the 2010 Pacific Division Meeting of the American Philosophical Association.
123
404
Synthese (2011) 183:389–408
the movement of the cue stick, rather than that of the chalk, is what caused the eight ball to sink: the ball sinking bears an ApCoInD relation to the movement of the stick but not to the movement of the chalk. Actualist-mechanist theories that put all of the weight on spatiotemporally continuous transmission of force or energy, on the other hand, seem to have trouble explaining why we judge that one happening is causally relevant and the other is not, for there is a spatiotemporally continuous transmission from each to the effect. Woodward worries that the MeNe theory might run aground of a similar problem. The problem, as Woodward puts it, is to explain why we take one of these mechanical relationships to be of the wrong kind. To see how a proponent of the MeNe theory might respond, notice first that what we are dealing with here is a causal relationship of which we have some amount of deep evidentiary information. Recall, moreover, that children’s expectations regarding how events unfold in specific mechanical systems become more sophisticated over time. Arguably, they come to expect that a rigid ball on a flat surface will (absent impediment) roll when struck squarely by something like a cue stick, that the velocity of the ball will depend on how fast and how heavy the stick is and on whether it hits squarely, and so on. By the same token, children surely do not expect that an impact with fast moving chalk dust will result in a cue or eight ball moving off quickly. These sorts of expectations arguably have much to do with why even children would, presumably, judge that the movement of the stick is causally relevant but that of the chalk is not. If they see or, based upon a description, are asked to simulate the sequence in question, they will understand that the cue stick moving MeNes the ball sinking through a chain of fairly basic perceptual expectations and that the chalk dust moving does not. The former, in other words, is connected to the effect by the right kind of mechanical relation. The second major worry is that some causal relations (e.g., in biology) involve so-called “double prevention” (a.k.a. “disinhibition,” “double prevention,” or “negative control”) (Woodward 2002, 2004). In particular, there are cases where we believe that some X causes some Z even though we know that it does so by preventing Y from preventing Z. For instance, one who accepts Mink’s model (discussed earlier) might think that excitation of striatum causes (or is part of the total cause of) movement execution in that neurons of the striatum prevent neurons of SN from preventing particular motor programs from being executed. Woodward claims that counterfactual theories seem to do a better job than typical actualist-mechanist theories of capturing the sense in which such double-prevention relationships are believed to be causal and that the MeNe theory may be vulnerable to a similar worry. It might be fairer, however, to suggest, with regard to such cases, that whereas traditional actualist-mechanist theories appear too restrictive, counterfactual theories appear too permissive. There are, after all, cases of double-prevention which would (at least prima facie) be classified as causal on the ApCoInD approach but which few people would so classify. Consider, for instance, a case where one billiard ball alters the trajectory of a second, thereby preventing it from altering the trajectory of a third, the outcome being that the third sinks into a pocket. The ball sinking in this case bears an ApCoInD relation to the motion of the first ball, but there is at least some evidence that few would take the former to have caused the latter (see Walsh and Sloman 2005; Woodward forthcoming).
123
Synthese (2011) 183:389–408
405
The MeNe theory offers at least some promise of charting a middle course. It does not rely upon notions like force or energy transfer, though, given the sorts of behavioral expectations hypothesized to bootstrap the concept of MeNe-ing, perhaps there is a tacit restriction of contact between each link in the chain of happenings connecting cause to effect. Consider, for instance, a case where a vase is placed squarely onto a supporting surface, after which a ball is thrown and knocks the support out from under (i.e., prevents it from preventing the downward motion of) the vase, which subsequently breaks. This too is ultimately an empirical matter, but probably most would judge that throwing the ball caused the vase to break. If, however, the chain of contact is broken—for instance, if the ball knocks the same support out of the way just before the vase is released onto it—I suspect that few would judge that the ball caused the vase to break (though the thrower may still be held responsible). Notice also that in both cases the vase breaking bears an ApCoInD relation to the ball being thrown, but only in the former case is the ball being thrown connected to the vase breaking by a continuous chain of contact (adhering to fairly basic perceptual expectations) between the relevant items. Only in the former case does the ball being thrown MeNe the vase breaking. This bodes well for the MeNe theory, but there is, unfortunately, at least one further wrinkle. Consider the following explanation for the link between smoking and cancer. SMOG: Certain forms of genetic damage, which are a common occurrence whether one smokes or not, initiate (if not repaired) a certain form of runaway tissue growth (i.e., malignant tumors). Cigarette smoke contains chemicals that, when inhaled, sometimes disable the cellular devices that repair the aforementioned forms of genetic damage. It may be that most people would, even if they believe that the described break in the chain of contact obtains, still judge that smoking causes cancer. Insofar as the MeNe theory includes stricture that there must be a continuous chain of contact linking causes with effects, such a result would appear troubling on its face. Even so, a MeNe theorist might try to explain the result away through, for instance, one of the following proposals: (A) Normative associations or prior beliefs about the case might counteract our natural tendency to deny a causal connection upon learning that there is a break in the chain of contact. (B) The complexity of the SMOG model might lead us to gloss over what we would otherwise regard as a salient spatiotemporal gap between repair mechanisms and damaged genes. (C) The action verbs (e.g., “inhale”, “repair,” “disable”) used to describe the case may set off our (fast and automatic) System 1 causation detector (see Evans 2003). On closer examination, perhaps our (more careful and deliberative) System 2 would reach a different result—namely, that normal wear and tear causes cancer and smoking just ensures that this process goes uninterrupted. In any case, as far as double-prevention cases are concerned, it appears that neither MeNe theory nor ApCoInD theory can skate to an easy and obvious victory.
123
406
Synthese (2011) 183:389–408
Let me also emphasize, before turning finally to the contents of causal claims, that my primary goal in offering up the MeNe theory is not to demonstrate that it is far and away the best account of our concept of causation (though I do find it quite promising). Instead, I just hope, first, to combat the feeling that the AcMe approach it exemplifies is nonsensical or hopelessly ill-defined and, secondly, to give a concrete illustration of how a worked-out AcMe theory might help to address Woodward’s biggest concerns about actualist-mechanist theories of causal beliefs and, as shown presently, causal claims. 3.2.2 An unexplored egress Under the assumption that our concept of causation captures the standard application conditions for ‘cause’ (and cognates) let us return now to the question of what causal claims assert. If you will recall, one of Woodward’s main objections to the idea that causal claims make assertions about actual productive mechanisms was that sometimes—in particular, under purely shallow evidentiary conditions—there seems to be nothing for the putative mechanical assertions to assert. As we saw, Woodward here overlooks the possibility of (inter alia) a consistently non-specific (AcMe) theory of causal claims according to which “X causes Y” (typically) just asserts that X bears some general sort of mechanical relation Y. One possible way of putting flesh on the bare bones of this proposal would be to put forward a MeNe theory of causal claims according to which claims of the form “X causes Y” simply assert that one happening MeNes another. On this view, causal claims do not (typically) assert anything about specific mechanisms by which this occurs, and Woodward’s concern about shallow evidentiary conditions can thus be diffused. By highlighting how Woodward has, thanks admittedly to neglect of this alternative on the part of mechanists themselves, overlooked the possibility that an AcMe theory of causal claims is correct, I hope to at least plant seeds of doubt regarding his above criticisms of mechanistic theories of explanation. The MeNe theory, for example, shows how mechanists might answer concern (ii), which is that some explanations (viz., bare and contentful causal claims) are not constituted by claims about mechanisms. On any AcMe theory, claims of this sort are about mechanisms. On the MeNe theory, for instance, they essentially just assert that same the kind of mechanical-production relation discerned in other cases (i.e., through on-line and off-line integration of specific spatiotemporal expectations) also obtains, if only covertly, in the case at hand. Claims of this sort do not make assertions about specific mechanisms, though, and so they convey at best only the wispiest understanding of the target happening. Thus, to the extent that they do pass muster as explanations, it is only just barely. As for (i), we saw that on Woodward’s view a major impetus for taking mechanism explanations to be constituted by networks of assertions about counterfactuals is that they bottom out at claims about how the parts of the system being described causally interact. However, insofar as mechanists are able to address (ii), they can also address (i). A proponent of the MeNe theory of causal claims might, for instance, respond that mechanism explanations do bottom out at causal claims but that, instead of making assertions about counterfactuals, these claims make further, non-specific assertions about mechanisms. Mechanism explanations bottom out at claims assert-
123
Synthese (2011) 183:389–408
407
ing that, but not how, on happening MeNes another. At the same time, one need not, on this view, deny that information about counterfactuals is any part of the contents of bottom-out claims. On the MeNe theory, causal claims explicitly assert that one happening MeNe’s another, but, as explained in the previous section, they may also have implications regarding, or tacitly assert, what will happen under counterfactual circumstances. 4 Conclusion As we also saw, (i) formed the basis for Woodward’s claim that mechanism explanations reduce to networks of claims asserting ApCoInD relations, and (ii) formed the basis for his claim that mechanistic theories should be relegated to handling only a proper subset of explanations. Whether on not mechanists about explanation embrace the MeNe theory of either the concept of causation or of the contents of causal claims, my hope is that the foregoing considerations will at least convince them that pursuit of a non-specific, actualist-mechanist theory of causal claims (of whatever sort) offers a way, one which has yet to be fully explored, in which they might stave off Woodward’s apparent attempt at marginalization and assimilation. Acknowledgements Thanks to Jim Woodward, Gaultiero Piccinini, Brendan Shea, Daniel Estrada, and anonymous referees at Synthese for their helpful comments.
References Ahn, W., & Kalish, C. (2000). The role of mechanism beliefs in causal reasoning. In F. Keil & R. Wilson (Eds.), Explanation and cognition (pp. 199–225). Cambridge, MA: The MIT Press. Baillargeon, R. (1994). How do infants learn about the physical world?. Current Directions in Psychological Science, 3(5), 133–140. Bechtel, W., & Richardson, R. C. (1993). Discovering complexity: Decomposition and localization as strategies in scientific research. Princeton, NJ: Princeton University Press. Bogen, J. (2004). Analyzing causality: The opposite of counterfactual is factual. International Studies in the Philosophy of Science, 18, 3–26. Brewer, W. F. (1999). Scientific theories and naive theories as forms of mental representation: Psychologism revived. Science and Education, 8, 489–505. Bullock, M. (1984). Preschool children’s understanding of causal connections. British Journal of Developmental Psychology, 2, 169–191. Bullock, M., Gelman, R., & Baillargeon, R. (1982). The development of causal reasoning. In W. J. Friedman (Ed.), The developmental psychology of time (pp. 209–254). New York, NY: Academic Press. Craver, C. (2006). When mechanistic models explain. Synthese, 153, 355–376. Craver, C. (2007). Explaining the brain. New York, NY: Oxford University Press. Evans, J. (2003). In two minds: Dual process accounts of reasoning. Trends in Cognitive Sciences, 7, 454–459. Glennan, S. (1996). Mechanisms and the nature of causation. Erkenntnis, 44, 49–71. Glennan, S. (2002). Rethinking mechanical explanation. Philosophy of Science, 69, S342–S353. Glennan, S. (2010). Mechanisms, causes, and the layered model of the world. Philosophy and Phenomenological Research. Hegarty, M. (2004). Mechanical reasoning by mental simulation. Trends in Cognitive Sciences, 8(6), 280–285. Horst, S. (2007). Beyond reduction. New York, NY: Oxford University Press. Leslie, A., & Keeble, S. (1987). Do six-month old infants perceive causality?. Cognition, 25, 265–288.
123
408
Synthese (2011) 183:389–408
Machamer, P. (2004). Activities and causation: The metaphysics and epistemology of mechanisms. International Studies in the Philosophy of Science, 18, 27–39. Machamer, P., Darden, L., & Craver, C. F. (2000). Thinking about mechanisms. Philosophy of Science, 67, 1–25. Mink, J. (1996). The basal ganglia: Focused selection and inhibition of competing motor programs. Progress in Neurobiology, 50, 381–425. Nersessian, N. (2002). The cognitive basis of model based reasoning in science. In P. Carruthers, S. Stich, & M. Siegal (Eds.), The cognitive basis of science (pp. 133–192). Cambridge, MA: Cambridge University Press. Oakes, L. M., & Cohen, L. B. (1990). Infant perception of a causal event. Cognitive Development, 5, 193–207. Psillos, S. (2004). A glimpse of the secret connexion: Harmonizing mechanisms with counterfactuals. Perspectives on Science, 12, 288–319. Salmon, W. (1998). Causality and explanation. New York, NY: Oxford University Press. Schlottmann, A. (1999). Seeing it happen and knowing how it works: How children understand the relation between perceptual causality and underlying mechanism. Developmental Psychology, 35(5), 303–317. Schlottmann, A., Allen, D., Linderoth, C., & Hesketh, S. (2002). Perceptual causality in children. Child Development, 7(3), 1656–1677. Schlottmann, A., & Shanks, D. (1992). Evidence for a distinction between judged and perceived causality. The Quarterly Journal Of Experimental Psychology, 44, 321–342. Scholl, B., & Tremoulet, P. (2000). Perceptual causality and animacy. Trends in Cognitive Sciences, 4(8), 299–409. Schwartz, D. L., & Black, J. B. (1996). Shuttling between depictive models and abstract rules: Induction and fall-back. Cognitive Science, 20, 457–497. Tabery, J. (2004). Synthesizing activities and interactions in the concept of a mechanism. Philosophy of Science, 71, 1–15. Thagard, P. (2010). How brains make mental models. In L. Magnani, W. Carnielli, & C. Pizzi (Eds.), Modelbased reasoning in science and technology. Abduction, logic and computational discovery (pp. 447– 462). Berlin: Springer. Trout, J. D. (2002). Scientific explanation and the sense of understanding. Philosophy of Science, 69, 212–233. Walsh, C., & Sloman, S. (2005). The meaning of cause and prevent: The role of causal mechanism. Proceedings of the Twenty-Seventh Annual Conference of the Cognitive Science Society (pp. 2331–2336). Hillsdale, NJ: Erlbaum. Waskan, J. (2006). Models and cognition: Prediction and explanation in everyday life and in science. Cambridge, MA: The MIT Press. Waskan, J. (2008). Knowledge of counterfactual interventions through cognitive models of mechanisms. International Studies in Philosophy of Science, 22(3), 259–275. Waskan, J. (2010). Applications of an implementation story for non-sentential models. In L. Magnani, W. Carnielli, & C. Pizzi (Eds.), Model-based reasoning in science and technology. Abduction, logic, and computational discovery (pp. 463–476). Berlin: Springer. Wolff, P. (2007). Representing causation. Journal of Experimental Psychology: General, 136(1), 82–111. Woodward, J. (2002). What is a mechanism? A counterfactual account. Philosophy of Science, 69, 366–377. Woodward, J. (2003). Making things happen. New York: Oxford University Press. Woodward, J. (2004). Counterfactuals and causal explanation. International Studies in the Philosophy of Science, 18, 41–72. Woodward, J. (forthcoming). Causal perception and causal cognition. In J. Roessler (Ed.), Perception, causation and objectivity: Issues in philosophy and psychology. New York, NY: Oxford University Press. Wright, C., & Bechtel, W. (2007). Mechanisms and psychological explanation. In P. Thagard (Ed.), Philosophy of psychology and cognitive science (pp. 32–79). New York, NY: Elsevier.
123
Synthese (2011) 183:409–427 DOI 10.1007/s11229-011-9870-3
Mechanisms revisited James Woodward
Received: 7 October 2010 / Accepted: 3 January 2011 / Published online: 12 January 2011 © Springer Science+Business Media B.V. 2011
Abstract This paper defends an interventionist treatment of mechanisms and contrasts this with Waskan (forthcoming). Interventionism embodies a differencemaking conception of causation. I contrast such conceptions with geometrical/ mechanical or “actualist” conceptions, associating Waskan’s proposals with the latter. It is argued that geometrical/mechanical conceptions of causation cannot replace difference-making conceptions in characterizing the behavior of mechanisms, but that some of the intuitions behind the geometrical/mechanical approach can be captured by thinking in terms of spatio-temporally organized difference-making information. Keywords Mechanism · Interventionist theory of causation · Difference-making · Perception of causation 1 Introduction I’m pleased to have this opportunity to discuss Jonathan Waskan’s “Mechanistic Explanation at the Limit”. In what follows I explore some issues dividing the interventionist treatment of mechanisms in Woodward, 2004 from the alternative suggestions in Waskan’s essay. Sections 1–4 lay out the basic ideas of the interventionist framework and the contrast between two different conceptions of causation, which I call differencemaking and geometrical/mechanical. Sections 5–8 then comment on Waskan’s ideas. It is widely agreed that in many areas of science information about “mechanisms” plays a central role: the identification of mechanisms is a major goal of theory construction, information about mechanisms is important in causal explanation and so
J. Woodward (B) University of Pittsburgh, Pittsburgh, PA, USA e-mail:
[email protected] 123
410
Synthese (2011) 183:409–427
on. I take this to be common ground among virtually all who discuss the role of mechanisms in science, myself included. Disagreement ensues, however, when one asks how the notion of mechanism is best understood or elucidated. One problem is the notion is used in a variety of different ways or at least has amorphous boundaries. At one end of a spectrum, there is the notion that is in play when one thinks of “mechanical” reasoning involving certain simple machines (gears, pulleys, levers) and which is associated with such “mechanical” philosophers as Galileo and Descartes. Here causal influence is conceptualized in terms of “contact” forces1 that require something like spatio-temporal contiguity and the focus is on such properties as shape, weight, rigidity, impenetrability, position, energy and momentum. At the other end of the spectrum, identifying “mechanisms” seems to mean little more than “explaining in terms of whatever entities and causal relationships are taken to be fundamental in the relevant domain of investigation, perhaps with the expectation that this often involves the identification of parts or components in some larger structure, and an account of how their interaction produces (or explains) some overall input/output relation”. Discussions of “mechanism design” in economics arguably illustrate this latter notion of mechanism—the theorist looks for rules or institutions (e.g., governing different sorts of auctions) that will achieve outcomes that are normatively optimal and where the explanation of the behavior of these institutions is in terms of the interactions of the individuals (the parts) that make them up. In addition, we find writers talking about the “mechanism” of natural selection, where even the notion of decomposition into parts may not have an obvious application. Needless to say, whatever is meant by “mechanism” in this context, it probably has little to do with the identification of structures involving contact forces. As still another possibility, chemists regularly talk about chemical reactions in terms of the mechanisms, where the forces are associated with these reactions, represented in terms of various claims about chemical bonds, are at bottom electromagnetic in nature, but where intuitions about spatio-temporal contact can be misplaced if taken literally. Given this diversity, I believe it is a mistake to look for an account of “mechanism” that will cover all possible interpretations or applications of this notion. A better strategy is to try to elucidate some core elements in some applications of the concept and to understand better why and in what way those elements matter for the goals of theory construction and explanation. In what follows I pursue this strategy, focusing, among other considerations, on the role of spatio-temporal information in the characterization of mechanisms, but without any suggestion that such information is crucial to all mechanism talk. I begin with a contrast between two different ways of thinking about causation which I label Difference-making (DM) and Geometrical/Mechanical (GM). “Interventionist” treatments are one possible variety of DM account and I believe Waskan’s judgment that such treatments fail to do justice to the GM aspect of causation underlies his resistance to them. After explicating the GM vs. DM contrast and explaining how
1 It is worth emphasizing, though, that the modeling of such machines often proceeds by radically abstract-
ing away from the details of the operation of contact forces, which are both complex and often unknown.
123
Synthese (2011) 183:409–427
411
it figures in debates over the notion of mechanism, I turn to a look at some of Waskan’s suggestions and then to some positive proposals of my own.
2 Difference-Making versus Geometrical/Mechanical Accounts of Causation 2.1 Difference-making accounts of causation A guiding intuition of many treatments of causation is that causes make a difference to their effects. There are many ways of spelling out this idea; in terms of INUS conditions, in terms of the idea that the probability of the effect conditional on the occurrence of the cause is different from the probability of the effect conditional on the absence of the cause when other factors are appropriately controlled for, and in terms of counterfactuals. Interventionist accounts are a version of a difference-making account in which the notion of difference-making is spelled out by means of counterfactuals concerning what would happen to an effect if certain interventions on the cause were to occur. DM accounts share several common commitments. One is that causal claims involve a comparison of some kind between what happens in a situation in which a cause is present and alternative situations (which may be actual or merely possible) in which the cause is absent or different. For example, when ingestion of a drug is said to cause recovery from a disease, DM treatments take this to imply a comparison between what happens to subjects with the disease who ingest the drug and otherwise similar subjects who do not ingest the drug. Both groups may actually exist, as would be the case if an experimental trial were actually performed with a treatment and control group. But on at least some formulations of the DM approach, including the interventionist version I favor, the existence of both groups is not necessary for it to be true that the drug is efficacious in promoting recovery. On such views, if, say, patients are given the drug and recover, then it will be true that the drug causes recovery as long as it is also true that if (perhaps contrary to actual fact) otherwise similar patients were given the drug, they would recover. In both these cases, though, the claim that C causes E is taken to “point beyond” the local, actual situation in which C and E occur and has implications for what happens or would happen in alternative situations in which C does not occur. DM accounts emphasize the role covariational or contingency information plays as evidence for causal claims—that is, information having to do with patterns of co-occurrence or correlations among different values of the variables that are candidates for causal relata, whether these are detected through passive observation or generated via experimental interventions. Such information is relevant in an obvious way to whether one variable plays a difference-making role with respect to another. In part for this reason, DM accounts seem to apply most directly to the characterization of so-called type-causal relationships—claims that some repeatable type of property or magnitude (variable) is causally relevant to another. It is of course possible to provide differencemaking accounts of token-causal relationships, as in Lewis’ well-known account, but in my view such theories rely, explicitly or tacitly, on type-level difference-making claims.
123
412
Synthese (2011) 183:409–427
As remarked above, the interventionist treatment of causation in Woodward (2002, 2003) is a difference-making account. The basic idea is simple: (M) Suppose X and Y are variables. Then, in the simplest case, X is causally relevant (at the type level) to Y if and only if there is a possible intervention on X such that if such an intervention were to occur, the value of Y or the probability distribution of Y would change—in other words, some interventions on X make a difference for Y . Heuristically, an intervention can be thought of as an unconfounded manipulation of the sort that might be realized in an “ideal” experiment—for details see Woodward (2003). There are many issues about how best to formulate an interventionist theory, but I will ignore most in what follows since they are not germane to the topic of mechanisms. Nonetheless, a few additional observations may be helpful. First, the employment of counterfactuals in (M) contrasts with Waskan’s aspiration to characterize causal relationships and mechanisms in a way that does not appeal to counterfactuals and instead has only to do with what is “actual”. Second, (M) characterizes a notion of causation that is relatively non-specific and uninformative: (M) will be satisfied as long as some interventions on X in some background circumstances are associated with changes in Y . One often wants to know much more than this—for example, exactly which changes in X are associated with exactly which changes in Y and in what circumstances. Within an interventionist framework, this information is spelled out by more specific “interventionist” counterfactuals linking specific changes in X to specific changes in Y . One way of accomplishing this is by using functional relationships – writing Y as a function of X and other variables (Y = F(X, Z )), with this functional relationship understood along interventionist lines, as telling us how Y will respond to specific changes in X and other variables. Such functional relationships provide difference-making information of a quantitative, rather than purely qualitative sort. Conversely, it seems hard to understand the widespread use of functional relations in science to represent causal relations unless the latter are understood in difference-making terms. I mention this because one motivation for an interest in mechanisms is the observation that if one knows only that X causes Y (e.g., aspirin causes headache relief), understood along the lines of (M) or in some other way, this is uninformative about much else one would like to know—about the circumstances in which X produces Y , and so on. The demand for information about mechanisms is often (at least in part) a demand for such more specific and detailed information. As I explain below, there is nothing in a difference-making or interventionist account which is inconsistent with this demand, which can be provided by interventionist counterfactuals that are more detailed and specific than (M). 2.2 Geometrical/mechanical theories Within philosophy, causal process theories such as Salmon (1984), and Dowe (2000) furnish familiar examples of GM theories but many more recent theories that focus on the role of mechanisms also seem largely in this camp. A distinguishing feature is that these approaches do not think of causal claims as implicitly comparative in the way
123
Synthese (2011) 183:409–427
413
that DM accounts do. Instead, the underlying intuition about causation is something like this: whether there is a causal relationship between two events c and e just has to do with whether c and e occur and whether there is an appropriate “connecting” process (or mechanism) between them; moreover whether there is such a process depends just on what is actually true of the particular occasion of interest, and does not depend on what does or would happen on other occasions (for example, it does not depend on what does or would happen if a c-like event had occurred on some other occasion or if c had not occurred). In this sense, GM accounts treat causation as a “local” or “intrinsic” relation to be understood in terms of what “actually” happens, and not in terms of modal or counterfactual considerations. (These “actualist” commitments are apparent in Waskan’s treatment of mechanisms—indeed, he takes them, not unreasonably, to be the feature distinguishing his treatment from DM accounts like mine.) Just which features of a causal transaction qualify as “actual” in the relevant sense is not entirely straightforward, but presumably these at least include spatio-temporal relationships and other “kinematic” properties, as well as properties like energy and momentum. It is thus unsurprising that these figure prominently in many accounts of “connecting mechanisms”. GM treatments differ from DM treatments in other ways as well. The notion of a connecting process applies most naturally to particular events or interacting entities (as implicitly assumed above when I used lower case c and e to represent the entities standing in GM causal relationships). In contrast to DM theories, GM theories thus tend to regard token causal claims as primary, although claims relating types of events may be formulated as generalizations over token claims. In addition, unlike DM accounts, GM treatments impose substantive, domain-specific constraints on what it is for one event to stand in a causal relationship to another. Suppose I omit to water the flowers and they die, where my omission makes the difference for whether they die. Suppose also (as seems uncontroversial) there is no spatio-temporally continuous connecting process leading from omission to death. Then, on a GM account requiring such connectedness, my omission cannot not be a cause of the flowers’ death. As another possibility, suppose that people typically do not represent mental events as located in space or as spatio-temporally connected to behavior. Then, if representation of events as causally related requires representation of events as connected by spatiotemporally continuous processes, people will not represent mental events as causes of behavior. While DM accounts assign a central role to contingency information as a source of evidence for causal claims, GM accounts commonly assign a central evidential role to spatio-temporal or geometrical relationships or to facts about the presence or absence of the appropriate sorts of mechanical properties (rigidity, weight etc.). In particular, cases in which (it appears) one can just “read off” which causal relationships are present from geometrical or mechanical properties, without any apparent need to rely on contingency information, play an important role in GM thinking about causation. A paradigmatic example, which I discuss below and which also figures in Waskan’s discussion, is the so-called “perception” of causation in “launching” events. When one sees a moving billiard ball strike a stationary one, the second begin to move, and the spatio-temporal parameters governing the collision are right, it appears that one can simply “see” the collision cause the second ball to move—everything one
123
414
Synthese (2011) 183:409–427
needs to reach this conclusion seems to be present in the motion of the balls and their spatio-temporal relation. In particular, it appears unnecessary to rely on covariational or counterfactual information about the movement of other balls on other occasions, or about how the second ball would have moved in the absence of a collision with the first in reaching the conclusion that the collision caused the movement of the second ball. Interactions of this sort are thus natural sources for “actualist” or “localist” intuitions about causation. Many other examples seem to exhibit a similarly transparent connection between geometrical structure/mechanical properties and causal relationships—think of seeing that one solid object supports another or that the movement of one gear causes a second, interlocking gear to turn. What is the relationship between DM and GM approaches to causation? Although, as I suggest below, both capture important strands in how ordinary adults think about causation, they also seem conceptually or logically independent of each other to a remarkable degree. This apparent independence helps to generate many of the conflicting “intuitions” that philosophers have about the role of mechanisms in explanation. First, if we interpret GM approaches as requiring a spatio-temporally continuous process or spatio-temporal contiguity between C and effect E, this is apparently not necessary for C to stand in a difference making relationship to E. This is illustrated by the omission example described above as well as by more complex examples involving double prevention, of which more below. As another illustration, if we think of Newtonian gravitational theory as describing a force acting instantaneously at a distance, then this theory describes difference-making relationships with great accuracy, yet without specifying any mediating mechanism. Of course, Newton himself famously thought that there must exist something playing this mediating role, but subsequent generations of physicists, failing to find such any such mediator, became increasing comfortable conceiving of gravity along purely difference-making lines, thus again illustrating the apparent logical or conceptual independence of difference-making from claims about connecting mechanisms. Somewhat more subtly, numerous examples show that spatio-temporal connectedness (or, more generally, the existence of connecting mechanism between C and E) is also not sufficient for C to cause E. Suppose (cf. Hitchcock 1995) blue chalk is applied to the tip of a cue stick, which is used to strike the cue ball which in turn strikes the eight ball, which goes into the pocket because of the impact of the cue ball. Portions of the chalk are transmitted to the cue ball, and from there to the eight ball. There is a mechanical relationship in the paradigmatic sense of spatio-temporally continuous transmission of energy/momentum between the presence of the chalk on the cue (P) and the falling of the eight ball (F), yet P does not cause F in the difference-making sense. (We will see later that this observation creates difficulties for Waskan’s proposal to analyze what is for C to cause E in terms of the existence of “some mechanism or other” linking C to E.) Another illustration: Suppose yellow fingers (Y ) and cough (C) are joint effects of a common cause, smoking (S). Y and C are correlated but neither causes the other. Are there connecting processes or mechanical relationships between Y and C? Sure. Sound waves from C impinge on Y , transferring energy and momentum. The slight increase in mass of the fingers because of the deposit of discoloring material exerts a (minute) gravitational influence on the parts of the subject’s body involved in coughing. None-
123
Synthese (2011) 183:409–427
415
theless, whether or not Y occurs does not make a difference for the occurrence of C or vice-versa. The problem in both cases is that although there is a mechanical relationship/ connecting process between F and P, and between Y and C, the relationship is not of the right character to mediate or support F‘s causing P or Y ’s causing C, in the sense of cause associated with difference-making. For example, the gravitational energy transfers associated with Y and C are far too minute and inconsequential to warrant the conclusion that Y makes a difference for C. Similarly, what makes a difference for whether F occurs are the details of energy and momentum communicated to the cue ball by the cue and not just whether some non-zero amount of energy/momentum or blue chalk is communicated. Moreover, while these may seem to be artificial examples, similar observations apply to more realistic cases—to anticipate a case discussed below, information about physical connections between neural structures typically does not tell us what sort of difference-making role (if any) these structures play. In addition to this apparent conceptual distinctness, there is substantial evidence for empirical disassociations between causal claims or judgments based on GM information and those based on DM information. In an experiment discussed by Waskan (described in more detail in Woodward, forthcoming), Schlottman and Shanks (1992) presented subjects both with apparent collisions between moving objects that “looked causal” on a token-by-token basis (because the collisions exhibited spatio-temporal patterns characteristic of launching events) and with other events that did not look causal (e.g., because there was no spatio-temporal contact or the temporal delay in the movement of the second object was too great). The contingency between these events was also varied independently of the spatio-temporal parameters—for example, on some trials, even though the spatio-temporal parameters were such that interactions looked causal, the second object was no more likely to move in the presence of such a collision than in its absence. Subjects were asked questions designed to probe whether they perceived the interactions as causal (e.g., “Did it look to you as though A hit B and caused it to move?”— Schlottman and Shanks take these to be measures of “causal perception”, although they might equally well be described as measures of judgments of perceived causation. Subjects were also asked questions that Schlottman and Shanks took to reflect judgments about the existence of dependencies or contingencies at the type level (e.g. subjects were asked to judge how “necessary” collisions with A were for making B move—Schlottman and Shanks call this “judged causality”). The result was a “dissociation” between these two sets of judgments. Ratings of perceived causality “show[ed] a substantial contiguity effect but no contingency effect whatsoever” (1992, p. 337)—that is, ratings fell with increased temporal delay between collision with A and movement of B, but were uninfluenced by the contingency between collision with A and subsequent movement of B. On the other hand, judgments of causal necessity showed a “large contingency effect and a much smaller contiguity effect” (p. 337). It is natural to connect these results with the differences between DM and GM approaches to causation.2 On a DM theory, whether there is causation between the 2 I should emphasize that this connection with DM and GM conceptions of causation is not suggested by
Schlottman and Shanks but is rather my own interpretation.
123
416
Synthese (2011) 183:409–427
movement of A and the subsequent movement of B has to do with whether there is contingency or dependence between these two classes of events. This is reflected in the subject’s responses to questions about “judged causality”. By contrast, on a GM account, whether causation is present in individual interactions between A and B requires only that the appropriate connection—in this case, spatio-temporal contiguity of the right sort—is present. This is reflected in subject’s reports about perceived causality. What the Schlottman/Shanks experiments show is that these judgments can dissociate. This possibility should not be surprising in view of the conceptual/logical independence of these two different ways of thinking about causation. As emphasized above, there is apparently nothing in the idea there is a difference-making relation between C and E, by itself, that requires that there a connecting process or mechanism between individual instances of C and E. Moreover, there seems to be nothing in the concept of such a connecting process being present on particular occasions from which it follows that there is an overall contingency between C and E.
3 Two Concepts of Causation? What does all this imply about how we should think about causation? The logical differences between DM and GM treatments of causation and the different sorts of judgments to which they sometimes lead may seem to suggest that we operate with two distinct concepts of causation. This is the conclusion drawn (at one point) by Hall (2004). Hall distinguishes between a concept of causation as “dependence” which corresponds fairly closely to what I have been calling “difference-making” and a concept he calls “production”. The latter is broadly similar to although not identical with the notion GM approaches attempt to capture. Hall thinks both concepts figure in ordinary causal thinking, although he regards the production concept as more fundamental. However, even if Hall is correct in thinking that we operate with two distinct causal concepts, it also seems uncontroversial that these are integrated or intertwined in ordinary adult causal thinking, rather than remaining disconnected and orthogonal to one another. This is apparent in cases of casual perception. When one “perceives” that a collision with ball one has caused a second ball to move, then, at least in the absence of information about some other causal factor that might cause movement of the second ball, most people will find it natural—indeed virtually irresistible—to judge it was the collision that made the difference for the motion of the second ball and that if the collision had not occurred, the second ball would not have moved. In other words, we assume there will be a close relationship between the spatio-temporal relationships and connecting processes on which GM accounts focus and difference-making. This is reflected in the responses of subjects to the Schlottman/Shanks experiment—subjects found the dissociation between what geometrical cues seemed to indicate about causal relationships and what the contingency evidence seemed to indicate unexpected and bewildering. For example, subjects commented that on contingent problems with delay, they were aware that collision was necessary for A to move but that “it just did not look as if it should be.” On non-contingent problems with no temporal delay, they commented that “they knew the collision was not necessary for B to move but it looked as if it should be”. Similarly when geometrical mechanical information
123
Synthese (2011) 183:409–427
417
tells us that one object (e.g. a plate) is supported by another objection which it rests (a table), we immediately also judge that the contact with the table is what makes a difference for whether the plate remains at rest, that if the table was removed, the plate would fall and so on. In short, despite the conceptual distinctness of GM and DM approaches of causation, we also think that in appropriate circumstances the presence of certain spatio-temporal relationships/patterns and mechanical properties can be fallible indicators of the presence of certain kinds of difference-making and that different varieties of difference-making often have characteristic spatio-temporal “signatures”. (In ordinary circumstances, one billiard ball does not make a difference for whether another moves if it does not make spatial contact with it.) 4 The Role of DM and GM Information in the Characterization of Mechanisms The rest of my discussion will explore the relevance of these ideas to controversies about how the notion of a mechanism is best characterized and how this notion figures in causal explanation. As I see it, these controversies turn in part on issues about the relationship between the two approaches to causation I have distinguished. Insofar as the emphasis on the role of mechanisms represents a distinctive alternative to DM (and in particular, interventionist) treatments of causation, I believe this is because mechanistas think of the GM approach (or at least some sort of non-comparative, “actualist” approach) to causation as more fundamental than DM approaches and as a promising way of capturing what is “really going on” at a “deeper” level when difference-making relations are present. In what follows I express skepticism about this idea but also argue, on a more positive note, that a more adequate characterization of the (or a) notion of mechanism can be found by integrating elements of GM and DM approaches. 5 Waskan’s Proposals It should be obvious that one can sometimes tell (for example, on the basis of randomized experiments) that (M) (or other similar conditions having to do with difference-making) are satisfied even if one knows nothing about the “mechanism” (in any intuitive sense of that notion) connecting X to Y . In fact, it is quite common to be in this epistemic situation. It was known for many years that ingestion of aspirin causes (in the DM sense) headache relief but only recently has the mechanism of action of aspirin has come to be understood. Similarly, the role of SSRIs in causing alleviation of depression in at least some subjects is generally accepted, but the mechanism by which the drug produces this effect remains controversial. (M) seems to fit these observations in a natural way since for there to be causal relation between X and Y , all that is required is the right sort of contingency between X and Y and one can have knowledge that such contingency obtains without knowing anything about the mechanism connecting X and Y . As Waskan notes, these observations seem to tell against one simple proposal about the semantics of causal claims: namely that to assert that C causes E is to assert that the specific mechanism that in fact connects C to E holds between C and E. The
123
418
Synthese (2011) 183:409–427
observations also tell against a related proposal about what it is to (internally) represent that there is a causal relationship between C and E—namely that this requires that the subject represent whatever specific mechanism as a matter of fact connects C to E. Waskan’s response is to interpret the claim that C causes E as [making ]” a non-specific assertion to the effect that [C] bears some general sort of mechanical relation” to E (p. 13). In other words, his contention is that the claim (1) C causes E “asserts” that there exists some mechanism connecting C to E, although (1) does not specify what that mechanism is. Waskan adds, “[t]he tricky part… is figuring out how to flesh this idea out into something useful”. This is indeed tricky, for at least two reasons. The first issue, which I raise but will not further pursue, is to provide some non-trivial characterization of what “some general sort of mechanical relation” includes—a characterization that is sufficiently general to cover everything that intuitively one wants to classify as a “mechanical” relation and yet at the same time is not so vague as to be entirely toothless and to exclude nothing. For example, this characterization will need to be broader than Salmon/Dowe type characterizations in terms of energy/momentum transfer (in order to include, e.g., neural mechanisms) but also needs to exclude some kinds of DM relationships (perhaps those involving omissions and double prevention?) as not providing information about “mechanisms”, if the GM approach is to be a genuine alternative to DM accounts. Putting this issue aside, suppose we somehow arrive at a satisfactory characterization of what it is for C to bear “some general sort of mechanical relation” to E. Is the existence of such a relation between C and E sufficient for C to cause E? We have already seen the answer to this question is “no”, at least if causation is understood in a broadly difference-making sense. This was the lesson of the cue ball and cough/yellow fingers examples: although there is a mechanical relationship in the paradigmatic sense of spatio-temporally continuous transmission of energy/momentum) between the presence of the chalk on the cue stick (P) and the falling of the eight ball (F), P does not cause F. So “P causes F” cannot mean merely that there exists some mechanism connecting P to F. Note, moreover, that this difficulty cannot be avoided by looking for a less restrictive notion of “mechanical relationship” than “relationship mediated by energy/momentum transfer”. The proposal under consideration says that as long as “there exists” a mechanical relationship between P and F, then P causes F and there does exist such a relationship in this case, on any reasonable interpretation of “mechanical”. Even if we cannot appeal to the existence of “some mechanism” between C and E to elucidate what is for C to cause E, there remains the interesting and important issue of just what kind of information is conveyed by mechanism claims. Is this just (more detailed) DM information or does it involve something completely different? To further explore this issue, let us briefly consider how one feature of information about mechanisms might be accommodated within an interventionist framework and then contrast this with the alternative that Waskan favors. Whatever else is meant by “information about the mechanism” connecting C and E, virtually everyone agrees that this often involves the provision of information about intermediate or intervening causal relationships that underlie or mediate the overall relationship between C and E. As an illustration, consider the mechanism by which
123
Synthese (2011) 183:409–427
419
aspirin causes pain relief, which in rough outline is as follows: headaches and various other injuries/disorders cause the production of prostaglandins, which cause swelling of the affected part and are involved in pain signaling. The synthesis of prostaglandins requires COX enzymes, of which there are two types, COX-1 and COX-2, operating on arachadonic acids. Aspirin works by inhibiting the synthesizing action of both enzymes, thus preventing the conversion of arachadonic acid to prostaglandins. In particular, both enzymes have “tunnels”, through which the arachadonic acid must pass to reach the active sites of the enzymes. Aspirin blocks these tunnels, although the details of how it does so and the active sites themselves differ from COX-1 to COX-2. These differences are important because the form of prostaglandin made by COX-1 helps to protect the lining of the stomach; it is the inhibition of this form which accounts for the role of aspirin in upset stomach. Potentially, then, it may be possible to create agents which, unlike aspirin, act only on COX-2 and not on COX-1, thus providing some pain/inflammation relief without causing stomach upset. On one natural way of understanding this mechanism information, it seems to fit well into a difference-making (and interventionist) framework—what is provided includes more detailed, fine grained difference-making information (and relatedly, more detailed, fine-grained possibilities for intervention) that goes well beyond information about the overall difference-making relation between aspirin and pain relief. (Perhaps this information about intervening causes involves more than just more finegrained difference-making information, but it at least includes this.) For example, we are told the presence of prostaglandins makes a difference for whether one experiences pain and inflammation (hence that intervening on whether prostaglandins are present in some other way besides aspirin ingestion would also make a difference for pain and inflammation), that the synthetic activity of COX-1 and 2 makes a difference for whether prostaglandins are present (so that intervening on this can affect whether there is pain or inflammation, and that whether aspirin is present makes a difference for whether this synthetic activity occurs). Similarly, that the active site of COX-1 is different from the active site of COX-2 suggests the possibility of an intervention that inhibits only the former. This is by no means the only way in which an interventionist account might attempt to capture some aspects of the notion of information about mechanisms (see below for some additional suggestions) but even at this point, it is at odds with Waskan’s treatment. I have suggested that at least some features of mechanistic information are just “more of the same” when viewed from an interventionist perspective—more finegrained and detailed information about interventionist counterfactuals than is provided by the original causal claim. By contrast, Waskan sees information about connecting mechanisms as different in kind from the information about overall dependence provided by the original causal claim; it is not difference-making or interventionist information at all. If I have understood him correctly, Waskan’s idea is that mechanism information differs from difference-making information in that the former is just information about what actually happens (so that in this respect, it incorporates one of the distinctive features of GM information), while the latter has a modal or counterfactual element to it. He writes:
123
420
Synthese (2011) 183:409–427
The driving intuition behind some mechanists’ resistance to Woodward’s account of the contents of causal claims is that causal claims make assertions about what actually happens rather than about what would happen…. In particular, some contend that causal claims make assertions about actual productive mechanisms connecting cause and effect. To illustrate this idea Waskan refers to a mechanistic account of movement disorders in late stage Huntington’s and Parkinson’s put forward by Mink (1996). In Waskan’s summary: To vastly oversimplify, [Mink] claims that motor-planning systems of the cortex initiate movement by focally exciting neurons in the striatum and globally exciting neurons of the substantia nigra. The latter in turn send inhibitory signals to each of the endogenously active neural triggers ….. in the cerebellum for familiar motor patterns (e.g., raising one’s arm, clenching one’s fist, etc.). At the same time, local activity in the striatum leads to selective inhibition of the substantia nigra, which has the effect of selectively disinhibiting specific motor-pattern triggers. It is easy to envision how, on this setup, degeneration of striatum might produce movement-execution deficits (i.e., no brake would ever be disabled) and how degeneration of SN would lead to co-contraction rigidity (i.e., many brakes would be disabled simultaneously). To illustrate what is distinctive about this information, Waskan asks us to consider the following causal claim, which I re-label as [2]: [2] Degeneration of dopaminergic striatal neurons (S) causes movement-execution difficulties (E). He writes: There are at least two distinct ways in which we might conclude that [2] is the case. On the one hand, we might monitor the relative values of S and E and come to believe on this basis, and finally to claim, that S causes E. Let us call this a superficial justification… [On the other hand] one might, in principle, also reach the conclusion that [2] is true by learning about the relevant physical parts and processes of the nervous system …. For instance, one might discover that Mink’s general proposals about the mechanisms connecting motor planning centers and motor pattern triggers are correct and subsequently determine that an implication of this setup is that increasing or decreasing S will result – that is, absent such impediments as the introduction of reuptake inhibitors, electrocortical stimulation, etc – in an increase or decrease in E. In what sense is the information Mink provides just about “actual” mechanical connections or “physical parts and processes”, in contrast to information about what would happen under interventions? One thing that might be meant by “actual connections” etc. are the anatomical or physical connections involving neurons, synapses etc linking different neural structures. These do figure prominently in Mink’s discussion, since he is concerned to show that the various neural connections required
123
Synthese (2011) 183:409–427
421
by his hypotheses about motor control are in fact present. But it is generally agreed one cannot just read off from facts about these anatomical connections, what the causal role (in the sense of difference-making) of structures like the substantia nigra or striatum are in motor behavior. One reflection of this is the contrast neurobiologists draw between “anatomical connectivity” and what they call “functional connectivity”—the latter being a difference-making notion having to do with whether various changes or signals originating in one neural structure cause changes in other structures. The existence of the right sort of anatomical connections is necessary for one neural area to influence another, but the existence of such connections does not in itself tell us what the pattern of functional connectivity between the two areas is or the details of how what happens in one area depends on what happens in the other.3 Still less does it yield anything like a computational theory of how normal movement is produced or how damage to the circuits and structures involved in normal movement leads to movement disorders. For these latter purposes, covariational information of some kind seems required, which may derive either from experimental interventions or passive observation. In fact such evidence plays a central role in Mink’s account, as is reflected in his use of information about the impact of surgically and chemically induced lesions of various sorts on movement disorders to elucidate the mechanism underlying [2]. For example, Mink notes the hypothesis that parkinsonism is associated with decreased activity of certain structures and increased activity of others [GPi and STN] can be tested with focal lesions of the latter structures in normal (animal) subjects. This in effect involves the use of experimental interventions to show how the normal activity of these structures makes a difference of various sorts for movement control. Mink also describes evidence involving recording from various structures when normal subjects perform tasks that involve different sorts of movements (for example, tasks involving sequences of movements versus single movements). This provides information that different structures are differentially involved in single versus sequential movement tasks and that damage to different structures leads to different sorts of movement disorders. In all of these cases, it looks as though what is represented or elucidated is the difference-making role of these different structures, and the different connections (excitatory or inhibitory) between them. In addition, Mink’s review article (as well as his subsequent papers on movement disorders) contains several purported “circuit diagrams” (e.g., Figs. 14 and 15) showing (in a schematic way) how various sorts of activity, excitatory or inhibitory, in certain neural structures influence responses in others, and in turn influence movement. The diagrams also suggest how damage or disruption of certain of these circuit components leads to abnormal movement. Again it is natural to think of these diagrams as representations of difference-making relations, showing us that, e.g., increased activity in the striatum leads via a direct pathway to 3 Would it help the actualist analysis if we included, within the class of “actual connections”, such proper-
ties and processes as the discharge of axons, diffusion of ions across synaptic clefts, and the like? No. We still face the same fundamental difficulty—if we want to understand the causes (in the difference-making sense) of normal and abnormal movement, we need to identify the difference-making role of those properties/processes which make a difference for these outcomes. The mere occurrence of, say, a certain pattern of neural spiking followed by normal movement does not tell us what (if anything) in that pattern made a difference for whether the movement was normal. For that we need information about what would happen if the spiking pattern were different.
123
422
Synthese (2011) 183:409–427
inhibition of GPi etc. What seems distinctive about these diagrams as representations of mechanisms is not that they eschew difference-making information, but rather the way in which they integrate it with other sorts information—for example, information about the physical connections and spatial relationships along which, so to speak, the difference-making relations operate. I thus see this example as illustrating the same general point as the chalk transmission example. It is true enough that if the impact of the cue stick causes the eight ball to go into the pocket, there must be a “connection” of some appropriate kind between the stick and the ball. But the existence of this connection does not by itself pin down what it is about the motion of the cue stick that makes a difference for whether the ball goes into the pocket. Similarly for the role of various neural circuits in the control of movement.
6 More on Mechanisms and Interventionist Counterfactuals So far, I have focused on one way in which an interventionist account might try to capture aspects of mechanistic information—this having to do with information about intermediate causal links. I turn now to some other ways in which interventionist accounts might capture other features of mechanisms. I noted above the characterization (M) is vague and non-specific and that we often wish to know more details. Even when quantitative information about the X −Y relation is not available, it may be possible to discover more qualitative features of the relationship that go beyond the simple claim that X causes Y and often this information will bear on the nature of the mechanism connecting X to Y . For example, if X is a variable that can increase or decrease in magnitude, then it is often of interest to learn whether such changes are associated with increases or decreases in Y (whether there is a “dose–response” relationship), whether there are threshold effects in X ’s relationship to Y (as in the discharge of an axon) and so on. Again, this is naturally thought of as information about difference-making and it can be useful for the identification of mechanisms even when imprecise. For example, before the molecular mechanisms by which smoking causes lung cancer were elucidated, researchers had, in addition to the observation of overall correlations between smoking and lung cancer, general if vague ideas about the mechanism by which smoking produced this effect: they assumed that substances in tobacco smoke was inhaled during smoking, deposited in the lungs, and that their presence had a carcinogenic effect. This suggested additional hypotheses about the qualitative character of the smoking/lung cancer relationship, understood along interventionist or difference-making lines. For example, one might expect to see evidence of tobacco products in smoker’s lungs. Another fallible but not unreasonable inference is that those who smoke more should deposit more carcinogenic material in their lungs and hence have a higher probability of lung cancer. Similarly for those who inhale deeply while smoking in comparison with those who do not. Filtering devices, if effective in removing the carcinogenic material in smoke, should reduce the inference of cancer. And so on. When these patterns were found, this increased confidence that smoking causes lung cancer. As this example illustrates, one reason why information about mechanisms is valuable (in addition to the increased
123
Synthese (2011) 183:409–427
423
understanding it brings) is that it can play an important role in testing causal hypotheses. Relying just on the observed correlation between smoking and lung cancer, one might worry this is produced by some unknown confounding factor. However, it is extremely implausible that this factor should happen to be distributed in such a way that all of the patterns described above obtain—more cancer among those who smoke more heavily and so on. Finding such patterns helps to rule out the confounding factor hypothesis. Yet another feature of information about mechanisms that can be naturally captured within an interventionist framework has to do with stability (discussed in more detail in Woodward 2006, 2010). The stability of a causal or counterfactual relationship has to do with whether that relationship would continue to hold as various other “background” factors change. As noted above, mechanistic explanations often provide information about intermediate causal links “underlying” an overall causal relationship and often these intermediate links will be more stable than the overall relationship itself. A concern with identifying stable relationships thus leads directly to an interest in identifying intermediate links in underlying mechanisms. In addition, as I have attempted to explain elsewhere (2006), relationships of counterfactual dependence that are not mediated by direct physical connections, such as relationships of “double prevention”, may differ greatly in their stability. Other things being equal, the more stable such relationships are, the more likely they are to be judged as paradigmatically causal. By appealing to this consideration, an interventionist can provide a partial explanation of why many double prevention relations in biomedical contexts strike us as unproblematically causal,4 while other double prevention relations, such as those in the Walsh and Sloman experiment described by Waskan, do not. 7 Mechanisms and Spatio-temporally Organized Difference-Making Information Despite these suggestions about how interventionist ideas might be used to elucidate mechanism claims, many mechanistas will still think something crucial has been omitted. In this section, I offer a further proposal about what this might be—it is intended as a supplement to rather than a retraction of the interventionist approach. When one looks at accounts of mechanisms in philosophy as well as examples of mechanistic explanations in science itself (especially the biomedical sciences) one thing that stands out is the role that the spatio-temporal structure or organization of the various components plays in the operation of the mechanism. For example, as emphasized in Bechtel (2006), biochemical mechanisms require that various reaction products be brought together at just the right times and at the right spatial positions with respect to each other for reactions to go forward. One expects there to be systematic relationships between difference-making and spatio-temporal relationships in other respects as well. If a reaction is claimed to proceed via some (causally) intermediate product, then one expects this causal structure will be mirrored in the spatio-temporal 4 Lombrozo (2010) provides empirical evidence that subjects are more likely to judge stable double
prevention relationships as causal than unstable relationships.
123
424
Synthese (2011) 183:409–427
relationships that are present: we expect to find the causally intermediate product in a position that is spatially and temporally intermediate between the initial and final steps of the reaction. If not, this is at least prima-facie evidence that the hypothesized reaction mechanism is mistaken. Similarly, accounts of neural mechanisms require not just that there be neuronal connections mediating putative causal influences, but that these respect various temporal and spatial constraints that are characteristic of the brain. For example, the speed at which an animal carries out some perceptual discrimination task sharply constrains the possible neural mechanisms that might carry out that task, given information about the relatively slow velocity with which neural signals propagate. As another illustration, many causal influences in the difference-making sense (fundamental physical forces, transmission of infectious diseases, some social influences) diminish in regular ways with distance, so that nearby objects or regions are more strongly affected than more distant objects and regions. These are specific illustrations of the more general point noticed in connection with launching experiments: there are often systematic connections between causation in the sense of difference-making and facts having to do with spatio-temporal relatedness and organization, with different difference-making relationships having, so to speak, different characteristic spatio-temporal signatures. One thing many mechanistic explanations do is to provide information about these DM-GM connections; in telling us about more intermediate or fine-grained causal relationships they also tell us how the objects and processes standing in these relationships are spatio-temporally related in such a way that these relationships allow them to exercise their difference-making roles. I emphasize, however, that in my view, unlike Waskan’s, this is not a matter of replacing “superficial” facts about difference-making with “deeper” explanations specified in purely spatio-temporal or other “actualist” terms. Rather, mechanistic explanation involves, so to speak, difference-making information all the way down, but spatio-temporally organized difference-making information. In thinking about how mechanistic explanations work, we should think in terms of understanding how difference-making and spatio-temporal information work together in an integrated fashion, rather than thinking in terms of the replacement of the former by the latter. The interrelationships between difference-making and spatio-temporal structure are diverse and defy easy summary. There is a tendency for philosophical treatments of these interrelations to focus on just one possibility—causal relationships in which cause and effect are spatio-temporally contiguous. It is important to recognize, however, that even when difference-making relationships do not involve spatio-temporal contiguity, they may involve some other characteristic spatio-temporal pattern. For example, many drugs and diseases are such that there is a characteristic temporal delay between exposure to the cause and the occurrence of the effect. If an illness occurs immediately after exposure to some putative cause (rather than, e.g., hours or days later), this is reason to doubt that the factor in question was really efficacious. Similarly for the role of many drugs in promoting recovery from disease. A related observation is that there are many other possible relationships between difference-making and spatial organization, besides the obvious one that some difference-making relations require spatial contact. Even when literal spatial contact is not required, many causal agents require some degree of spatial proximity to produce their effects or, as noted above, tend to affect nearby objects more strongly than more dis-
123
Synthese (2011) 183:409–427
425
tant objects or tend, if causally efficacious at all, to affect many or all nearby objects rather just a few. It is for this reason that patterns of spatial clustering and spread of diseases can often convey important information about the mechanisms by which they are caused, as illustrated, for example, by Snow’s well-known investigations into cholera.5
8 Causal Perception and the Development of Causal Representation I conclude with some brief remarks on the development of causal representation—a topic Waskan also addresses. One popular idea goes like this: infants’ capacity for causal perception in connection with launching events (and perhaps other sorts of mechanical interactions) emerges very early—perhaps it is “innate”. The representations “triggered” in such perception then become available for more general use to represent other sorts of causal relationships and for other kinds of causal reasoning and learning, but the concepts/reasoning patterns themselves undergo at best limited further change and development. As Schlottman and Shanks (1992) summarize the idea, “the mechanism of causal perception… would provide a robust intuitive understanding of the concept of cause”. Although Waskan does not endorse the claim that the adult notion of causation derives only from causal perception, he is sympathetic to the claim that this plays some role: While our adult concept of causation probably does not arise solely from the content of the kind of causal perception associated with launch events, it may well arise from the integrated use of an array of perceptual, or perceptually derived, expectations regarding spatial, temporal, and other properties. I agree with Waskan that the challenge is to understand how the different elements making up the adult notion of causation are (eventually) integrated or put together. But unlike what may be Waskan’s view, I think one of the central elements that needs to be integrated has to do with the role of difference-making considerations. As we have seen, conceptually (and in some respects empirically) these DM considerations are independent of notions having to do with spatio-temporal connectedness and other notions definable in actualist terms (such as the features that Waskan describes as “perceptual”). As a result, it seems unlikely that the representation of causation qua difference-making can somehow be derived from, reduced to, or replaced by a representation characterized just in actualist terms. The acquisition of a full adult notion of causation is a very complicated matter, but here is a speculative suggestion about part of the story, which draws on my remarks about difference-making relations often having characteristic spatio-temporal signatures. Imagine the young infant begins, developmentally, by finding certain spatiotemporal relationships, such as those involved in launching and other contact events 5 The notion of modularity in Woodward (2003) provides another illustration of interconnections between DM aspects of causation and spatio-temporal relationships. Often different modules correspond to spatiotemporally distinct parts.
123
426
Synthese (2011) 183:409–427
particularly salient. At some early point—I believe it is not fully known when6 —they also notice these relationships are connected to motion (or its absence) in characteristic difference-making ways—balls move when and only when there is spatial contact with other moving objects, objects remain at rest when supported by spatial contact with solid surfaces and fall when unsupported. Infants are sensitive to covariational information in general even when this does not have distinctive spatio-temporal features, but covariational relations are often more easily learned and incorporated into action when they do possess such features (cf. Bonawitz et al. 2010). The infant’s experiences with moving balls etc. leads to “expectations” regarding the relationship between covariational and spatio-temporal relationships (for which we can test in looking-time experiments) but these need not have all the features of full-fledged adult causal representations—among other things, the infant may lack full appreciation of their significance for manipulation. In addition, expectations about covariation do not in themselves require the idea that some patterns of covariation are causal and others are not. Moreover, initially infants register only very coarse-grained spatiotemporal features of these experiences—e.g., they register just that support requires there be some spatial contact between the object supported and the support. As a result (as experiments show), the infant is initially not surprised by an ecologically unnatural case in which an object fails to fall even though its center of mass is located well off the support. But with additional experience, the infant begins to learn that the details of spatio-temporal relationships are also systematically connected to covariational or difference-making relationships: the infant learns whether an object in contact with a surface will be supported depends on how much of the object overlaps the surface, that whether (and how much) an object hit by another will move depends not just on whether there is spatial contact but on the relative sizes, heaviness, and speeds of the objects, and so on. At the same time, the infant also acquires experience of difference-making relationships—for example, those involved in social interactions—that show that difference-making can exist independently of spatio-temporal contact. The result, eventually, is an integrated understanding (and accompanying representation) of how the difference-making aspects of causation are connected to distinctive spatio-temporal relationships in some cases, but not always. In this process, the representations deployed in perception of launching and similar interactions are not retained unaltered from early infancy but are rather enriched and transformed in important ways.
References Bechtel, W. (2006). Discovering cell mechanisms. Cambridge: Cambridge University Press. Bonawitz, L., Ferranti, D., Saxe, R., Gopnik, A., Meltzoff, A., Woodward, J., & Schulz, L. (2010). Just do it? Investigating the gap between prediction and action in toddlers’ causal inferences. Cognition, 115, 104–117. Dowe, P. (2000). Physical causation. Cambridge: Cambridge University Press.
6 For example, it is unclear whether the 6 month-old infants who allegedly perceive causation in launching events would show evidence of violated expectations if they were given information that is discordant between perceptual and covariational cues to causation.
123
Synthese (2011) 183:409–427
427
Hall, N. (2004). Two concepts of causation. In J. Collins, N. Hall, & L. Paul, Causation and counterfactuals. Cambridge, MA: MIT Press. Hitchcock, C. (1995). Discussion: Salmon on causal relevance. Philosophy of Science, 62, 304–320. Lombrozo, T. (2010). Causal-explanatory pluralism: How intentions, functions, and mechanisms influence causal ascriptions. Cognitive Psychology 61, 303–332. Mink, J. (1996). The Basal Ganglia: Focused selection and inhibition of competing motor programs. Progress in Neurobiology, 50, 381–425. Salmon, W. (1984). Scientific explanation and the causal structure of the world. Princeton: Princeton University Press. Schlottman, A., & Shanks, D. (1992). Evidence for a distinction between judged and perceived causality. The Quarterly Journal of Experimental Psychology, 44A, 321–342. Woodward, J. (2002). What is a mechanism? A counterfactual account. Philosophy of Science, 69, 366–377. Woodward, J. (2003). Making things happen: A theory of causal explanation. New York: Oxford University Press. Woodward, J. (2006). Sensitive and insensitive causation. The Philosophical Review, 115, 1–50. Woodward, J. (2010). Causation in biology: Stability, specificity and the choice of levels of explanation. Biology and Philosophy, 25, 287–318. Woodward, J. (forthcoming). Causal perception and causal cognition. In J. Roessler (Ed.), Perception, causation and objectivity: Issues in philosophy and psychology. New York: Oxford University Press.
123