Synthese (2011) 182:1–5 DOI 10.1007/s11229-009-9622-9
Phenomena, data and theories: a special issue of Synthese Peter Machamer
Received: 25 May 2009 / Accepted: 3 June 2009 / Published online: 10 July 2009 © Springer Science+Business Media B.V. 2009
The papers collected here are the result of an INTERNATIONAL SYMPOSIUM: “Data · Phenomena · Theories: What’s the notion of a scientific phenomenon good for?” held in Heidelberg in September 2008. The event was organized by the research group “Causality, Cognition, and the Constitution of Scientific Phenomena” in cooperation with Philosophy Department at the University of Heidelberg (Peter McLaughlin and Andreas Kemmerling) and the IWH Heidelberg. The symposium was supported by the “Emmy-Noether-Programm der Deutschen Forschungsgemeinschaft” and by “Stiftung Universitat Heidelebrg”. The workshop was held in honor of Daniela BailerJones, who died on 13 November 2006 at the age of 37 (cf. my 2007 “Daniela BailerJones”).1 Bailer-Jones was an Emmy Noether fellow, and the symposium was arranged and run by those who were working in her research group at the time of her death: Monika Dullstein, Jochen Apel, and Pavel Radchencko. To them goes the credit for the conception, planning, and carrying out of the symposium. 1 A brief prolegomena to phenomena In philosophy of science, the origin of much of the contemporary discussion about phenomena and data came with the publication of Bogen and Woodward (1988). There they made the important distinction between phenomena and data. Data, which play the role of evidence for the existence of phenomena for the most part can be straightforwardly observed. However, data typically cannot be
1 Machamer (2007).
P. Machamer (B) Department of History and Philosophy of Science, University of Pittsburgh, 1017 Cathedral of Learning, Pittsburgh, PA 15260, USA e-mail:
[email protected] 123
2
Synthese (2011) 182:1–5
predicted or systematically explained by theory. By contrast, well developed scientific theories do predict and explain facts about phenomena, Phenomena are detected through the use of data, but in most cases are not observable in any interesting sense of the term (pp. 305–306). Here Bogen and Woodward seem to assume that phenomena are constructed from data obtained by experiment. In more detail as Woodward wrote later: Phenomena are stable, repeatable effects or processes that are potential objects of prediction and systematic explanation by general theories and which can serve as evidence for such theories (Woodward 2000, p. S163). Presumably, to be consistent with original, the phenomena spoken of are “repeatable effects or processes’ of experiments or other data producing activities. Data, by contrast, are what was produced by experiments, usually the result of measurements. Again, the later Woodward: Data are public records…produced by measurement and experiment that serve as evidence for the existence of phenomena or for their possession of certain features (Woodward 2000, p. S163).2 Despite the influence of their work, it is important to note that the Bogen and Woodward treatment of phenomena is non traditional, and somewhat limited. Traditionally, what we seek for a definition of “phenomena” is something more crude and open such as phenomena are the “stuff” in the world that we may study. I realize “stuff” is not exactly perspicacious. I mean something like the kinds of entities and activities that are in or of the world. These would include objects, properties, events, processes, artifacts, effects, and probably many other kinds. Classically, Duhem and others took phenomena to be observable happenings. For example, observed planetary positions, risings and settings of stars, or eclipse observations were the phenomena that needed to be saved. Theories (or models), instrumental or realistic, were then constructed that would ‘save the phenomena’. As G.E.L. Owen put it “the phainomena in question are empirical observations” (1967, p. 169).3 Of course, Owen goes on to list other uses of “phenomena” in Aristotle, but none of them correspond to Bogen and Woodward’s use. There is no consistent usage across the existing literature (in many fields) as to what “Phenomena” means. There are many uses in many fields, and many distinctions, many of which are quite unclear and unhelpful. The concept of data, by contrast, is surprisingly clearer, but still not exactly unproblematic. Connecting phenomena with particular observations (pace Bogen and Woodward) could give rise to particular models or theories, e.g., about planetary orbits. But very often scientists were interested in general kinds of phenomena, e.g., falling bodies. This raises questions about how are we to classify, categorize, identify, or count kinds? This is a serious problem across many areas of science. It is not always clear what are the criteria for identifying something as one kind of thing. Is schizophrenia one disease or many? Does it have essential or prototypical characteristics or properties? 2 Woodward (1989, 2000). 3 Owen (1967).
123
Synthese (2011) 182:1–5
3
Is hearing voices one such essential or necessary symptom? Or again, is Long Term Potentiation (LTP) of a neuron, one phenomenon or many? In all cases it seems agreed that it is the strengthening of a connection between two neurons that enhances their communication. But some LTP depends on NMDA receptor while another, even in the hippocampus, depends upon metabotropic glutamate receptor (mGluR).4 Then there is the difference between E-LTP and late LTP. Yet somehow it is thought that they serve the same or quite similar functional roles in the nervous system, a role that allows memories to be formed and maintained. So maybe something like the same functional role in a variety of systems is sufficient to count as an identity condition. However, this would seem to depend on the use we are going to make of the phenomenon as identified. When we are speaking of phenomena in the context of science, and more specifically experimental science, there are certain concerns that the concept is supposed to address. One of the most important I think is a concern about external or ecological validity. Does the experiment actually show us something important about what is going on in the real world that lies outside of the experimental setting? One form of this worry is gleaned by making a distinction between phenomena and artifacts (see Feest 2003, 2005, 2008).5 This problem arises when operationalizing a phenomenon so it may be experimentally investigated in a laboratory or other non-natural setting. Basically, we want to make sure that when we create an experimental design, we are reasonably sure that we are ‘catching’ the aspects of the phenomenon that originally sparked our interest or which we were seeking to explain. We want our experiments to tell us something about the world, about the phenomena. When we design experiments we try to simplify situations so that we may control the relevant variables, which will then allow us to intervene and observe what happens as a result of the intervention. We design experiments to generate data, which then may be used to tell us something about how the world is or how it works. But often we know something about the phenomenon of interest before setting up the experiment. Moreover, I am not sure these entities and activities have to be stable (whatever that means) or even naturally existing in the real world to count as Bogen and Woodward phenomena. Consider an experimentally constructed phenomenon inferred from the data, which seems to fit Bogen and Woodward’s original definition: Element 115, named Ununpentium (Uup) in 2004, decayed after a fraction of a second.6 What kind of stability is that? Maybe phenomena should be repeatable in principle, but there seem to be one-time events, like the supernova of 1604. Supernovae is a general class, but the one in 1604 was unique and important in singular ways. And surely history is 4 See Malenka and Bear (2004). 5 Feest (2003, 2005, 2008); Sullivan (2007). 6 Each calcium nucleus contains 20 protons and americium 95. Since the number of protons determines where an element goes in the periodic table, simple addition shows the new element to bear the atomic number 115, which had never been seen before. Within a fraction of a second, the four atoms of Element 115 decayed radioactively to an element with 113 protons. That element had never been seen, either. The atoms of 113 lasted for as long as 1.2 s before decaying radioactively to known elements. Scientists generally do not give permanent names to elements and write them into textbooks until the discoveries have been confirmed by another laboratory. By an international convention based on the numbers, element 113 will be given the temporary name Ununtrium. From NY Times, Feb 1, 2004.
123
4
Synthese (2011) 182:1–5
filled with one-time events, such as Caesar crossed the Rubicon in 49 BC and the die was cast. It is thoughtful to build into the definition of phenomena that they are potential objects of prediction or systematic explanation by general theories. But I am not convinced we have or can have general theory that may explain why Caesar thought to confront Pompey by crossing the river. In the other, evidential direction, I am not sure that the event of Jim’s evincing psychotic behavior on the occasion of seeing a particular fireworks display, is ever going to serve as evidence, by itself, for any general theory of psychosis. But I think these cavils about what are included in the class of phenomena should only force us to re-cast the main problem. I am sure however that in many cases, part of the vindication of classifying things as being of a kind does depend upon the usefulness of that classification. Often part of the use of identifying or categorizing some thing is to warrant inferences about its to etiology or mechanisms of production. Classification also licenses predictions and expectations of what things of that kind are likely to do. Both types of inference, to etiology or production and to predictions of future behaviors or events, depend heavily on the goals or purposes that the categorization scheme is to serve. Some categorizations may be useful for some purposes, e.g. psychiatric treatment, even though they do not correspond well, or even at all, to other ways in which we think of the phenomena that constitute the world. Some engineering categories maybe like this. The identity criteria for phenomena and the phenomena-artifact problem raises issues about how we may decide what the phenomena of the world really are. Individuation of phenomena, as noted, is not at all straightforward. Phenomena per se are not isolated entities, activities, things, or events. We treat them as individuals because of the interests we have in them. This is one major insight of the social constructivists and historical epistemologists. Ian Hacking’s example of rent as a causally efficacious, social phenomenon is apt here.7 Sometimes we countenance relational properties as part of the nature of an entity and sometimes not. The symmetry properties of a crystal are essential relational properties that determine many of its optical behaviors. In many cases, the legitimacy of classifications depends on the goals and functions the classifications are supposed to serve. Yet the problems do not stop here. As we run our experiments or build our models of phenomena we often discover aspects that we had not foreseen, and so come to change our view of what and how the phenomena really are. When we model, experimentally in the lab or mathematically, a phenomenon already known (in some sense), as Daniela Bailer-Jones says, we often come to learn more about the phenomenon and clarify more precisely what the phenomenon is.8 And sometimes we get it wrong. My favorite case here were the French astronomers who were doing gas chromatic analyses of stars, and encountered random sodium flashes on their screen. They tried very hard for along time to explain the origin and relation of the sodium presence to what other data they acquired about the stars. And it was not until some one noticed that the sodium flashes were correlated
7 Hacking (2000). 8 Bailer-Jones (2009).
123
Synthese (2011) 182:1–5
5
with the lighting of the astronomers’ cigarettes (probably Gauloise) that the sodium emission was subtracted from the gaseous identity of the star. These considerations would naturally lead to further discussion of reliability, types of validity, and the concept of categorization. But in an introduction such as this there is no time to pursue these discussions. References Bailer-Jones, D. (2009). Models in philosophy of science. Pittsburgh: University of Pittsburgh Press. Bogen, J., & Woodward, J. (1988). Saving the phenomena. Philosophical Review, 97, 303–352. Feest, U. (2003). Operationism, experimentation and concept formation. PhD dissertation, Pittsburgh. Feest, U. (2005). Operationism in psychology—what the debate is about, what the debate should be about. Journal for the History of the Behavioral Sciences, XLI(2), 131–150. Feest, U. (2008). Exploring implicit memory: On the interrelation between operationalizations, concept formation, and experimental artifacts. http://www.mpiwg-berlin.mpg.de/GenExpKnowl/feest.html. Hacking, I. (2000). The social construction of what? Cambridge, M.A.: Harvard University Press. Machamer, P. (2007). Daniela Bailer-Jones. International Studies in Philosophy of Science, 21(2), 211–212. Malenka, R., & Bear, M. (2004). LTP and LTD: An embarrassment of riches. Neuron, 44(1), 5–21. http://en.wikipedia.org/wiki/Long-term_potentiation. Owen, G. E. L. (1967). Tithenai ta Phainomena. In J. M. E. Moravsik (Ed.), Aristotle: A collection of critical essays, Doubleday anchor (pp. 167–190). Sullivan, J. A. (2007). Reliability and validity of experiment in the neurobiology of learning and memory. PhD dissertation, Pittsburgh. Woodward, J. (1989). Data and phenomena. Synthese, 79, 393–472. Woodward, J. (2000). Data, phenomena and reliability. Philosophy of Science, 67. Proceedings of the 1998 biennial meetings of the philosophy of science association, Part II, symposia papers, September 2000, pp. S163–S179.
123
This page intentionally left blank z
Synthese (2011) 182:7–22 DOI 10.1007/s11229-009-9619-4
‘Saving the phenomena’ and saving the phenomena Jim Bogen
Received: 20 May 2009 / Accepted: 3 June 2009 / Published online: 2 July 2009 © Springer Science+Business Media B.V. 2009
Abstract Empiricists claim that in accepting a scientific theory one should not commit oneself to claims about things that are not observable in the sense of registering on human perceptual systems (according to Van Fraassen’s constructive empiricism) or experimental equipment (according to what I call “liberal empiricism”). They also claim scientific theories should be accepted or rejected on the basis of how well they save the phenomena in the sense delivering unified descriptions of natural regularities among things that meet their conditions for observability. I argue that empiricism is both unfaithful to real world scientific practice, and epistemically imprudent, if not incoherent. To illuminate scientific practice and save regularity phenomena one must commit oneself to claims about causal mechanisms that can be detected from data, but do not register directly on human perceptual systems or experimental equipment. I conclude by suggesting that empiricists should relax their standards for acceptable beliefs. Keywords
Empiricism · Data-phenomena · Evidence · Van Fraassen
This paper is based on the talk I gave at the conference Monika Dullstein, Jochen Apel, Pavel Radchenko and Peter McLaughlin organized in honor of Daniela Bailer-Jones. Thanks to them for inviting me, and thanks to the participants, especially Peter Machamer, Sandra D Mitchell and Jim Woodward. Thanks to Ken Schaffner and Ted McGuire for helpful discussion, and to Mike Smith and the Trident, the wonderful espresso bar he runs in Boulder, Colorado. J. Bogen (B) Department of HPS, University of Pittsburgh, 1112 N Highland Ave, Pittsburgh, PA 15206, USA e-mail:
[email protected] 123
8
Synthese (2011) 182:7–22
1 ‘Saving the phenomena’ Jim Woodward and I called our first paper ‘Saving the Phenomena’ because we wanted to rescue the distinction between data and phenomena from the neglect it suffers at the hands of logical empiricists, their followers, and those who criticize them without giving up some of their central assumptions. What we call phenomena are processes, causal factors, effects, facts, regularities and other pieces of ontological furniture to be found in nature and in the laboratory. They exist and are as they are independently of interests, concepts, and theoretical commitments of scientists, and political, historical, social, and cultural factors that influence them. Some phenomena are observable, but many are not. Appropriating what Bernard Williams said in another connection, to know about phenomena is to have …knowledge of a reality that exists independently of that knowledge, and indeed…independently of any thought or experience. (Williams 1978, p. 78) Such knowledge ‘…is of what is there anyway’ (ibid). Our original examples of phenomena included neuronal electrical signaling, weak neutral current events, solar neutrino fluxes, the melting point of lead, and in the motions of moons and planets. Instances of phenomena owe their occurrences, behaviors, and characteristic features to relatively small numbers of causal factors that co-occur and interact with one another in pretty much the same way in a range of different circumstances and locations. They can be expected … to have stable repeatable characteristics which will be detectable by means of a variety of different procedures which may yield quite different kinds of data. (Bogen and Woodward 1988, p. 317) Phenomena are the kinds of things it is reasonable for scientists to try to predict and explain systematically by appeal to local or widespread causal mechanisms. They are the kinds of things theories are tested against. Data, by contrast, are records of effects in investigators’ sensory systems or experimental equipment. Their epistemic usefulness depends on features whose possession reflects the causal influence that phenomena1 or closely related causal factors2 exert on their production. Investigators apply background knowledge and inferential techniques to identify such features and look for answers to questions about phenomena.
1 The height of the mercury in a thermometer can help indicate something about an infectious disease
by virtue of its causal connection to a patient’s temperature. Investigators can use spatial configurations in photographs of tracks caused by particles traveling in a bubble chamber to study subatomic physical phenomena. 2 Edouard Machery pointed out to me that data used to study regularity phenomena reflect the influence
of causal factors that should not be confused with the regularities themselves. His example was test score data indicating an inverse correlation between male and female subjects’ performances on certain cognitive tasks. The regularities don’t interact with experimentalists or experimental equipment to cause data. Epistemically useful features of the data reflect the causal influence of cognitive and other factors that sustain the inverse correlation. The data used to calculate the melting point of lead reflect changes in heat and phase state change effects like vaporization rather than regularities among them.
123
Synthese (2011) 182:7–22
9
Epistemically useful data can seldom be produced except by processes that mark them with features due to causal factors that are too numerous, too different in kind, and too irregular in behavior for any single theory to take into account. For example, the data Bernard Katz used to study neuronal signaling was influenced by causal factors peculiar to the makeup and operation of his instruments, the positions of electrodes used to stimulate and record electrical activity in neurons, physiological changes in nerves deteriorating during experimentation, not to mention causes as far removed from the phenomenon of interest as vibrations that shook the equipment in response to the heavy tread of Katz’s mentor, A.V. Hill on the stairs outside the laboratory. Some factors influenced the data only as mutually interacting components of shifting assemblies of causal influences. Such multiplicity, heterogeneity, and complexity typically renders data points immune to the kinds of explanation to which phenomena are susceptible.
2 Constructive and liberal empiricism Woodward and I appealed to differences between data and phenomena to correct a number of mistaken ideas about scientific practice. One of them is the empiricist idea that (ignoring practical applications) what scientific theories are supposed to do is to “save the phenomena” in the sense of describing, predicting, retrodicting, and systematizing “observables”.3 As Hempel put it, Empirical science has two major objectives: to describe particular phenomena in the world of our experience and to establish general principles by which they can be explained and predicted. The explanatory and predictive principles of a scientific discipline …characterize general patterns and regularities to which… individual phenomena conform and by virtue of which their occurrence can be systematically anticipated. (Hempel 1970, p. 653) For Hempel ‘explaining’ means exhibiting general regularities to which the explanandum conforms. He emphatically does not mean describing causally productive mechanisms, let alone imperceptible causal influences, that give rise to things we experience. Osiander’s forward to Copernicus’ On the Revolutions’ is a locus classicus for this conception of science.4 Since he cannot in any way attain to the true causes…[the astronomer] will adopt whatever suppositions enable…[celestial] motions to be computed correctly from the principles of geometry for the future as well as the past…[These] hypotheses need not be true or even probable…[I]f they provide a calculus consistent with the observations, that alone is enough. (Osiander 1543/1992, p. XIX) Duhem held that 3 Some of our other targets were the myth of theory laden perceptual observations (Bogen 2009), and IRS, the logical empiricist view that scientific reasoning should be modeled in terms of Inferential Relations among Sentential structures (Bogen and Woodward 2005). 4 But see Barker and Goldstein (1998).
123
10
Synthese (2011) 182:7–22
[a] physical theory is not a [causal] explanation. It is a system of mathematical propositions, deduced from a small number of principles, which aim to represent as simply, as completely, and as exactly as possible a set of experimental laws… Duhem’s experimental laws are simplified or idealized general descriptions of experimentally produced observable effects. Concerning the very nature of things, or the realities hidden under the phenomena [described by experimental laws]…a theory…tells us absolutely nothing, and does not claim to teach us anything. (Duhem 1991, p. 19) That amounts to saying that theoreticians should limit their attention to regularities among data. Woodward and I hold to the contrary that for the most part, scientific theories are incapable of predicting or explaining data. They are required instead to answer questions about phenomena, many of which are unobservable in the sense that they do not register directly on human perceptual systems, or upon experimental equipment. Van Fraassen’s constructive empiricism is a familiar version of the idea we reject. It holds that in accepting a theory one should not commit oneself to anything more than the claim that it provides ‘…true account of what is observable’.5 In particular, it requires no commitment to claims whose truth presupposes, assumes or implies the existence of non-observable causes and causal processes. In this respect constructive empiricism agrees with Osiander’s idea that scientists shouldn’t claim to know hidden causes of observable things. But its conditions on observability rule out claims about a great many phenomena Duhem and Hempel think scientific theories should try to save. For Van Fraassen, to be observable is to be perceptible by the unaided senses of a properly situated human observer under favorable observation conditions. Thus distant moons too far away to see with without a telescope are observable only because we could see them with our naked eyes if we were close enough to them under favorable observation conditions.6 By contrast, things humans can’t see without electron microscopes or detect without Geiger counters under any conditions are not observable (Van Fraassen 1980, p. 16). Van Fraassen’s conception of observability brings constructive empiricism into conflict with Duhem’s conception of science, along with real world scientific practice that departs from Duhem.7 To estimate the velocity of the nervous impulse (we call it the action potential), Helmholtz applied an electrical stimulus to one end of a nerve fiber whose other end was attached to a bit of muscle and recorded the time at which 5 As I understand it, Bas van Fraassen’s present view (it is subject to the same objections as the view I sketched) is that what scientific theories should be required to explain are statistical patterns among data points (Van Fraassen 2006, pp. 31, 41). 6 Would conditions in outer space allow us to see a speck we’d glimpsed through a high powered telescope
with our naked eyes if we got close enough? If it was near the surface of the sun, why wouldn’t the dazzle blind us? Van Fraassen is silent about such cases. 7 Centering as they do on questions Van Fraassen raised about what we can see with microscopes (e.g.,
Hacking 1983, pp. 186–209) discussions of Van Fraassen’s conditions on observability tend to ignore the tension between constructive empiricism and traditional ideas about saving the phenomena.
123
Synthese (2011) 182:7–22
11
he saw the muscle contract. To produce a velocity estimate he also needed data on the temporal interval between the application of the stimulus and the beginning of the contraction. Because ‘our senses are not capable of directly perceiving an individual moment of time with such a small duration,’ he could not produce this data without recourse to ‘artificial methods of observation’ (Olesko and Holmes 1994, p. 84). By this Helmholtz meant using a perceptible effect on a piece of experimental equipment as a measure of an imperceptible quantity. To this end he set up a galvanometer so it would respond to electrical activity by deflecting a magnetized needle. He could see (observe non-artificially) the arc the needle moved through when he stimulated the nerve. Assuming that the length of the arc was proportional to the temporal duration of the motor impulse, Helmholtz could use the former to measure the latter (ibid). Watching the dial to estimate a temporal duration is by no means analogous to observing a marking on a distant bird one sees through a telescope. The telescope enables the observer to visually experience the marking and attend to it consciously. Because the observer could see the marking without a telescope if the bird was close enough, Van Fraassen would call this an observation. But Helmholtz could not see the duration of the nervous impulse he measured; all he could see was the arc through which the needle moved. If Helmholtz had limited his investigations to things he could observe only in Van Fraassen’s sense of that term, he could not have used artificial methods of observation to calculate the velocity of the nervous impulse. As this example illustrates, working scientists (Duhem, for one) are happy to talk about observation in connection with things that register on their equipment rather than their senses. Those who believe the goal of science is to describe, predict, and systematize observations can improve their chances of doing justice to research in the neurosciences, molecular biology, particle physics, and elsewhere by rejecting constructive empiricism in favor of what I’ll call liberal empiricism. Liberal empiricism believes scientists should commit themselves only to the truth of claims about things that register on their senses or experimental equipment. This fits real world science better than constructive empiricism. But high energy physicists, microbiologists, and scientists in other fields commit themselves to claims about phenomena like weak neutral current interactions that do not even meet liberal empiricism’s conditions for observability.8
3 Scientists often try to do more than empiricists think they should Here are two examples of scientists who succeeded in saving the phenomena in the sense of computing, describing, and predicting instances of regularities among the quantities they studied, apologized for not having accomplished more, and exhorted others to join them in looking for hidden, causal mechanisms that sustain the regularities they saved.
8 See Bogen and Woodward (1988, p. 328 ff). Further examples are discussed in Bogen and Woodward
(2005). Counterfactually minded empiricists might extend the notion of observability to things that would register on instruments that are yet to be developed. But it would be hard to distinguish their position from the realism that traditional empiricism rejects.
123
12
Synthese (2011) 182:7–22
3.1 The Hodgkin–Huxley (HH) current equation The action potential HH studied is a wave of electrical activity that travels at uniform velocity and amplitude down a neuronal axon away from the cell body. It consists of a series of electrical currents carried by charged ions moving in and out of the axon membrane at one place after another toward the synapse. The ion flows are promoted and damped as local membrane permeability changes in response to changes in membrane potential.9 The HH total current equation m3h I = CM dV/dt + gn4 K (V − VK ) + gNa (V − VNa ) + gl (V − Vl ),
(1)
provides a unifying description, and predicts instances of regularities among electrical quantities involved in the action potential. In particular, it describes the cross membrane current (I) as the sum of a capacity current (CM dV/dt), and three cross membrane m3h ion currents—a potassium current (gn4 K (V − VK )), a sodium current (gNa (V − VNa )), and a “leakage current” (gl (V−Vl )) due largely to Cl− flow. Some of the variables and constants represent unobservable physical quantities. ‘gK ’, ‘gNa ’, and ‘gl ’ symbolize maximal membrane conductances for the potassium current, the sodium current, and the leakage current, respectively calculated from measurements of electrical activity (Hodgkin and Huxley 1952, pp. 505, 518). Values of V − Vk , V − VNa and V − Vl are differences between actual membrane potential, V, and the membrane potentials (VK , VNa , VL ) at which there is no net potassium, sodium, or leakage ion flow. None of these quantities qualify as observable in Van Fraassen’s sense, but all of them are calculated from voltage data. By contrast, n, m, and h are weighting constants devised to bring the equation into line with actual values of the cross membrane current. HH emphasize the availability of mathematically acceptable alternatives that could have delivered equally good quantitative descriptions of the ion currents. They chose their constants for their mathematical convenience (Hodgkin and Huxley 1952, pp. 505, 512). To establish the relevance of their equation to the action potential, HH used it to derive predictions of some crucial features of action potential propagation (Hodgkin and Huxley 1952, pp. 518–528). The total current equation and the predictions they derived from it agreed to an acceptable approximation with an impressive variety of results obtained in limited but impressive range of experiments and background conditions. A liberal empiricist would welcome these results, asking for nothing more in the way of improvements than a broadening of their range of application and a reduction of discrepancies between predicted and experimentally detected electrical magnitudes like those HH discuss at Hodgkin and Huxley (1952, pp. 541–544). A Duhemian empiricist would hope the regularities the current equation describes could be shown to be instances of wider regularities. But HH wanted something that neither constructive nor liberal empiricism countenance. They express dissatisfaction over their failure to provide physical interpretations of their weighting constants, and their inability 9 Membrane potential is the difference in voltage between charges on the membrane’s inner and outer
surfaces.
123
Synthese (2011) 182:7–22
13
to explain the regularities they described (Hodgkin and Huxley 1952, p. 541). The following illustrates what they thought was missing. HH speculated that if there were some molecules that encouraged, and other molecules that discouraged ion flow, they could explain the membrane sodium conductance as depending on and varying with …the number of sites on the inside of the membrane which are occupied simultaneously by three activating molecules, but are not blocked by an inactivating molecule. (Hodgkin and Huxley 1952, p. 512) Then ‘m’ in the current equation could be interpreted as representing the proportion of activating molecules on the inside of the membrane, and ‘h’ as representing the proportion of inactivating molecules on the outside of the membrane (ibid). HH warn that agreements between calculated and experimental results are not good reasons to believe their speculation is correct. The most they show is that the current equations deliver empirically acceptable descriptions … of the time-course of the changes in permeability to sodium and potassium. An equally satisfactory description of the…data could no doubt have been achieved by equations of very different form. (Hodgkin and Huxley 1952, p. 541) In particular predictive success does not argue for the existence of the activating and blocking molecules they speculated about (ibid). Over a decade later Kenneth Cole said that the greatest challenge the HH equations presented to investigators of the action potential … was—and still is—the explanation of the only unknowns in the Hodgkin– Huxley formulation, the sodium and potassium conductances. These were described so precisely and were such adequate expressions of so much physiology that any theory which could explain them quantitatively would be sufficient—for physiology at least. Conversely, a theory which failed to account for these conductances seemed destined to be incomplete or wrong in some, probably important, aspect. Theory thus has a clear and shining goal. (Cole 1968, pp. 292–293) The dominant experimental research undertaken in pursuit of this goal assumes that unobservable (though detectable) ions carry electrical currents by flowing through unobservable (though detectable) channels embedded in the neuronal membrane. Contrary to Osianderian empiricism, ions, ion channels, and their components were not posited to facilitate computations. They were understood to be real physical structures that respond to and exert causal influences. Bertil Hille and others worked to find out how the channels were gated, how ions moved through them, and what structural and functional features decided which ions could pass through which channels (Hille 2001, pp. 45–51, 539–602). Roderick MacKinnon and his associates crystallized potassium channels and applied spectrographic techniques to produce computer generated images with which to investigate the structure of the channel and the spatial configurations of its parts at different stages of voltage gating (Jiang et al. 2003a, pp. 33–41, 2003b, pp. 42–48). This was difficult and delicate work. The investigators said they had to attach molecules to stabilize the channels, apply detergents to wash away surrounding lipids, introduce toxins to immobilize channel components during
123
14
Synthese (2011) 182:7–22
crystallization, and bombard the crystals with X-Rays to produce spectrographic data. It is unthinkable that anyone who didn’t believe in ions and ion channels could take such pains to manipulate them. Neither they nor the X-Rays they used are perceptible in any but a Pickwickian sense of that term. The crystallographic images are computer generated from instructions constrained by inputs from the spectrographic equipment. It is a mistake to think the investigators could see ion channels in them in anything like the way one can see Rachel Maddow in a photograph or on TV. The channels MacKinnon visualized were supposed to be, and as far as we know, actually are, real world gating mechanisms that support, rather than enable the computation of the regularities HH described. This is to say that the Nobel prize MacKinnon won for his ion channel research couldn’t very well have been awarded by a committee of constructive or liberal empiricists. 3.2 A. V. Hill’s tiny rainstorms Nerve fibers release what A. V. Hill called transmission heat at a steady rate during, and perhaps for a very short time after, action potential propagation.10 After that, heat is emitted at a steeply declining rate for around 2 s. This is the first, or A phase of what Hill calls recovery heat release. The second or B phase is a longer lasting emission of gradually declining recovery heat. Hill uses the expression ‘Ae−at ’ to describe the exponentially declining rate of A phase heat release, and ‘Be−bt ’ for phase B. In Hill’s recovery heat equation, y = Ae−at + Be−bt ,
(2)
y is the rate of recovery heat emission beginning immediately after the completion of an action potential at time, t0 . A is the magnitude of heat at t0 . B is its magnitude at the end of phase A. t is elapsed time. e−at is the exponential change in the rate at which heat is released during phase A. e−bt is the exponential change in phase B emission. a and b are uninterpreted constants chosen so that A/a differs from B/b as the total heat released during phase A differs from the total heat released during phase B (Hill 1932a, p. 155, 1932b, pp. 21ff, 54). Heat magnitudes and time courses calculated from Hill’s equation agree acceptably well with experimental results. There’s no reason for an empiricist to care about whether constants can be given physical interpretations as long as they facilitate computations of the quantities of interest. But Hill was no more content than HH to use uninterpreted constants to save the regularity phenomenon he studied. In his Liversidge lecture he emphasized that although he had succeeded in describing recovery heat release he had failed to explain them. To explain them he would have to discover whether a and b correspond to real factors, and if so, how they influence heat release. More importantly, he says, he would have to identify the relevant physiological process. Because A and B heat are released when the nerve is not spiking, he would also have to explain why resting nerves waste so much energy (Hill 1932b, p. 55). 10 His equipment wasn’t sensitive enough for him to decide.
123
Synthese (2011) 182:7–22
15
Hill approaches these questions by looking to analogous physiological phenomena. In doing its osmotic work, the kidney wastes an enormous amount of energy on heat emission (Hill 1932b, p. 55). Muscles waste a great deal of energy in recovering from contraction. In these and related physiological processes ions are moved through membranes until they are distributed as required for the relevant systems to continue to function (e.g., for muscle fibers to resume their contracting). Inefficiency seems to be the rule for the mechanisms that do this kind of work. The propagation of an action potential disrupts the distribution of ions and charges inside and outside of the nerve membrane. The nerve cannot fire again until the original distributions have been restored. Hill speculates that the restoration is achieved by an inefficient mechanism that wastes heat energy as it moves ions and charges back through the membrane (Hill 1932b, pp. 55–56). He exhorts chemists, physicists and engineers to join in the search for an explanation of the workings of the restorative mechanism and claims that a good explanation would promote our understanding of ‘one of the most important facts in nature’ (Hill 1932b, p. 57). He expects this research project to be as time consuming, difficult, and important as the equally non-empiricist investigation of the structure of the atom (ibid). It’s one thing to try to satisfy an empiricist by improving or replacing Hill’s equation to provide for new and better computations of restorative heat emission. It’s quite another to look for the causes that interested Hill. Hill did a lot of fiddling and adjusting to avoid experimental artifacts. The natural way to understand what he was doing and why his adjustments succeeded is to invoke causal facts Hill knew about. For example, Hill used a thermopile to produce an electrical current that varied with the temperature of a bit of nerve fiber11 (Hill 1932a, p. 116). To control the nerve’s temperature and shield the recording equipment from ambient heat, Hill sealed them in the core of a double walled container. To see how the nerve behaved at 0◦ C, he filled the space between the walls with crushed ice. For experiments at higher temperatures he used paraffin oil. To bring the inner temperature to within a desired range above 0◦ C he pumped in air from the laboratory. For lower temperature experiments he used oxygen. When the core was cooled slowly or when oxygen was introduced, the equipment sometimes recorded ‘curious fluctuations’ in nerve temperature. Hill supposed the fluctuations were artifacts of warming caused by little rainstorms in the chamber. The rainstorms occurred when the charge on the wire he used to stimulate the nerve initiated condensation in supersaturated air surrounding the thermopile. Assuming that the gases he used to manipulate the temperature inside the core contributed to supersaturation, he took pains to remove moisture from his oxygen and adjusted the temperature of the room to keep the air he pumped into the container from getting colder than the air in the core. As Hill expected, his tinkering eliminated the unwanted fluctuations (Hill 1932a, p. 119). Hill’s tinkering was motivated by beliefs about the existence and influence of causal factors he could not observe. He believed the curious temperature fluctuations should 11 Current from thermopile flows to a galvanometer rigged to a light emitting mechanism. For purposes of amplification Hill arranged his equipment so that light emitted in response to input from the galvanometer fell on a photo electric cell connected to a second galvanometer and light source. The resulting light signal was strong enough to produce useful photographic data.
123
16
Synthese (2011) 182:7–22
be eliminated because they were caused by condensing moisture rather than changes in post transmission heat. He was committed to beliefs about how heat from the nerve acted on his apparatus, how supersaturation could lead to artifactual temperature fluctuations, and how warming the air and drying the oxygen could prevent them. It’s not obvious how we could explain what Hill was doing and why it worked while remaining agnostic about the causal factors he manipulated. When Carl Craver made a similar point in discussion, John Carson objected that even if you can’t make sense out of experimental practices unless you assume the experimenters believe in unobservable causes, it doesn’t follow that you have to believe in them too. After all, we explain ancient medical practices as motivated by beliefs about humoral disease mechanisms that we don’t share. Why must empiricists assent to all the beliefs they ascribe to the experimenters? The rest of this paper is devoted [a] to answering Carson’s objection, and [b] to arguing that without commitment to the truth of claims about causal factors and processes that humans can’t perceive and experimental equipment can’t register, it is epistemically imprudent, if not incoherent to believe in regularity phenomena that empiricists think scientific theories are supposed to save. 4 What’s wrong with constructive and liberal empiricism The only argument I’ve been able to find for constructive empiricism that seems at all plausible comes from Van Fraassen. Van Fraassen says that a scientific claim is empirically adequate …exactly if what it says about the observable things and events in the world is true. (Van Fraassen 1980, p. 12) Assume that you can’t or shouldn’t commit yourself to the correctness of a theory unless you believe in the empirical adequacy of what is says and implies about observables. Because ‘empirical adequacy coincides with truth’ for every claim that ‘is solely about what is observable’ accepting a theory commits you to the truth of what it says and implies about them (Van Fraassen 1980, p. 72). Thus if you have acceptably good reasons to believe that what a theory says or implies about observables is empirically adequate, epistemic prudence condones commitment to just that much of the theory.12 But most scientific theories make general claims (including predictions) whose empirical adequacy depends upon things and events that have occurred but were not observed, are now occurring unobserved, and still others that won’t be available for observation until later on. Gravitational theories are obvious examples. Most instances of the regularities they address have not yet, and never will be observed. Thus, as Van Fraassen acknowledges, one cannot commit oneself to the empirical adequacy of a general scientific theory without taking some epistemic risks (Van Fraassen 1980, p. 69). It would be absurd for a scientist to try to save regularities he didn’t believe in. It would be equally absurd for an agnostic philosopher to maintain that scientists 12 For our purposes, it doesn’t matter what counts as a sufficiently good reason as long as we reject
skepticism by setting an attainable threshold.
123
Synthese (2011) 182:7–22
17
should compute, describe, predict, and unify descriptions of them. Thus liberal and constructive empiricists must acknowledge, e.g., that no matter how many empirical tests a gravitational theory passes, it will always be somewhat epistemically risky to accept it. The argument for constructive empiricism is that the risk becomes unacceptably high if one goes on to commit oneself to the claims the theory makes about things that cannot be observed. Therefore epistemic prudence is supposed to preclude commitments to anything the theory says or implies about causes the unaided senses can’t perceive (Van Fraassen 1985). We can transform this into an equally plausible argument for liberal empiricism by relaxing Van Fraassen’s conditions for observability and empirical adequacy to admit imperceptibles that register on experimental equipment. Something like the same thing applies to Carson’s question because one can’t commit oneself to the empirical adequacy of Hill’s reasons for cooling his oxygen or warming his laboratory without committing oneself to the empirical adequacy of generalizations about unobserved as well as observed electric charges and condensation effects. Agnosticism about whether unobserved as well as observed charges condense water out of unobserved as well as observed supersaturated gases makes it epistemically imprudent to commit oneself to Hill’s reasons for thinking the curious temperature fluctuations he recorded were condensation artifacts or to his explanation of how he eliminated them. But it is more epistemically imprudent than any empiricist can tolerate,13 to commit yourself to the truth (nearness to truth, or acceptably high probability) of empirical generalizations that outstrip the available observational evidence without secure (i.e., truth conducive) inductive arguments to support them. For reasons that derive from John Norton’s material theory of induction, I submit that constructive and liberal empiricists can’t avoid such imprudence. John Stuart Mill observed that even if all the swans you’ve seen are white, and even though all the humans you’ve seen have heads that grow above their shoulders, the belief that no living human’s head grows under her shoulders is more credible than the belief that no swans are black (Mill 1967, p. 205). Mill was right. Genetic, developmental, metabolic, and environmental factors rule out such drastic anatomical variations while rendering black swans improbable but not impossible. Whether it’s epistemically prudent for you to believe a generalization that outstrips the available empirical evidence, and how prudent it is for you to believe it depends on what else you believe and how reasonable it is for you to believe it. The more you know about the relevant biological facts, the more prudent it is for you to believe there are no living humans with under the shoulder heads. Given enough background knowledge about the mechanisms that influence anatomical structure, it’s imprudent for you not to believe that all human heads grow above the shoulders. But the most important thing Mill’s example can teach us about empiricism is that it is not 13 Epistemic (as opposed to pragmatic) prudence requires one to take reasonable measures to commit one-
self to what is true approximately true, or acceptably probable, and avoid believing what is not. (Although this paper has no room to discuss it, it is worth asking whether, or under what circumstances it is pragmatically prudent to commit oneself to the truth of a general claim in the absence of secure inductive support.)
123
18
Synthese (2011) 182:7–22
epistemically prudent to accept the generalization unless you know something about the relevant biology. Maybe you don’t need to know much, but you do need to know something. I learned what I’ve just said about Mill from John Norton’s account of why the inference from (a) ‘Some samples of the element bismuth melt at 271◦ C’ to (b) ‘All samples of the element bismuth melt at 271◦ C’ (call it the bismuth induction) is secure even though the inference (call it the wax induction) from (c) ‘Some samples of wax melt at 91◦ C’ to (d) ‘All samples of wax melt at 91◦ C.’ is infirm Since both arguments have the same logical form and both are deductively invalid, there is no formal explanation for the difference. Norton’s idea is that is that the bismuth induction is made secure (this means that if the premise is true it’s conclusion is ‘most likely true’) …by a fact about chemical elements: their samples are generally uniform in their physical properties… (Norton 2003, p. 651) The claim that this is a fact is an example of what Norton calls a material postulate. ‘…[I]f we know the physical properties of one sample of the element,’ he says, the postulate gives us ‘…license to infer that other samples will most likely have the same properties’ (ibid). If we were agnostic about the postulate and didn’t know of anything else that might keep unobserved bismuth from melting at different temperatures than observed samples, it would be epistemically imprudent to believe (b) on the basis of (a). By contrast the wax induction is infirm (i.e., the truth of it’s premise doesn’t make its conclusion probable enough to merit acceptance) because wax samples differ from one other with regard to the chemical factors that determine their melting points. Norton doesn’t appeal to the same facts that I would in support of the bismuth induction. I wouldn’t call the assumptions that make good inductive arguments truth conducive ‘postulates’ unless that term is understood broadly enough to include contingent, testable, empirical generalizations. But apart from that, Norton’s point is eminently sensible. Since for every secure inductive argument, there’s an infirm argument of the same logical form, it must be facts rather than formal features that distinguish secure from infirm inductions. An inductive argument will be …truth conducive as opposed to being merely pragmatically or instrumentally useful, as long as the material postulates are strong enough to support it … How each induction will be truth conducive will depend upon the material postulate. (Norton 2003, p. 651) If truth conduciveness depends on the strength of the material postulate, strength must require truth, approximation to the truth, probability, or something of the kind. But the point I want to emphasize is that no matter how secure the bismuth induction may be, it is epistemically imprudent to accept it without good reasons to believe that it is supported by a true, approximately true, or acceptably probable material postulate. And as suggested above, the less reason you have to believe the material postulate, the less epistemically prudent it is for you to accept the induction.
123
Synthese (2011) 182:7–22
19
Norton’s example of a material postulate was a general assertion of brute fact. For someone who believes for epistemically good reasons that samples of the same chemical element share the same physical properties and that melting is one of them, it will be epistemically prudent to trust the bismuth induction and accept its conclusion. By the same token, it will be prudent to invoke the claim that bismuth melts at around 271◦ C to explain an experimental practice, or what results when investigators follow it. It will not be epistemically prudent for someone who disbelieves or remains agnostic about the material postulate to accept any the generalization or use it to explain the experimental practice or its results. Before we go on, it’s worth digressing to note that a material postulate that is available to philosophers, historians, or sociologists in one historical setting, intellectual tradition, academic community, etc. may not be available to those in another. Nor need it have been available to the scientists whose work they try to understand or evaluate. For example what is known now about heat and chemical bonding and bond breaking provides better established and stronger support for the bismuth induction than any of the material postulates that were available to 18th or 19th century chemists. This should come as no surprise: we can’t use what we don’t know about to help us avoid error. Norton’s material postulate may be more palatable to liberal (and maybe even to constructive) empiricists than an appeal to mechanisms of bonding and bond breaking. That’s because he formulated it without explicit appeal to unobservable bonds and the components of the mechanisms that make and break them. But it’s hard to see how Norton’s postulate could be confirmed and its application to the melting point of bismuth established without question begging unless one appeals to facts about hidden causal factors.14 Thus It may be more prudent to accept the bismuth induction on the basis of facts about chemical bonding than by appeal to the generalization Norton proposed. Analogous considerations hold for the generalizations about condensation and supercooling, that Hill relied on to eliminate artifacts. The trouble with constructive and liberal empiricists is that their views about the goals of science assume that there are regularity phenomena for scientific theories to save even though the empirical adequacy of claims about many regularities that extend beyond observed instances can only established through inductive inferences whose truth conduciveness depends on causal factors that do not register on their senses or their equipment. By forbidding their adherents to commit themselves to claims about things that are unobservable in this sense, constructive and liberal empiricists forbid them from accepting the material postulates without which it is epistemically imprudent for them to believe in regularity phenomena. 5 Degrees of epistemic prudence I am indebted to Ken Schaffner for pointing out that Snell’s law seems to have been epistemically well established long before physicists had any way to decide whether 14 It should go without saying that the trivializing maneuver of stipulating that bismuth is not to be classified
as an element if its melting point turns out to be irregular does the bismuth induction no good.
123
20
Synthese (2011) 182:7–22
refraction is a wave or a particle phenomenon, let alone what makes light change direction when it travels out of one medium and into another. If the security of an inductive inference depends on material posits and epistemically good reasons to accept them, how could it have been epistemically prudent for early physicists to accept Snell’s law? The answer is that the physicists had reason to think that light must consist, or most probably consists of waves or particles, and that the regularities Snell describes can be maintained by either a particle or a wave mechanism. Just how epistemically prudent it was for such physicists to believe Snell’s law describes a genuine phenomenon would depend on how good those reasons are. A reasonable belief that there is a mechanism of some kind to do the job wouldn’t warrant as much confidence as well confirmed ideas about how the mechanism operates. A commitment to Snell’s law based on weak evidence for claims about refraction mechanisms would be less prudent than commitment based on good evidence for such claims. In short, epistemic prudence comes in degrees and depends upon a variety of considerations. The example doesn’t show that material postulates are unnecessary. All it shows is that epistemic prudence allows us to accept (perhaps without much confidence) the conclusions of inductive arguments that lack less than perfect support. Thus although we should expect some commitments to Snell’s law to have been epistemically imprudent, we should also expect that others were minimally prudent, others more prudent, and perhaps that still others were models of epistemic virtue. It would be epistemically vicious to accept the conclusion of an inductive generalization on the basis of an inference whose security rests on epistemic posits one believes to be false. To accept an induction on the basis of posits one neither believes nor disbelieves is only slightly less imprudent. My objection to constructive and liberal empiricism is that is just what these views endorse.
6 Conclusion: how to be a good empiricist I’ve sketched examples to illustrate that neither constructive nor liberal empiricism accurately reflect the goals, attitudes, or beliefs of working scientists. More importantly, these empiricisms are as unsatisfactory normatively as they are descriptively. If what I’ve said about induction is correct, anyone who abides by liberal or constructive empiricist constraints must conclude that for all they know, Hodgkin’s, Huxley’s, Hill’s, and a great many other scientists’ best work may well have been a colossal waste of time and effort. It would be a waste of time to investigate or attempt to manipulate unobservable causal factors. And it would be a waste of time to try to save regularity phenomena in the absence of minimally adequate inductive arguments to show that the regularities obtain. Because inductive arguments typically depend for their security on facts about imperceptible causal factors that do not register on experimental equipment, It follows that constructive and liberal empiricisms are philosophies fit only for skeptics. An epistemically prudent empiricism much better suited to the philosophical, historical, and sociological investigations of scientific practice would allow commitment to claims about unobservable phenomena (including causal factors and processes and not just regularities) that can be legitimately inferred from perceptual and instrument generated data. That kind of empiricism would countenance beliefs about solar
123
Synthese (2011) 182:7–22
21
neutrinos even though they cannot interact with the senses as required for perception, or interact directly with equipment to move needles, expose photographic plates, initiate tracings, affect computer displays, make noises, and so on. By embracing an empiricism that tolerates inferences from data to unobservable phenomena, philosophers, historians, and sociologists can direct our attention away from sterile debates about such things as what is and what is not observable, what if anything we can see through a microscope, whether observational evidence consists of perceptual experiences or features of the things observers perceive and so on. It would direct us instead toward what I think are much more interesting questions about how data are produced and interpreted, and about the causal processes and mechanisms that determine how safe it is to accept inductive generalizations. And it would direct our attention toward fruitful questions about the cognitive, contextual, and other factors that influence the degree to which it is epistemically prudent for investigators working in different laboratories in different historical and social settings to accept material posits and the inductive generalizations they support. Some one will want to object that what I think of as good empiricism isn’t empiricism at all. But even if it isn’t the empiricism of Locke, Hume, Mill, Duhem, and their followers, it incorporates what I think was traditional empiricism’s best idea. You don’t have to agree with traditional empiricist theories of meaning, theory testing, concept formation, causality, causal explanation, etc., to embrace the principle that the natural sciences should concern themselves exclusively with things to which observational evidence can provide reasonably secure epistemic access. To accept that, you needn’t follow Locke in holding that natural science is concerned only with things that can be constructed from perceptions. You don’t have to accept Duhem’s idea that scientists should devote themselves to producing, describing, and unifying general descriptions of experimental results without looking for causes. You don’t have to follow Hume and try to replace ideas about causal processes and causes with ideas about regularities and psychological expectations. To be good empiricists philosophers would have to relax the constraints of constructive and liberal empiricism and acknowledge how much investigators can learn by producing, analyzing, and interpreting data. But that agrees perfectly well with the central empiricist requirement that scientists must concern themselves with things to which observational evidence gives them epistemic access.
References Barker, P., & Goldstein, B. R. (1998). Realism and instrumentalism in sixteenth century astronomy: A reappraisal. Perspectives on Science, 6(3), 232–258. Bogen, J. (2009). Theory and observation in science. In Stanford encyclopedia of philosophy. http:// plato.stanford.edu/entries/science-theory-observation/ Bogen, J., & Woodward, J. (1988). Saving the phenomena. Philosophical Review, XCVII(3), 303–352. Bogen, J., & Woodward, J. (2005). Evading the IRS. In M. R. Jones & N. Cartwright (Eds.), Correcting the model (pp. 233–269). Amsterdam: Rodopi. Cole, K. S. (1968). Kenneth membranes, ions, and impulses. Berkeley: University of California Press. Duhem, P. (1991). The aim and structure of physical theory (P. P. Wiener, Trans.). Princeton: Princeton University Press. Hacking, I. (1983). Representing and intervening. Cambridge: Cambridge University Press.
123
22
Synthese (2011) 182:7–22
Hempel, C. (1970). Fundamentals of concept formation in empirical science. In O. Neurath, R. Carnap, & C. Moris (Eds.), Foundations of the unity of science (Vol. 2, pp. 1–52). Chicago: University of Chicago Press. Hill, A. V. (1932a). A closer analysis of the heat production of nerve. Proceedings of the Royal Society, B, III, 106–165. Hill, A. V. (1932b). Chemical wave transmission in nerve. Cambridge: Cambridge University Press. Hille, B. (2001). Ion channels of excitable membranes (3rd ed.). Sinauer Associates: Sunderland. Hodgkin, A. L., & Huxley, A. F. (1952). A quantitative description of membrane potential and its application to conduction and excitation in nerve. Journal of Physiology, 117, 500–544. Jiang, Y., Lee, A., Chen, J., Ruta, V., Cadene, M., Chait, B. T., & MacKinnon, R. (2003a). X-ray structure of a voltage-dependent K+ channel. Nature, 423, 33–41. Jiang, Y., Ruta, V., Chen, J., Lee, A., Cadene, M., & MacKinnon, R. (2003b). The principle of gating charge movement in a voltage dependant K+ channel. Nature, 423, 342–348. Mill, J. S. (1967). A system of logic ratiocinative and inductive. London: Spottiswoode, Ballantine, & Co. Norton, J. (2003). A material theory of induction. Philosophy of Science, 70(4), 647–670. Olesko, K. M., & Holmes, F. L. (1994). Experiment, quantification and discovery. In D. Cahan (Ed.), Hermann Helmholtz (pp. 50–108). Berkeley: University of California Press. Osiander, A. (1543/1992). Forward to copernicus. In E. Rosen (Ed.), On the revolutions. Baltimore: Johns Hopkins University Press. Van Fraassen, B. C. (1980). The scientific image. Oxford: Oxford University Press. Van Fraassen, B. C. (1985). Empiricism in philosophy of science. In P. M. Churchland & C. A. Hooker (Eds.), Images of science. Chicago: University of Chicago Press. Van Fraassen, B. C. (2006). ‘Structure, its shadow and substance’ penultimate. http://www.princeton. edu/~fraassen/abstract/StructureBvF.pdf Williams, B. (1978). Descartes: The project of pure inquiry. London: Pelican.
123
Synthese (2011) 182:23–38 DOI 10.1007/s11229-009-9620-y
On the meaning and the epistemological relevance of the notion of a scientific phenomenon Jochen Apel
Received: 20 May 2009 / Accepted: 3 June 2009 / Published online: 9 July 2009 © Springer Science+Business Media B.V. 2009
Abstract In this paper I offer an appraisal of James Bogen and James Woodward’s distinction between data and phenomena which pursues two objectives. First, I aim to clarify the notion of a scientific phenomenon. Such a clarification is required because despite its intuitive plausibility it is not exactly clear how Bogen and Woodward’s distinction has to be understood. I reject one common interpretation of the distinction, endorsed for example by James McAllister and Bruce Glymour, which identifies phenomena with patterns in data sets. Furthermore, I point out that other interpretations of Bogen and Woodward’s distinction do not specify the relationship between phenomena and theories in a satisfying manner. In order to avoid this problem I propose a contextual understanding of scientific phenomena according to which phenomena are states of affairs which play specific roles in scientific practice and to which we adopt a special epistemic attitude. Second, I evaluate the epistemological significance of Bogen and Woodward’s distinction with respect to the debate between scientific realists and constructive empiricists. Contrary to what Bogen and Woodward claim, I argue that the distinction does not provide a convincing argument against constructive empiricism. Keywords Data · Phenomena · Bogen · Woodward · Constructive empiricism · van Fraassen 1 Introduction In their seminal paper Saving the Phenomena James Bogen and James Woodward attack philosophical models of science according to which scientific theories
J. Apel (B) Department of Philosophy, University of Heidelberg, Schulgasse 6, Heidelberg, 69117, Germany e-mail:
[email protected] 123
24
Synthese (2011) 182:23–38
predict and explain observable facts, which in turn provide evidence for or against those theories. They argue that a closer look at scientific practice reveals that one should better distinguish between data, which are the observed outcomes of measurements, and phenomena, which are explained and predicted by theories but have to be inferred from data. Bogen and Woodward take it to be a serious shortcoming of most philosophical accounts of science that they do not take this difference into account. In this paper I scrutinize how exactly the distinction between data and phenomena has to be understood and what its philosophical impact is. So, the aim of my discussion is twofold. First, I try to provide some conceptual clarification. I seek to answer the question what exactly is meant when something is described as a phenomenon. I argue that several ways of understanding the notion, which are endorsed in the philosophical literature, are not appropriate to capture its meaning. In particular there seems to be no unambiguous interpretation of the distinction between data and phenomena. Therefore, I will make a proposal how the notion of a scientific phenomenon and Bogen and Woodward’s distinction are understood best. En passant I will argue that in the light of this proposal an objection Bruce Glymour (2000) has raised against Bogen and Woodward turns out to be misguided. Second, I want to evaluate Bogen and Woodward’s claim that the non-consideration of the data–phenomena-distinction is a serious shortcoming of traditional philosophical accounts of science. If this thesis is correct Bogen and Woodward’s distinction should shed some new light on problems we encounter in philosophy of science. Of course, in a short paper like the one at hand I cannot generally assess whether it does so or not. Rather the second aim of the paper is more modest. I only try to show that Bogen and Woodward’s thesis does not hold in one particular case. I argue that the data– phenomena-distinction is of no philosophical importance for the debate between scientific realists and constructive empiricists. Bogen and Woodward attempt to show that this central issue in contemporary philosophy of science is a prime example where their distinction displays its philosophical value by providing an argument against constructive empiricism. I will argue to the contrary: the data–phenomena-distinction does not force us to take any different point of view within this important epistemological debate. My paper’s structure is as follows: In Sect. 2 I identify two conceptual features which are characteristic for the notion of a phenomenon in scientific parlance. Section 3 deals with the traditional empiricist view of scientific phenomena and James Bogen and James Woodward’s critique of it which is based on their distinction between data and phenomena. In Sect. 4 I argue that despite its intuitive plausibility it is not exactly clear how this distinction has to be understood. I reject one widespread interpretation of it which I call the pattern view of scientific phenomena. In Sect. 5 I point out that other interpretations of Bogen and Woodward’s distinction also suffer from obscurities. In order to avoid these unclarities I propose a contextual understanding of scientific phenomena. Finally, in Sect. 6 I evaluate the philosophical significance of Bogen and Woodward’s distinction with respect to van Fraassen’s constructive empiricism.
123
Synthese (2011) 182:23–38
25
2 Conceptual features of the notion of a scientific phenomenon in scientific parlance The expression “phenomenon” is a common part of scientists’ vocabulary. Consider for example the following two statements taken from a recent issue of Nature. In the phenomenon known as the Aharonov-Bohm effect, magnetic forces seem to act on charged particles such as electrons—even though the particles do not cross any magnetic field lines. Is this evidence for electromagnetic forces that work in new and unsuspected ways? Tonomura and Nori (2008, p. 298) The presence of a liquid-gas transition was noted to be very remarkable because there are few, if any, other experimentally known instances in localized spin systems. […] No mechanism was known to account for this phenomenon, and our theory of magnetic monopoles fills this gap. Castelnovo et al. (2008, p. 44) I take these statements to be typical examples for scientists’ talk about phenomena.1 Therefore, it is possible to extract general characteristics of the notion from them: each statement points to a specific role, which phenomena play in scientific practice. Those roles can be stated as follows: (a) Phenomena are potential evidence for scientific theories (first quotation) and (b) phenomena are potential explananda of scientific theories (second quotation).2 These two roles provide the starting point for my further discussion. In the following two sections I examine two philosophical proposals for the explication of the notion of a scientific phenomenon. Those proposals spell out the notion by invoking epistemic criteria; that means they restrict its usage to entities about which we gain knowledge in specific ways. But as it will turn out, those proposals fail at specifying the notion in way that respects its yet identified characteristic features. Of course, such a line of argumentation presupposes that the criticized proposals are descriptive approaches, i.e., that these proposals are intended to analyze the notion of a phenomenon as it is used by scientists. Obviously, if those proposals were rather meant as stipulative approaches, which use the expression “phenomenon” as a technical term with a meaning stipulated exclusively for a certain philosophical theory, my argumentation would be pointless. For this reason I will have to point out in the next section why the discussed proposals and in particular Bogen and Woodward’s account of phenomena should be understood as attempts to illuminate what scientists mean when they describe something as a phenomenon. 1 That these statements really are such typical examples is of course an empirical hypothesis. But I think it is easy to persuade oneself from its correctness, e.g., by typing the word “phenomenon” into the online search of databases of scientific journals and reading through the hits. 2 One might wonder why I added the qualification “potential” in (a) and (b), but the reason for this is simple. In many cases scientists encounter a phenomenon while possessing neither a theory to explain it nor one for which the phenomenon provides evidence, but are rather in search for such theories. Nevertheless we want to say that those scientists encountered a phenomenon.
123
26
Synthese (2011) 182:23–38
3 Traditional accounts of science and the distinction between data and phenomena The historical roots of the expression “phenomenon” as a part of the scientific vocabulary can be traced back to the ancient astronomical research program called ‘saving the phenomena’. Here the observed motions of the celestial bodies where denoted as “phenomena”(cf. Mittelstrass 1961). Ancient astronomical theories should account for these motions without violating the principles of Greek natural philosophy. A generalization of this conception of phenomena, which I call the traditional view of scientific phenomena, can also be found in contemporary philosophy of science. The traditional view identifies phenomena with the observable structures and processes of the world. Such a view is typically held by empiricists like Pierre Duhem (1969) or Bas van Fraassen (1980). Both claim that it is the aim of science ‘to save the phenomena’, thereby explicitly echoing the name of the ancient research program. Saving the phenomena means for van Fraassen that scientific theories should make true statements about the observable structures and processes of the world, but we can not and need not know whether what our theories say about the unobservable parts of the world is true or not. So, according to the traditional view the criterion for being a phenomenon is an epistemic one, namely that we can acquire knowledge about it by observation. Closely connected with the traditional view is a conception of science according to which science can be modeled as consisting out of two parts: an observational basis and a theoretical superstructure. The observational basis confirms theories and the theories in turn allow for explanation and prediction of observable phenomena. James Bogen and James Woodward (1988) challenge this dyadic picture by introducing the distinction between data and phenomena into the philosophical discourse (cf. also Bogen and Woodward 1992, 2003; Woodward 1989, 2000). According to Bogen and Woodward the phenomena of science are typically unobservable. We rather have to infer them from reliable scientific data. Scientific theories, in turn, do not explain and predict observed data but inferred phenomena. Bogen and Woodward characterize data and phenomena in the following ways: Data are perceivable and public accessible records of measurement outcomes. They are, as they put it, “idiosyncratic”(Bogen and Woodward 1988, p. 317), which means that they are bound to specific experimental contexts and typically do not occur outside of scientists’ laboratories. Furthermore, data result from complex causal interactions in such a way that it is usually not possible to construct theories which explain or predict their occurrence. Phenomena, in turn, are not idiosyncratic. They rather are “stable and general features of the world” (Woodward 1989, p. 393), for which our theories should account. The very same phenomenon can often be detected via different experimental methods, while those methods yield different kinds of data. Bogen and Woodward criticize traditional accounts of science for leaving out these important differentiations and plead for a modification of traditional conceptions of science: the dyad of theory and observation should be replaced by a triad of data, phenomena and theories. To exemplify the main idea let us briefly consider their notorious example for a phenomenon: the melting point of lead. Scientists do not determine the melting point of lead by liquefying one sample of lead and observing an individual thermometer reading of the melting temperature. Instead they perform a series of measurements
123
Synthese (2011) 182:23–38
27
and even when the measuring device functions properly and systematic errors can be excluded the various individual thermometer readings (i.e., the data) in this measurement series will differ slightly from one another. In order to explain this one assumes that the measurement results were determined as well by the ‘true melting point’ as by numerous small and contingent ‘errors’, which cannot be adequately controlled and which often remain even unknown. If we can assure that those errors are independent from one another, have the same variance and are leading with equal probability to an increase as to a decrease of the measured value we are justified in claiming that the data are normally distributed and that the arithmetic mean is a good estimate for the ‘true value’. The establishment of the phenomenon that the melting point of lead is, say, 327.46 ± 0.04◦ C therefore involves various levels of data analysis and inference. Consequently, the melting point of lead is nothing we observe at all; rather this fact has to be inferred from observed data mainly on the basis of statistical inference. At the end of Sect. 2 I introduced a distinction between descriptive and stipulative approaches and pointed out that I take Bogen and Woodward as pursuing the former type of account. Having introduced Bogen and Woodward’s distinction in this section I am now in a position to make clear why I think that the framework they provide should be understood in this manner. First, the ways Bogen and Woodward characterize phenomena seem to be not only coherent with scientists’ usage of the notion, but even explicitly motivated by the fact that what scientists regard as phenomena often differs from how philosophers conceptualize scientific phenomena in their accounts. Second, Bogen and Woodward do not explicitly attempt to give a definition of the term “phenomenon”, which is something one would suspect if the expression was used as a stipulated technical term. Third, the descriptive understanding fits general developments in philosophy of science. Nowadays descriptive approaches are regarded as more successful than stipulative ones. Just think of the notorious debate about the nature of theories. There we try to find out what theories are particularly by investigating which kinds of objects scientists describe as such. We regard it for example as a shortcoming of the syntactic view that it does not apply to things to which scientists’ routinely refer as a theory, e.g., Darwinian evolutionary theory. If we accept that such descriptive strategies are the most successful means of philosophical investigation the principle of charity also suggests to take Bogen and Woodward as proponents of such a strategy. For these three reasons I take it that the data–phenomena-distinction is really meant to illuminate the concept of phenomena (and the concept of data as well) as we find it in scientific parlance. 4 The pattern view of scientific phenomena and the detection of extrasolar planets If the traditional identification of phenomena with observable structures and processes is misguided in the way Bogen and Woodward claim, how can we then specify what phenomena are? In the following I discuss one particular interpretation of the data–phenomena-distinction which one frequently finds in the philosophical literature. I will refer to this interpretation as the pattern view of scientific phenomena. This view is motivated on the one hand by the just mentioned melting point example and on the other hand by statements like the following from Woodward’s paper Data and Phenomena:
123
28
Synthese (2011) 182:23–38
The problem of detecting a phenomenon is the problem of detecting a signal in this sea of noise, of identifying a relatively stable and invariant pattern of some simplicity and generality with recurrent features—a pattern which is not just an artifact of the particular detection techniques we employ or the local environment in which we operate. Woodward (1989, p. 396) Authors like James McAllister (1997) or Bruce Glymour (2000) take statements like this to assert that phenomena are or correspond to particular patterns in data sets. These patterns can typically be represented graphically, for example by drawing a smooth curve through a set of data points. Patterns correspond to phenomena in the sense that it is assumed that those patterns can be found not only in the examined samples but also in the whole population. For example it is assumed that not only the experimentally analyzed lead samples exhibit the pattern that lead melts at 327.46 ± 0.04◦ C but every sample of lead will do so under appropriate circumstances. According to this view phenomena are what the statistician calls population statistics and the criterion for being a phenomenon which the pattern view invokes is therefore again an epistemic one, namely that knowledge about phenomena is gained exclusively by statistical interpretations of observable data.3 As aforementioned, one of the authors which endorse the pattern view interpretation of Bogen and Woodward’s distinction is Bruce Glymour. This in turn leads him to the conclusion that the whole distinction between data and phenomena is unnecessary and philosophically uninteresting. His objection against Bogen and Woodward is straightforward and runs as follows (cf. Glymour 2000, pp. 33, 34): If phenomena are population statistics, then Bogen and Woodward simply give a new name to a well known distinction from statistics. Instead of talking about sample and population statistics they just use the other words namely “data” and “phenomena”. Therefore, the data–phenomena-distinction does not provide any new insights for philosophy of science and is for this reason nothing more than a superfluous and at the worst misleading terminological reform. In the remainder of this section I argue that the pattern view interpretation of Bogen and Woodward is misguided. Phenomena cannot be identified with patterns in data sets. I will elaborate on this by considering the first discovery of extrasolar planets. My thesis is that the existence of extrasolar planets should also be regarded as a scientific phenomenon although this phenomenon does not correspond to a pattern in a data set. En passant these considerations lead to a rejection of Glymour’s objection because his critique presupposes the pattern view interpretation. Extrasolar planets or, for short, exoplanets are planets outside of our solar system. For a long time it was an open question whether there are planets orbiting other stars than our sun or not. Astronomers were interested in this question for different reason. One reason is of course that the correct answer could give us a hint whether there 3 At this point the question arises whether Bogen and Woodward themselves also endorse the pattern view. As I said, some authors suggest this, but to me this does not seem to be correct. This can be seen for example from the fact that Bogen and Woodward also regard weak neutral currents as a phenomenon (cf. for example Bogen and Woodward 1988, p. 315). This obviously does not fit the pattern view. But I do not want to discuss exegetic questions here. For me it is enough that there are authors that endorse the pattern view interpretation to make it expedient to examine how well this suggestion works.
123
Synthese (2011) 182:23–38
29
possibly exists life in other parts of the universe. A further important reason was that the detection of extrasolar planets would provide (and as a matter of fact did provide) strong evidence for certain astronomical theories about the formation of stars and planets. According to those theories the formation of planets is a typical by-product of the formation of stars and therefore there should be numerous exoplanets out there (cf. for example Casoli and Encrenaz 2007, pp. 91–116). If we now ask how the distinction between data and phenomena can be applied to the first detection of an exoplanet we have to reconsider the detection method by which the astronomers Michael Mayor and Didier Queloz discovered this planet. This method is called velocimetry and it depends on the measurement of the radial velocity of the star 51 Pegasi (cf. Casoli and Encrenaz 2007, pp. 22–30). Radial velocity is the component of the velocity of a moving object whose vector points directly to the observer (or directly away from her).4 If one measures the temporal development of the radial velocity of 51 Pegasi one finds that it shows a periodical change over time. From this change astronomers infer the existence of an exoplanet which causes this change by its gravitational interaction with the star. If one takes a look at following figure from Mayor and Queloz’s original paper one seems to be confronted with a paradigmatic case for the pattern view of phenomena (Fig. 1).
Fig. 1 Radial velocity change of 51 Pegasi (Mayor and Queloz 1995, p. 357)
4 To be more precise: one actually measures the Doppler-shift of the light emitted from the star and infers from this to the radial velocity change. But because it does not make a difference to my argument I skip this intermediate step for the sake of simplicity.
123
30
Synthese (2011) 182:23–38
The figure displays a distribution of data points fitted by a sinus curve. This curve represents the periodical change of 51 Pegasi’s radial velocity and according to the pattern view one should say that the only phenomenon Mayor and Queloz detected was this periodical change because it is the only thing which corresponds to a pattern in a data set. Against this I argue that it was another phenomenon Mayor and Queloz ultimately detected, namely the fact that 51 Pegasi is orbited by a planetary companion. Taking the existence of this exoplanet as a phenomenon can draw on the following reasons: First, this fact plays the roles which are characteristic for phenomena: it provides evidence for astronomical theories about planetary formation and on the other hand it is this fact that can be explained by those theories as soon as they become accepted.5 Second, the change of 51 Pegasi’s radial velocity in contrast to the existence of the exoplanet does not provide strong evidence for astronomers’ theories about planetary formation, because this change could also be brought about by causes different from the gravitational influence of the extrasolar planet. For example the very same pattern could have been caused by a pulsation of the star or by so-called spot rotation.6 Scientists are interested in evidence which provides good reasons for hypotheses and the pattern alone can not provide such, because in order to possess good reasons one has to rule out the mentioned rival hypotheses. But in doing so one does nothing else than establishing the exoplanet hypothesis.7 Third, treating the existence of extrasolar planets as a phenomenon fits very well with our everyday usage of the expression “phenomenon”. Here phenomena are things that are surprising, noteworthy, sometimes exceptional and for which we would like to have an explanation—and in the end it was the existence of extrasolar planets what astronomers were really curious about and not the radial velocity pattern. Fourth—this point complements the third—for astronomers nothing hinged on the question which of the different available methods was used to establish the exoplanet’s existence. Other detection methods would have been as appropriate as the one they actually applied, e.g., astrometrical or photometrical methods. But those methods would have brought about totally different kinds of data and consequently totally different kinds of patterns in data sets. I want to clarify, however, that I do not deny that the periodical change of 51 Pegasi’s radial velocity is a scientific phenomenon; I only argue that the pattern view is not universally valid. As the phenomenon of extrasolar planets reveals to identify a phenomenon we often have infer from patterns in data sets like the periodical change of the radial velocity to the causes of these patterns like the orbiting of 5 As the following quote from a standard textbook on astrophysics illustrates also astronomers refer
routinely to extrasolar planets or the fact that those exist as a phenomenon: “A proper understanding of phenomena like black holes, quasars and extrasolar planets requires that we understand all physics that underlies all of astrophysics.“ (Duric 2003, p. i). 6 These are not mere theoretical possibilities invented by philosophers, who argue for some kind of underdetermination thesis, but real-life possibilities seriously considered by astronomers (cf. Mayor and Queloz 1995, pp. 357–358). 7 This consideration points to certain insufficiencies of so-called positive relevance conceptions of evi-
dence. Arguments that scientific evidence is indeed something over and above positive relevance can be found in Achinstein (2001, pp. 5–10).
123
Synthese (2011) 182:23–38
31
the exoplanet, which brings about this change. Therefore, I conclude that the notion of a scientific phenomenon can not be adequately explicated by the pattern view. Furthermore it results from these considerations that Glymour’s way of arguing that the data–phenomena-distinction is superfluous is not convincing because his critique relied solely on the misguided pattern view. It should be noted, however, that this rebuttal of Glymour’s objection is build on the assumption that Bogen and Woodward really subscribe to a descriptive approach as I described it in the previous sections, because only in this case the deficiency of the pattern view interpretation inevitably makes Glymour’s critique lapse. On the contrary, if Bogen and Woodward were instead pursuing a stipulative approach Glymour might rightly call the relevance of their distinction into question. This gives a further, indirect reason to understand Bogen and Woodward as pursuing the former type of approach.
5 Theories and claims about phenomena: a contextual approach What we found out in the preceding sections is that analyses which tie the notion of a scientific phenomenon to our ways of epistemic access to the relevant entities are not convincing. As a matter of fact science often deals with phenomena which are neither directly observable nor statistically inferred patterns in data sets. In this last step I discuss a question which results from the fact that none of the proposed epistemic criteria proved as universally valid to specify what phenomena are. This question is how phenomena and claims about phenomena are related to theories. Which of the claims scientists make are descriptions of phenomena and which are theoretical accounts for those phenomena? In a first approximation the relation between theories and phenomena is something like this: According to the syntactic view presenting a theory is stating a few, maybe quite abstract axioms which form a deductive system. By feeding some additional information into this system for example certain boundary conditions we can derive theorems from these axioms. At least some of those theorems are then the parts of the theory which correspond to phenomena. On the rival semantic view, in turn, presenting a theory is presenting a family of (set theoretic) models. Those models or at least substructures of them correspond to phenomena. Thus, on both accounts phenomena correspond to substructures of the whole theoretical structure. But which substructures exactly correspond to phenomena? While discussing this question I will adopt the following manner of speaking: I will describe substructures which correspond to phenomena as claims about phenomena and the rest of the theoretical structure as theoretical claims. By adopting this terminology I do not want to imply that my considerations are only valid if one follows the syntactic view. Rather I think that they hold independently from which conception of theories one prefers. My mode of speaking is a mere terminological convention which I adopt because it corresponds to Bogen and Woodward’s parlance. I will argue that there is no clear-cut, context-independent criterion by which claims about phenomena and theoretical claims can be distinguished. Rather claims about phenomena are a subset of the theoretical claims to which we adopt special epistemic attitudes. This thesis constitutes a difference to the yet discussed conceptions
123
32
Synthese (2011) 182:23–38
of scientific phenomena. According to those there is a clear-cut distinction between claims about phenomena and theoretical claims. A proponent of the traditional view simply draws the distinction this way: The claim “Coffee cups fall to the ground when they drop from the table.” is a claim about a phenomenon, because we can observe accordant states of affairs. In turn all claims in which unobservable entities figure have to be considered as theoretical, for example the claim “There is a gravitational force specified by Newton’s law of gravitation which makes the coffee cup fall”.8 An analogous distinction can be drawn for the pattern view with respect to claims about patterns in data sets. But if one does not want to stick to one of these problematic views it is no longer clear which claims of science belong to mere descriptions of phenomena and which belong to theory. Nevertheless, Bogen and Woodward also suggest that there is a clear-cut distinction between both, but they draw it on different grounds. According to them, only theoretical claims have the potential to provide scientific explanations of claims about phenomena while claims about phenomena only figure as explananda but not as explanatia. Data, phenomena and theories can therefore be distinguished by the kinds of explanatory relations between them: – Claims about data do not explain and are not explained. They are neither explananda nor explanantia. – Claims about phenomena are explained by theories but do not explain claims about data. They are explananda but no explanantia. – Theoretical claims explain claims about phenomena and can be explained by other theoretical claims. They are explanantia but can also be explananda. The basic idea can again be illustrated by the melting point example. The melting point of lead can be explained by physical theory, be it thermodynamics or a theory about the atomic bindings in solid state bodies. But these theories and also the phenomenon of the melting point of lead itself cannot be invoked to explain the individual data points. They do not explain why we measured one time 326.8 ◦ C and the other time 327.1 ◦ C. But is the proposed distinctive criterion that claims about phenomena, in contrast to theoretical claims, are not used to provide scientific explanations really feasible? Consider for example the phenomenon of extrasolar planets and an accordant claim about this phenomenon: P1: The star 51 Pegasi is orbited by a planetary companion. This phenomenon is inferred as a cause of the periodical change of the radial velocity and therefore it can of course be invoked to explain this periodical change. For example envisage a student of astronomy who has basis knowledge in physics and astrophysics but does not know why the radial velocity change occurs. It is obvious that you can explain the change to him by referring to the phenomenon that there is an exoplanet orbiting the star. Consequently claims about phenomena can very well 8 This does not presuppose a distinction between observational and theoretical vocabulary. The only rel-
evant point is whether the sentence refers to observable or unobservable entities (cf. van Fraassen 1980, p. 15).
123
Synthese (2011) 182:23–38
33
provide explanations for other phenomena and are in this respect on a par with theoretical claims.9 Furthermore, it is not even clear whether claims about phenomena really do not explain data or at least aspects of them. In order to discuss this, consider the following sentences: P2: Lead melts (under such and such circumstances) at 327.46 ± 0.04◦ C. X: In this experiment (consisting in a series of measurements of melting temperatures of lead samples with a certain measuring instrument) the arithmetic mean of the melting temperatures was 326.43◦ C. X is a description of a certain aspect of experimental data because X is explicitly bound to an experimental context. Being bound to specific experimental contexts is a characteristic of data and not of phenomena. But contrary to Bogen and Woodward, I think that can P2 provide an explanation of X. This is so because explanations typically involve a contextually determined contrast-class and there are at least some contrast-classes with respect to which P2 is explanatory. E.g., If one asks for an explanation of the fact that we found in this particular experiment a mean value of 326.43◦ C and not of, say, 370◦ C one could respond to this question by stating P2 and adding that the experimental situation to which X refers approximately meets the circumstances specified in P2. Maybe one might want to call into question whether this is really an explanation, but such doubts can be allayed. Admittedly the given explanation is not a very deep one, but nevertheless it is explanatory at least in a minimal sense. Explanations are answers to why questions, they are requests for certain pieces of information. And stating that every sample of lead has the dispositional property of melting at 327.46 ± 0.04◦ C under appropriate circumstances can provide the requested information in an appropriate context. Just think of other materials like paraffin (a certain type of wax) which do not have a single melting point. A comparison with this case makes plausible that the average melting behavior of certain lead samples, i.e., experimental data, can very well be explained by referring to the stable melting point of lead. If lead would not have this property we would have measured different data which most likely would not have been normally distributed. In this sense there is a pattern of counterfactual dependency between the statistical properties of the considered data set and property of having a melting point of 327.46 ± 0.04◦ C. According to Woodward’s own account on explanation, explanatory relations are grounded exactly in such patterns of counterfactual dependency (cf. Woodward and Hitchcock 2003, p. 11). These considerations provides further reason to assume that claims about phenomena can as well as theoretical claims figure as explanantia. Therefore, I conclude that the suggestion of a clear-cut, context-independent distinction between claims about phenomena and theoretical claims is untenable. Rather theories seem to be nothing else than putative descriptions of phenomena. Our theory 9 Note that this is not an idiosyncrasy of the exoplanet case but holds true for many other phenomena, for
example for phenomena of atomic and subatomic physics. E.g., the phenomenon of weak neutral currents can be invoked in the very same manner to explain the central characteristics of bubble chamber trails.
123
34
Synthese (2011) 182:23–38
of atomic bindings, for example, describes phenomena of atomic binding and states relations between the different phenomena it describes. Why do we then sometimes refer to a certain state of affairs as a phenomenon while we are in other cases more reluctant to do so? There certainly are cases in which we are more likely to say things like that our description of a certain states of affairs is ‘just’ a theoretical hypothesis and not a claim about a phenomenon. This indicates that we regard certain states of affairs as phenomena if we adopt a special epistemic attitude towards these states of affairs. Denoting a state of affairs as a phenomenon expresses that (additional to the ascription of two roles identified in Sect. 2) one assumes to have good reasons for accepting a corresponding claim which describes this state of affairs. Phenomena are state of affairs which we accept as given in a certain context. And this is the case if the claim figures as highly probable given our experimental data and accepted background knowledge. If we talk about the phenomenon of exoplanets or the phenomenon of weak neutral currents we accept the claims that there are exoplanets orbiting stars and that there are weak neutral currents in certain particle interactions. Consequently there is no difference in principle between theoretical claims and claims about phenomena; rather there is just a contextual determined difference. To be more specific I claim that one should distinguish between strongly accepted claims of science, which are claims about phenomena, and claims which are less strong accepted. These claims one might call “theoretical hypothesis”. Claims about phenomena and theoretical hypothesis together make the whole set of theoretical claims. A claim can be regarded as theoretical hypothesis in one and as a claim about a phenomenon in another context, for example when a change in background theory or the acquisition of new experimental data raises (or lowers) our acceptance of the claim. That our epistemic attitudes determine whether something is a phenomenon squares well with the two roles of phenomena which were identified in Sect. 2: Something we accept in a strong manner is of course more appropriate to provide evidence for further hypotheses than something we are not so sure of. An analogous point holds true for the explanandum role: we only ask for an explanation if we have good reasons for accepting the explanandum-sentence. Thus, the notion of a phenomenon is a contextual concept which can be characterized in the following manner: (i) States of affairs can be regarded as phenomena only if they play certain roles in science: They have to be potential evidence for and potential explananda of further scientific claims. (ii) Claims about such states of affairs are judged as claims about phenomena only if there are sufficiently good reasons to accept those claims. There are no further restrictions on states affairs in order to qualify as phenomena. In particular there is no restriction regarding our way of epistemic access to the relevant states of affairs. There are various ways by which we can acquire knowledge about phenomena. Some phenomena can be directly observed like the phenomenon that the heaven is blue or that lead melts if you heat it strongly, some correspond to patterns in data sets like the melting point of lead or the periodical change of the radial velocity of 51 Pegasi and some we infer as causes of such patterns like the phenomenon of extrasolar planets or the existence of weak neutral currents.
123
Synthese (2011) 182:23–38
35
6 Data, phenomena and constructive empiricism In the last section of this paper I want to shift the focus from the analysis of the meaning of the expression “phenomenon” as it is used in scientific discourse to the question whether we can gain something for the philosophical meta-discourse about science from the distinction between data and phenomena. Here I will focus on epistemological issues related to the scientific realism debate and argue that at least in this particular field we do not gain new insights from the distinction. Bogen and Woodward by contrast claim that van Fraassen’s constructive empiricism, arguably the most prominent rival account to scientific realism in contemporary philosophy of science, is unable to accommodate the distinction between data and phenomena into its theoretical framework (cf. Bogen and Woodward 1988, pp. 349–352; Woodward 1989, pp. 450–452). At the heart of constructive empiricism lies the notion of empirical adequacy. The aim of science according to constructive empiricism is to find empirically adequate theories, i.e., theories which make true statements about the observable. But in contrast to the scientific realist the constructive empiricist holds the view that we should be agnostic with regard to the truth-value of scientific claims about the unobservable parts of the world. Bogen and Woodward object that the consideration of the distinction between data and phenomena renders van Fraassen’s construal of empirical adequacy implausible. They write: Empirical adequacy, as we understand it, means that a theory must “save” or “be adequate to” the phenomena, which for the most part are not observed, rather than the data which are observed. By contrast, van Fraassen requires that theories save or be adequate to what can be observed. This is tantamount to requiring that a theory must save the data—that an acceptable theory of molecular structure, in Nagel’s example [i.e., the melting point example; J.A.], must fit the observed scatter of thermometer readings, rather than the true melting point of lead which is inferred from these readings. We have argued at length that this is an unreasonable requirement to impose on any theory. It seems unlikely that van Fraassen could accept our notion of empirical adequacy without abandoning many of his most central claims. Bogen and Woodward (1988, p. 351) To my knowledge van Fraassen nowhere reacts to this challenge, but prima facie there seem to be two options he could take in order to maintain constructive empiricism. The first option would be to argue that the data–phenomena-distinction shows that the realm of the unobservable is larger than commonly expected. Despite what one intuitively thought Bogen and Woodward have shown that also phenomena like the melting point of lead are in fact unobservable and the constructive empiricist has to bite the bullet: She also has to be agnostic with regard to the truth-value of claims about those phenomena just as well as in the case of any claim in which unobservable entities figure. But taking this option cannot be satisfying for van Fraassen because he claims that his position gives an as adequate account of scientific practice as scientific realism. That means in particular that its determination of the aim of science should be coherent with actual scientific practice. Indeed his position is advocated by van Fraassen as
123
36
Synthese (2011) 182:23–38
the “equilibrium point” between epistemic modesty and adequacy to science (Monton and van Fraassen 2003, p. 407; cf. also Berg-Hildebrand and Suhm 2006). But if it is the case that science is mostly concerned with unobservable phenomena and that there are virtually no observable phenomena at all then this casts serious doubts on van Fraassen’s claim that it is the aim of science to find empirically adequate theories, if empirical adequate theories are understood as theories which make true statements about data. Hence, if van Fraassen really were committed to taking this option the data–phenomena-distinction would indeed show that constructive empiricism does not meet the criterion of adequacy to science, which van Fraassen himself imposes. But, as I will argue in the remainder of this paper, there also is a second option according to which van Fraassen is in not committed to such a narrow understanding of empirical adequacy but should better construe this notion in a way such that it includes patterns in data sets. The defense of van Fraassen’s position goes like this: Van Fraassen readily concedes that nearly every claim about a phenomenon involves more than pure observation but also some kind of inference. This can be seen from the fact that constructive empiricism contains a commitment to the truth about the observable and not just about the observed. This step already involves some observation-transcending inference and van Fraassen is willing to take the epistemic risk involved this ampliative inference in order to meet the adequacy to science criterion. Analogously to these observedobservable inferences he can further concede that the adequacy to science criterion also requires the acceptance of inferences by which we move from data to patterns in data sets. Nevertheless he can still maintain that those inferences involve a lesser degree of epistemic risk than inferences to phenomena involving unobservable entities. Everything one needs to establish the melting point of lead are observable data about melting temperatures of observable lead samples, assumptions about the absence of systematic error and statistical methods of data analysis.10 But those assumptions and mathematical inferences, so van Fraassen might argue, do not increase epistemic risk in the same way as theoretical inferences to the existence of unobservable entities like electrons do, because they do not involve any specific physical theory to license the observation transcending step; in particular they do not involve any new ontological commitments (Cf. van Fraassen and his coauthors in Ladyman et al. 1997, p. 317, for a similar line of argument). Contrary to Bogen and Woodward’s objection, such an argumentation enables him to stretch the notion of empirical adequacy in a way that it covers patterns in data sets while his position is still epistemically more modest than scientific realism. Furthermore, it seems as if van Fraassen already endorses such a broad understanding of empirical adequacy anyway when he writes: The whole point of having theoretical models is that they should fit the phenomena, that is, fit the models of the data. van Fraassen (1985, p. 271) Here he refers to phenomena explicitly as “models of data”, a terminus introduced by Patrick Suppes (1962). But those are nothing else than what was called patterns in 10 Remember that van Fraassen has no quarrels with the claim that also our observational vocabulary is
theory-infected (cf. van Fraassen 1980, p. 14).
123
Synthese (2011) 182:23–38
37
data sets in this paper. Hence, the practice of saving the phenomena (in van Fraassen’s sense) might very well include patterns in data sets and at the same time exclude states of affairs in which unobservable entities figure. It is worth noting that, contrary to what Bogen and Woodward claim, such a broad understanding of empirical adequacy is still in accordance with its definition as truth in regard to the observable. That phenomena which correspond to patterns in data sets are often not observed but inferred does by no means entail that those phenomena are unobservable. Rather, under appropriate circumstances, i.e. in the absence of all confounding factors, we would observe, e.g., lead samples melting exactly at their melting temperature. This case is perfectly analogous to the fact that dinosaurs are observable because we would observe dinosaurs under appropriate observation conditions regardless of whether we are de facto able to realize those conditions. In summary, nothing about the distinction between data and phenomena debars the empiricist from claiming that we should have different epistemic commitments with regard to different types of phenomena depending on whether the accordant states of affairs involve observable or unobservable entities. There may be other reasons why there are no epistemological significant differences between both kinds of phenomena but such reasons would be independent of the data–phenomenadistinction. 7 Concluding remarks My appraisal of Bogen and Woodward’s distinction between data and phenomena yielded two main results. The first result is a clarification of the notion of a scientific phenomenon. I argued that ways of understanding the notion which specified phenomena via the way we gain knowledge about them were not able to account for the characteristic features of the notion in scientific parlance. It is neither an essential property of phenomena to be observable, as the traditional view claims, nor is it that phenomena solely correspond to unobservable patterns statistically inferred from data sets, as the pattern view has it. A broader reading of Bogen and Woodward’s distinction which also allows for the inclusion of, e.g., the phenomenon of extrasolar planets seemed to provide a more promising attempt to capture the notion’s meaning but seemed at the same time to be connected with an unclear distinction between claims about phenomena and theoretical claims. In order to deal with this difficulty I provided an analysis which does not specify the notion by connecting it to a special way of epistemic access to phenomena but rather by the roles of scientific phenomena in scientific practice and by our epistemic attitudes towards claims about phenomena. The second result is an evaluation of the epistemological significance of Bogen and Woodward’s distinction for the debate between scientific realists and constructive empiricists. The upshot of this investigation is that the distinction does not provide any convincing arguments for this debate. Acknowledgments I am deeply indebted to Daniela Bailer-Jones for initiating my work on scientific phenomena and I thank various people, who gave valuable comments on earlier drafts of this paper, especially Monika Dullstein, Pavel Radchenko, Stephan Hartmann and the participants of conference
123
38
Synthese (2011) 182:23–38
“Data–Phenomena-Theories” held in Heidelberg in 2008. My research is financially supported by the “Emmy-Noether-Programm der Deutschen Forschungsgemeinschaft”.
References Achinstein, P. (2001). The book of evidence. Oxford: Oxford University Press.. Berg-Hildebrand, A., & Suhm, C. (2006). The hardships of an empiricist. In A. Berg-Hildebrand & C. Suhm (Eds.), Bas C. van Fraassen—The fortunes of empiricism (pp. 57–67). Frankfurt/Main: Ontos.. Bogen, J., & Woodward, J. (1988). Saving the phenomena. The Philosophical Review, 97, 303–352. Bogen, J., & Woodward, J. (1992). Observations, theories and the evolution of the human spirit. Philosophy of Science, 59, 590–611. Bogen, J., & Woodward, J. (2003). Evading the IRS. Poznan Studies in the Philosophy of Science and the Humanities, 20, 223–245. Casoli, F., & Encrenaz, T. (2007). The new worlds: Extrasolar planets. Berlin: Springer. Castelnovo, C., Moessner, R., & Sondhi, S.L. (2008). Magnetic monopoles in spin ice. Nature, 451, 42–45. Duhem, P. (1969). To save the phenomena—An essay on the idea of physical theory from Plato to Galileo. Chicago: University of Chicago Press. Duric, N. (2003). Advanced astrophysics. Cambridge: Cambridge University Press. Glymour, B. (2000). Data and phenomena: A distinction reconsidered. Erkenntnis, 52, 29–37. Ladyman, J., Douven, I., Horsten, L., & van Fraassen, B. C. (1997). A defence of van Fraassen’s critique of abductive inference: Reply to Psillos. The Philosophical Quarterly, 47, 305–321. Mayor, M., & Queloz, D. (1995). A Jupiter-mass companion to a solar-type star. Nature, 378, 355–359. McAllister, J. W. (1997). Phenomena and patterns in data sets. Erkenntnis, 47, 217–228. Mittelstrass, J. (1961). Die Rettung der Phänomene—Ursprung und Geschichte eines antiken Forschungsprogramms. Berlin: Walter de Gruyter & Co. Monton, B., & van Fraassen, B. (2003). Constructive empiricism and modal nominalism. British Journal for Philosophy of Science, 54, 405–422. Suppes, P. (1962). Models of data. In E. Nagel, P. Suppes, & A. Tarski (Eds.), Logic, methodology, and philosophy of science—Proceedings of the 1960 international congress (pp. 252–261). Stanford: Stanford University Press. Tonomura, A., & Nori, F. (2008). Disturbance without the force. Nature, 452, 298–299. van Fraassen, B. (1980). The scientific image. Oxford: Clarendon Press. van Fraassen, B. (1985). Empiricsm in the philosophy of science. In P. M. Churchland & C. A. Hooker (Eds.), Images of science (pp. 245–368). Chicago: University of Chicago Press. Woodward, J. (1989). Data and phenomena. Synthese, 79, 393–472. Woodward, J. (2000). Data, phenomena, and reliability. Philosophy of Science, 67, S163–S179. Woodward, J., & Hitchcock, C. (2003). Explanatory generalizations, Part I: A counterfactual account. Nous, 37, 1–24.
123
Synthese (2011) 182:39–55 DOI 10.1007/s11229-009-9615-8
Bogen and Woodward’s data-phenomena distinction, forms of theory-ladenness, and the reliability of data Samuel Schindler
Received: 20 May 2009 / Accepted: 3 June 2009 / Published online: 12 August 2009 © Springer Science+Business Media B.V. 2009
Abstract Some twenty years ago, Bogen and Woodward challenged one of the fundamental assumptions of the received view, namely the theory-observation dichotomy and argued for the introduction of the further category of scientific phenomena. The latter, Bogen and Woodward stressed, are usually unobservable and inferred from what is indeed observable, namely scientific data. Crucially, Bogen and Woodward claimed that theories predict and explain phenomena, but not data. But then, of course, the thesis of theory-ladenness, which has it that our observations are influenced by the theories we hold, cannot apply. On the basis of two case studies, I want to show that this consequence of Bogen and Woodward’s account is rather unrealistic. More importantly, I also object against Bogen and Woodward’s view that the reliability of data, which constitutes the precondition for data-to-phenomena inferences, can be secured without the theory one seeks to test. The case studies I revisit have figured heavily in the publications of Bogen and Woodward and others: the discovery of weak neutral currents and the discovery of the zebra pattern of magnetic anomalies. I show that, in the latter case, data can be ignored if they appear to be irrelevant from a particular theoretical perspective (TLI) and that, in the former case, the tested theory can be critical for the assessment of the reliability of the data (TLA). I argue that both TLI and TLA are much stronger senses of theory-ladenness than the classical thesis and that neither TLI nor TLA can be accommodated within Bogen and Woodward’s account. Keywords Data · Phenomena · Bogen and Woodward · Reliability · Theory-ladenness
S. Schindler (B) Department of Philosophy, University of Konstanz, 78457, Konstanz, Germany e-mail:
[email protected] 123
40
Synthese (2011) 182:39–55
1 Introduction In 1988 Bogen and Woodward introduced a ‘third level’ into the classical dichotomy of theory and observations: scientific phenomena. Unlike the notion of phenomena in the empiricist tradition of ‘saving the phenomena’ defended by van Fraassen (1980) in particular, Bogen and Woodward’s phenomena are usually unobservable. However, phenomena can be inferred from observable data, after the reliability of the latter has been secured by various experimental procedures involving statistical inference, data reduction, the exclusion of confounding factors, error and noise control and the like. As a result of these inferences, phenomena are “stable” over different experimental contexts, whereas data typically lack this stability and are often highly idiosyncratic due to the causes peculiar to those contexts (Bogen and Woodward 1988, pp. 317, 319). Whereas we, for instance, have good reasons to believe that there is one true melting point of lead of 327.43 ◦ C (i.e. the phenomenon), it may well be that none of the values of a particular set of thermometer readings will coincide with the true value. What is more, we may not even know the precise reason for the variation of those data points. The outcome of the latter depends on a multitude of factors, not all of which we will be able to fully control at all times: The outcome of any given application of a thermometer to a lead sample depends not only on the melting point of lead, but also on its purity, on the workings of the thermometer, on the way in which it was applied and read, on interactions between the initial temperature of the thermometer and that of the sample, and a variety of other background conditions. (Bogen and Woodward 1988, p. 309) Even though we often may be ignorant about exactly which causes were involved to which extent in the production of the data, we nevertheless may still have good reasons to believe in the existence of the phenomenon of the melting point of lead if we can infer it from the observed values.1 It is this inferred value, i.e. the phenomenon, which is the “thing-to-be-explained”, not the data. In more complex situations than this “toy” example, say in the discovery of weak neutral currents, which has been cited rather extensively by Bogen and Woodward (see also Woodward 1989, 2000) the inference from the data to the phenomena, of course, is not as easy and straightforward as in the example of the melting point of lead. There, background effects, which mimicked neutral currents, had to be taken care of before neutral currents could be established as a genuine phenomenon (see below and Galison 1983, Pickering 1984). And yet, Bogen and Woodward claim that this case is analogous to the simple example of the melting point of lead. In both cases, phenomena and data possess the characteristics mentioned above and in both cases phenomena are somehow inferred from the observable data. Unfortunately apart from the occasional (rather vague) reference to statistical techniques, Bogen and Woodward don’t say much about
1 Glymour (2000) has pointed out that in the example of the melting point of lead what Bogen and Wood-
ward call the phenomenon and the data seems to be merely giving other names to the well-defined statistical concepts of population and sample. For the sake of better illustration I shall ignore this point here.
123
Synthese (2011) 182:39–55
41
the nature of those inferences.2 In any case, they stress that one must make sure that the inferential “base” (i.e. the data) is not flawed in order for one to carry out inferences to the phenomena: [T]he question of whether data constitute reliable evidence for some phenomena turns (among other things) on such considerations as whether the data are replicable, whether various confounding factors and other sources of possible systematic error have been adequately controlled, on statistical arguments of various kinds, and on one’s procedures for the analysis and reduction of data. (ibid., p. 327) A point which Bogen and Woodward emphasise very strongly throughout their paper is that the reliability of the data can be secured without higher order theory explaining or predicting why particular data happened to be produced by one’s experimental apparatus. On the contrary, […] the details of the operation of these causes will be both unknown and enormously complex, so that typically there will be no possibility of explaining why this or that individual data-point occurs. (ibid., p. 334) Yet, Bogen and Woodward hold that we do not need to be able to explain the data in order to ensure their reliability. In particular, the theories we seek to test do not need to explain (or predict) the data. If the tested theories were involved in the procedures for ensuring data reliability, one of Bogen and Woodward’s motives for introducing the data-phenomena distinction in the first place would not materialise: rebutting the thesis of theory-ladenness (cf. ibid., pp. 310, 342–347). According to the thesis of theory-ladenness, theories interfere on a very low level of cognition: our observations are significantly altered by the theories we seek to test with those same observations. This is usually seen as a threat to scientific objectivity because if the thesis of theory-ladenness were true, and if we were to hold different theories about the world, we would not be able to agree on what we actually observe. Observations thus could no longer serve as the arbiter between different theories. Bogen and Woodward’s scheme rules out the thesis of theory-ladenness because it simply denies that there is any direct epistemological link between theories and observations. It is not observations but rather phenomena that theories predict and explain. Phenomena, in turn, cannot be subject to the thesis of theory-ladenness because they (a) are unobservable, and (b) inferred from the data in an epistemologically unproblematic (if somewhat obscure) way. At this point, one or two comments about the thesis of theory-ladenness are in order. Although philosophers like Thomas S. Kuhn have often been ridiculed as questioning the reliability of our perceptual apparatus and the stability of our perceptions, this is simply not the point about theory-ladenness. For instance van Fraassen in his Scientific Image asked us not to confuse “observing (an entity, such as a thing, 2 Bogen and Woodward explicitly say (p. 338) that they do not mean the well-known Inference to the Best Explanation (IBE) according to which one infers the truth of a hypothesis from the fact that it explains the data best (when compared to other hypotheses explaining the same data). Explanation Bogen and Woodward generally regard as being extraneous to data to phenomena reasoning (see below).
123
42
Synthese (2011) 182:39–55
event, or process) and observing that (something or other is the case)” (van Fraassen 1980, p. 15; original emphasis). Van Fraassen illustrates this distinction by calling on a thought experiment in which a member of some isolated tribe is shown a tennis ball. From the behaviour of tribe member, we can infer that she saw the tennis ball (she may throw it, chew it, etc.), but it would be going too far to claim that she has seen a tennis ball—because she does not possess the required concepts. To say that [s]he does not see the same things and events as we do, however, is just silly; it is a pun which trades on the ambiguity between seeing and seeing that. (ibid). Even though van Fraassen’s distinction between ‘seeing’ and ‘seeing that’ is a sound one as far as it goes, it is not at all obvious that it really weakens the problem of theory-ladenness. After all it matters rather little for our day-to-day or scientific discourse and practices whether we have the same percepts when we (due to our differing knowledge) actually do see different things. This can be nicely illustrated with the following example from another important sense perception (for which the thesis of theory-ladenness should equally apply, if it is true). Whereas I won’t hear much more than strange unfamiliar sounds when someone speaks Chinese, somebody with the knowledge of Chinese is bound to hear meaningful sentences when hearing the same utterances. Someone with a good knowledge of Chinese will not even have to think about what she hears (in terms of grammatical rules). She will just hear meaningful sentences—despite the fact that the (phonetic) sounds the Chinese speaker utters are the same for her and for me! It is this influence of our knowledge/theories on our sensations (despite leaving our “physical” percepts intact), which the thesis of theory-ladenness refers to. Although Bogen and Woodward in their original paper rejected the thesis of theoryladenness, Woodward has recently indicated that he accepts that at least some forms of theory-ladenness (if somewhat weaker than the form discussed above) do occur in scientific practice.3 According to one of these forms, the design and the conduction of experiments can be motivated by theories (call this the thesis of “theory-drivenness” of scientific practice).4 Such a claim is epistemologically rather unproblematic. Whether or not the conduct of particular experiments has been motivated by the theory, which these experiments seek to test, is irrelevant to the (logical) question of whether the data obtained in these experiments do or do not support the belief in particular phenomena, and whether the latter in turn, do or do not confirm the theory’s predictions. There is another form of theory-ladenness which Woodward is happy to accept. A theory may 3 In his talk at the ‘Data-Phenomena-Theories’ conference at the University of Heidelberg in Germany (11–13 September 2008), Woodward has tried to weaken this apparent contradiction with his earlier claims by mentioning that he and Bogen used the word “typically” too often in their joint paper of 1988 and that that their data-phenomena distinction with all its characteristics was not meant to be a universal account of how science works. 4 I’m rather reluctant to refer to this as a form of theory-ladenness. Yet, not only Woodward but also Heidelberger (2003) doesn’t draw the distinction between theory-ladenness and theory-drivenness, which I suggest here.
123
Synthese (2011) 182:39–55
43
provide the vocabulary for interpreting and describing the data.5 Several questions arise here. How can the theory that explains and predicts the phenomena (call this T(p)) but not the data, provide the interpretational and descriptive framework for the latter (which presumably would require some sort of theory)? If an interpretation is some sort of explanation, doesn’t the theoretical interpretation of data not clash with Bogen and Woodward’s claim that theories do not explain the data? Why should T(p) provide such a framework for the data in the first place? After all, T(p) only predicts and explains the phenomena, not the data, so why would an interpretation of the data be required? Lastly, given one interpretational framework for the phenomena and one for the data, what is their relationship supposed to be? These are questions, I think, Bogen and Woodward need to clarify. Yet, in this paper, I want to rather focus on the question of whether Bogen and Woodward’s scheme is really as descriptively adequate of scientific practice as they have claimed. In particular, I want to challenge Bogen and Woodward on their thesis that the reliability of data can be secured without the theory which predicts and explains the phenomena that are supported by those data. Take for instance “data reduction”, one of the procedures, which Bogen and Woodward quote in this context. Here, Bogen and Woodward have referred to the fact that scanning bubble chamber photographs for significant events is often carried out by theoretically naïve personnel or even machines: The extent to which such methods of data reduction are independent of any concern with explanation is illustrated by the fact that the person or machine performing these tasks can carry them out without understanding either the theory which explains the interactions for which the photographs are evidence, or the physical principles by which the equipment works. (p. 333) I don’t think routines of data reduction, whose carrying out does not require much theoretical understanding, supports Bogen and Woodward’s claim that the reliability of data is secured without the theory whose predictions are tested against the data.6 Even though someone might have the relevant skills to follow a routine that has been given to her, this routine has to be given to her in the first place. It is hard to imagine that the working out of such a data reduction routine, which will involve the choice for and against particular data, could be done without any theoretical understanding of what one ought to see in the data (here: certain events on bubble chamber photographs), if the routine in question is supposed to produce any meaningful results. Rather than assuming the independence of experimental practice from theoretical reasoning when it comes to the reduction of data (and other forms of establishing the reliability of data, as I shall argue in the following), a thorough interplay between the two appears to be much more plausible.
5 See again Heidelberger (2003). 6 Hacking in fact uses a very similar example of theoretically naïve lab assistants scanning cloud chamber
photographs for tracks left by positron events (Hacking 1983, p. 179). See also Duhem (1962, p. 145).
123
44
Synthese (2011) 182:39–55
2 Strong versions of theory-ladenness After the above clarifications about the traditional concept of theory-ladenness (which I think is correct), and the epistemologically unproblematic concept of theory-drivenness, consider even stronger versions of theory-ladenness: the principled neglect of data due to theoretical predispositions (TLI), and theoretical reasons for belief in the reality of phenomena, which can prove to be critical in the assessment of the reliability of the data and the eventual acceptance of this phenomenon as being real (TLA). In the following I shall substantiate these two stronger forms of theory-ladenness by considering two historical cases which—rather ironically—have been quoted in support of Bogen and Woodward’s account. Neither TLI nor TLA seems to be compatible with Bogen and Woodward’s account, because theories, according to Bogen and Woodward, simply have no business in establishing the reliability of the data. So even if Bogen and Woodward were to accept the traditional form of theory-ladenness discussed above (which their account doesn’t seem to permit without major modifications), I don’t think it can handle either TLI or TLA.
3 Case study I: the zebra pattern of magnetic anomalies The first case I am going to discuss is the discovery of the zebra pattern of magnetic anomalies, which has been discussed by Kaiser (1995) in order to defend Bogen and Woodward’s notion of data and phenomena.7 In the late 1950s and early 1960s it was found that the sea floor is not only positively but also negatively magnetised and that these magnetisations occur in stripes running in a north-south direction. Positive anomalies were conventionally shaded black thus giving the impression of a “zebra” pattern (see Fig. 1). This pattern received its currently accepted (and perhaps final) explanation in the so-called Vine–Matthews–Morley hypothesis (VMM). This hypothesis is essentially a combination of the hypothesis of geomagnetic field reversals (GFR) and the hypothesis of sea floor spreading (SFS). GFR is the idea that the poles of the earth’s magnetic field switch repeatedly throughout time so that the North Pole becomes the South Pole and vice versa. SFS is slightly more complicated. It is the hypothesis that sea floor is formed at oceanic ridges by hot mantle rock that cools down at the surface after being elevated by convection currents within the mantle. The sea floor, which has been formed thus, then is pushed towards the periphery by succeeding hot mantle rock and submerges beneath the continental plates where it causes earthquakes and eventually is turned again into magma by the high temperatures within the mantle (see Fig. 2). Magnetites within the molten rock, which is elevated from the mantle to the surface, align with the current geomagnetic field, and remain like that when the molten rock cools down. Thus, the past geomagnetic field is “fossilised”, as it were, and the striped pattern of Fig. 1 can be explained by an evolving sea floor whose material is magnetised according to past geomagnetic fields (see Fig. 3).
7 For a discussion of Kaiser’s view and for many more details about this case can be found in Schindler
(2007).
123
Synthese (2011) 182:39–55
45
Fig. 1 The Zebra pattern of positive and negative magnetic anomalies off the East coast of Northern America. Positive anomalies are shown in black
Fig. 2 Sea floor spreading
It is important to note that neither GFR nor SFS were accepted as real phenomena within the scientific community at the time (i.e. the early 1960s) but were regarded as rather speculative. This changed when VMM combined SFS and GFR in order to explain the zebra pattern. Before VMM, the discoverer of the zebra pattern, Ron Mason of the Scripps Institute of Oceanology in California offered his own explanation of the pattern which is highly interesting for the purposes of this paper. Mason reasoned that the positive anomalies were likely to be caused by slabs within the sea floor consisting of “material highly magnetic by comparison with adjacent formation and almost
123
46
Synthese (2011) 182:39–55
Fig. 3 The Vine–Matthew–Morley hypothesis as a combination of geomagnetic field reversals and sea floor spreading
Fig. 4 A profile of the magnetic anomalies (both positive and negative) and three out of a multitude of possible shapes of positively magnetised slabs within the sea floor explaining the observed positive anomalies (but not the negative ones). Diagram adapted from (Raff, 1961, p. 154)
certainly a basic igneous rock” (Mason 1958, p. 328). The concrete form of these positively magnetised slabs could take various forms, some of which are depicted in Fig. 4. Crucially, Mason did not even try to account for the negative magnetisations. Retrospectively, Mason explained this thus: I could have kicked myself for not thinking of the idea [VMM hypothesis], particularly because, had I looked more carefully at our map, I would have realized that some of the seamounts might be reversely magnetized, and this might have headed my thoughts in the right direction. (Mason 2003, p. 41) Yet, as it turns out, Mason’s attempt to retrospectively explain why he ignored the negative anomalies can be shown to be implausible. First, Mason shaded only the positive anomalies of the map of magnetic anomalies, not the negative ones. He must
123
Synthese (2011) 182:39–55
47
have been thus aware that there were negative anomalies. This is also clear from the description of the maps. For instance, Raff, Mason’s co-worker wrote in the Scientific American: The Lineated Pattern […] is startlingly apparent when the positive anomalies are shown in grey and the negative anomalies in white. (Raff 1961, p. 147) Note also that Raff spoke of a lineated pattern, rather than of an (alternating) zebra pattern. A perhaps even clearer statement of the same idea can be found in Mason’s earliest publication of the magnetic anomalies: This paper has been mainly concerned with the north-south anomalies because they are the most striking feature of the magnetic map […] (Mason 1958, p. 328; my emphasis) Now, it needs to be emphasised that Mason and Raff’s description and explanation of the lineated (rather than the “zebra”) pattern cannot be easily dismissed as being just bad science. There is no record that anybody criticising their description and explanation of the pattern their articles made it into the most respected peer-reviewed scholarly journals, and no alternative proposals are reported anywhere (until the VMM hypothesis in 1963, i.e. 5 years after Mason’s first publication about the pattern). Even though one could of course simply question the proper judgment of the scientific community at the time, I think it is worth asking why Mason, Raff, and others neglected the negative anomalies. Elsewhere (Schindler 2007) I have argued in more detail that Mason and Raff didn’t explain the whole pattern but just the subset of positive anomalies because they either weren’t aware of the concept of SFS, or they didn’t know how to apply it to the pattern. Whatever the case, nowhere in their publications did Mason and Raff mention SFS.8 Even though they were well aware of GFR and even tried to apply it to the pattern, GFR is not explanatory of the pattern if not combined with SFS, as VMM did later. But because Mason and Raff weren’t aware of or didn’t know how to apply SFS, they also couldn’t recognize a corollary of SFS, namely transform faults and its importance for the interpretation of the pattern (see Fig. 5, right). Thus, rather than realising that the off-set of the ridges within the pattern was original (which is the case in transform faults), Mason and Raff were assuming that these off-sets were caused by seismic activity along the entire fault line (this being a feature of transcurrent faults; Fig. 5, left): The north-south magnetic lineation appeared throughout the area. In several places the pattern was sharply broken along an east-west line, so that the striations above and below it did not match. Although the significance of the lineation [not the alternation of stripes] was—and still is—a mystery, the meaning of the discontinuities was immediately clear. (Raff 1961, p. 148; my emphasis)
8 Mason (2003, p. 41) himself claims that he was aware of not only GFR but also SFS. But again, we must
be careful about scientists’ own recollections, as the above discussion shows.
123
48
Synthese (2011) 182:39–55
Fig. 5 Two hypotheses about the relationships between fracture zones and ridges. In both A and B the stars indicate expected earthquake distributions; the arrows rock motion. (A) Transcurrent fault. It is assumed that the two ridges were once continuous across the fracture zone. (B): Transform fault. It is assumed that the offset of the two ridges is original and invariant. Rock motion is caused by sea-floor spreading away from the ridges. Notice that in B only the area between the two ridges is expected to exhibit seismic activity
Mason and Raff thus tried to reconstruct what they thought was the original lineated pattern by aligning the off-set ridges.9 Here, of course, the positive anomalies were sufficient and the alternation between positive and negative anomalies of the pattern rather irrelevant. Hence, Mason and Raff’s ignoring of the negative anomalies of the pattern can be explained by the fact that that they did not possess or take into consideration the at the time rather speculative idea of SFS. Although a case for the traditional concept of theory-ladenness can easily be made with this episode (perception of the pattern as lineation vs. perception of the pattern as alternation due to different theoretical beliefs), this case supports an even stronger version of theory-ladenness according to which data are bluntly ignored if they are found irrelevant from a particular theoretical standpoint (TLI).10 Even if Bogen and Woodward were to accept the traditional thesis of theory-ladenness, it is hard to see how their account could accommodate TLI. Bogen and Woodward hold that the reliability of the data can be established without the theory, which is sought to be tested. Now, as pointed out above, the reduction of data, according to Bogen and Woodward, is one of those procedures one applies when trying to achieve the reliability of the data independently of the theory to be tested. Nevertheless, in the case discussed in this section, the theory at stake clearly was involved in the reduction of data. In fact, it led to a principled neglect of some data in favour of others. Hence, Bogen and Woodward’s scheme, quite apparently, is not descriptive of scientific practice here. One may be inclined to dismiss my example on grounds that the theoreticallymotivated reduction of the data did not lead to a real phenomenon. Leaving aside that TLI can observed in cases where this sort of data reduction did result in an actual phenomenon (see Schindler 2008), again, Mason & Raff’s explanatory models were not attacked or criticised by the scientific community for their descriptions of a lin9 They realised however that the “displacements” had rather awkward features, actually characteristic of transform faults: “Horizontally displaced faults have always presented a puzzle: no distinctive topographic features mark their ends; the cracks simply stop” (Raff (1961), p. 148). 10 Kuhn has of course stipulated something like TLI in his Structure of Scientific Revolutions. Note, however, that in Kuhn’s account the ignorance of observations which do not fit theoretical expectations (i.e., so-called anomalies) occurs in the context of normal science and well-established paradigms. The case discussed here clearly does not fall into this category.
123
Synthese (2011) 182:39–55
49
eated (rather than an alternating) pattern and for their whole-sale neglect of negative anomalies. And if one wants to defend an account that is descriptive of actual scientific practice (and this is clearly what Bogen and Woodward seek to do11 ) one cannot simply discard such cases because, clearly, Mason & Raff’s work was seen as good scientific practice and the lineation of anomalies was clearly perceived as a genuine phenomenon at the time. It therefore appears plausible to give theories a more active role in the assessment of data and in the discovery of the phenomena in scientific practice. This will become even more apparent in the other case I want to consider in the following. 4 Case study II: the discovery of weak neutral currents The discovery of weak neutral current (WNC) figures quite heavily in Bogen and Woodward’s publications. It is cited as a prime example for their notions of data and phenomena. Yet, as I shall argue in the following, the discovery of WNC does not support their account, when correctly construed. WNC are a form of weak interaction between subatomic particles, mediated by the so-called Z0 boson. In contrast to their charged current counterpart (mediated by a W+ or W− boson), WNC are characterised through their lack of muons (µ− ) in neutrino-nucleon scattering experiments (see Fig. 6; right). Since only charged particles leave tracks in bubble and spark chambers (which are used to detect WNC) and since neutrinos (and Z0 bosons) are electrically neutral, the non-production of muons became the main identifier of WNC. This way of identifying WNC, however, is highly problematic. In spark chamber experiments, where the production and detection of muons is spatially separated, wide-angle muons could escape the detector.12 In bubble chamber experiments muons could get stuck in the shielding of the chamber. If one didn’t take extra care in estimating these undetected but nevertheless present muons, one could end up counting as WNC events what in fact were merely charged current events. Moreover, in bubble chamber experiments, incoming neutrinos could knock off neutrons within the shielding, which in turn would propagate into the chamber where they would scatter off hadrons, thus emulating WNC events (Fig. 7). The latter came to be known as neutron background and it became the main challenge in bubble chamber experiments to estimate this neutron background correctly in order to discern the genuine WNC events from the charged current events that only “mimicked” them. In the history and philosophy of science literature, the discovery of WNC (with the pioneering work of Galison 1983) is not uncontroversial. Pickering (1984) has claimed that already bubble and spark chamber experiments in the 1960s were capable of producing WNC and that they actually did produce data which indicated 11 E.g.: “[…] we do not regard our discussion as an argument […] Our discussion is rather intended as an empirical description of various features of scientific practice that have been overlooked or misdescribed in the philosophical literature.” (Bogen and Woodward 1988, p. 337; original emphasis). 12 On the other hand, with not enough space between generator and detector, hadrons could penetrate into the detector where they would be counted as muons. See Galison (1983), Pickering (1984), and Schindler (under review) for more details.
123
50
Synthese (2011) 182:39–55
Fig. 6 Neutrino-nucleon scattering. a Charged current event, mediated by a W+ boson, which carries a positive charge from the reaction v → μ− to the reaction n → p (where v = neutrino, n = nucleon, p = proton, μ− = muon); b neutral current event, mediated by an electrically neutral Z0 boson (also characterised as massive analogue of the photon). Neutral current events produced in scattering are characterised by the charge remaining the same for incoming and outcoming particles (upper and lower parts of the diagrams, respectively). From Pickering (1984)
Fig. 7 Neutron stars in bubble chamber. This figure shows two forms of neutron stars triggered by a neutrino beam. Above A neutrino hits a nucleus, producing a muon (μ− ), hadrons, and a neutron (n). The neutron, again, hits another nucleus, producing even more hadrons, but without producing a muon. The event caused by the neutron can unproblematically be associated with the neutrino beam (and hence be identified as a charged current event), which is why these events are called “associated events” (AS) [also called neutron stars (n∗ )]. All this happens within the visible chamber. Below neutron star is triggered in the invisible shielding, starting in the invisible shielding, making the muon event (μ− ) undetectable. The latter gives the appearance of a neutral current event [non-associated or “background event” (B)]. The interaction length of the AS events within the chamber serves as the basis for estimating the number of unobservable B events by means of Monte Carlo programmes. Diagram from Haidt (2004)
the presence of WNC. The question then of course is why WNC were not discovered in the 1960s but rather in the 1970s? Pickering has argued that this was so because WNC were a “socially desirable phenomenon” (p. 109) in the 1970s after theoretical physicists had come up with the Salam-Weinberg model which postulated WNC. Pickering concluded that “particle physicists accepted the existence of
123
Synthese (2011) 182:39–55
51
the neutral current because they could see how to ply their trade more profitably in a world in which the neutral current was real” (ibid., p. 87; my emphasis). Miller and Bullock (1994) however, have challenged Pickering’s interpretation by denying Pickering’s claim that the 1960s bubble chamber experiments were capable of producing a WNC “signal”. As I have shown elsewhere (Schindler under review),13 Miller and Bullock’s argument doesn’t go through. On the other hand, I don’t think one needs to fall back on any sociological arguments in order to make sense of the discovery. But then, how does one explain that the discovery was made in the 1970s and not in the 1960s? Perhaps, one should take a closer look at how the eventual discovery was made in 1974. In this year, both the HPWF and the CERN groups each published a paper, both of which are usually regarded as having established WNC as a genuine phenomenon. Here is how both groups, rather surprisingly for “discovery papers”, conclude:14 A possible, but by no means unique, interpretation of this effect [muonless events] is the existence of a neutral weak current. (Benvenuti et al. 1974 p. 800; my emphasis) It has to be emphasized that the neutral current hypothesis is not the only interpretation of the observed events. (Hasert et al. 1973, p. 20; added emphasis) Likewise, a review article in Physics Today reads: Although both groups [Gargamelle and HPWF] suggest that they may be seeing neutral currents, they also offer alternative explanations. And many experimenters are sceptical that either group has demonstrated the existence of neutral currents. (Lubkin 1973, p. 17) In other words, even in the articles that (retrospectively) came to be seen as “discovery papers”, quite obviously a fair amount of scepticism about the reality of WNC remained. But it comes even worse than that. As Perkins, a former researcher at CERN has pointed out recently (see also Fig. 8): It is interesting to note that the HPWF result is actually inconsistent with the Salam–Weinberg theory, while the Gargamelle result shows a value of R [i.e., NC/CC] that is only about two-thirds of the present-day value […]. The value deduced for sin2 θW = 0.38 ± 0.009 has to be compared with the present value of 0.23. (Perkins 1997, p. 442) In other words, the data of 1974 were not at all as compelling as we may think they should be (because they were either inaccurate or inconsistent with currently accepted values) given that we take it that WNC were discovered in that year by the two groups. Hence one may again wonder, with much more urgency than before, how the research 13 In this publication, I discuss Bogen and Woodward’s characterisation of the discovery of WNC in terms of their notions of data and phenomena in more detail. There, I also have a few things to say about Mayo’s construal of the discovery in terms of her statistical error account. 14 The first quote is by the HPWF group the second by the CERN group. Note that the paper by the HPWF group was already submitted August 1973 and that the CERN group published a short paper in 1973, which they elaborated in the article that appeared a year later. See Pickering (1984) for details.
123
52
Synthese (2011) 182:39–55
Fig. 8 Comparison of the results of the main Neutral Current experiments. The figure shows the results obtained from Gargamelle and HPWF as compared to the predictions of the Salam–Weinberg model. From Perkins (1997)
community eventually became convinced of the reality of WNC in 1974? Since there wasn’t any other evidence for WNC than the one provided by the experiments of the HPWF and CERN groups, we can assume that the data themselves obviously were insufficient for making a discovery claim. So the reasons for the research community to accept existence claims about WNC must be extra-empirical. Now, I have already said that I’m not willing to resort to any socio-economical arguments here. I simply think they are not necessary. I certainly don’t think physicists are as opportunistic as Pickering suggests when he says that they welcomed WNC just because it allowed them to “ply their trade more profitably”. Nor do I think one should draw any other relativist conclusions. Rather, I believe the reasons why physicists accepted WNC as a genuine phenomenon—despite remaining doubts about the diminishment of neutral background—has to be sought in the properties of the theory that postulated WNC. The Weinberg–Salam model,15 proposed in 1967 and re-normalised in 1972, unified theories of electromagnetic and weak interactions into a single theory. Since the Weinberg–Salam model required the existence of WNC, the Weinberg–Salam model provided, in the form of the unificatory benefit, good reasons for the belief in WNC. Of course, the Weinberg–Salam model, by itself, would have been far from being sufficient for physicists to accept WNC as real. But, as we saw above, the experiments quite apparently were not sufficient either for a discovery claim. Nevertheless the data combined with the theoretically motivated “need” for WNC did the trick! Although doubts about the conclusiveness of the experiments remained, these doubts were outweighed by the benefits the Weinberg–Salam model offered in terms of conceptually unifying electromagnetic and weak interactions. 15 Even though it is called a “model”, philosophers for all its characteristics would probably prefer to call
it a theory.
123
Synthese (2011) 182:39–55
53
Contrary to the conclusions of this section, Bogen and Woodward in their publications assumed all along that a clear case for the existence of WNC could be made simply on the basis of the experimental evidence in 1974.16 And yet, the discovery of WNC is only a case supporting their account if one makes this (false) assumption. But quite obviously, the experimental procedures aiming at achieving the reliability of the data were not sufficient for establishing the reality of WNC. Rather, as I suggested here, it was the “tested” theory that provided good reasons (in the form of unificatory power) for accepting WNC as real. Nothing in Bogen and Woodward’s account, however, allows for such. 5 Conclusion In their well known article “Saving the phenomena”, Bogen and Woodward (1988) made four central claims: (i) phenomena in science, more often than not and contrary to traditional intuitions, are unobservable, (ii) the phenomena are inferred from observable data, (iii) the reliability of the data is secured on the basis of experimental and statistical means, (iv) theories do not predict or explain data but rather phenomena. As a corollary to (i) and (iv), Bogen and Woodward also rejected the thesis of theory-ladenness (roughly, what we see is influenced by the theories we hold). Although Bogen and Woodward did not mention any other forms of inference in their original article (making (ii) seem like a necessary condition for the inference of phenomena), they recently retreated on claim (ii) and conceded that their scheme should be complemented by other forms of inferences, most notably by theory-driven inferences. This is largely unproblematic. Although one may be interested in how Bogen and Woodward would spell out those kinds of inferences, which they haven’t even mentioned in their publications to date, their retreat on claim (ii) seems to leave their other claims and their mutual consistency intact. Woodward has furthermore partially backed down on claim (iv), conceding that data may well be described and interpreted in terms of theoretical vocabulary. As pointed out above, this leads to serious conceptual difficulties, which Bogen and Woodward would need clarify. But regardless of this, there are much more serious questions to be raised about the descriptive adequacy of their account with respect to claim (iii) and its relation to claim (iv). Some of these questions arise from re-visiting two case studies, which Bogen and Woodward and others have quoted in support of the descriptive adequacy of their account. The first case study concerns the discovery of the so-called zebra pattern of magnetic anomalies (Kaiser 1991, 1995) and the second case study regards the discovery of weak neutral currents (Bogen and Woodward 1988; Woodward 1989, 2000). With respect to the first case study I showed that not only the traditional thesis of theory-ladenness is substantiated but a case can be made for an even stronger version of theory-ladenness according to which the data are neglected if they appear irrelevant from a certain theoretical perspective (I called this TLI). In the second case, a positive property of the theory at stake, namely the unifying power of the Salam–Weinberg model, obviously played 16 Bogen and Woodward are in good company here. See Galison (1983, 1987). For the contrary view see
Pickering (1984).
123
54
Synthese (2011) 182:39–55
a critical role in assessment of the reliability of the data (I called this TLA). Both case studies directly challenge claim (iii) and claim (iv) of Bogen and Woodward’s account. Since Bogen and Woodward and others have depicted the first and the second case discussed here as paradigm examples for their distinction, I don’t see how they could dismiss the cases as “untypical”, “exceptions” or the like. On a last note, despite my grave disagreements with Bogen and Woodward that I articulated in this paper, however, where I do agree with them is that there is a distinction to be made between data and phenomena and that many important phenomena in science are unobservable. It is here where they have made a significant contribution to the philosophy of science. Acknowledgements I would like to thank the Swedish Institute for financing a postdoctoral research fellowship at the Department of Philosophy at the University of Stockholm, during which I wrote this paper and in particular Paul Needham for supporting my application for this fellowship. I also want to thank the organizers and the participants of the Data, Phenomena, Theories conference at the University of Heidelberg, where I presented this paper. I’m furthermore indebted to Steven French for comments on an earlier draft of this paper.
References Benvenuti, A., et al. (1974) Observation of moonless neutrino-induced inelastic interactions. Physical Review Letters, 32(14), 800–803. Bogen, J., & Woodward, J. (1988). Saving the phenomena. Philosophical Review, 97, 303–352. Duhem, P. (1962). The aim and structure of physical theory (trans: Wiener, P.P., Second French edition). Princeton: Princeton U.P. Galison, P. (1983). How the first neutral-current experiments ended. Reviews of Modern Physics, 55, 477–509. Galison, P. (1987). How experiments end. Chicago: University of Chicago Press. Glymour, B. (2000). Data and phenomena: A distinction reconsidered. Erkenntnis, 52(1), 29–37. Hacking, I. (1983). Representing and intervening. Cambridge: Cambridge University Press. Haidt, D. (2004). The discovery of neutral currents. European Physics Journal C, 34, 25–31. Hasert, F. J., et al. (1973). Observation of neutrino like interactions without muon or electron in the Gargamelle neutrino experiment. Physics Letters B, 46, 138–140. Heidelberger, M. (2003). Theory-ladenness and scientific instruments in experimentation. In H. Radder (Ed.), The philosophy of scientific experimentation (pp. 138–151). Pittsburgh, PA: University of Pittsburgh Press. Kaiser, M. (1991). From rocks to graphs: The shaping of phenomena. Synthese, 89, 111–133. Kaiser, M. (1995). The independence of scientific phenomena. In W. Herfel, W. Krajewski, I. Niiniluoto, & R. Wojcicki (Eds.), Theories and models in scientific process. Poznan Studies in the Philosophy of Science and the Humanities (Vol. 44, pp. 179–200). Amsterdam: Rodopi. Mason, R. (2003). In N. Oreskes & H. LeGrand (Eds.), Plate tectonics: An insider’s history of the modern theory of the earth. Boulder, Colo.: Westview Press. Mason, R. G. (1958). A magnetic survey off the west coast of the United States between latitudes 32◦ and 36◦ N. and longitudes 121◦ and 128◦ W. Royal Astronomy Society. Geophysical Journal, 1, 320–329. Miller, A., & Bullock, W. (1994). Neutral currents and the history of scientific ideas. Studies in History and Philosophy of Science, 25(6), 895–931. Lubkin, G. B. (1973). CERN and NAL groups claim evidence for neutral currentss. Physics Today, 26, 17–19. Perkins, D. (1997). Gargamelle and the discovery of neutral currents. In L. Hoddeson (Ed.), The rise of the standard model: Particle physics in the 1960s and 1970s. Cambridge: Cambridge University Press. Pickering, A. (1984). Against putting the phenomena first. Studies in History and Philosophy of Science, 15, 85–114.
123
Synthese (2011) 182:39–55
55
Schindler, S. (2007). Rehabilitating theory. The refusal of the bottom-up construction of Scientific Phenomena. Studies in the History and Philosophy of Science, 38(1), 160–184. Schindler, S. (2008). Model, theory and evidence in the discovery of the DNA structure. The British Journal for the Philosophy of Science, 59, 619–658. Schindler, S. (under review). Real phenomena vs. background noise: The discovery of neutral currents. Studies in History and Philosophy of Modern Physics. Raff, A. D. (1961). The magnetism of the ocean floor. Scientific American, 205(4), 320–329. van Fraassen, B. (1980). The scientific image. Oxford: Clarendon. Woodward, J. (1989). Data and phenomena. Synthese, 79, 393–472. Woodward, J. (2000). Data, phenomena, and reliability. Philosophy of Science, Vol. 67, Supplement. Proceedings of the 1998 Biennial Meetings of the Philosophy of Science Association, Part II: Symposia Papers, pp. S163–S179.
123
This page intentionally left blank z
Synthese (2011) 182:57–71 DOI 10.1007/s11229-009-9616-7
What exactly is stabilized when phenomena are stabilized? Uljana Feest
Received: 20 May 2009 / Accepted: 3 June 2009 / Published online: 2 July 2009 © Springer Science+Business Media B.V. 2009
Abstract The last two decades have seen a rising interest in (a) the notion of a scientific phenomenon as distinct from theories and data, and (b) the intricacies of experimentally producing and stabilizing phenomena. This paper develops an analysis of the stabilization of phenomena that integrates two aspects that have largely been treated separately in the literature: one concerns the skills required for empirical work; the other concerns the strategies by which claims about phenomena are validated. I argue that in order to make sense of the process of stabilization, we need to distinguish between two types of phenomena: phenomena as patterns in the data (“surface regularities”) and phenomena as underlying (or “hidden”) regularities. I show that the epistemic relationships that data bear to each of these types of phenomena are different: Data patterns are instantiated by individual data, whereas underlying regularities are indicated by individual data, insofar as they instantiate a data pattern. Drawing on an example from memory research, I argue that neither of these two kinds of phenomenon can be stabilized in isolation. I conclude that what is stabilized when phenomena are stabilized is the fit between surface regularities and hidden regularities. Keywords Data · Phenomena · Stabilization · Validation · Bogen/Woodward · Hacking · Epistemology of experimentation 1 Introduction In the philosophy of science literature of the past 20-some years or so, the term “phenomenon” has been used by the proponents of two, partially overlapping, bodies of literature: One goes back to Bogen and Woodward (1988) paper “Saving the U. Feest (B) Institut für Philosophie, Wissenschaftstheorie, Wissenschafts- und Technikgeschichte, Technische Universität Berlin, Straße des 17. Juni 135, Sekr. H72, 10623 Berlin, Germany e-mail:
[email protected] 123
58
Synthese (2011) 182:57–71
Phenomena”. In this article, the authors introduced the notion of phenomenon as distinct from theories and data, thereby questioning the traditional dichotomy between theory and observation as well as the evidential and explanatory relations usually thought to hold between them (Bogen and Woodward 1988). The other body of literature, mainly associated with the history and philosophy of experimentation, has begun to take a much more detailed look at the physical and cognitive intricacies of identifying and “stabilizing” phenomena in experimental contexts (see Hacking 1983, p. 239). While there are obvious affinities between the philosophical intuitions expressed by these two approaches, their respective insights and suggestions have not, to my knowledge, been explicitly brought to bear upon each other. In this article, I provide such an analysis. In doing so I show that a careful analysis of the notion of stabilization can be utilized to fine-tune the notion of phenomenon by differentiating between two kinds of phenomena that have—in my view—not been sufficiently distinguished in the literature: surface phenomena and hidden phenomena. I conclude by arguing that what is stabilized when phenomena are stabilized is the mutual fit between these two kinds of phenomena, and I emphasize the substantial conceptual work that is required in the process. I begin by providing some background for my usage of the key terms of this article: “phenomena” and “stabilization”, drawing an analytical distinction between two aspects of stabilization that have been emphasized by different contributors to the literature, respectively: a skill- aspect and a validational aspect (Sect. 2). It is then argued that a philosophically satisfactory account of stabilization has to integrate these two aspects. This claim is made plausible by refuting two arguments that might be put forward in favor of treating skill and validation separately (Sect. 3). I then go on to develop a sketch of a positive account of the stabilization of phenomena. This account draws on two crucial points developed in the previous section, i.e., (a) a distinction between two types of phenomena, and (b) an emphasis on the conceptual prerequisites that are needed to identify data points as instantiating or indicating phenomena (Sect. 4). This analysis is followed by a discussion of an example from memory research (Sect. 5). I conclude with some reflections on the scope of my analysis. 2 The key terms explained The ways in which the notion of phenomenon is used in the above-mentioned literature self-consciously positions itself vis-à-vis traditional philosophy in two ways: (1) In contrast to traditional epistemological and phenomenological usages, where phenomena are usually associated with fleeting sensory experiences, or with the irreducible subjectivity of such experiences, phenomena are here taken to be “relatively stable and general features of the world” (Woodward 1989, p. 393). Hacking in particular makes it quite clear that his usage of the term “phenomenon” is not informed by philosophical tradition, but rather follows that of physics, where phenomenology is a branch of research that attempts to provide detailed calculations and descriptions of a given object of research, especially in an experimental context (Hacking 1983, Chap. 13). While Bogen and Woodward are less explicit about this point, their usage is also clearly inspired by scientific, rather than philosophical, usage. This choice of terminology reflects their intention to do justice to scientific practice. (2) By using the
123
Synthese (2011) 182:57–71
59
term “phenomenon” in the way they do, Bogen and Woodward, like Hacking, not only distance themselves from certain usages in traditional epistemology, but also express a critique of a widespread assumption within twentieth-century philosophy of science, according to which the notion of phenomenon is closely tied to an idea of unproblematic observability. Countering this assumption, Bogen and Woodward emphasize a distinction between that which is actually empirically registered by scientists (the data), and that which constitutes the objects of scientific research (the phenomena). Their claim is that scientists do not directly observe phenomena, but rather use data to gain evidence for claims about phenomena. Hacking is less concerned with the question of whether phenomena are observable in principle. His point rather is that even if we grant that phenomena are observable, it is not at all a trivial matter to actually observe them. He thereby draws attention to the fact that empirical work in the sciences is not a matter of passively registering facts about the world, but that a good deal of empirical work in the sciences consists on efforts to make phenomena observable.1 Prima facie it may appear that Bogen and Woodward (1988) on the one hand and Hacking (1983) on the other are directly contradicting each other, since Hacking appears to affirm what Bogen and Woodward want to deny, i.e., that phenomena are observable. However, as I hope to show in this paper, the two analyses are in fact compatible. In order to appreciate this, we have to make slight amendments to both accounts. First, Bogen and Woodward are right to emphasize that phenomena are not observable. Second, Hacking is right to draw attention to the intricacies of empirical work, though he is probably ill-advised to use the term “observation” in this context, especially since he, like Bogen and Woodward, wants to distance himself from much of the philosophical baggage that comes with the term. These pitfalls and potential misunderstandings can be avoided if we assert that what Hacking really means when talking about the complexities of making a phenomenon observable is what he, in other places, refers to as “stabilization”. It is therefore this concept that I want to turn to next. The notion of stabilization is found in some of the literature on scientific experimentation, though its scope is not restricted to experimental sciences, but also encompasses other forms of empirical work. Roughly, this notion is taken to refer to the processes whereby scientists (a) empirically identify a given phenomenon or entity, and (b) gradually come to agree that the phenomenon is indeed a stable and robust feature of the world, rather than being (say) an artifact of any particular experiment or instrument. Philosophers tend to think of this process as one that is rationally guided (or can at least be rationally reconstructed), whereas proponents of the social studies of science have focused more on other (e.g., social) mechanisms that contribute to scientific consensus formation. However, even within the philosophical community, it is generally taken for granted that the process of stabilization has two aspects: One has to do with the ability to make empirical results physically stable, i.e., to “make an experiment work”, 1 Of course there is a finer distinction to be made here, regarding the question of whether Hacking means
that certain phenomena literally did not exist before they were produced in the laboratory, or whether he merely means that the laboratory sciences make stable phenomena empirically accessible by producing data that instantiate them. As will become apparent, it is this latter reading that I endorse.
123
60
Synthese (2011) 182:57–71
or make an instrument reliably reproduce particular results, where this clearly involves the ability to recognize that the experiment/instrument did in fact work. The other has to do with strategies of establishing that the phenomenon is genuine, or “robust”. Versions of the first aspects were touched on by Kuhn (1962) and Polanyi (1966), with their emphasis on tacit knowledge in the use of scientific concepts. It is this aspect that Hacking has in mind when he talks about Caroline Herschel’s remarkable ability to identify and classify planets, or about the ability of certain lab assistants to read X-ray pictures (Hacking 1983, Chap. 10). The focus here is on the question of how scientists (individually or collectively) come to converge in their delineation and classification of relevant phenomena. Versions of the second aspect of stabilization are treated in the literature on the calibration of instruments and the validation of claims about phenomena. Keywords here are “robustness” (e.g., Wimsatt 1981; Hacking 1981; Culp 1994, 1995) and the experimental strategies scientists draw on in establishing the validity of a finding. Such strategies include, amongst others, that of “multiple determination”, i.e., the identification of a phenomenon by means of more the one instrument (e.g., Franklin 1993; Woodward 1989). In the following, the two aspects of stabilization outlined in the previous paragraph shall be referred to as the “skill”- and the “validation”-aspect of stabilization. Let me clarify the distinction by responding to a potential objection. The objection states that if the skill aspect of stabilization is connected to some form of tacit knowledge then this implies that the results delivered by a person who possesses the skill would thereby automatically also be validated. In response I would like to suggest that the practical knowledge required to run an experiment or to use an instrument has to be distinguished from the kinds of arguments required to back up the claim that the results of the experiments or instrument readings indeed indicate the type of thing the scientist take it to be. For example, while Caroline Herschel may have been better than anybody else at using the light telescope to identify and classify a particular kind of structure, it is still conceivable that this structure might have turned out to be an artifact of the instrument rather than indicating a heavenly body, let alone a planet. This would have become apparent later, when different kinds of telescopes were used to replicate her findings. In such a case we could still credit her with a particular kind of skill, while maintaining that her claim to have identified planets was not validated. In such a case we might also conclude that her skill has turned out to not be terribly relevant to any question of current scientific interest. 3 Two readings of the skill/validation distinction—and their problems Having drawn an analytical distinction between two aspects of the stabilization of phenomena, let me point out that there are no attempts in the literature to provide an account of stabilization that integrates them. Instead, scholars have focused either on the one or the other aspect. The question I want to raise here is whether a philosophically satisfactory account of stabilization can afford to neglect one of these two aspects at the expense of the other, simply treating them as two processes that can be analyzed separately. My thesis is that it cannot (i.e., the answer is “no”), and that an analysis of stabilization is needed that integrates the two aspects. My argument for this thesis comes in two parts, one positive and one negative. In Sects. 3.1 and 3.2 my focus will
123
Synthese (2011) 182:57–71
61
be on the negative arguments. I proceed by discussing—and rejecting—two possible rationales for not integrating the two aspects of stabilization outlined above. In the process of discussing and criticizing these rationales, two insights will emerge that form the cornerstones of my positive account of the stabilization of phenomena. The two insights are that if we want to understand the ways in which phenomena are stabilized, we need (1) a more fine-grained analysis of the very notion of phenomenon, and (b) an appreciation of the central ability that underlies (and thereby integrates) both aspects of stabilization. 3.1 Skills versus validation: different aspects of stabilizing the same phenomena? Of the two notions of stabilization of phenomena I have delineated, one focuses on the skills that enable scientists to empirically classify phenomena, whereas the other focuses on methodologies that enable them to validate claims about (previously classified) phenomena. One way of justifying separate philosophical treatments of those two issues might be two say that while each of those two aspects is involved in the stabilization of any given phenomenon, the distinction between them can be construed along the lines of that between discovery and justification, understood in a particular way, i.e., as a distinction between (a) the skills required for empirical work, and (b) the ways in which the results can be justified (or the justification can be rationally reconstructed). Of course, this is only one of several readings of the dichotomy between discovery/justification, and it has not remained uncontested (e.g., Hoyningen-Huene 1987). Nonetheless, there is a tendency within philosophy of science to assume such a dividing line. Accordingly, the question of the craft aspect of experimentation is often (implicitly or explicitly) relegated to the realm of psychology or sociology of science, whereas philosophers prefer to focus on the rational criteria that can be appealed to in validating claims about the robustness of phenomena. Of course, I do not mean to suggest that philosophers of science who have analyzed epistemic strategies used in experimental science accept the distinction between discovery and justification as just described. My point, rather, is that they might appeal to something like this distinction when explaining their neglect to provide philosophical analyses of the first type of stabilization in favor of the second type of stabilization. The underlying rationale would be that philosophers and sociologists/psychologists of science ask different kinds of questions about the stabilization of phenomena, and the results of the two complement each other rather than needing to be integrated. While this reasoning has a certain plausibility, it is worth taking a closer look at what exactly it claims. One possible interpretation, which I shall refer to as the “genetic” reading of the skill/validation distinction, holds that the skill- and validational aspects of stabilization reflect the actual order in which research proceeds, such that scientific research would first converge on a particular classification of a phenomenon and then proceed to show that the classification is robust. But this reading is suspect in several ways. Firstly, even if research really did proceed in this sequence, it would be hard to align it with the distinction between discovery and justification. That is to say, it would be rather implausible to claim that the skillful design and execution of empirical work involves no elements of rational justification, and—conversely—it is not so clear that the application of validational strategies cannot constitute part of the discovery
123
62
Synthese (2011) 182:57–71
process. Hence, an argument for a genetic reading of the skill/validation distinction that appeals to the distinction between discovery and justification is questionable. Secondly, the very premise of the genetic reading of the skill/validation distinction, i.e., that stabilization occurs in two consecutive stages, may be doubted. While it is perhaps the case that the skillful design and execution of empirical work about phenomena does not necessarily aim at the validation of claims about those phenomena, the converse is surely not true: any and all attempts to empirically validate claims about the stability of a given concept are going to draw on precisely the skills that—on the genetic reading of the skill/validation distinction—are supposed to precede it. It is impossible to empirically investigate of the question of whether a given phenomenon is robust without empirically identifying instances of it. This means that at least the validational aspect of stabilization is not temporally distinct from the skill aspect of stabilization. Now, this leaves open a second reading of the skill/validation distinction. I shall refer to it as the “systematic” reading. According to it, the practice of validation may well coincide with the skills required to engage in it, but it does not follow that they are not logically distinct.2 Therefore, it might be argued that a philosophical treatment of validation can take the skill aspect of stabilization for granted, leaving it to others to give a theoretical account of it. I would like to offer two critical responses to this argument, one weak and one strong. The weak argument grants that it is possible to give a satisfactory account of validation that exclusively focuses on reasoning strategies employed to demonstrate the robustness and stability of a previously classified phenomenon. But if our aim is to give an account of stabilization, we cannot restrict our attention to a purely logical evaluation of reasoning strategies, since stabilization is a process in time, which involves two aspects (skill and validation). An account of stabilization, therefore, needs to say something about how they work together. While I embrace this argument, I would like to go further and put forward a stronger thesis, which states that a distinct treatment of the skill and validational aspects of stabilization not only fails to provide an adequate analysis of stabilization, but also of validation.3 To provide some background for this thesis, let us return to the question of what precisely constitutes the “skill aspect” of stabilization. Above I have characterized it as the ability to empirically identify and classify phenomena, where this ability was further differentiated into (a) an element of physical craftsmanship (being able to make an experiment or instrument work) and (b) an element of cognitive judgment (being able to recognize that an experiment or instrument in fact works). I argue that this latter, conceptual, skill should take center stage in an account of stabilization that integrates notions of skill and validation, since it is not only required in the process of identifying potentially interesting empirical phenomena, but it also has to be drawn on in attempts to validate claims about those phenomena. As I will argue more fully 2 Indeed, I made a similar point above when I argued that the two aspects of stabilization are analytically
distinct. 3 I am here talking about the problem of providing a descriptively adequate account of validation. This is
to be distinguished from the normative question of what it would take to actually validate a given finding. My overall focus in this article is on descriptive questions, but we will return to the normative question in the last section of this paper.
123
Synthese (2011) 182:57–71
63
in Sects. 4 and 5 below, however, it is not merely the case that conceptual skills are required to even start inquiring about the validity of specific claims about phenomena. The reverse is also true: the results of validational procedures will have an effect on our judgment of what are relevant conceptual skills. Hence, I suggest that the stabilization of phenomena, on the account provided here, has to be viewed as a process of mutual adjustment between notions of what are stable phenomena and notions of what are relevant conceptual skills.
3.2 Skills versus validation: stabilizing different kinds of phenomena? There is a second way in which one might attempt to construe the distinction between the two notions of stabilization outlined at the outset, namely as applying to two different types of phenomena, where the first notion of phenomenon is equated with empirical data patterns that are either found in the world or created in the lab, whereas the second notion of phenomenon is more removed from particular regularities, i.e., is, in some sense, unobservable. To avoid the problematic language of “observable” and “unobservable” phenomena”, I will refer to these two types of phenomena as “surface phenomena” and “hidden phenomena”. To stabilize the first type of phenomenon, then, would mean to identify some empirical data pattern and to provide evidence for its robustness. To stabilize the second type of phenomenon would mean to provide evidence for, and validate claims about, some regularity that is more removed from empirical data, though the evidence would presumably be provided by means of data. It might be suggested that data patterns, especially when they are experimentally produced, are not really phenomena in Bogen and Woodward’s sense. I believe that this suggestion is wrong, and I would like to argue for this by way of a discussion of Ian Hacking’s notion of a phenomenon. According to Hacking: “To experiment is to create, produce, refine and stabilize phenomena” (Hacking 1983, p. 239). Phenomena, on this reading, are empirical regularities that can be (though are not necessarily) created in the lab. Hacking specifically talks about experimental effects, such as the Hall effect. In contrast, Bogen and Woodward downplay the importance of experimental production, suggesting instead that phenomena are general features in the world, as opposed to the idiosyncratic features of data production in experiments. This prompts them to argue that “[the] features which Hacking ascribes to phenomena are more characteristic of data” (Bogen and Woodward 1988, p. 306, footnote 6). Should we then conclude that when Hacking uses the term “phenomenon” he is really talking about data in Bogen and Woodward’s sense, since he is specifically interested in effects that are created under the idiodyncratic conditions of a given experiment? There are two reasons why I want to resist this conclusion. Firstly, some of the examples of phenomena presented by Bogen and Woodward are very similar to the kinds of experimental effects reported by Hacking (e.g., the chunking effect, to which I will return in Sect. 5 below). More importantly, Hacking’s own definition of the term “phenomena” as “noteworthy discernible regularities” (Hacking 1983, p. 225) suggests that he does not think (or at any rate should not think) that any one experiment literally creates a phenomenon, since the data that constitute the
123
64
Synthese (2011) 182:57–71
outcome of any one individual experiment cannot display a regularity. Regularities, on my understanding, are events that occur regularly. Experiments create data. Data may be classified as instantiating patterns of what I have called surface phenomena. This way of paraphrasing Hacking’s notion of phenomenon has two interesting consequences. First, it implies the recognition that phenomena—even in the sense of “mere” empirical regularities—transcend the idiosyncratic features of any one experiment or data point. If we talk about the results two different experiments in which the Hall effect was demonstrated, there is a sense in which we are talking about something that is very idiosyncratic to the specific conditions of the two experiments, respectively. But of course we can only compare them if we assume that both results are instances of the same empirical regularity. In other words, empirical regularities meet Bogen and Woodward’s criterion of phenomenonhood. Second, if individual experimental data can at best instantiate (rather than constitute) phenomena, the question arises, what is required to classify data as instantiating a particular phenomenon? It should be clear that we are here touching on an issue that is closely related to a topic that was discussed in the previous subsection, namely, the question of the conceptual skills required to identify a phenomenon. We will return to this in the next section. Having argued that surface regularities and hidden regularities both qualify as phenomena, we can now return to the main question of this section, i.e., whether the skill and validational aspects of the stabilization might be two modes of stabilization that apply to these two different kinds of phenomena, respectively. This suggestion is implausible in several ways. First, it suggests that claims about surface regularities do not need to be validated, since they are delivered to us as self-evident by means of our conceptual skills. Second, it makes the assumption that the validation of hidden regularities does not (or at least not to any significant extent) require conceptual skills. I would like to question both assumptions. The first assumption implies (a) that surface regularities are directly observable, and (b) that these observations are veridical. But, of course direct observability is precisely what we denied by granting surface regularities (data patterns) the status of phenomena in the sense of Bogen and Woodward. Furthermore, the mere fact that a given empirical result is classified as instantiating a surface regularity does not mean that there really is a stable regularity. In other words, claims about the robustness of data patterns need to validated, just as claims about hidden regularities need to be validated. Now, with respect to the second assumption, it is indeed hard to see at first glance how conceptual skills might be involved in the empirical identification and validation of hidden phenomena, since the latter are, after all, more removed from the realm of the empirical than are the surface phenomena. If hidden phenomena are inferred from data, as suggested by Bogen and Woodward, then conceptual skills do not seem to enter. However, we need to be careful here: while it is correct that the raw material of inferences from data to phenomena are individual data points, I find it less obvious that individual data points, as such, qualify as an inductive basis for such inferences. They have to be classified in a particular way,
123
Synthese (2011) 182:57–71
65
which in turn will determine the kinds of inferences that are made from them.4 I would like to suggest that individual data points are treated as providing an inductive basis for inferences from data to phenomena insofar as they are classified as instantiating a particular surface regularity. As I argued above, the identification and validation of surface regularities requires conceptual skills. Hence, the validation of claims about hidden regularities indirectly also requires conceptual skills, because it relies on the identification of surface regularities. If these arguments are correct, then the skill and the validational aspect of stabilization each play a role in the stabilization of both types of phenomena delineated above: surface phenomena and hidden phenomena. Thus, they are not different modes of stabilization that are specific to different kinds of phenomena.
4 Towards a positive account of the stabilization of phenomena We have come to an end of our discussion of whether there are any defensible reasons for not integrating aspects of skill and validation in an account of stabilization. I conclude that the two reasons discussed in Sects. 3.1 and 3.2 do not hold up. It is conceivable that there might be other reasons, but for now I will shift the burden of proof to those who are still inclined to disagree with my thesis. Of course, arguing that an analysis of stabilization requires an integrated account of the ways in which skills and validational processes work together is one thing. Providing such an account is quite another. I do not pretend that I have yet done so. However, in the course of the previous section, some insights about the nature of phenomena and of stabilization have emerged that we can now draw on in an effort to construct such a positive account. The first important insight concerns our analysis of the very notion of phenomenon. I agree with Bogen and Woodward’s general characterization of phenomena as stable features of the world, distinct from the empirical data that are used to test claims about them. However, I argue that if we are interested in the processes of stabilization (i.e., of the ways in which scientists come to identify phenomena and validate claims about them by empirical means), a more fine-grained analysis of the notion of phenomena and their epistemic relationship to data is required. Cornerstones of such a more fine-grained analysis will be (a) my distinction between surface phenomena and hidden phenomena, and (b) my thesis that data can only play the role of evidence for phenomena by virtue of being classified as either instantiating a surface phenomenon, or indicating a hidden phenomenon. In turn, data can only be treated as indicating a hidden phenomenon insofar as they are also viewed as instantiating a surface phenomenon. (To put this differently, scientists will not be inclined to treat an individual data point as indicative of a hidden phenomenon as long as they do not 4 I take it to be the central insight of Goodman’s (1955) new riddle of induction to have pointed out that
there is nothing in the raw data that privileges one classification over another. This, then, gives rise to the descriptive question of how we arrive at classifications and inferences that seem “natural”, as well as to the normative question of whether we have any reason to believe that what appears natural is in fact correct. As indicated in the previous footnote, the bulk of this paper is concerned with descriptive question here, though some implications for normative questions will be discussed below.
123
66
Synthese (2011) 182:57–71
think that the data point is at least in principle replicable, and can therefore be treated as instantiating a data pattern or surface regularity.) My emphasis on the importance of treating data as instantiating or indicating phenomena once again draws our attention to the conceptual prerequisites that must be in place if we want to even begin to inquire about phenomena and their evidential relations to data. In previous sections I have referred to this ability to identify surface phenomena as a conceptual skill to distinguish it from the more “crafty” ability to physically create data that instantiate or indicate phenomena. The term “conceptual skill” will do if our aim is to indicate a specific kind of ability, though the ability itself remains rather mysterious. It is something like this ability that Thomas Kuhn attributed to the narrower of his two notions of paradigm (Kuhn 1962), later attempting to provide a semi-naturalistic account of it (Kuhn 1974). It is not my ambition here to prove an account of the conceptual skill itself. My interest lies rather in analyzing the contribution made by this skill to the process of attempting to stabilize a phenomenon. One interesting feature of this skill is that on the one hand it seems to be possible to attribute a conceptual skill to someone even if they are mistaken about the nature of the objects they are able to identify empirically. For an example, recall the fictitious scenario of what would have happened if it had turned out that the perceptual objects Caroline Herschel was so good at identifying were not planets, but some systematic artifact of her instrument. I maintain that we would still attribute some kind of conceptual skill to her, whereas we would not do so if her classifications had been entirely random. On the other hand, it is clear that what counts as a scientifically relevant conceptual skill will depend heavily on the current state of knowledge at a given time. Staying with the fictitious example: assuming that at a later point in time we had a better understanding of what planets are, we would then also view a different set of skills as relevant to the task of empirically classifying planets. The reason why I am emphasizing the close connection between knowledge and what counts as a relevant conceptual skill is that it seems to me to point to an important dynamic: On the one hand (as I suggested in Sect. 3.1 above) we cannot even begin to empirically investigate claims about the robustness and stability of phenomena (say, the orbits of planets) unless we assume that we have a way of identifying the objects/phenomena in question (say by means of a specific instrument that provides us with data that we treat as instantiating surface phenomena, which we in turn treat as indicating the phenomena in question). Differently put, our ontology of surface phenomena will inform our ontology of (presumed) hidden phenomena. On the other hand, our attempts to validate claims about hidden phenomena may result in the finding that the surface phenomena we used as indicators of the hidden phenomena were in fact quite inadequate for the purpose. Thus, our ontology of hidden regularities may well alter our ontology of surface phenomena. If this analysis is correct, what follows from it for the question of this essay (“What exactly is stabilized when phenomena are stabilized?”)? The answer—in a nutshell—is that neither surface phenomena nor hidden phenomena can be stabilized in isolation. They stabilize each other. In other words, what is stabilized
123
Synthese (2011) 182:57–71
67
when phenomena are stabilized is the fit between surface phenomena and hidden phenomena. 5 Memory phenomena: surface and hidden I would like to claim some generality for the account that I have developed so far. However, even if something like the distinction between surface phenomena and hidden phenomena, and the epistemic relationship that I have claimed holds between them, are general features of empirical research, the details of how this relationship plays out in the process of stabilization will be quite different for different fields of research. Hence, the following case study will hardly constitute conclusive evidence for my analysis, but is intended to (a) point to the relevance of the issues addressed by my account, and (b) make my account more tangible by linking it to a specific example. Memory research provides us with a nice example of the distinction between surface phenomena and hidden phenomena, since in the literature we find references to phenomena like encoding and long-term memory, but also to phenomena like chunking effects or ceiling effects, where the latter are obviously very closely tied to particular experimentally generated data patterns, whereas the former are not. In the following, I will argue that the identity conditions of a surface regularity like a memory effect cannot be established conclusively without drawing on additional assumptions about hidden memory phenomena that are causally responsible for instances of the memory effects. This will raise the question of how the identity conditions of hidden memory phenomena are established. 5.1 Chunking, recency, priming, and other memory effects As already mentioned, Bogen and Woodward list among their examples of phenomena some findings from experimental psychology, commonly called “effects”, such as the chunking effect, and the recency effect. Other effects one might wish to add are the primacy effect, the ceiling effect, and the priming effect. Our focus here will be on the chunking effect. This expression refers to the fact that in memory experiments human subjects can recall more items when they are instructed to “chunk” them than without such an instruction. For example, it is easier to remember a phone number when it is chunked, then to remember every single digit individually (6-6-1-8-4-0-0 vs. 661-8-400). There is an intuitive sense in which this is an empirical regularity. Moreover, it is one that is thought to be well-confirmed. But what would it take to show that it is a genuine, robust phenomenon? Presumably one would need to show that the effect can be replicated. But here we face a problem: On the one hand it seems obvious that the conditions of the experiment need to be varied in order to rule out that the data in question are specific to one very specific experimental set-up, thereby demonstrating the robustness of the phenomenon. For example, one has to show that instances of the chunking effect hold for one subject in different situations, that it holds for different subjects, that it holds with different materials (different digits, words instead of digits, nonsense-syllables instead of digits, etc…), and that it holds regardless of the type of memory test that is used. But on the other hand, every variation of the conditions of the
123
68
Synthese (2011) 182:57–71
experiment raises the question of whether the experiment in question is really an apt instrument for triggering instantiations of the chunking effect. The question is on what grounds a given experimental result can be classified as instantiating one phenomenon, as opposed to some other, entirely different phenomenon, or as not instantiating an interesting phenomenon at all. A version of this question was asked by Collins (1985) when he raised the issue of how one tells the difference between a replication of an experiment with a negative result, and a failure to replicate the experiment. While we do not have to agree with his account of how this issue is resolved (i.e., that there are no identity criteria that can be explicated and that the question is ultimately resolved by way of power relations), he puts his finger on an important question, namely by what criteria data of different experiments can be classified as instantiating the same empirical regularity. For example, how might we decide that the results of chunking experiments, when replicated with words, are really instances of the chunking effect, rather than, say of a depth-of-processing effect, or an effect of some other brain process? Or even more generally, on what grounds do we classify the outcomes of chunking experiments as memory phenomena rather than, say, attention phenomena? On what grounds do we decide that a given test used to explore the nature of chunking is even a memory test? One possibility would be to say that some psychologists just happen to have the conceptual skill to hone in on the kinds of experimental data that happen to instantiate the chunking effect. But this is not very satisfactory, since it does not provide us with any criteria for distinguishing between people who have this skill and people who do not. Nor does it tell us how it might be determined that the conceptual skill in question is one of any particular scientific relevance. The questions I am raising here are (a) how the identity conditions of surface phenomena like the chunking effect can be established, and (b) how claims about them can be validated? I argue that neither question has an answer if we stay strictly on the level of surface phenomena. That is, our classifications of surface phenomena, no matter how self evident they may appear, will not be able to shake an air of arbitrariness. My thesis, then, is that even though surface regularities like the memory effects in psychology can be objects of study in their own right, attempts to stabilize them will ultimately turn to something other than surface regularities. More precisely, in their search for criteria that determine the identity conditions of surface regularities, scientists assume that such identity conditions are determined by hidden phenomena that stand in a relevant explanatory relationship to the data that instantiate the surface regularities. In the case at hand: if the chunking effect is a memory phenomenon, its identity conditions are presumably determined by a brain process that is associated with memory (i.e., with a hidden memory phenomenon), where that process is causally responsible for all the data that instantiate the chunking effect.
5.2 Memory phenomena as (hidden) physical regularities But of course the thesis of the previous section (that the identity conditions of surface phenomena are determined by hidden regularities) immediately raises the question of how the identity conditions of hidden regularities are determined. In this case, if we
123
Synthese (2011) 182:57–71
69
assume—as was done for many decades within the memory literature—that the chunking effect is to be explained by reference to the phenomenon of short term memory (or by some cognitive processes that takes place within short term memory) then the issue is how the identity conditions of this phenomenon might be established. In the philosophical literature this kind of question has been addressed by proponents of experimentalism, who have argued that it is possible to validate claims about what I here call hidden phenomena by drawing on methodological strategies that will determine whether a given phenomenon is robust. Such strategies include controlling for known factors that might interfere between phenomenon and data, etc. (Franklin 1993; see also Woodward 1989). These methodological strategies will then allow for the generation of data, which in turn will provide evidence for the robustness and stability of the phenomenon in question. With respect to the example, let us assume that we tentatively explain the chunking effect by reference to a chunking mechanism that is located in working memory. Evidence for the existence of such a chunking mechanism might be provided by controlling for other known mental operations, by testing for correlations and/or dissociations between different ways of testing for chunking, and by causally manipulating the brain area where the chunking mechanism is thought to be located. These all appear to be promising ways to go. The only trepidation one might have is that the data used to validate claims about the hidden phenomenon in question are not neutral. It seems that if the empirical data we use are to have any relevance at all to the hidden chunking phenomenon, they would have to be data from chunking experiments. But if the data produced in chunking experiments are supposed to instantiate the chunking effect, which in turn is supposed to be causally related to the chunking mechanism, this appears to beg the question, since the very issue of what are the identity conditions of this purported empirical regularity was supposed to be settled by reference to its cause, i.e., some yet to be determined hidden regularity. In other words: knowledge of the identity conditions of the mental operation of chunking (the hidden regularity) was supposed to help us delineate the identity conditions of empirical chunking phenomena (the surface regularity). But how can this be achieved if the only way we can gain access to the presumed mental operation of chunking is by way of the already existing criteria of delineating the surface regularity? What I am pointing to here may at first glance sound a lot like the type of argument that has come to be known by names such as the “experimenter’s regress”. However, is the threat of circularity real? I do not think so. It is quite conceivable that we use a specific chunking experiment under various different conditions, e.g., with subjects that have specific forms of brain damage with known functional impairments; or under conditions where subjects are given other cognitive tasks to see how they affect their performance on the chunking task. Consider for example the scenario where an attention-demanding task greatly diminishes the subject’s chunking ability. This might leave scientists to suspect that chunking is really a phenomenon of attention, rather than memory.5 Or consider the scenario where patients with known memory impairments do well on some chunking tasks, but not on others. This might lead scientists to suspect that what was previously classified as one surface regularity is in fact a
5 This is in fact entertained as a life possibility in recent psychological publications (Jonides et al. 2008).
123
70
Synthese (2011) 182:57–71
much more complex phenomenon, in that what were previously thought to be different indicators of the same hidden regularity are in fact indicators of different hidden regularities. These are both examples of how the surface regularity is instrumental in an investigation of a hidden regularity, which in turn may change the ways in which the surface regularity is classified: in the former case, the outcome might be that the chunking effect is no longer classified as a memory phenomenon. In the latter case, the outcome might be that we end up distinguishing between different types of chunking effects. The point of this example is to provide some plausibility to my thesis that neither surface regularities nor hidden regularities can be stabilized individually. Insofar as anything is stabilized here, what is stabilized are not claims about the robustness of specific regularities, but rather claims about the fit between surface regularities and hidden regularities.
6 Conclusion In this article I have proposed an account of the stabilization of phenomena that integrates two aspects that have previously not been discussed systematically in relation to one another: skill and validation. Central to my analysis was the insight that what I called “conceptual skills”, i.e., the ability to classify data as instantiating or indicating phenomena of interest, are a vital prerequisite for the very possibility of empirical research about the phenomena in question. This kind of skill, then, is a necessary condition for the possibility of validating claims about phenomena. In turn, however, skills do not embody some kind of mysterious and eternal wisdom. Rather, what counts as a skill, and moreover, what counts as a relevant skill, can change as a result of changing ideas about what kinds of phenomena are in fact stable and robust. Hence, I argued, the skill and validational aspects of stabilization are strongly interconnected. Turning then to an analysis of the process of stabilizing phenomena, I presented a fine-tuned version of Bogen and Woodward’s notion of phenomenon, one that distinguishes between what I called “surface regularities” and “hidden regularities.” Both, I suggested, are phenomena in Bogen and Woodward’s sense, and claims about either kind of phenomenon calls for validation. I argued that neither kind of phenomenon can be stabilized and validated on its own. Ultimately, if anything is stabilized, then it will be the fit between those two kinds of phenomena. The main aim of my analysis has been to give a descriptively adequate account of how the stabilization process proceeds. I thereby responded to what I argued are descriptively inadequate or one-sided treatments of stabilization in the literature. However, given my treatment of processes of validation as constituting one aspect of stabilization, the question still remains whether there is a normative conclusion to be drawn from my analysis. In other words, can we say something about what it takes for a claim about a phenomenon to be validated? In response to this question we need to be clear on what exactly we want to assert when we say that a claim about the existence of a phenomenon has been validated. Some of the authors who have written about robustness have viewed their work as potentially being able to show that there can still be a realm of secure knowledge, even if it is no longer thought to be delivered
123
Synthese (2011) 182:57–71
71
by neutral observations. The basic idea, that we find, for example in Hacking’s entity realism, is that knowledge about certain robust phenomena can prevail, even if our theories about the phenomena change. We therefore do not have to make a choice between being realists or antirealists about theories. We can simply be realists about the stability of phenomena. By severing the tie between theory and observation and emphasizing the partial autonomy of research on phenomena, Bogen and Woodward, at times also lend themselves to this reading, though it is not clear whether this is in fact what they intended. Given my own analysis of stabilization, it should be clear that I do not think that any such attempts of isolating a small pool of secure knowledge will work. As was suggested by my example of the chunking effect, claims about phenomena require for their stabilization and validation assumptions about other phenomena and about the ways in which they are causally and systematically related. Hence, my conclusion about the validation of claims about robustness of phenomena is mainly negative, in that I reject philosophical attempts to show that individual phenomena can be validated in isolation. A more positive account might try to argue that once we have a large number of claims about interrelated phenomena, the very fact that they fit so well together might in itself constitute an argument for their validity (see Hacking 1992 for a version of this argument). A discussion of this suggestion will have to be deferred to another paper. References Bogen, J., & Woodward, J. (1988). Saving the phenomena. The Philosophical Review, XCVII(3), 303–352. Collins, H. M. (1985). Changing order: Replication and induction in scientific practice. London: Sage Publications. Culp, S. (1994). Defending robustness: The bacterial mesosome as a test case. PSA 1994, 1, 46–57. Culp, S. (1995). Objectivity in experimental inquiry: Breaking data-technique circles. Philosophy of Science, 62, 430–450. Franklin, A. (1993). The epistemology of experiment. In D. Gooding, T. Pinch, & S. Schaffer (Eds.), The uses of experiment (pp. 437–460). Cambridge: Cambridge University Press. Goodman, N. (1955). Fact, fiction, and forecast. Cambridge: Harvard University Press. Hacking, I. (1981). Do we see through a microscope?. Pacific Philosophical Quarterly, 62, 305–322. Hacking, I. (1983). Representing and intervening. Introductory topics in philosophy of science. Cambridge: Cambridge University Press. Hacking, I. (1992). The self-vindication of the laboratory sciences. In A. Pickering (Ed.), Science as practice and culture (pp. 29–64). Chicago: University of Chicago Press. Hoyningen-Huene, P. (1987). Context of discovery and context of justification. Studies in History and Philosophy of Science, 18, 501–515. Jonides, J., Lewis, R. L., Nee, D. E., Lustig, C. A., Berman, M., & Moore, K. S. (2008). The mind and brain of short-term memory. Annual Review of Psychology, 59, 193–224. Retrieved February, 2008 from http://psych.annualreviews.org. Kuhn, T. (1962). The structure of scientific revolutions. Chicago, IL: University of Chicago Press. Kuhn, T. (1974). Second thoughts on paradigms. In T. Kuhn (Ed.), The essential tension. Selected studies in scientific tradition and change. (pp. 293–319). Chicago, IL: University of Chicago Press. Polanyi, M. (1966). The tacit dimension. Gloucester, Mass.: P. Smith. Wimsatt, W. (1981). Robustness, reliability, and overdetermination. In R. Brewer & B. Collins (Eds.), Scientific inquiry and the social sciences (pp. 124–163). San Francisco: Jossey-Bass. Woodward, J. (1989). Data and phenomena. Synthese, 79, 393–472.
123
This page intentionally left blank z
Synthese (2011) 182:73–87 DOI 10.1007/s11229-009-9613-x
What do patterns in empirical data tell us about the structure of the world? James W. McAllister
Received: 20 May 2009 / Accepted: 3 June 2009 / Published online: 7 July 2009 © The Author(s) 2009. This article is published with open access at Springerlink.com
Abstract This article discusses the relation between features of empirical data and structures in the world. I defend the following claims. Any empirical data set exhibits all possible patterns, each with a certain noise term. The magnitude and other properties of this noise term are irrelevant to the evidential status of a pattern: all patterns exhibited in empirical data constitute evidence of structures in the world. Furthermore, distinct patterns constitute evidence of distinct structures in the world. It follows that the world must be regarded as containing all possible structures. The remainder of the article is devoted to elucidating the meaning and implications of the latter claim. Keywords
Empirical data · Evidence · Noise · Patterns · Phenomena · Structures
1 Introduction Empirical science aims to ascertain the structure of the world. It does so by analyzing empirical data, consisting of the outcomes of observations and measurements, which constitute our sole epistemic access to reality. Empirical science must thus rest on what I will call a “principle of evidential correspondence” that relates structures in the world to features of empirical data. This principle licenses the move from perceiving a particular feature in a given empirical data set to claiming to have ascertained that the world has a particular structure. The principle of evidential correspondence is central to the epistemology, methodology, and ontology of empirical science. Although we can interpret many debates among philosophers of science as addressing aspects of this principle, we have not yet seen a systematic attempt to give a general formulation of it. What can be said about
J. W. McAllister (B) Institute of Philosophy, University of Leiden, P.O. Box 9515, 2300 RA Leiden, The Netherlands e-mail:
[email protected] 123
74
Synthese (2011) 182:73–87
the content of the principle of evidential correspondence that underlies the modern empirical sciences? One intuition on which we may draw is that empirical data in most cases must be analyzed into components before they reveal structures in the world. The global value of empirical data rarely delivers any clear evidence. This standpoint was first taken by Galileo Galilei, who thought that observed physical outcomes were the effect of the combination of one or more phenomena and of several perturbations: phenomena were fundamental constituents of reality that formed the object of his science and were to be described by laws of nature, whereas perturbations were irregular, ephemeral, and not amenable to systematic analysis. For example, Galileo regarded the fall of a body under everyday conditions as determined partly by a phenomenon, described by his law of free fall, and partly by various perturbations involving air resistance, the shape and composition of the body, and so on. Galileo concluded that observational data, since they recorded the resultant of all these causes, were often misleading: evidence about the fundamental structures of the world could be gathered only by analyzing data into components (Koertge 1977; McAllister 2005). Present-day experimental scientists share Galileo’s intuition. They assume that, while individual observations are affected by perturbations and errors, we can rely on the underlying constancies, regularities, and trends in empirical data to reveal structures in the world. The practice of repeating and varying experiments is designed partly to allow these features of data to shine through more clearly. A great merit of the work of James Bogen and James Woodward (Bogen and Woodward 1988; Woodward 1989) is to have placed this intuition at the centre of discussions of scientific practice: they argue persuasively that the crucial feature of empirical data sets is the patterns that they exhibit, rather than the precise value of each data point. On Bogen’s and Woodward’s view, patterns in empirical data constitute evidence that the world contains particular fundamental and stable structures, which they dub “phenomena”. Thus, detecting a particular pattern in an empirical data set licences the conclusion that the world contains a particular phenomenon. Correspondingly, the empirical content of laws and theories amounts to a description of certain potential patterns in data. This holds most clearly for so-called empirical laws, such as Kepler’s laws and Snell’s law, but the same applies to theories that are farther removed from the appearances. A law or theory is confirmed if, roughly, the relevant pattern is detected in data, and disconfirmed otherwise. Building on Bogen’s and Woodward’s insight that patterns are the crucial evidential element of empirical data sets, this article aims to give a general formulation of the principle of evidential correspondence between features of empirical data and structures in the world. I will then suggest that the sole admissible principle of evidential correspondence leads to unexpected and radical conclusions about the structure of the world—conclusions that deviate markedly from those that Bogen and Woodward themselves draw. 2 Patterns, noise, and their properties Empirical evidence comes in various forms. Many important bodies of data in present-day science take the form of strings of numbers, in which each number represents
123
Synthese (2011) 182:73–87
75
the outcome of a discrete measurement act. Other important data take the form of images, as Bogen (2009) notes. However, many images nowadays are digitally generated, and other images are used as a source of numerical data. Crystallographers, for example, extract from X-ray diffraction photographs numerical measurements of the angular positions and intensities of the individual spots, which constitute evidence for and against hypotheses about crystal structures. For these reasons, I will conceive empirical data in this article as strings of numbers. If we conceive an empirical data set as a string of numbers, a pattern that is exhibited in a data set is an additive component of that string: it is what remains once some other component has been subtracted, item by item, from the data set (Grenander 1996). Mathematically, there is no limit to the number and variety of patterns that are exhibited in a data set. This is because a string of numbers can be expressed as the sum of any string whatever and a second string that matches the discrepancy between the two. Drawing on the metaphor of signalling theory, I shall call the discrepancy between a data set and a pattern identified within it the “noise” with which that pattern is exhibited in that data set. The term “noise” thus denotes purely the mathematical difference between the original data set and the pattern into which one has chosen to decompose the data set. Given that, mathematically, an empirical data set can be decomposed into any one of all conceivable patterns and an associated noise term, what does this fact reveal about the structure of the world? Providing a full answer to this question goes a long way towards specifying the content of the principle of evidential correspondence defined in the previous section. There would seem to be two possible initial responses. One is to claim that only some of the patterns that we can discern in data sets correspond to structures in the world, and then to specify the respect in which patterns that correspond to structures in the world differ from patterns that do not. The other response is to deny that any such ontologically significant distinction between patterns can be drawn, to admit that all patterns exhibited in empirical data sets correspond to structures in the world, and then to consider the meaning and implications of this claim. Let us begin by exploring the first option. Most writers in philosophy of science assume that only some of the patterns exhibited by empirical data sets correspond to structures in the world, whereas other patterns that could be discerned in empirical data have no evidential significance. Bogen and Woodward are among them. They hold that the world contains only a relatively small number of the structures that they call phenomena, and they do not explicitly raise the possibility that the world contains further structures of other kinds. In the light of this, they believe that only a few of the patterns that might be discerned in empirical data sets are evidence of structures in the world. Indeed, their examples suggest that they think that no more than one pattern in a typical empirical data set corresponds to a phenomenon. In order to avoid circularity, however, those who take this option must specify the respect in which patterns that correspond to structures differ from patterns that do not. If Bogen and Woodward cannot specify which patterns correspond to what they call phenomena, they have no justification for believing that the world contains a certain set of phenomena rather than a different set (McAllister 1997). For example, in order to substantiate his claim that it is an objective fact that some but not all patterns in data correspond to “true causes”, Bogen (2009) must explain how they differ.
123
76
Synthese (2011) 182:73–87
In what respect does a pattern exhibited by an empirical data set that corresponds to a structure in the world differ from a pattern that does not? In attempting to answer this question, we operate under a restriction: we cannot appeal to knowledge about which structures the world contains. If we could, we would be able to say simply that the patterns in empirical data that have evidential significance are those that correspond to the list of verified structures in the world. In fact, this answer presupposes what we are attempting ultimately to ascertain—what structures the world contains. Instead, in answering this question, we are restricted to using information contained in empirical data themselves, since empirical data constitute our sole epistemic access to the world and its structures. We must induce criteria for distinguishing patterns that correspond to structures in the world from patterns that do not in some way from the empirical data themselves. From this, it follows that we must be able to specify the difference between patterns that correspond to structures in the world and patterns that have no evidential significance purely in terms of morphological parameters of the patterns themselves or of the noise terms with which they are exhibited in data. The infinitely many patterns into which one can decompose a given empirical data set can be ranked and assessed on various morphological parameters. One is the algorithmic complexity of the pattern. A pattern of high algorithmic complexity admits no description substantially shorter than the data set itself, whereas a pattern of low algorithmic complexity is highly regular and thus can be captured in a much shorter description (Li and Vitányi 1997). A second parameter is the magnitude of the noise term, or of the discrepancy between the pattern and the original data set. A pattern exhibited with lower noise will account for a larger part of the variance present in the data set, whereas a pattern exhibited with higher noise will account for a smaller part of that variance. A third parameter is the degree to which the noise term associated with the pattern shows patterns of its own, including correlations with the pattern picked out from the data. For example, if the values in a time series have roughly constant magnitude, and we pick out from that data set a pattern described by a sine curve, the associated noise term will show a pattern corresponding to the inverse of that sine curve. We might hope that the difference between patterns that correspond to structures in the world and patterns that have no evidential significance would manifest itself in different values of one or more of these morphological parameters. A natural suggestion is to say that the patterns that correspond to structures in the world differ from patterns that do not in respect of the noise terms with which they are exhibited in empirical data: for example, that a pattern fails to correspond to a structure if the noise with which it is exhibited is high, shows patterns of its own, or is not randomly distributed. Woodward (2009), for example, suggests that a pattern fails to correspond to what Bogen and he call a phenomenon if it is exhibited with noise that is not randomly distributed. These proposals seem initially to accord with a simplified model of inferential statistics, but they are undermined partly by systematic considerations and, more crucially, by examples of scientific practice. Before we turn to the examples, let us review two systematic considerations that limit the scope of these proposals. First, it is not sufficient to stipulate conventional numerical thresholds of these parameters, saying, for example, that a pattern corresponds to a structure in the world
123
Synthese (2011) 182:73–87
77
only if it is exhibited in an empirical data set with noise no higher than 10%. Such a conventional stipulation might be adequate if we required a pragmatic, methodological guide to data handling; but we presume that the partition between patterns that correspond to structures in the world and patterns that have no evidential significance depends solely on the structure of the world, and we should not expect a conventional stipulation to capture it. Second, it is not adequate to say that, given two patterns exhibited in a data set, only the pattern that is exhibited with lower noise corresponds to a structure in the world. It is easy to pick out a pattern that is exhibited by a data set with zero noise: this is the unique pattern of high algorithmic complexity that coincides precisely with the original data. We would not want to conclude that this pattern alone corresponds to a structure in the world, especially in the light of the fruitful intuition that the value of individual data points is less significant than the underlying constancies, regularities, and trends in empirical data (McAllister 2003). Let us now turn to examples of scientific practice. Scientists are ready to attribute evidential significance to patterns in data that have low and high algorithmic complexity, to patterns that are exhibited with low and high noise, and to patterns that are exhibited with noise that shows patterns of its own, including correlations with the pattern picked out from the data. They are willing to regard patterns with all these characteristics as corresponding to real structures in the world (McAllister 2007). Let us consider a data set consisting of microwave radiation intensity readings of the sky. Cosmologists have identified multiple different patterns in this data set, each of which is exhibited with a different noise level; furthermore, they regard each of these patterns as evidence of a real structure in the universe. One pattern, named “isotropic background”, consists of a constant intensity distribution, independent of direction, with the spectrum of a black body at approximately 2.7 K. Cosmologists regard this pattern as evidence of the decoupling of matter and radiation in the early universe. A second pattern, named “dipole anisotropy”, consists of one warm and one cool pole in opposite directions in the sky, and is modelled by a first-order spherical harmonic function with an amplitude of approximately 3.4 × 10−3 K. Cosmologists regard this pattern as evidence of the motion of the earth relative to the matter that last scattered the microwave radiation. A third pattern, dubbed “quadrupole”, consists of two warm and two cool poles, and is modelled by a second-order spherical harmonic function with an amplitude of around 1.6×10−5 K. Cosmologists attribute this pattern to the radio emissions of the Milky Way. A fourth pattern, called “ripples”, consists of irregular fluctuations of radiation temperature with an amplitude of approximately 3.1 × 10−5 K on angular scales of 10◦ and above. Cosmologists regard this pattern as evidence of inhomogeneities in the matter distribution in the early universe, from which galaxies and clusters of galaxies condensed. The fifth pattern identified up to now, dubbed “acoustic oscillations”, consists of irregular fluctuations of radiation temperature with an amplitude of approximately 6.9 × 10−5 K on an angular scale of around 1◦ . Cosmologists regard this pattern as evidence that the universe is flat, i.e. that the total mass and energy density of the universe is equal to the so-called critical density (Partridge 1995; Weinberg 2008). The patterns in this series are exhibited with noise terms that show all the properties that might conceivably be regarded as indicating that a pattern has no evidential
123
78
Synthese (2011) 182:73–87
significance. First, consider the suggestion that a pattern fails to correspond to a structure in the world if it is exhibited in a data set with high noise. The pattern named “isotropic background” is exhibited in the cosmic microwave radiation data with relatively high noise—much higher than the noise with which any of the other patterns is exhibited. This does not weaken the belief of cosmologists that this pattern is evidence of the decoupling of matter and radiation in the early universe. Next, consider the suggestion that a pattern has no evidential significance if it is exhibited in a data set with noise that shows patterns of its own. Since the five patterns listed above are all superposed on one another in the cosmic microwave radiation data, it follows that, after we have subtracted any one of these patterns from the data, the remainder—which counts as the noise term with which the subtracted pattern is exhibited in the data— shows various further patterns. This does not stop cosmologists taking each of these patterns separately as evidence of a distinct physical process in the universe. Lastly, consider Woodward’s suggestion that a pattern fails to correspond to a phenomenon if it is exhibited in a data set with noise that is not randomly distributed. The pattern named “acoustic oscillations” is exhibited in the cosmic microwave radiation data with a noise term that, among other things, shows the pattern named “dipole anisotropy”, and therefore cannot be described as randomly distributed. This does not dissuade cosmologists from regarding the pattern named “acoustic oscillations” as evidence of the flatness of the universe. The mining of cosmic microwave radiation data for evidence about cosmological processes undoubtedly counts as one of the paradigmatic examples of good science in the twentieth century. If we were to discard patterns that are exhibited in data sets with noise terms that are high, show patterns of their own, or are not randomly distributed, we would disqualify this achievement. This observation suggests that the above account of the difference between patterns that correspond to structures in the world and patterns that do not cannot be correct: patterns that fail the test have no less claim to evidential significance than patterns that pass it. One could claim that the cosmic microwave radiation data are unusually rich, and that other empirical data sets are simpler or more univocal. In fact, the opposite is the case. It is usual for empirical data sets to exhibit multiple patterns with different noise levels, each of which is regarded by scientists as evidence of a distinct structure in the world. Take, as a second example, the data set comprising all atmospheric temperature measurements made on earth. This exhibits a pattern with a period of a day, associated with the earth’s rotation about its axis; a pattern with a period of a year, associated with the earth’s orbit around the sun; several patterns with periods ranging from 11 to 100,000 years, attributed to various astronomical processes; several less regular patterns associated with individual weather systems, which have a life span of a few days or weeks; patterns associated with El Niño events, occurring at intervals of between 4 and 10 years; and the “hockey stick” pattern associated with anthropogenic changes in the composition of the atmosphere in the twentieth century (Mann et al. 1998; Burroughs 2003; Ruddiman 2008). This series of patterns reproduces all the features shown by the cosmic microwave radiation data. The “hockey stick” pattern, for example, is exhibited with much higher noise than the pattern representing daily temperature excursions; moreover, this noise term shows patterns of its own and is not randomly distributed, since it contains all the other patterns listed above. If we accepted
123
Synthese (2011) 182:73–87
79
the above account of the difference between patterns that correspond to structures in the world and patterns that do not, we would have to conclude that the “hockey stick” pattern is not evidence of anything in the world—a conclusion that few nowadays would embrace. Perhaps alternative accounts of the difference between patterns that correspond to structures in the world and patterns that do not can be proposed. We would have to scrutinize any such new proposal with care. The list of paradigmatic examples of good science is sufficiently heterogeneous, however, that we are likely to find examples counting against any such proposal.
3 Objections from the temporal development of science Some critics might argue that so far I have taken insufficient account of the development of science in time. Perhaps later discoveries build on earlier discoveries to reveal which patterns in data correspond to structures in the world and which lack evidential significance. There are two ways in which the temporal development of science might enable scientists to distinguish patterns that correspond to structures in the world from patterns that do not. First, as Woodward (2009) points out, scientists approach empirical data normally not as blank slates, but equipped with substantive assumptions about the world. Woodward suggests that, since these assumptions go beyond the empirical data under scrutiny, they allow us to infer which patterns exhibited in those data constitute evidence of phenomena. In many cases, these assumptions entail that one should seek patterns of a certain form. Consider, for example, a data set listing the temperatures at which samples of lead have been observed to melt in a series of trials. If we assume that the melting point of a chemical element depends only on the atomic number of the element, we might infer that the sole pattern in a graphical plot of this data set that constitutes evidence about the structure of the world is a horizontal straight line. Assumptions of this kind, while they play an important pragmatic role in scientific research, are unable to resolve the ontological question of which structures the world contains. This would certainly hold if these assumptions were stipulations: stipulations cannot be expected to latch onto a putative objective feature of the world, namely the structures that the world contains. Woodward, of course, considers these assumptions to be not stipulations but hypotheses that command empirical support. Such assumptions do not help either, however. The empirical support for an assumption can take the form only of a pattern identified in an empirical data set on a previous occasion. For example, the empirical support for the assumption that the melting point of a chemical element depends only on the atomic number of the element takes the form of a pattern consisting of another horizontal straight line exhibited in a graphical plot of empirical data on the melting of samples of various elements. The questions that I raised above about the later data set apply to this earlier data set too: what is the difference between patterns that correspond to structures in the world and patterns that do not? Reference to empirically supported assumptions about the structure of the world, therefore, does no more than advance the problem to an earlier time.
123
80
Synthese (2011) 182:73–87
The second feature of the temporal development of science that might enable scientists to distinguish patterns that correspond to structures in the world from patterns that do not is the discovery that some patterns are projectable, whereas others are not. The argument might go as follows: it is easy for a scientist to fit multiple patterns to data that are already available, but the real test is faced when he or she checks these patterns against data gathered in future. Then, the scientist will discover that some previously identified patterns are exhibited also in the new data, whereas other patterns are not. It seems natural to suggest that the former patterns have greater claim to correspond to structures in the world than the latter. What does the discovery that a pattern is or is not projectable amount to? The statement that a pattern P(x) is projectable means that, if P(x) is exhibited with a noise level of n percent in a data set A, it will later be found to be exhibited with a noise level no higher than n percent also in a second data set B. By contrast, the statement that a pattern Q(x) is not projectable means that, whereas Q(x) is exhibited with a noise level of m percent in A, it will later be found not to be exhibited with a noise level no higher than m percent in B. These outcomes may represent an interesting discovery about the two patterns, but they provide no reason for thinking that pattern P(x) has more claim to correspond to a structure in the world than pattern Q(x). From findings about patterns in data sets established above, it is clear that the data set B exhibits both P(x) with a certain noise level and Q(x) with a certain other noise level. Insofar as patterns in data are regarded as corresponding to structures in the world, these two patterns have equal claim. The fact that at some moment we predicted that P(x) would be exhibited in B with a noise level no higher than n percent, and this prediction was later verified, and we predicted that Q(x) would be exhibited in B with a noise level no higher than m percent, and this prediction was later not verified, has no bearing on the ontological question which patterns in data correspond to structures in the world and which do not. It merely causes us to revise our estimate of the noise level with which Q(x) is exhibited. For projectability to offer a criterion to distinguish patterns that correspond to structures in the world from patterns that do not, it would have to be supplemented with an argument for thinking that all and only patterns exhibited with a noise level no higher that n percent correspond to structures in the world. The projectability proposal, in short, far from offering a solution to the problem before us, presupposes such a solution. The irrelevance of the temporal succession of prediction and verification to the ontological question may be established also in another way. Consider the data set U comprising all observations that we will in fact make throughout the history of science, or even comprising all observations that we could make in the actual world. Presumably U provides the most reliable empirical base possible for our project of establishing from empirical evidence which structures the world contains. Now, all the results about patterns exhibited in data sets established above hold of the patterns exhibited in U . In particular, U exhibits all possible patterns, each with a certain noise level. Some writers, like Bogen and Woodward, would claim that some of the patterns exhibited in U correspond to structures in the world, and other patterns do not. These writers face the following question: what is the difference between patterns of these two classes? It is evident that historical circumstances pertaining to the gradual process
123
Synthese (2011) 182:73–87
81
by which we ascertain the noise levels with which individual patterns are exhibited in U cannot provide grounds for resolving this question. 4 The equality of patterns Let us now turn to the other possible response to the fact that, mathematically, an empirical data set can be decomposed into any one of all conceivable patterns and an associated noise term. This second response denies that any ontologically significant distinction can be drawn between patterns, and concludes that all patterns exhibited in empirical data sets constitute evidence of structures in the world. What grounds are there for endorsing this response? The grounds for thinking that all patterns that can be discerned in an empirical data set correspond to structures in the world are the following. A pattern is an additive component of an empirical data set. If the world causes the full value of the empirical data, as it must, then some factor in the world causes every additive component of that value. Whereas it may be difficult, from a pattern in data, to reconstruct the particular element of the combination of causal factors in the world that produced it, it is incontrovertible that such a factor exists and that the pattern is an effect of it. For example, each of the patterns that cosmologists have identified in cosmic microwave radiation data—the isotropic background, the dipole anisotropy, the quadrupole, the ripples, and the acoustic oscillations—is caused by a particular physical process. A pattern in data is a trace of the causal factor that produced it. Therefore, every pattern exhibited in empirical data constitutes evidence of some structure in the world. One might object that, unlike in the case of the patterns just mentioned, we have not yet theorized distinct causal factors that are responsible for the vast majority of patterns that we might identify in empirical data. This, however, can be explained as reflecting the incompleteness of current science. A further possible objection, that there is no evidence of the existence of a distinct physical factor responsible for an arbitrary pattern that we may discern in empirical data, is misconceived: the pattern itself constitutes such evidence. For example, the evidence of the existence of physical processes responsible for the patterns identified in cosmic microwave radiation data takes the form of the patterns themselves. The conclusion that every pattern exhibited in empirical data constitutes evidence of some structure in the world holds regardless of the algorithmic complexity of the pattern, the magnitude of the noise with which it is exhibited in the data set, and whether this noise shows patterns of its own. In particular, nothing about the noise with which any pattern is displayed in a data set lowers its worth as evidence of structure in the world: even a pattern that is displayed with high noise is the effect of a causal process in the world, and thus constitutes evidence of that process. We may test this view further by scrutinizing one of its implications. According to this view, there is no component of any empirical data set that does not constitute evidence about the structure of the world. If this view is correct, it follows that even what is labelled, for the purposes of a certain research project, as the noise term in a data set contains information about some structure in the world, albeit not one that lies at the focus of that project. In other words, for any component of an empirical data set that counts for one research project as noise, it must be possible to formulate another
123
82
Synthese (2011) 182:73–87
project for which that component constitutes the pattern of interest and the source of evidence. Is this implication confirmed by our experience of scientific research? I believe it is. There are countless examples of the discovery of useful information in what in other contexts is classed as noise in data. For Arno A. Penzias and Robert W. Wilson, who set out to investigate the reflection of radio waves by satellites, the sum of all the patterns now found in cosmic microwave radiation data counted originally as noise. Small fluctuations in observational data in some research projects are attributed to Brownian motion and are discarded as noise, but a research project into the properties of matter will regard them as evidence of the existence and properties of atoms, from which Avogadro’s number can be calculated. Voltage fluctuations in an electrical resistor are discarded in many research projects as thermal noise or Johnson noise, but may be used in another research project as an indicator of temperature (White et al. 1996). Indeed, scientific progress is often achieved by re-analyzing the component of empirical data that was discarded as noise in earlier research and showing that it contains information about further structures of the world. As we saw above, Galileo analyzed observational data on the fall of bodies into two components: one that was described by his law of free fall, and a second that he regarded as too irregular to yield useful information. In the nineteenth century, we discovered that this second component contained important further patterns, described by Stokes’s law. Such examples suggest that the partition of a data set into pattern and noise is context dependent and does not correspond to any objective difference between causal factors in the world (Kosko 2006, pp. 6–8). This test confirms the thesis that there is no element of empirical data sets that does not constitute evidence about the structure of the world. This enables us to give a preliminary formulation of the principle of evidential correspondence relating features of empirical data and structures in the world on which empirical science necessarily relies: all patterns exhibited in empirical data, irrespective of the magnitude or other features of the noise with which they are exhibited, correspond to structures in the world. 5 From patterns to structures Let us suppose that the preliminary formulation of the principle of evidential correspondence just given is correct, and that all the endlessly many patterns into which a data set can be decomposed constitute evidence of the structure of the world. What substantive conclusions about the structure of the world does this entail? Any pattern into which we can decompose a data set can be expressed as the constancy of a certain mathematical function. That is, all patterns in data can be expressed in the form Fi (x) = 0. Consider now a particular pattern, F1 (x) = 0. What evidence about the structure of the world does this pattern contain? This pattern is evidence that the world has the invariance or symmetry necessary to cause data to exhibit this particular constancy. In other words, there must be some causal process in the world that ensures that the quantity F1 (x) remains equal to zero. The principle of conservation of energy offers a good illustration. We discover that the world has the structure called “conservation of energy” by detecting that certain empirical data sets exhibit, with non-zero noise, a pattern E(x) = 0. From this pattern,
123
Synthese (2011) 182:73–87
83
we infer that the world contains a combination of causal processes that ensures that the quantity E(x) remains constant. Physicists describe these, of course, as the causal processes by which energy is converted from one form into another within closed physical systems. Thus, the pattern E(x) = 0, exhibited in empirical data sets with non-zero noise, constitutes evidence that the world has a certain invariance. The same holds for all other patterns of the form Fi (x) = 0. Thus, from the fact that empirical data show all possible patterns, we must conclude that the world contains invariances that cause all possible patterns. How many distinct invariances are these? Are these all possible invariances, or merely a finite number? In other words, is there a one-to-one relation between invariances in the world and patterns in data, or a one-to-many relation? An answer to this question will be provided by a criterion for individuating invariances in the world. If our individuation criterion counts the invariances that cause two distinct patterns in data as distinct, a one-to-one relation between invariances and patterns will hold; if our criterion in some cases counts those invariances as the same, a one-to-many relation will hold. Now, our sole possible epistemic access to invariances is through the patterns in empirical data that they cause. I therefore suggest that we make our individuation criterion for invariances follow our individuation criterion for patterns in data: if two patterns differ, then the invariances that caused them differ also. On this criterion, there are as many distinct invariances in the world as there are distinct patterns in empirical data. The alternative view, that, in some cases, multiple distinct patterns are caused by the same invariance, is open to three objections. First, this view is empirically unmotivated. This view envisages a taxonomy in which patterns are attributed to a smaller number of invariances. However, there is no empirically detectable fact that can weigh for or against any such taxonomy. Second, this view does not amount to a coherent alternative. An invariance that manifests itself in multiple distinct patterns in data must be regarded as complex: it has multiple facets, each of which causes one of the patterns. If we are willing to go this far, we may equally identify each of its facets with an invariance in its own right. Third, this view is not stable. If we assume that an invariance is able to cause multiple patterns in data, then simplicity considerations will favour saying that the world in its entirety consists of just one invariance, which causes all the patterns in data that we are capable of picking out. Then, the option of saying that this invariance has as many facets as the patterns that we discern in empirical data becomes all the more compelling. These critical considerations support the conclusion that distinct patterns in empirical data constitute evidence of, and in this sense correspond to, distinct invariances in the world. Given that the class of invariances is equivalent to the class of structures, it follows that empirical data constitute evidence for the proposition that the world contains all possible structures. Let us briefly recapitulate the argument up to now: • Empirical data sets exhibit all possible patterns, each with a certain noise term. • Each pattern in empirical data, irrespective of its properties and of those of the noise term with which it is exhibited in data, is evidence of a structure in the world.
123
84
Synthese (2011) 182:73–87
• Distinct patterns in data are evidence of distinct structures in the world. • Therefore, empirical data constitute evidence that the world contains all possible structures. We can now add detail to the preliminary formulation of the principle of evidential correspondence given at the end of Sect. 4. Our formulation then was that all patterns exhibited in empirical data correspond to structures in the world. We can now state the principle of evidential correspondence more fully as follows: Principle of evidential correspondence: Distinct patterns exhibited in empirical data, irrespective of the magnitude or other features of the noise with which they are exhibited, correspond to distinct structures in the world. Coupled with the finding that empirical data contain all possible distinct patterns, this principle yields the conclusion that the world contains all possible structures. 6 A radically polymorphous world I propose to label a world that contains all possible structures as “radically polymorphous”. The thesis that the world is radically polymorphous is alien to most thinking in philosophy of science and to our everyday conceptions of the world. Nonetheless, I believe that it is the sole possible outcome of a careful analysis of the features of empirical data and their evidential significance. In this section, I attempt to make the thesis more palatable by neutralizing some possible misgivings at it. How should we conceptualize the simultaneous existence of all structures in the world? Since our sole possible evidence that the world has some structure comes in the form of patterns in empirical data sets, I suggest that we allow our understanding of structures in the world to be guided by our understanding of patterns in data. The fact that we pick out one pattern in a data set does not establish that alternative patterns do not exist. In actuality, as we have seen, a data set exhibits all possible patterns with various noise levels. Patterns are superposed on one another and exist simultaneously in data: they are all present in the same empirical data set, available to be picked out or to be disregarded. We ought to conceive of structures in the world in the same way. Whereas we habitually think of a structure as something incompatible with alternative structures, in actuality all structures exist simultaneously in the world, available to be picked out or to be disregarded. Some might label this a form of relativism, but this characterization is wide of the mark. Relativism would involve saying that which structure the world has, depends on the observer’s standpoint. By contrast, I hold that the world has all structures absolutely: all structures are objectively real for all observers. Whatever its inadequacies, therefore, this is not a form of relativism. At most, only Bogen’s and Woodward’s label “phenomenon”—now reduced to an honorific title designating the structure that is at the focus of a particular research project—shifts from structure to structure depending on the research project being pursued, just as what counts as the pattern and as the noise in data depends on the project. A possible extension of this view is to reason that an entity that exhibits all possible symmetries is equivalent to one that has no structure at all, and conclude that the world
123
Synthese (2011) 182:73–87
85
is amorphous. Whereas I would not quarrel with this further conclusion, I do not at present see any advantages in adding this. In virtually all practical cases, of course, observers choose to pick out just one pattern at a time from the infinitely many that are available in empirical data. Two devices in the observer normally guide the choice. The first is a preference for patterns of a certain form. For example, an observer may choose to pick out a pattern consisting of a horizontal straight line, corresponding to the constancy of some physical magnitude, from a graphical plot of a data set. As we saw in Sect. 3, the choice to pick out a pattern of a certain form in a data set may be motivated or constrained by previous choices to pick out certain other patterns in other data sets. In this sense the choice of a pattern of a particular form may be empirically motivated, although no history of earlier choices of patterns establishes that the latest pattern chosen has any more claim than any other pattern to correspond to a structure in the world. The second device is a preference for a noise level. An observer may tune in to patterns that are exhibited in a data set with higher or lower noise. A preference for a noise level enables the observer to choose whether to focus on patterns that follow the variance of the data more or less precisely. In many cases we would describe a pattern that is exhibited with lower noise as corresponding to smaller-scale structures in the world, and patterns exhibited with higher noise as corresponding to larger-scale structures. Preference for a noise level is built in to the human senses and to all observational instruments. For example, the human sense of sight has a preference for a noise level that ensures that we interpret visual stimuli in terms of structures of a size of approximately one meter. If the preferred noise level were lower, we would focus on smaller-scale structures, becoming more aware of the irregularities of a tabletop, for example. If the preferred noise level were higher, we would lose sight of many smaller objects altogether, and see landscapes as flat and bare. Structures on these alternative scales—irregularities of tabletops, featurelessness of landscapes—are self-evidently present too, but human observers in everyday circumstances usually focus on them less than on mid-sized structures. An observer’s preference for patterns of a certain form or for a certain noise level leads that observer to focus on some patterns in data and thus on some structures in the world to the exclusion of others. Neither preference, however, has any bearing on the question whether the world contains just one structure or, as I maintain, all possible structures. This question is decided exclusively by two factors: which patterns are exhibited in empirical data, and what evidential significance these patterns have. Indeed, observers would not need preferences for patterns of a certain form and for a certain noise level if the world manifestly displayed a unique structure. On a psychological level, the facility to pick out a single pattern may give observers the impression that the world has a unique structure, but this impression is corrected if they realize that they might equally legitimately focus on another pattern in the same empirical data. Whereas some researchers might regard science as the project to identify the unique structure of the world, anyone with experience of complex data sets—such as those pertaining to the cosmic microwave radiation or atmospheric temperature—will acknowledge that empirical data themselves contradict the assumption that the world shows a unique structure. A world with a unique
123
86
Synthese (2011) 182:73–87
structure would not generate empirical data with the properties that we find data to have. Lastly, the options for those who wish to block the conclusion that the world has all possible structures are clear. They must either establish that empirical data sets do not exhibit all possible patterns, or propose a formulation of the principle of evidential correspondence between features of empirical data and structures in the world alternative to that given above. To do the latter, they must establish either that some of the patterns exhibited by empirical data sets do not constitute evidence of structures in the world, or that multiple distinct patterns constitute evidence of the same structure in the world. As I have tried to argue in the preceding pages, I believe that none of these challenges can be met. 7 Conclusions In this article, I have directed attention at the need to formulate the principle of evidential correspondence underlying modern empirical science. This principle is required to licence the move from perceiving a certain feature in an empirical data set to concluding that the world contains a certain structure. The principle should specify which features of empirical data constitute evidence of which structures in the world. I have argued that the sole admissible principle of evidential correspondence states that distinct patterns exhibited in empirical data correspond to distinct structures in the world. This principle holds irrespective of the magnitude or other properties of the noise with which patterns are exhibited in data sets. I have coupled this principle with the finding that empirical data sets exhibit all possible patterns, each with a certain noise level. My conclusion is that the world contains all possible structures. I have dubbed such a world “radically polymorphous”. Whereas the thesis that the world is radically polymorphous appears at first sight to conflict with our everyday experiences and with the findings of science, I have argued that this thesis is actually required to make sense of the nature of empirical data, and especially of its inexhaustibility. Whereas more work is needed to explore the implications of this view, it seems to suggest a new conceptualization of the way multiple structures co-exist in the world. Acknowledgements I dedicate this article to the memory of Daniela Bailer-Jones (1969–2006), who established and led the research programme, “Kausalität, Kognition und die Konstitution naturwissenschaftlicher Phänomene”, at the Ruprecht-Karls-Universität Heidelberg. I presented a first draft at the conference, “Data—Phenomena—Theories: What’s the Notion of a Scientific Phenomenon Good for?”, Heidelberg, September 2008. I thank the organizers for inviting me to speak and the participants for valuable comments. I am especially grateful to James Bogen and James Woodward for cordial and constructive exchanges. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References Bogen, J. (2009). ‘Saving the phenomena’ and saving the phenomena. Synthese. doi:10.1007/ s11229-009-9619-4.
123
Synthese (2011) 182:73–87
87
Bogen, J., & Woodward, J. (1988). Saving the phenomena. Philosophical Review, 97, 303–352. Burroughs, W. J. (2003). Weather cycles: Real or imaginary? (2nd ed.). Cambridge: Cambridge University Press. Grenander, U. (1996). Elements of pattern theory. Baltimore, MD: Johns Hopkins University Press. Koertge, N. (1977). Galileo and the problem of accidents. Journal of the History of Ideas, 38, 389–408. Kosko, B. (2006). Noise. New York: Viking. Li, M., & Vitányi, P. M. B. (1997). An introduction to Kolmogorov complexity and its applications (2nd ed.). Berlin: Springer. Mann, M. E., Bradley, R. S., & Hughes, M. K. (1998). Global-scale temperature patterns and climate forcing over the past six centuries. Nature, 392, 779–787. McAllister, J. W. (1997). Phenomena and patterns in data sets. Erkenntnis, 47, 217–228. McAllister, J. W. (2003). Algorithmic randomness in empirical data. Studies in History and Philosophy of Science, 34, 633–646. McAllister, J. W. (2005). The virtual laboratory: Thought experiments in seventeenth-century mechanics. In H. Schramm, L. Schwarte, & J. Lazardzig (Eds.), Collection, laboratory, theater: Scenes of knowledge in the 17th century (pp. 35–56). New York: De Gruyter. McAllister, J. W. (2007). Model selection and the multiplicity of patterns in empirical data. Philosophy of Science, 74, 884–894. Partridge, R. B. (1995). 3 K: The cosmic microwave background radiation. Cambridge: Cambridge University Press. Ruddiman, W. F. (2008). Earth’s climate: Past and future (2nd ed.). New York: W. H. Freeman. Weinberg, S. (2008). Cosmology. Oxford: Oxford University Press. White, D. R., Galleano, R., Actis, A., Brixy, H., De Groot, M., Dubbeldam, J., et al. (1996). The status of Johnson noise thermometry. Metrologia, 33, 325–335. Woodward, J. (1989). Data and phenomena. Synthese, 79, 393–472. Woodward, J. (2009). Data and phenomena: A restatement and defense. Synthese. doi:10.1007/ s11229-009-9618-5.
123
This page intentionally left blank z
Synthese (2011) 182:89–100 DOI 10.1007/s11229-009-9614-9
Data meet theory: up close and inferentially personal Ioannis Votsis
Received: 20 May 2009 / Accepted: 3 June 2009 / Published online: 8 July 2009 © Springer Science+Business Media B.V. 2009
Abstract In a recent paper James Bogen and James Woodward denounce a set of views on confirmation that they collectively brand ‘IRS’. The supporters of these views cast confirmation in terms of Inferential Relations between observational and theoretical Sentences. Against IRS accounts of confirmation, Bogen and Woodward unveil two main objections: (a) inferential relations are not necessary to model confirmation relations since many data are neither in sentential form nor can they be put in such a form and (b) inferential relations are not sufficient to model confirmation relations because the former cannot capture evidentially relevant factors about the detection processes and instruments that generate the data. In this paper I have a two-fold aim: (i) to show that Bogen and Woodward fail to provide compelling grounds for the rejection of IRS models and (ii) to highlight some of the models’ neglected merits. Keywords
Data · Phenomena · Confirmation · Inferences · Bogen · Woodward
1 Introduction James Bogen and James Woodward’s much discussed work on the relationship between data, phenomena and theories denounces, among other things, positivist-inspired views of confirmation (Bogen & Woodward 1988, 1992, 2003; Woodward 1989). In their most recent article, they classify such views under the general heading ‘IRS’, for IRS supporters cast confirmation in terms of Inferential Relations between observational and theoretical Sentences. According to Bogen and Woodward the most paradigmatic of such views are the hypothetico-deductive and the positive instance models
I. Votsis (B) Philosophisches Institut, Universität Düsseldorf, Universitätsstr. 1, Geb. 23.21/04.86, 40225 Düsseldorf, Germany e-mail:
[email protected] 123
90
Synthese (2011) 182:89–100
of confirmation. Countering IRS models, Bogen and Woodward launch two chief objections: (a) inferential relations are not necessary to model confirmation relations since many data are neither in sentential form nor can they be put in such a form and (b) inferential relations are not sufficient to model confirmation relations because the former cannot capture evidentially relevant factors about the detection processes and instruments that generate the data. In this paper I have a two-fold aim: (i) to show that Bogen and Woodward fail to provide compelling grounds for the rejection of IRS models and (ii) to highlight some of the models’ neglected merits. 2 The IRS model and its presumed failures In their now classic 1988 paper, Bogen and Woodward draw a distinction between data and phenomena. Data are roughly what is observed or detected with or without the use of instruments. Examples include drawings made by surgeons, temperature readings and bubble chamber photographs. Phenomena are the objects, events and processes (in the world) under investigation. Examples include sensory processing dysfunction, the melting point of mercury and particle interactions. In Bogen and Woodward’s view we use theories to systematically explain, infer or predict phenomena; we typically cannot use them to explain, infer or predict data. This view is a reaction to what they conceive of as the positivist-inspired view that “[e]xplanation, prediction, and theory-testing involve the deduction [or more generally the inference] of observation sentences from other sentences, some of which may be formulated in a ‘theoretical’ vocabulary...” (p. 303). While in their 1988 paper, Bogen and Woodward focus on the first of these dimensions, i.e. explanation, in a recent paper they turn their attention to the theory-testing dimension, branding their nemesis ‘IRS’. According to IRS, “the epistemic bearing of observational evidence on a scientific theory is best understood in terms of Inferential Relations between Sentences which represent the evidence and sentences which represent hypotheses belonging to the theory” (2003, p. 223). As already explained in the introduction, the most well-known versions of IRS are the hypothetico-deductive account and the positive instance account of confirmation. The latter is understood broadly so at to include a number of inductive models including bootstrapping accounts (see, for example, Glymour 1980) and several Bayesian accounts (see, for example, Howson & Urbach 1989).1 Against such views, Bogen and Woodward emphasise their conviction that the character of the evidential relation between data, phenomena and theories is not one best cast in inferential terms. In their eyes, “... the epistemic import of observational evidence is to be understood in terms of empirical facts about particular causal connections and about the error characteristics of detection processes. These [facts] are neither constituted by nor greatly illuminated by considering the formal relations between sentential structures which IRS models focus on” (2003, p. 223). 1 Bogen and Woodward claim that in principle Bayesian accounts can solve some of the difficulties IRS
models face but in practice most fall prey to them (2003, f43, p. 255).
123
Synthese (2011) 182:89–100
91
Bogen and Woodward trace the motivation for the first IRS models to philosophers like Hempel and Reichenbach, who wanted an objective theory of confirmation, one that avoids the conflation of objective, rational and logical factors with the undesired subjective, psychological ones. They cite, for example, Hempel who seems to want to de-subjectivise and generalise confirmation: “it seems reasonable to require that the criteria of empirical confirmation, besides being objective in character, should contain no reference to the specific subject matter of the hypothesis or of the evidence in question” (1965, pp. 9–10). It is essential to note that, as Bogen and Woodward concede, the IRS project is supposed to be a rational reconstruction of the confirmational practice of science. It is therefore no objection to IRS theorists to argue that their modelling does not mirror actual scientific reasoning. Having said this, IRS theorists can be held accountable if they fail to reconstruct all the evidentially significant factors that make up theory testing. That the formal tools of IRS models are not able to capture the evidential relations between data and theories is a case that Bogen and Woodward parcel into two objections. First, it is claimed that inferential relations are not necessary for confirmation relations because many data are neither in sentential form nor can they be put in such a form (pp. 232–236). Second, it is claimed that inferential relations are not sufficient for confirmation relations because: (a) they are unable to account for various actual and hypothetical cases including well-known paradoxes of confirmation (pp. 229–231) and (b) they are unable to account for the processes that generate the data and on which standards of reliability depend (pp. 249–255). In the sections that follow I raise doubts on the gloomy picture Bogen and Woodward paint on the prospects of encapsulating confirmation relations in an inferential manner. Moreover, whenever appropriate I present positive reasons for the benefits of adopting a general IRS framework, though I refrain from endorsing any particular manifestation of it.
3 Getting the sentence out of the data First up, let us consider the objection that inferential relations are not necessary for confirmation relations because several data are non-sentential in form. Such data are manifested in different ways, among them “photographs, drawings, tables of numbers, etc.” (p. 226). To substantiate their objection, Bogen and Woodward throw the spotlight on the data gathered by the Eddington expedition in 1919 and used to confirm the general theory of relativity. The data were photographs and required a certain amount of processing, e.g. ascertaining scale, before they were used to calculate the phenomenon at issue, i.e. starlight deflection by the sun’s gravitational pull. The result was then compared against the predictions of three competing hypotheses: (1) the General Theory of Relativity (GTR), (2) Solder’s Newtonian model and (3) the No Deflection Hypothesis. The detected starlight deflection though not entirely in coherence with the predictions of the GTR was much more in line with those predictions than with the ones made by Solder’s model. The upshot, as we all know, was a significant conformational boost that helped establish GTR in the scientific world.
123
92
Synthese (2011) 182:89–100
On its own the fact that many data are prima facie non-sentential does not suffice to trouble the IRS-theorist. Such a theorist, recall, wears a reconstructionist hat and hence need only demand that prima facie non-sentential data can be packaged in sentential form. What is really at stake then is whether such data are irreducibly non-sentential. To subdue the IRS-theorist, Bogen and Woodward need to argue that at least some of the prima facie non-sentential data—and perhaps even to argue that the vast majority of them—cannot be reduced to sentential form. Though they do not explicitly adopt such a stance, at least one of their arguments seems to target reducibility. To be exact, they argue that IRS “provides no guidance for the construction of the required observation sentences” (p. 235). Need IRS provide such guidance? It would be a boon, of course, but as Bogen and Woodward regularly point out—see Sect. 5 below—the data are often idiosyncratic in a way that prevents IRS theorists from formulating a general rule or even a rule of thumb that converts non-sentential data into sentential ones. In my view, the most likely scenario is that there is not only one rule but a great number of them, each corresponding to different types of data and conditions under which they are useful, e.g. a rule for converting photographic data when scale is not important, a rule for converting sounds coming from a Geiger counter, etc. A powerful justification for this view comes from the fact that we have learned to automate such tasks with computers. The mere digitisation of data makes them ripe for inferential manipulation—in other words just what the IRS doctor ordered. The subsequent algorithmic analyses of the data to detect phenomena illustrates that even this stage of the testing of a hypothesis is an inferential matter. We can even program the computer to output its results in natural language form, e.g. ‘Phenomenon A was detected at time t1 under conditions φ, , and ζ ’. Take the data in the seti@home experiment based at the University of California Berkley. The experiment attempts to detect extra-terrestrial broadcasts (the phenomenon under study) by analysing narrow-bandwidth signals (the data) gathered from the Arecibo radio telescope. The data are recorded, digitised and broken into small manageable chunks which are then sent around the globe to home users who have downloaded a program that analyzes the data for extra-terrestrial signals. All of this is automated, to the extent that a scientist’s computer could output each step in the process in argument form. For the time being to counter the irreducibility claim it will thus suffice to tackle Bogen and Woodward’s counterexamples on a case by case basis. Take the GTR example. The evidentially relevant part of the eclipse photographs consists in the distances between the images of the stars. These represent the apparent distances between the stars under starlight deflection. Foil photographs were also taken at night-time to record the ‘actual’ distances of the stars, i.e. the distances without starlight deflection from the sun. In sentential form these data will look something like this: ‘The image of star A is n microns distant from the image of star B’. Similar reconstructions of the data can be given to Bogen and Woodward’s favourite cases, namely that of the calculation of the melting point of lead and that of the detection of weak neutral currents. The data collected from measurements of the melting point of lead can be sententially dressed as follows: ‘Sample x1 measured by thermometer l1 under experimental conditions c1 at time t1 registered value y1 ’. The data from the Gargamelle heavy-liquid bubble chamber at CERN were also photographs. As with all particle accelerator
123
Synthese (2011) 182:89–100
93
photographs the telltale signs of particles and of particle interactions are streaks that satisfy certain geometrical properties. In sentential form these could look something like this: ‘Photograph x1 taken at the Gargamelle bubble chamber at time t1 under experimental conditions c1 exhibits geometrical properties y1 ’. Bogen and Woodward will surely complain that there is much else of evidential relevance in these cases. That’s absolutely right. There are calculation techniques, correction factors, signal threshold assumptions, etc. Nothing prevents us from stating these in sentential form. Take the GTR case again. One of the central evidentially relevant assumptions employed was a correspondence relation between features of the photographs and features we want to attribute to the stars. The correspondence can be stated in sentential form and will look something like this: ‘Distances in microns between images of the stars correspond to distances in astronomical units (AU) between the stars’. Another central assumption that was evidentially relevant was a correction for parallax effects between the eclipse and comparison photographs. In sentence form this assumption will not be very different from the following: ‘For eclipse photographs taken at (long.-lat.) coordinates λ1 , φ1 at time interval t2 − t1 , the distance measured between images of stars A and B in microns n needs to be corrected by factor g before they can be compared to the night-time photographs taken at (long.-lat.) coordinates λ2 , φ2 at the same time interval’. The irony of it all is that sometimes Bogen and Woodward unwittingly offer their own sentential formulations of evidentially relevant considerations. For example, speaking of the scale complications between the eclipse and the comparison photographs they state that this was solved by a “correspondence of radial distances between stars shown on an accurate star map to linear distances between star images on the photographs” (p. 232). A similar treatment can be given to the other high-profile cases Bogen and Woodward like to discuss in their work. Indeed, I provide examples of sentential formulations of evidentially relevant factors from the melting point of lead case (Votsis, forthcoming). Without a doubt these reconstructions are not easy to perform. Life does get a bit easier for the IRS theorist, however, in that evidentially relevant factors need not be reconstructed in detail. As Bogen and Woodward themselves note some factors are either not explicit or not systematically understood (p. 244). In such cases all one needs to do is encapsulate whatever little information we have about the factor in sentential form. For example, not knowing the exact mechanism behind a reliable instrument does not forbid us from asserting (justifiably or unjustifiably) that the features in its outputs correspond to features in the objects that we target with it. Bogen and Woodward’s discussion of Galileo’s telescope fits like a glove here as they point out that the telescope was reliable “even though Galileo lacked an optical theory that explained its workings” (p. 240). At this point one may well wonder why we would trouble ourselves with a reconstruction of evidentially relevant factors of which we have little to no knowledge about. The answer is the same as that given for factors we do have knowledge about, namely that by reconstructing them we can more easily examine their inferential role and ultimately hold them accountable. A final remark should secure enough nails on the coffin of the irreducibility objection to seal it tight. If a set of data is to play as weighty a role as confirming or disconfirming a phenomenon (and by Bogen and Woodward’s lights only indirectly a theory) then it had better be something that we can express assent or dissent towards.
123
94
Synthese (2011) 182:89–100
This I take to be a basic requirement of scientific discourse about data. But to express assent towards a set of data D1 or dissent towards a set of data D2 , both with respect to a phenomenon P, just means to believe in the first but not in the second. Since beliefs, at least for most philosophers, are propositional attitudes, it is reasonable to assume that the content of the foregoing beliefs are propositions. But that’s precisely what sentences express.2 Hence to express assent or dissent to a set of data is to implicitly (and perhaps even contrary to one’s wishes) endorse the view that what is evidentially of essence in that set can be put in sentential form. To sum the section up, Bogen and Woodward’s qualms are powerless to faze the IRS theorist. All that matters according to that theorist is whether we can (re)construct sentences from prima facie non-sentential data without loss of evidential content. I am sympathetic to this view and furthermore challenge Bogen and Woodward to display one datum whose evidential worth cannot be sententially packaged in any of the aforementioned ways. 4 Restoring sufficiency. Part I—A paradox and a thought experiment Let us now turn to the objection that IRS models are insufficient to capture all evidential relations. This objection amounts to the idea that there are cases where the theory and the data have the right inferential relation but no corresponding evidential relation exists. In the next section I examine how Bogen and Woodward’s views about standards of reliability threaten the sufficiency of IRS models. In this section I concentrate on two of their counterexamples to the sufficiency of IRS models that are discussed independently of reliability considerations. One is the well-known raven paradox. A white shoe does not seem to be evidence for the hypothesis that ‘All ravens are black’. Yet, under the IRS model of confirmation it is evidence because it stands in the right inferential relations to that hypothesis. The other example is a thought experiment which involves what I brand ‘doppelganger data’. Imagine a case where a set of data D1 generated by some unknown process is identical to a set of data D2 generated at Gargamelle. Suppose further that D2 positively identifies a weak neutral current event. D1 does not seem to be evidence for weak neutral currents although it stands in the right inferential relations to them. What lands the IRS models into hot water, according to Bogen and Woodward’s diagnosis, is that they neglect to take into account the actual processes that generate the data (p. 250). Confronting the IRS theorist are two options. First, such a theorist can bite the bullet and argue that data standing in such inferential relations do confer some support. This option is harder to sell in the case of the doppelganger data but it is a rather uncontroversial one in the debate over the significance of the raven paradox. Thus it has been argued that white shoes provide some support for the raven hypothesis even though much less than the support one gets from finding black ravens. The degree of support 2 We can even argue in the following way. Assent or dissent towards a set of data expresses a belief in the truth or falsity of the data respectively. But truth and falsity are properties of sentences. More subtly, since in most cases data encode information that is to some extent degraded, assent towards D1 amounts to, among other things, the belief in the truth of a counterfactually close data set D1 which is devoid of such imperfections.
123
Synthese (2011) 182:89–100
95
depends on the relative size of the two classes, in this case that of non-black things and that of black ravens. Perhaps more subjectively, the degree of support depends on whether such objects were randomly picked. There is no reason why these ideas cannot be encoded as auxiliaries or even as specific rules of inference. Bayesians, for example, model the different degrees of support—via degrees of belief—afforded by each datum by assigning different values to the expectedness of the evidence. The second option available to the IRS theorist is to accept (as Bogen and Woodward anyhow insist) that such data are not evidentially relevant but at the same time to employ inferential filters to dismiss them. That way the data no longer stand in the right inferential relations to the theory and hence no insufficiency objection can be launched. This option is open to both the doppelganger data and the raven paradox data. In the raven case, we can use rules to block Bogen and Woodward’s assumption that “evidence which confirms a claim also confirms claims which are logically equivalent to it” (p. 230). In cases of doppelganger data, we can use rules to block inferential relations that do not instantiate the right causal relations in much the same way as causal-reliabilist theories in epistemology block knowledge when an agent’s belief in a certain proposition p is not causally connected with one or more fact(s) about p. I can conceive of two ways of implementing the second option. The first involves the postulation of rules that can be used as steps in a non-monotonic argument to prohibit a certain class of inferences. Such rules are therefore internal to the argument. The second involves the postulation of rules that discriminate between relevant and irrelevant data prior to the construction of the particular inferential model. These rules are external. A potential objection to the second option calls into question the augmentation of IRS models with additional rules, especially those that have semantic content. To do so, it might be argued, would be to go beyond the original conception of the model as a purely (first-order) syntactical account of confirmation and hence to defend a viewpoint that was never in contention. Although reasonable sounding at first, this objection caricatures the truth on the ground. IRS models, even those touted by logical empiricists, were hardly ever purely syntactical.3 Take the deductive-nomological model which served both as an account of explanation as well as one of confirmation.4 Among other things, the model demands the satisfaction of at least one extra-syntactical condition, namely the premises of a D-N argument all have empirical content.5 In similar fashion, we may demand from other instantiations of IRS models to fulfil any number of additional syntactical or extra-syntactical conditions without forsaking their IRS pedigree.
3 Cf. Hempel (1943). 4 Although the deductive-nomological model was primarily meant as a model of explanation, we must not
forget that for the logical empiricists explanation and prediction were intimately (and even symmetrically) tied. For that reason the model was employed in a confirmational capacity. It is worth noting that more recent models of explanation, e.g. Kitcher’s unificationist account, continue to enjoy proximal relations with confirmation. 5 Another potentially extra-syntactical condition is the requirement that at least one law of nature be
included in the premises of a D-N argument. Its qualification as extra-syntactical depends on whether laws of nature are understood in a purely syntactical manner.
123
96
Synthese (2011) 182:89–100
5 Restoring sufficiency. Part II—A question of reliability The main thrust of Bogen and Woodward’s objection that IRS models are insufficient to capture all evidential relations comes not from the examples discussed in the previous section but from their discussion of reliability. Two notions of reliability are at stake here, general and local. An instrument or detection process is generally reliable with respect to a set of data “if it has a satisfactorily high probability of outputting, under repeated use, correct discriminations… of competing phenomenon-claims and a satisfactorily low probability of outputting incorrect discriminations” (p. 237). Galileo’s telescope is deemed generally reliable because it presumably possesses ‘repeatable error characteristics’.6 Sometimes general reliability is insufficient to establish the evidential import of data. It is then that we turn to local reliability. This involves “single-case performances of procedures, pieces of equipment, etc” (p. 226). The relevant phenomenon is thus established without recourse to repeatability but simply “by ruling out other possible causes of the data” (p. 245). So, for example, “...it might be that this particular fossil is contaminated in a way that gives us mistaken data, or that the equipment I am using has misfunctioned [sic] on this particular occasion” (p. 244). The data resulting from the radioactive dating of this fossil might still be locally reliable, provided we can rule out the contamination or the malfunctioning of the instrument as plausible causes. Bogen and Woodward contend that the IRS cannot model evidentially relevant factors concerning general or local reliability. They buttress their contention by citing two closely related reasons. The first one latches onto the idea that “assessing the general [or local] reliability of an instrument or detection technique does not require that one possess a general theory that systematically explains the operation of the instrument or technique or why it is generally reliable” (p. 240)—see also (1988, pp. 309–310, 312, 317). What is implied here is that without such a general theory the IRS cannot reconstruct the kinds of evidential relations that hold between instruments/detection techniques, data and theories. The second reason concerns the “idiosyncratic” and “highly heterogeneous” nature of the information utilised in making judgments about reliability (2003, pp. 244–245).7 This information includes such things as the “the performance of the detection process in other situations in which it is known what results to expect, about the results of manipulating or interfering with the detection process in various ways, and so forth” (p. 253). The punch-line, according always to Bogen and Woodward, is that it is unclear how to model such information via the IRS. A no-frills response to the first reason is that the IRS theorist need not possess a general theory that systematically explains the reliability or the operation of an instrument or technique. So long as we have some evidential basis for our belief in the reliability of an instrument or technique, I do not see why a relevant auxiliary cannot be reconstructed. Such an auxiliary, if you recall from Sect. 2, may encapsulate whatever little 6 I say ‘presumably’ because their account of the telescope’s general reliability is thin on the details. 7 Bogen and Woodward seem to intend the two reasons to apply to both general and local reliability.
For example, after discussing how the two reasons are pertinent to general reliability, they say “[s]imilar remarks apply to conclusions about local reliability” (f37, pp. 239–240).
123
Synthese (2011) 182:89–100
97
information we have about a given factor and still prove helpful. Consider the case of Galileo’s telescope again. The general reliability of the telescope becomes apparent when one realises that we have independent confirmation for a subset of the data it generates, e.g. features of the moon we can see with the naked eye. But this is surely easy to encode into an auxiliary. In the given case the auxiliary will look something like this: ‘Galileo’s telescope is generally reliable with respect to domains of phenomena P1 and P2 where it produces data sets D1 and D2 respectively because we have independent confirmation (i.e. several consistent naked eye accounts) of the validity of D1 ’. The benefit of formulating such auxiliaries is that they facilitate the transparent evaluation of scientific presuppositions. In short, they allow us to throw light on the defeasibility conditions of such presuppositions. For example, if we have various independent reasons to believe that Galileo’s telescope is not reliable in its production of a given proper subset of data D2 then we shall have to modify the auxiliary accordingly as well as any of our epistemic commitments that are based on it. A similar response can be given to Bogen and Woodward’s second reason. The specificity and heterogeneity of the relevant information are irrelevant to the question whether we can construct suitable auxiliaries. Initial conditions are both specific and heterogeneous yet they are and have been the staple of IRS models without any objections to the contrary. Why should information about the local reliability of an instrument or a technique be any different? To satisfactorily answer this question we need to delve deep into the sole example Bogen and Woodward provide considerable detail for, namely the attempts to detect gravitational waves in the late 1960s and early 1970s. The key figure in this story is Joseph Weber who in 1969 unilaterally claimed their discovery. The measuring instrument in Weber’s experiment was a metal bar which was supposed to vibrate under the influence of gravitational waves. Since other sources could cause such vibrations—e.g. seismic, electromagnetic or thermal activity—the single most important consideration in designing the experiment was to shield against, or at least to correct for, confounding factors. That’s precisely what Weber set out to do. In so doing, Bogen and Woodward tell us, Weber was building up a case for the local reliability of his instruments and methods. The problem [Weber] faced was that a number of other possible causes or factors besides gravitational radiation might in principle have caused his data. Unless Weber could rule out, or render implausible or unlikely, the possibility that these other factors might have caused the disturbances, he would not be justified in concluding that the disturbances are due to the presence of gravitational radiation (p. 247). In the years that followed consensus was formed that Weber did not in fact positively identify gravitational waves. At issue were the very methods he employed to establish local reliability. Weber’s critics, according always to Bogen and Woodward, were able to challenge his detection claim on grounds that were idiosyncratic and highly heterogeneous. Two questions emerge at this point. What were these idiosyncratic and highly heterogeneous evidential considerations upon which Weber’s detection claim was challenged? Are IRS models incapable of doing such evidential considerations
123
98
Synthese (2011) 182:89–100
justice? Bogen and Woodward list three evidential considerations that were crucial to the dismissal of Weber’s detection claim. Only two of them can be classified as idiosyncratic and highly heterogeneous.8 They are: (i) one of Weber’s detection claims was made on the basis of an erroneous assumption about the time another team’s data were recorded—the correction of this assumption annulled the detection claim and (ii) the detections produced by Weber’s own data had the desirable feature of sidereal correlation but this feature could not subsequently be reproduced either by him or by any of the other teams. Let us look at the first problem. Even though Bogen and Woodward mention this problem in passing, we can take a more sustained look informed by Franklin’s (1994) excellent historico-philosophical paper on this event.9 Franklin digs deep into the published records (articles and memoirs) and unearths the following details. A handful of teams other than Weber’s run their own detection experiments, collected data and analysed them. The data were then passed around the various groups for closer scrutiny. At one point, Weber professed a detection correlation between a segment of his own team’s data and that of data gathered by a team led by David Douglass. According to Weber, the time stamps of the two sets of data had a 1.2 s discrepancy but that could easily be explained away by the fact that the clocks used at each of the different experiment locations were not in perfect synchrony. As a matter of fact the time discrepancy was 4 h, 1.2 s. Weber had erroneously not taken into account that the clocks in the two experiment locations were set to different time zones (one used GMT and the other EDT). Douglass confronted Weber at a General Relativity conference in Cambridge (CCR-5) and Weber conceded the mistake. What is relevant to our discussion here is that we can model the time difference between the two sets of data by appealing to a correction factor. Crucially we can do this by encoding the relevant information in sentential form more or less as follows: ‘Data taken by Weber’s team can be compared to data taken by Douglass’ team only after they are offset by 4 h 1.2 s’. The second problem also needs some stage setting. Bogen and Woodward explain the appeal of sidereal correlations: Weber also relied on facts about the causal characteristics of the signal—the gravitational radiation he was trying to detect. The detectors used by Weber were most sensitive to gravitational radiation when the direction of propagation of given radiation was perpendicular to the axes of detectors. Thus if the waves were coming from a fixed direction in space (as would be plausible if they were due to some astronomical event), they should vary regularly in intensity with the period of revolution of the earth. Moreover, any periodic variations due to human activity should exhibit the regular 24 h variation of the solar day. By contrast, the pattern of change due to an astronomical source would be expected to be in accordance with the sidereal day which reflects the revolution of the earth around the sun, as well as its rotation about its axis, and is slightly shorter than the solar day. When Weber initially appeared to find a significant correlation 8 The third one is that none of the competing groups of scientists were able to replicate Weber’s results. 9 Franklin discusses additional difficulties afflicting Weber’s methods and results.
123
Synthese (2011) 182:89–100
99
with sidereal, but not solar, time in the vibrations he was detecting, this was taken by many other scientists to be important evidence that the source of the vibrations was not local or terrestrial, but instead due to some astronomical event (p. 246). The honeymoon did not last long. None of the other groups that set about to confirm these results were able to detect a sidereal correlation. Even Weber himself was unable to repeat his earlier ‘success’. Once more, what is relevant to our discussion is whether IRS models can capture the relevant idiosyncratic and highly heterogeneous evidential considerations. It is not hard to imagine that they can. The relevant formulation will not look very different to this: ‘The intensity of any signal k that meets threshold r should co-vary with the sidereal day and, moreover, such positive correlations should be reproducible by other similarly constructed experiments’. 6 Conclusion In closing, I would like to summarise some of the main points I have argued for. In Sect. 3 I have argued for the view that prima facie non-sentential data can be reconstructed in sentential form without loss of evidential content. To support this view I furnished sentential reconstructions of Bogen and Woodward’s central examples of non-sentential data. What is more, I supplied general reasons why sentential reconstructions are always possible, namely that the widespread automation of data processing implies that in such cases the data are already in sentential form or can easily be converted into it and that the expression of assent or dissent towards a set of data is at least an implicit endorsement of their having propositional content. I concluded that section by challenging Bogen and Woodward to come up with one datum whose evidential worth cannot be sententially recreated. Arising from Sects. 4 and 5 is, I hope, a clear picture of why Bogen and Woodward’s insufficiency objection is hard to maintain. Neither the raven paradox nor the doppelganger data scenario manage to box in the IRS theorist who can always rely on syntactical as well as extra-syntactical tools to restore the model’s sufficiency. Considerations concerning general and local reliability fare no better. The IRS theorist need not be in possession of a general theory that explains the workings of an instrument or method. In so far as we have independent reasons to believe in the reliability of the method or instrument, reconstructing a suitable auxiliary does not seem beyond reach. Finally, the cited examples of idiosyncratic and highly heterogeneous evidential considerations appear to be as amenable to IRS formulation as the rest of the cases discussed. Acknowledgement This work was in part made possible by funding from the German Research Foundation (Deutsche Forschungsgemeinschaft).
References Bogen, J., & Woodward, J. (1988). Saving the phenomena. The Philosophical Review, 97, 303–352. Bogen, J., & Woodward, J. (1992). Observations, theories and the evolutions of the human spirit. Philosophy of Science, 59(4), 590–611.
123
100
Synthese (2011) 182:89–100
Bogen, J., & Woodward, J. (2003). Evading the IRS. Poznan studies in the philosophy of the sciences and the humanities. In R. Jones & N. Cartwright (Eds.), Idealization XII: Correcting the model (pp. 233–268). Amsterdam: Rodopi. Franklin, A. (1994). How to avoid the experimenter’s regress. Studies in History and Philosophy of Science, 25(3), 463–491. Glymour, C. (1980). Theory and evidence. Princeton: Princeton University Press. Hempel, C. G. (1943). A purely syntactical definition of confirmation. The Journal of Symbolic Logic, 8(4), 122–143. Hempel, C. G. (1965). Aspects of scientific explanation. New York: The Free Press. Howson, C., & Urbach, P. (1989). Scientific reasoning: The Bayesian approach. La Salle, IL: Open Court. Votsis, I. (forthcoming). Making contact with observations. In M. Suárez, M. Dorato, & M. Rédei (Eds.), EPSAO7: Launch of the European Philosophy of Science Association. Springer. Woodward, J. (1989). Data and phenomena. Synthese, 79(3), 393–472.
123
Synthese (2011) 182:101–116 DOI 10.1007/s11229-009-9611-z
From data to phenomena: a Kantian stance Michela Massimi
Received: 20 May 2009 / Accepted: 3 June 2009 / Published online: 7 July 2009 © Springer Science+Business Media B.V. 2009
Abstract This paper investigates some metaphysical and epistemological assumptions behind Bogen and Woodward’s data-to-phenomena inferences. I raise a series of points and suggest an alternative possible Kantian stance about data-to-phenomena inferences. I clarify the nature of the suggested Kantian stance by contrasting it with McAllister’s view about phenomena as patterns in data sets. Keywords
Data · Phenomena · Bogen · Woodward · McAllister · Kant
1 Introduction Bogen and Woodward (1988) distinction between data and phenomena marks an important turning point in recent philosophy of science. From a historical point of view, their notion of phenomena as stable and repeatable features emerging out of different data marks the beginning of a new trend in which phenomena are regarded as robust entities that scientific theories explain and predict, against a ‘thinner’ notion of phenomena typical of the empiricist tradition. As such, they have provided a foil to re-assess van Fraassen’s constructive empiricism. It is this particular aspect of Bogen and Woodward’s position that originally hooked me and that I explored in connection with a criticism of van Fraassen’s constructive empiricism in Massimi (2007). In this paper, I do not want to discuss the merits of the data—phenomena distinction with respect to van Fraassen’s view, nor do I want to present lengthy case studies taken from the history of physics. My starting point is instead a puzzle: how should we intend the data—phenomena distinction? Is it just descriptive of scientific practice?
M. Massimi (B) Department of Science and Technology Studies, University College London, Gower Street, London WC1E 6BT, UK e-mail:
[email protected] 123
102
Synthese (2011) 182:101–116
Or does it have a normative status? If descriptive, we risk loosing ourselves into a plurality of case studies, each of which may be right in its own terms, but it would no longer be clear what the exact role of the distinction is. If, on the other hand, we intend it as having a normative status, then we need to understand how exactly data-to-phenomena inferences ought to work. Although there may well not be a unique way in which data-to-phenomena inferences ought to work, I think the distinction does have a normative status, otherwise it could not bear on the epistemological implications against van Fraassen’s constructive empiricism, for instance. In this paper, I want to endorse an approach to the data—phenomena distinction which is normative and naturalised at the same time, following the Kantian tradition of epistemological naturalism. This is the tradition that claims that answers to the problem of knowledge (i.e., of how we know what we know) should be found by drawing on natural sciences. Or better, they should be found by taking the natural sciences as paradigmatic of scientific knowledge and by investigating the conditions under which we gain knowledge of nature. My focus is on data-to-phenomena inference, as Woodward (1989, 1998) has characterised it in terms of statistical reliability and manipulationist causation. I introduce a Kantian stance on phenomena and compare it with James McAllister’s (1997, 2004, 2007) alternative account of phenomena as patterns in data sets. I agree with Bogen and Woodward’s bottom-up characterization of data-to-phenomena inference, but I draw a conclusion that Bogen and Woodward would resist: namely, that phenomena are (partially) constituted by us, rather than being ready-made in nature. On the other hand, I grant with McAllister that phenomena have features which are—in some relevant sense to be clarified here below—mind-dependent, but I resist McAllister’s characterization of phenomena as patterns in data sets. Thus, a Kantian stance situates itself in between the externalist account of Bogen and Woodward, and McAllister’s internalist one. In Sect. 2, I raise three points to show that statistical reliability and manipulationist causation may not necessarily be sufficient to individuate phenomena unequivocally. In Sect. 3, I present a Kantian stance on phenomena intended as ‘conceptualised appearances’, and I discuss some main aspects of it by highlighting the points of convergence and divergence with respect to McAllister’s view. 2 Metaphysics and epistemology in Bogen and Woodward’s notion of phenomena In this section I want to highlight some of the metaphysical and epistemological assumptions behind the distinction between data and phenomena. From a metaphysical point of view, Bogen and Woodward share realist intuitions about phenomena: Phenomena exist ‘out there’ in nature, as stable and repeatable features emerging across a variety of experimental contexts and data. As Woodward nicely puts it: “Detecting a phenomenon is like looking for a needle in a haystack or (…) like fiddling with a malfunctioning radio until one’s favourite station finally comes through clearly” (Woodward 1989, p. 438).
123
Synthese (2011) 182:101–116
103
Despite the variety and idiosyncratic nature of the causal factors involved in data production, Woodward stresses the importance of controlling and screening off various causal influences that may undermine the reliability of phenomena detection. The usual procedure consists in (1) control of potential confounding factors; (2) elimination of background noise; (3) procedure for statistical analysis and data reduction. Woodward’s analysis goes in a Cartwright–Hacking direction in claiming that (1)– (3) are relatively theory-free, and fall into the province of experimentalists’ expertise, rather than of theoreticians’. Not surprisingly, most of the examples from the history of high-energy physics draw on the works of historians and sociologists such as Galison (1985); Franklin (1986); Collins (1981) and Pickering (1984). Thus, the metaphysical framework is close to that of experimental realism, whereby (i) phenomena such as weak neutral currents exist in the world ‘out there’; (ii) they manifest themselves by causally producing data such as bubble chamber photographs, which we then (iii) learn how to recognise from other data due to background noise via reliable procedures of data analysis and data reduction. Strictly coupled with this experimental realist metaphysics is the epistemology of reliabilism, championed in recent times by Alvin Goldman (1986) among others. The key idea is that we are justified to believe in a phenomenon p if the process that confers justification is reliable, i.e. generates true beliefs with high frequency. In other words, we are justified to believe in a phenomenon p if the process of data production and data analysis is reliable, i.e. generates true beliefs about p with high frequency. Woodward (1998) has addressed the issue of reliability by explicitly drawing links with both Goldman’s epistemology and with Deborah Mayo’s (1996) error-statistical approach. The central idea is to identify patterns of counterfactual dependence between data and phenomena that take the following form (Woodward 1998, S166): “(1) if Dj is produced, conclude that Pi is true. Then the ideal at which one aims is (2) the overall detection or measurement procedure should be such that each of the conditionals of form (1) recommends that one accept Pi when and only when Pi is correct. Somewhat more succinctly: the detection and measurement procedure should be such that different sorts of data D1…Dm are produced in such a way that investigators can use such data to reliably track exactly which of the competing claims P1…Pn is true”. A distinctive feature of reliabilism is that it licenses theory-free data-to-phenomena inferences: “in order for data to be reliable evidence for the existence of some phenomenon (…), it is neither necessary nor sufficient that one possesses a detailed explanation of the data in terms of the causal process leading to it from the phenomenon” Woodward (1989, pp. 403–404): •
•
It is not sufficient because even if a phenomenon plays a causal role in the production of data, it may well be impossible to extract reliable information from the data about the phenomenon of interest. For example, although W and Z bosons are produced in bubble chambers, in 1960s and 70s no one used bubble chambers to detect W and Z because it meant to locate a very rare event among millions of pictures, “an impractical task” until Carlo Rubbia did it in the 1980s. It is not necessary either because data can provide reliable evidence for some phenomena even if one is ignorant of or mistaken about the character of the causal processes leading from the phenomena to the data. For instance, Donald Glaser,
123
104
Synthese (2011) 182:101–116
who invented the bubble chamber, was himself quite mistaken about the actual causal mechanism at work in bubble chambers (which he misinterpreted as electrostatic repulsion). Thus, there are two main interrelated assumptions behind Bogen and Woodward’s notion of phenomena: I. Metaphysics: Phenomena are ready-made in nature and they manifest themselves by causally producing data. II. Epistemology: Reliability is the epistemic criterion for knowing phenomena from data. 2.1 Three points about Bogen and Woodward’s notion of phenomena In this section, I want to raise three points concerning both the metaphysical and epistemological assumptions above. 2.1.1 On the metaphysics of ready-made phenomena: underdetermination of phenomena by data The problem with the metaphysics of ready-made phenomena is that it is not always clear what signal we should be looking for in a sea of noise: in fact, there could well be more than one relevant signal compatible with the same data. To put it in a different way, the data from which we have to infer the phenomena can be causally produced by a variety of different phenomena under suitable experimental conditions, so that the choice of which phenomena the data provide evidence for may well be underdetermined by the data. I have explored this entity-version of the classical underdetermination problem in relation to Hacking’s experimental realism (1983) in Massimi (2004), where I concluded that “whenever we have prima facie rival potential causes for the same phenomena, in order to distinguish between them and to determine which entity-with-causal-power has actually produced the observed effect, we must in the end rely on a description of what causal powers/ capacities /dispositions an entity is to have so as to produce the observed effects. This description is given by a scientific theory” (ibid., pp. 42–43). Let me clarify this point. This is not a re-statement of the theory-ladeness of observation. I am making a more radical claim. I claim that there can be more than one way in which we can carve data production, analysis, and reduction, given the various contextual factors in the sea of noise. And what sort of phenomena we infer depends on the way we have carved and ‘massaged’ those data. Of course, there are constraints, and in no way do I want to suggest that anything goes, as it will become clear when I present my Kantian stance. But to go back to this first point, the same data may be compatible with more than one phenomenon, and simply manipulating / controlling data at the experimental level may not be sufficient to discriminate what entity-with-causal-powers has produced them. Woodward (1998, S167) acknowledges the problem: Consider a case in which the presence of neutral currents is in fact sufficient for (…) or has in fact caused certain bubble chamber photographs D1. If some other
123
Synthese (2011) 182:101–116
105
competing phenomena claim P2—e.g. that background neutrons are present— is also sufficient for D1 or if background neutrons would also cause D1 (…), then the overall detection procedure is very likely such that one would conclude P1 even if P2 is correct. In this case the pattern of counterfactual dependence described above will not be satisfied and D1 will not be good evidence for P1. The pattern of counterfactual dependence just described is an ideal, which for a variety of reasons is rarely fully satisfied in practice. The conclusion he draws is that data-to-phenomena inferences are probabilistic rather than deterministic. Thus, one possible response to the problem of underdetermination of phenomena by data consists in offering a probabilistic version of reliability. In this way, the burden of supporting the metaphysics of ready-made phenomena is shifted to the epistemology of reliabilism, to which I now turn. 2.1.2 On reliability as an epistemic criterion independent of our knowledge of the causal mechanism leading from the phenomenon to the data The main novelty of Bogen and Woodward’s notion of phenomena is their use of reliabilism to back up their experimental realist metaphysics. In epistemology, reliabilism has been championed in recent times by Alvin Goldman and Fred Dretske (1981), among others. The distinctive feature of reliabilism is that it provides justification for beliefs in a way that is independent of our knowledge of the causal mechanism behind data production, as we saw above. So it is this claim that we have to assess here. Consider the following counterexample. An experimenter may come to believe in a phenomenon p by a reliable process, i.e. a process that generates true beliefs about p from data with high frequency, although the experimenter’s undergoing this process is caused in an unreliable way (for instance, the experimenter may have learnt a reliable data reduction process from an unreliable colleague), so that although the belief-forming process is reliable we would not count it as justified belief. Goldman responds to this type of counterexample by introducing second-order processes (1986, p. 115): “A second-order process might be considered metareliable if, among the methods (or first-order processes) it outputs, the ratio of those that are reliable meets some specified level, presumably greater than .50”. The idea is that over time the use of second-order processes improves calibration by managing to discard poorly calibrated processes. However, there is a problem with this strategy of delegating the judgment of reliability for first-order processes to second-order processes (which is captured by the experimentalist notion of calibration—see Franklin 1997).1 And it is a problem that affects more in general reliabilism as an externalist epistemology. The problem is that it engenders a bootstrapping mechanism, as Jonathan Vogel (2000) has argued against Goldman; namely, the use of reliabilism itself to sanction its own legitimacy. 1 Woodward (1998, p. S171) refers to Franklin’s analysis of calibration within the context of empirical
assessment of reliability: “One important example (…) is the strategy that Allan Franklin (1997) calls calibration. Here one assesses the error characteristics of a method by investigating its ability to detect or measure known phenomena and then assumes that the method will have similar error characteristics when used to investigate some novel phenomenon”.
123
106
Synthese (2011) 182:101–116
Suppose someone believes whatever a spectrometer says about the spectrum of a certain chemical substance, without having justification for believing that the spectrometer is reliable.2 Suppose that the spectrometer happens to function very well. Then, an experimenter looks at it and forms the belief “In this case, the spectrometer reads ‘X ’ for substance a, and X ”, where X is the proposition that substance a has infrared spectrum. Since the experimenter’s perceptual process of reading the spectrometer is presumably reliable (she does not suffer from hallucinations, etc.), given the assumption that the spectrometer is functioning alright, we can say that the experimenter is justified to believe that the spectrum of substance a is infrared. Therefore, the experimenter can deduce that “On this occasion, the spectrometer is reading accurately”. Suppose she iterates this procedure many times, without ever checking whether the spectrometer is reliable because it is properly wired, etc. By induction, the experimentalist infers that “The spectrometer is in general reliable”, and hence goes on to use it in other cases to measure the spectrum of unknown substances. In this way, the experimentalist would fall prey of bootstrapping circularity. Goldman (2008, p. 17) replies to Vogel’s bootstrapping objection by saying that “the problem is not unique to reliabilism, but it is shared by many epistemologies. (…) Thus, if a theory like reliabilism—or any form of externalism—makes easy knowledge possible, this is not a terrible thing. Skepticism is a very unwelcome alternative”. If scepticism is certainly an unwelcome alternative, it is not however the only one. The counterexample shows, in my opinion, how despite the attractiveness, reliabilism is not itself exempt from difficulties. One of these difficulties resides in the attempt to secure knowledge by detaching reliability from the causal knowledge of the mechanism that generates true beliefs from data with high frequency. Woodward (1998, S176) stresses again this point: “The idea that one can often empirically establish that (4) a detection process is reliable without (5) deriving its reliability from some general theory of how that process works and/or why it is reliable is supported by a number of episodes in the history of science. (…) Galileo advanced a number of empirical arguments showing that his telescope was a reliable instrument in various astronomical applications even though he lacked a correct optical theory that could be used to explain how that instrument worked or why it was reliable”. I am not entirely convinced by this argument. I think that reliability cannot be entirely detached from the causal knowledge of the mechanism that generates true beliefs with high frequency. Take Galileo’s telescope and the controversy it sparked about the actual size of fixed stars. True, Galileo may not have had a full-blown or the right optical theory to explain how the telescope worked or why it was reliable, but he did have a causal explanation connecting the size of the stars (based on his Copernican belief), to the function of the lens of the telescope in making the objects appear more similar to the way they are in nature. His opponents endorsed the opposite causal explanation about the actual size of the stars (which they believed were bright spots on a celestial sphere) and the function of the lens, which—they thought—distorted the actual size of the stars. Galileo famously resorted to some non-telescopic observations to avoid possible objections against the reliability of his telescope (see Frankel 1978). 2
The following example is patterned upon Vogel’s (2000, pp. 612–615) reformulation of Michael Williams’ “gas-gauge case”.
123
Synthese (2011) 182:101–116
107
Nonetheless, several people from Christopher Clavius to Lodovico delle Colombe and Cesare Cremonini objected to the use of the telescope and its reliability. One of the main objections was that the telescope was unreliable because it did not seem to magnify the stars, by contrast with other objects: the size of the stars observed with naked eye at night and with the telescope was approximately the same. At stake in this debate was the issue of whether or not the halos of the stars visible with naked eye should be taken or not into account in the estimate of the actual size of the stars: Aristotelians such as Horatio Grassi thought that it should, while Galileo thought that it should not because it was illusory. The debate sparked when Grassi published (under the pseudonym of Lothario Sarsi) this objection to the reliability of the telescope in his 1619 Libra astronomica, which Galileo rejected in Il Saggiatore. The controversy that raged during Galileo’s time around the use of the telescope and how to interpret the data, testifies in my view to the central importance that causal explanation plays in assessing reliability claims. The final verdict went to Galileo because the scientific community eventually embraced Galileo’s causal explanation of how the telescope worked and why it was reliable. Another example could be cathode rays. Originally discovered by Julius Plücker in 1859, they became a standard tool for different generations of scientists that used them in conjunction with different causal explanations of the fluorescence observed. The current causal explanation is that cathode rays are streams of electrons projected from the cathode by electrical repulsion. But this was not the causal explanation endorsed by nineteenth century physicists. Indeed, their different causal explanations of the mechanism led to very different data-to-phenomena inferences: from Arthur Crookes’s ‘particles’ (ranging from molecules to an unknown state of matter); to J.J. Thomson’s ‘corpuscles’ carriers of electricity; to Joseph Larmor’s ‘electrons’ as permanent structural features of the ether. No wonder this historical episode has attracted much attention among historians and philosophers of science that are still trying to resolve the historiographic dispute about who really discovered the electron (see Achinstein 2001, and Arabatzis 2006). We see here another example of how reliability claims cannot be completely disentangled from discussions about the causal mechanism that generates data. 2.1.3 On whether the epistemology of reliabilism can support the metaphysics of ready-made phenomena There is a third point I want to make. One may wonder to what extent reliabilism supports the metaphysics of ready-made phenomenena. Or better, whether that metaphysics finds back-up in the epistemology of reliabilism. For instance, how do we know that neutral currents are ‘out there’ in nature? In Sect. 2.1.1 we discussed the problem of underdetermination of phenomena by data. In Gargamelle bubble-chamber, the greatest challenge was to identify whether the observed data were the result of weak neutral currents (which scientists were looking for) or rather the result of neutron background which can produce exactly the same data. We concluded Sect. 2.1.1 by mentioning Woodward’s probabilistic version of reliabilism to support the metaphysics of ready-made phenomena. Can the probabilistic version of reliabilism support the metaphysics of ready-made phenomena?
123
108
Synthese (2011) 182:101–116
I suspect that the answer to this question is negative for the following reason. Until we have specified the class of contexts within which processes of data production, analysis and reduction are to operate, it is hard to see how these processes can generate true beliefs about phenomena with high frequency. The problem with reliabilism—even in the probabilistic version—is that we need to specify in advance what our epistemic goals are and need to give an exact account for appraising reliability (‘reliable with respect to what?’). Unless we somehow know already how the phenomena that we are searching for should look like, how can we appraise whether data production and data reduction provide reliable evidence for them? In a way, this problem is a re-elaboration of what Harry Collins (1985/1992) has described as the experimenter’s regress: in order to prove that an experimental process is reliable, we have to show that it identifies the phenomenon correctly. But in order to identify the phenomenon correctly, one has to rely on the experimental process whose reliability is precisely at stake. So reliability seems to fall back into a justificatory circle. My claim is that unless we have a causal story, which normally comes from a scientific theory that helps us discriminate genuine phenomena from confounding factors, and enter in the way experiments are conceived, designed, and thought out, it may be very hard (and sometimes practically impossible) to discern phenomena on purely experimental grounds. To paraphrase an old slogan about causes, ‘No phenomena in, no phenomena out’. In sum, reliabilism presupposes the same metaphysics of readymade phenomena that is meant to back up. One may respond at this point that the metaphysics of ready-made phenomena is ultimately supported by some sort of Kripkean essentialism, according to which there are natural kinds and they manifest themselves in a reliable way through a variety of data and experimental procedures, even if we either do not know or simply cannot practically identify the causal mechanism that goes from natural kinds to data production. This is not an option that Bogen and Woodward expressly discuss, as far as I am aware of, but it is a fairly standard move for realists to make when pressed on metaphysical issues. But not even this move would solve the circularity between the metaphysics of ready-made phenomena and reliabilism. Indeed the circularity problem would crop up again this time in the bottom-up inductive nature of the data-to-phenomena inference. We may indeed ask how many ‘positive instances’ provide increasing evidential support for data-claims which would qualify as reliable evidence for a certain phenomenon. How can we guarantee that the next few instances are not going to be negative and hence are not going to have a knock-down effect on the data-claim we are trying to build for reliably inferring a certain phenomenon? Of course, this old problem of induction—applied to bottom-up data-to-phenomena inferences—has direct implications for how we look at the history of science and scientific revolutions. When did (inductively supported) data-claims about combustion stop being considered as reliable evidence for phlogiston, and begun to be regarded as reliable evidence for oxygen? As my colleague Hasok Chang (2008) is currently reconstructing in his work on the Chemical Revolution, answering this question (and similar ones about caloric and ether) is far from trivial. And it seems that unless we know already that phlogiston is not a natural kind, while oxygen is, reliability per se does not cut any ice in answering those questions. Thus, natural kinds can certainly back up the metaphysics
123
Synthese (2011) 182:101–116
109
of ready-made phenomena, but not the epistemology of reliabilism: unless one knows already that something is a natural kind (e.g. that oxygen is, but phlogiston is not), one cannot legitimately claim to infer it from data in a reliable way, i.e. in a way that generates true beliefs with high frequency. 3 A Kantian stance on phenomena I turn now to a Kantian stance on phenomena and hence on data-to-phenomena inferences, building up on some intuitions originally presented in Massimi (2007), and highlight some of the metaphysical and epistemological features that make it worth pursuing it, to my eyes. I said a “Kantian stance”. It is a ‘stance’, because echoing van Fraassen (2002) there is an element of voluntarism in endorsing a Kantian perspective on phenomena. I do not aim to offer a scientific, quasi-scientific or metaphysical theory of phenomenacognition, but only to describe the human epistemic conditions under which we gain knowledge of phenomena. In this sense, what I present here below is only a ‘stance’ that is non-committal about any specific matters of fact about human cognition. So much for clarifying why I call it a ‘stance’. It is ‘Kantian’ because it goes back to Immanuel Kant’s insight about phenomena as “conceptualised appearances”. Kant drew an important distinction between what he called ‘appearances’ and ‘phenomena’. An appearance, for Kant, is “the undetermined object of an empirical intuition” Kant (1781/1787, A20/B34). Appearance refers then to an object as merely given in sensibility and conceptually still ‘undetermined’, not brought yet under the categories of the faculty of understanding. A phenomenon, on the other hand, is a conceptually determined appearance, namely an appearance that has been brought under the categories of the understanding: “appearances, to the extent that as objects they are thought in accordance with the unity of the categories, are called phenomena” (ibid., A249). We gain scientific knowledge of nature by subsuming appearances (i.e. spatiotemporal objects as given to our mind in empirical intuition) under a priori concepts of the understanding (via schemata). In Massimi (2008, and forthcoming) I claimed that the special role Kant assigned to Galilean–Newtonian physics should be understood precisely in the light of the new conception of phenomena Kant was putting forward. Phenomena are not ready-made in nature, instead we have somehow to make them. And we make them by first ascribing certain spatiotemporal properties to appearances (for instance, for the phenomenon of free-fall investigated by Galileo, the property of acquiring the same speed over different inclined planes with the same height), and then by subsuming them under a causal concept, such as Newton’s gravitational attraction. But to what extent can we extract Kant’s conception of phenomena from its philosophical and historical context, and make it valuable for current discussions about data and phenomena? How would a Kantian stance on phenomena look like? And what good would it be? Here is my suggestion, which is only tentative and does not claim to be exhaustive or all-encompassing:
123
110
Synthese (2011) 182:101–116
(A) The inference from data to phenomena can be understood as the relation Kant envisaged between appearances and phenomena as follows: i. Scientific investigation starts with data, observable records of occurrences which can take the form of conditional relative frequencies (e.g. how many times the ‘green’ light of a particle detector goes off, given a chosen experimental setting; how many times the same degree of speed is recorded, given a plane with a certain inclination and height). ii. Data are then plugged into salient experimental parameters (e.g. scattering crosssection in particle collisions; acceleration for free-falling objects). iii. The salient experimental parameters are then organised in such a way as to make possible the production of graphs (which can or cannot be computeraided) showing how variation in one experimental parameter affects variation in another experimental parameter (e.g. how the scattering cross-section varies by varying the energy of collision;3 or how the space-to-time relationship varies given a certain acceleration). It is at this level that a new phenomenon can make its debut in the form of an unexpected graph solicited by data plugged into the experimental parameters. iv. We then have to find a model with new (unobservable and more removed from data) parameters (e.g. the parameter R of hadron-to-muon production in scattering cross-sections of particles; or Newton’s gravitational constant g) that maximise the probability of the graph found. It is at this level that a causal concept is introduced to back up the new parameter in its job of maximising the probability of the found graph. So we say that the parameter R shows an unexpected value of 10/3 (matching the unexpected peak at 3.1. GeV in the graph), because of some anomalous hadron production caused by the presence of a fourth quark c with fractional charge 2/3. Or we say that free-falling objects obey Galileo’s times-squared law because the gravitational acceleration g can be regarded as approximately constant on the surface of the Earth. The phenomena scientists investigate are often the end product of these series of intermediate steps, at quite a distance from the original data. Not only then can they be unobservable, as Bogen and Woodward have rightly pointed out; they may also require a significant amount of conceptual construction. By making phenomena the serendipitous result of what Kant called the faculty of sensibility and the faculty of understanding, or—as we may prefer to say today—the result of both input from nature (in the form of data) and human contribution (in the form of causal concepts), a Kantian stance can capture the data-to-phenomena inference in a novel way. Indeed, I think that a Kantian stance on phenomena can satisfy Patrick Suppes (1962) hierarchy of models and go in the direction of clarifying an important point about that hierarchy. Suppes (1962, p. 259) famously laid out an hierarchy of models, whereby the lowest layer is occupied by “ceteris paribus conditions”, whose typical 3 I am referring here to Burton Richter’s 1974 data model that led to the discovery of an unexpected peak at 3.1 GeV in the computer-aided graph about the scattering cross-section of electron–positron collisions obtained by varying the energy of the collisions (for details of this case study in relation to my analysis of data models, see Massimi 2007).
123
Synthese (2011) 182:101–116
111
problems are noise control with no formal statistics; the next layer up is “experimental design”, dealing with problems such as randomization, assignment of subjects and any relevant info about the design of the experiment. Then, we have the layer of “models of data” properly speaking, dealing with three main problems: (1) homogeneity, (2) stationarity, (3) fit of experimental parameters, followed up by “models of experiment” dealing with problems such as number of trials and choice of experimental parameters “far removed from the actual data” (ibid., p. 255). Finally, on top of the hierarchy, we find “linear response models” dealing with problems such as goodness of fit to models of data and estimation of parameters that may well be unobservable (e.g. in Suppes’ example from learning theory, the unobservable parameter which “is not part of the recorded data” is the learning parameter θ taking as values real numbers). First thing to note about this hierarchy is that noise control, data models and phenomena constitute distinct layers. In particular, noise control belongs to the lowest layer of ceteris paribus conditions, not even to the data models layer, even less so to the level of linear response models, which I take to be identifiable with Bogen and Woodward’s ‘phenomena’, since it involves the estimation of parameters not observable and far removed from the recorded data. The crucial point about ‘data models’, ‘models of experiment’ and ‘linear response models’—to use Suppes’ terminology—is to spell out the link between the experimental parameters, as given by relative frequencies of occurrences (raw data) and the estimation of unobservable parameters (like θ in Suppes’ example from learning theory, or R in my (Massimi 2007) Richter’s example from high-energy physics), which act at the interface between experiment and theory. This link is what a Kantian stance can take care of, in my view. The advantage of a Kantian stance is that it gives the bottom-up, empiricist approach championed by Suppes, Bogen and Woodward its due, while also acknowledging that phenomena are in part the product of the way scientists carve nature at its joints. In the next section, I take a closer look at a Kantian stance on phenomena by highlighting two main metaphysical and epistemological aspects and by comparing and contrasting it with McAllister’s similar view.
3.1 Metaphysical and epistemological aspects of a Kantian stance on phenomena 3.1.1 No ready-made phenomena A Kantian stance does not commit to any metaphysics of ready-made phenomena. From a Kantian point of view, the operation of the scientific instruments and the ensuing production of data together with what Kant called “principles of reason” play a pivotal role in the constitution of phenomena. In Massimi (2008, and forthcoming) I have investigated this distinctive aspect of a Kantian conception of phenomena in relation to Galileo’s mathematization of nature. From a Kantian perspective, the goal of the inclined plane experiment was to extract from the appearance (motion of a bronze ball along an inclined plane) the property of uniform acceleration that Galileo had himself a priori inserted in the appearance for the sake of possible experience. Namely, for the sake of experiencing uniformly
123
112
Synthese (2011) 182:101–116
accelerated motion, we must constitute the properties of free-falling bodies according to Galileo’s kinematical reasoning. Galileo did not arrive at his law of free fall by simple curve-fitting data about balls rolling down inclined planes. There was instead an element of construction, a “principle of reason” that guided Galileo in his experiments with inclined planes and led him to the phenomenon of uniformly accelerated free-falling bodies. Similar analyses could of course be carried out in reference to many other examples and case studies. The reason why I am concentrating on Galileo is because of its direct link with Kant, of course, and also because it provides a springboard to contrast a Kantian stance with James McAllister’s alternative analysis of Galileo along the lines of his account of phenomena as patterns in data sets. Galileo provides indeed a nice foil to compare the mind-dependence inherent a Kantian account with what I take to be the mind-dependence implicit in McAllister’s account, to which I now turn.4 McAllister (1997) locates mind-dependence in the contingent, investigator-dependent noise control settings, which as a result engender different phenomena out of the same data set: any given data set can be described as the sum of any one of infinitely many distinct patterns and a corresponding incidence of noise. (…) ‘Pattern A + noise at m percent’, ‘Pattern B + noise at n percent’ and so on. (…) As a data set lends itself equally readily to being described as containing any amount of noise, this value will have to be fixed by investigators (…) this option yields an investigator-relative notion of phenomena. (ibid., pp. 219, 223) While, my Kantian stance goes along the lines of McAllister’s in rejecting an ontology of phenomena as being ‘out there’ in nature causing patterns in data sets, at the same time it disagrees with McAllister about where to locate the mind-dependent aspect of phenomena. I would not locate it at the level of noise control, and I envisage a more robust role for it. Take McAllister’s (2004, pp. 1166–1168) analysis of Galileo’s experiments: An example of a phenomenon, according to Galileo, is free fall. However, each instance of free fall is also partly determined by accidents (…) In the limiting case, if the influence of accidents could be reduced to zero, it would be possible to read off the properties of the phenomenon from an occurrence. (…) Galileo’s polishing and smoothing of his experimental apparatus yielded the desired result. (…) Distinct performances of any concrete experiment with falling bodies that was technically feasible at Galileo’s time would not have accorded on any clear-cut phenomenon of free-fall.
4 James McAllister (private communication) would disagree with my use of the expression ‘minddependence’ to refer to his position, which is meant to capture a ‘thoroughgoing empiricism’, whereby the expression ‘phenomena’ does not appear any longer and ‘all evidence about the structure of the world is constituted by patterns in data sets’ (see McAllister’s paper in this volume). As the discussion here below will clarify, I use the expression ‘mind-dependence’ to refer to the specific ‘investigator-dependent’ choice of the noise level, which enters in McAllister’s analysis of how some patterns in data sets are identified as phenomena.
123
Synthese (2011) 182:101–116
113
I completely agree with McAllister’s comment that no actual amount of experience would back up Galileo’s phenomenon of free-fall. At the same time, I resist his conclusion that if Galileo had only been able to reduce the influence of accidents and noise to zero, he would have been able to read off the properties of the phenomenon of free fall from occurrences. No amount of polishing and smoothing the experimental apparatus would be tantamount to the phenomenon of free fall, unless we introduce some key assumptions or suppositions under which we construct certain kinematical properties, which is precisely what Galileo did, according to the Kantian line I am suggesting. Thus, I share McAllister’s ontological perspective in recognising that the world is a complex causal mechanism that produces data in which a variety of phenomena can be discerned, but I diverge from him when it comes to the identification of the mind-dependent feature of phenomena with noise control. I think instead that the latter is to be identified with the way in which kinematical data are organised and dynamical/causal concepts applied. Of course, this divergence is indicative of a more substantial divergence about the human contribution to the phenomena. While McAllister thinks that it is down to investigators to stipulate which patterns in data sets count as phenomena, I think there is more to this process than human ‘stipulation’. 3.1.2 Phenomena are not stipulated by investigations McAllister (1997, p. 217) defends the view that “each investigator may stipulate which patterns correspond to phenomena for him or her”, by differently setting the noise level, and hence by identifying different patterns in the same data set. He challenges Bogen and Woodward “to pick out independently of the content of scientific theories which patterns in data sets correspond to phenomena” (ibid., p. 222). Consider, for example, the phenomenon of planetary motions (ibid., p. 226): Asked in what the phenomenon of planetary orbits consists, Kepler would have replied ‘In the fact that, with such-and-such noise level, they are ellipses’, while Newton would have replied ‘In the fact that, with such-and-such noise level, they are particular curves that differ from ellipses, because of the gravitational pull of other bodies’. Thus, phenomena…vary from one investigator to another…. Which aspect of the occurrences is singled out for explanation varies from one investigator to another, notwithstanding the fixedness of physical occurrences themselves…there are no grounds for claiming that either Aristarchus, Kepler, Newton, Einstein, or anyone else has correctly identified the pattern to which the phenomenon of planetary orbits corresponds while the others have failed. I think that McAllister’s characterization of patterns in data sets plus noise is (1) neither a sufficient, (2) nor a necessary condition for phenomena. (1) It is not a sufficient condition because it is overpermissive: it multiplies phenomena without necessity, unless some suitable provisos are introduced. It is not the case that all regular patterns in data sets (obtained by simply varying levels of noise at the investigator’s discretion) qualify as phenomena. Take as an example William Prout’s chemical theory in 1815. On the basis of tables of
123
114
Synthese (2011) 182:101–116
atomic weights available at the time, Prout thought that the atomic weight of every element was an integer multiple of the hydrogen and hence identified the hydrogen atom as the basic constituent of all chemical elements. Although Prout identified a genuine pattern in data sets (plus noise level/accidents) which was indeed vindicated in modern chemistry with the discovery of isotopes, we would not qualify it as a phenomenon. Or take as another example eighteenth-century chemistry. In post-Lavoisier chemistry, under the influence of Lavoisier’s idea of oxygen as the principle of acidity, compound substances were normally listed in a series according to the amount of oxygen they contained, and regular patterns—namely, series of binary combinations of oxygen with simple substances up to four degrees of “oxygenation”—were identified and became popular in chemistry textbooks of the time. There was however a problem with muriatic acid. Lavoisier assumed that it was an oxide of an unknown radical which he called the ‘muriatic radical’, and that it could be further oxidated, to form oxygenated muriatic acid (later, chlorine). This whole muriatic series was built on an assumed analogy with the most common acids, namely vitriolic and nitric. But the analogy was not borne out, and the series listed in chemistry textbooks of the early 19th century was revised after Davy determined that chlorine was an element, that muriatic acid was hydrochloric acid (HCl), which contains no oxygen, and hence that the hypothesised muriatic radical did not exist at all.5 Again, not every identifiable pattern in a data set qualifies as a genuine phenomenon. (2) McAllister’s characterization is not a necessary condition either for phenomena because it oversimplifies the notion of phenomena by reducing it to a linear structure of the form F (x) = a sin ω x + b cos ω x + R(x) (where R(x) is the noise term—see McAllister 1997, p. 219). A physical phenomenon is a more complex entity that this linear structure. And indeed most phenomena do not obey this structure. Phenomena crucially involve parameters and hence concepts that shape data sets in a way rather than another. And typically different parameters and concepts used to carve data engender different phenomena in a way that vindicates the complexity of scientific revolutions. McAllister’s analysis of how Aristarchus, Kepler, Newton, and Einstein engendered different phenomena by setting the noise levels differently does not seem to capture the genuine conceptual revolution that took place in the passage from Aristotelianism to Copernicanism. Similarly, his analysis of Galileo’s experiment with the inclined plane and the phenomenon of free fall does not seem to vindicate the genuine conceptual revolution that Galileo brought to mechanics, compared to the Aristotelians. In this respect, I think that a Kantian stance can provide a better philosophical standpoint for the history of science: in the end, we do want to defend the view that the development of physics from Galileo to Newton was a scientific revolution that improved our understanding of nature, compared to Aristotelian physics. And one possible way to go about making this claim is to look at the immediately preceding past history (say, Medieval impetus theory), its transformation with Galileo’s concept 5 I thank Hasok Chang and Georgette Taylor, to whom I owe this example.
123
Synthese (2011) 182:101–116
115
of ‘impeto’ (which was at quite a distance from Buridan and Oresme and yet still somehow related to the Archimedean science of weights), and its paving the way to Newton’s new concept of ‘gravitational attraction’ (for details, see Massimi forthcoming). The specific Kantian story of how we construct phenomena (by ascribing spatio-temporal properties to appearances that we then have to prove via experiments, and subsume under suitable causal concepts) does justice to the idea of scientific progress. After all, justifying the progress achieved by Galilean–Newtonian physics was precisely what Kant’s transcendental philosophy aimed at.
4 Conclusion As I hope to have clarified, a Kantian stance on phenomena can potentially offer a genuinely new perspective on the issue of how we infer phenomena from data, by taking the distance both from a metaphysics of ready-made phenomena and from conventionalist readings. Needless to say, spelling out the details of a Kantian stance and how exactly it translates into modern science (leaving aside most, if not all, the baggage of Kant’s transcendental philosophy) is a very challenging enterprise and much more work needs to be done. But my goal in this paper was to show that this is an enterprise worth pursuing for all the metaphysical, epistemological, and also historical reasons briefly canvassed. Acknowledgements I am very grateful to the audience of the Heidelberg conference “Data, phenomena, and theories”, and in particular to Jim Bogen, James Woodward, James McAllister, Peter Machamer, and Sandra Mitchell for very stimulating comments and reactions on an earlier draft of this paper.
References Achinstein, P. (2001). Who really discovered the electron?. In J. Z. Buchwald & A. Warwick (Eds.), Histories of the electron (pp. 403–424). Cambridge, MA: MIT Press. Arabatzis, T. (2006). Representing electrons: A biographical approach to theoretical entities. Chicago: University of Chicago Press. Bogen, J., & Woodward, J. (1988). Saving the phenomena. Philosophical Review, 97, 303–352. Chang, H. (2008, August). The persistence of epistemic objects through scientific change. Paper presented at the conference What (good) is historical epistemology? MPIWG, Berlin 24–26 July. Collins, H. (1981). Son of seven sexes: The social destruction of a physical phenomenon. Social Studies of Science, 11, 33–62. Collins, H. (1985/1992). Changing order (2nd ed.). Chicago: University of Chicago Press. Dretske, F. (1981). Knowledge and the flow of information. Cambridge, MA: MIT Press. Frankel, H. R. (1978). The importance of Galileo’s non-telescopic observations concerning the size of the fixed stars. Isis, 69, 77–82. Franklin, A. (1986). The neglect of experiment. Cambridge: Cambridge University Press. Franklin, A. (1997). Calibration. Perspectives on Science, 5, 31–80. Galison, P. (1985). Bubble chambers and the experimental work place. In P. Achinstein & O. Hannoway (Eds.), Observation, experiment, and hypothesis in modern physical science. Cambridge: MIT Press. Goldman, A. (1986). Epistemology and cognition. Cambridge, MA: Harvard University Press. Goldman, A. (2008, 21 April). Reliabilism. Stanford Encylopedia of Philosophy. Retrieved August 15, 2008, from http://plato.stanford.edu/entries/reliabilism/. Hacking, I. (1983). Representing and intervening. Cambridge: Cambridge University Press.
123
116
Synthese (2011) 182:101–116
Kant, I. (1781/1787). Kritik der reinen Vernunft. Riga: Johann Hartknoch. English translation: Guyer, P., Wood, A. W. 1997. Critique of Pure Reason. Cambridge: Cambridge University Press. Massimi, M. (2004). Non-defensible middle ground for experimental realism: Why we are justified to believe in colored quarks. Philosophy of Science, 71, 36–60. Massimi, M. (2007). Saving Unobservable Phenomena. British Journal for the Philosophy of Science, 58, 235–262. Massimi, M. (2008). Why there are no ready-made phenomena: What philosophers of science should learn from Kant. In M. Massimi (Ed.), Kant and Philosophy of Science Today, Royal Institute of Philosophy, Supplement 63 (pp. 1–35). Cambridge: Cambridge University Press. Massimi, M. (forthcoming). Galileo’s mathematization of nature at the cross-road between the empiricist tradition and the Kantian one. Perspectives on science. Mayo, D. (1996). Error and the growth of experimental knowledge. Chicago: University of Chicago Press. McAllister, J. (1997). Phenomena and patterns in data sets. Erkenntnis, 47, 217–228. McAllister, J. (2004). Thought experiments and the belief in phenomena. Philosophy of Science, 71, 1164– 1175. McAllister, J. (2007). Model selection and the multiplicity of patterns in empirical data. Philosophy of Science, 74, 884–894. Pickering, A. (1984). Constructing quarks. Chicago: University of Chicago Press. Suppes, P. (1962). Models of data. In E. Nagel, P. Suppes, & A. Tarski (Eds.), Logic, methodology and philosophy of science. Stanford: Stanford University Press. van Fraassen, B. (2002). The empirical stance. New Haven: Yale University Press. Vogel, J. (2000). Reliabilism levelled. Journal of Philosophy, 97, 602–623. Woodward, J. (1989). Data and phenomena. Synthese, 79, 393–472. Woodward, J. (1998). Data, phenomena, and reliability. Philosophy of Science, 67, S163–S179.
123
Synthese (2011) 182:117–129 DOI 10.1007/s11229-009-9612-y
From data to phenomena and back again: computer-simulated signatures Eran Tal
Received: 20 May 2009 / Accepted: 3 June 2009 / Published online: 28 July 2009 © Springer Science+Business Media B.V. 2009
Abstract This paper draws attention to an increasingly common method of using computer simulations to establish evidential standards in physics. By simulating an actual detection procedure on a computer, physicists produce patterns of data (‘signatures’) that are expected to be observed if a sought-after phenomenon is present. Claims to detect the phenomenon are evaluated by comparing such simulated signatures with actual data. Here I provide a justification for this practice by showing how computer simulations establish the reliability of detection procedures. I argue that this use of computer simulation undermines two fundamental tenets of the Bogen–Woodward account of evidential reasoning. Contrary to Bogen and Woodward’s view, computer-simulated signatures rely on ‘downward’ inferences from phenomena to data. Furthermore, these simulations establish the reliability of experimental setups without physically interacting with the apparatus. I illustrate my claims with a study of the recent detection of the superfluid-to-Mott-insulator phase transition in ultracold atomic gases. Keywords Physics
Models · Simulations · Experiments · Computers · Methodology ·
1 Introduction The use of computer simulations has had profound effects on several methodological aspects of physics. This has been widely recognized by philosophers of science, who tend to focus on the role of computer simulations in model-building and prediction
E. Tal (B) Department of Philosophy, University of Toronto, 170 St. George Street (4th Floor), Toronto, ON M5R 2M8, Canada e-mail:
[email protected] 123
118
Synthese (2011) 182:117–129
as well as on their influence on the traditional demarcation between theorizing and experimentation (Hartmann 1996; Galison 1997; Hughes 1999; Morgan 2002; Keller 2003; Winsberg 2003; Humphreys 2004). By contrast, fairly little has been said about the considerable impact that computer simulations have had on methods of evaluating scientific evidence.1 In what follows I will illustrate and analyze an increasingly common practice in experimental physics, namely the practice of justifying claims to detect new phenomena by running computer simulations of the detection process. In some areas of physics, e.g. high-energy, computer-simulated versions of detectors have become requisite tools in interpreting experimental data. Here I attempt to make epistemological sense of this practice and to show that it calls for a significant revision to existing accounts of evidential reasoning in science. My criticism will be directed in particular against Bogen and Woodward’s (1988, 2003; Woodward 2000) account of reasoning from data to phenomena. Physicists engaged in detecting new phenomena often initially lack an exact idea of what kind of data would count as good evidence. Producing standards against which to evaluate data is a necessary requirement for detection in these cases. Such evidential standards are often local, i.e. tailored specifically to the design of the experimental or observational procedure. By running simulations of the detection process, physicists can produce patterns of data that are expected to be observed if the phenomenon is present. Such patterns are called ‘signatures’, and are usually presented in visual form. Claims to detect new phenomena are evaluated by making qualitative comparisons of simulated signatures with actual data. That detection claims may be warranted on the basis of computer simulations becomes less surprising once it is acknowledged that computer-simulated signatures function in ways that go well beyond the roles traditionally ascribed to theoretical predictions. As the following will show, signatures serve not merely to contrast empirical data with theoretical consequences, but also to validate empirical detection methods. The latter function is due to the fact that simulated signatures are produced by representing the nitty-gritty of experimental design. The simulation often represents procedures for producing and probing a sample, physical constraints on the apparatus, background effects and various sources of error. These features make computer simulations powerful tools for testing the reliability of experimental procedures and consequently for validating claims to the detection of new phenomena. This use of computer simulations will be shown to flatly contradict two of the main tenets of the Bogen–Woodward view. These are (i) the claim that the reliability of experimental data-production is established by ‘empirical investigations’ of apparatuses and (ii) the claim that phenomena are inferred from data in a ‘bottom-up’ manner. The computer simulations discussed here violate the first claim because they establish the reliability of detection procedures without physically probing detectors; they violate the second claim because they infer patterns of data from a model of the phenomenon. Nevertheless, these considerations do not provide compelling reasons to return to older views on confirmation that require systematic explanations of data from theory. Nor is a complete rejection of the Bogen–Woodward view entailed by the 1 Notable exceptions are Hartmann (1996) and Galison (1997), who discuss the use of computer simulations in estimating background effects and noise levels in experiments.
123
Synthese (2011) 182:117–129
119
current criticism. Instead, a considerable revision of the latter view is called for that would incorporate the methodological impacts of recent computational developments in science. To illustrate my claims, I will present a case study from the field of ultracold physics, a fast growing sub-discipline that deals with matter at temperatures near absolute zero. The case study revolves around the attempts to detect a certain phase transition, called the superfluid-to-Mott-insulator phase transition, in ultracold atomic gases. As will be shown below, computer-simulated signatures played a crucial role in establishing the reliability of relevant detection methods and hence contributed significantly to the evaluation of experimental evidence for the presence and features of the aforementioned phase transition.
2 Signatures of the superfluid-to-Mott-insulator phase transition 2.1 Quantum phase transitions: a brief introduction Since the 1980s, advancements in laser cooling and magnetic trapping techniques have enabled physicists to cool down atomic gases to nearly absolute zero temperature. This has opened up new possibilities for detecting quantum effects, i.e. effects that are predicted exclusively by quantum mechanics. One category of such effects is quantum phase transitions. Classical phase transitions, such as boiling and freezing, are driven by microscopic thermal fluctuations that can only occur at nonzero temperatures. By contrast, quantum-mechanical systems do not reach a state of complete rest even at absolute zero. Heisenberg’s uncertainty principle entails the existence of microscopic fluctuations even when thermal fluctuations freeze out. These fluctuations can, in the right circumstances, drive an abrupt change in the properties of a many-body system around a critical value of a certain parameter. A dramatic example of this kind of phase transition is the transition between superfluid and Mott-insulator states. Superfluidity is a phase of matter that displays unusual properties such as zero viscosity and frictionless flow. The mechanism behind many instances of superfluidity is known as Bose–Einstein condensation (BEC).2 Since 1995, laboratory physicists have been able to create Bose–Einstein condensates by cooling gases of alkali atoms down to the nanokelvin range. Once condensation is reached, the atoms in the gas become delocalized over the entire volume of the cloud. The many-body system then behaves like a single coherent wave, e.g. it can interact with itself to produce interference patterns. The phenomenon of Mott insulation was initially discovered when certain metal oxides were shown to be insulators despite the predictions of band theory. In 1937, Nevill Mott was one of the first physicists to offer an explanation of this phenomenon by taking into account repulsive interactions between electrons. The term 2 BEC theory is currently accepted as the explanation for superfluidity in bosons. It also explains certain
instances of fermionic superfluidity, namely those in which fermion pairs form diatomic molecules. Other instances of fermionic superfluidity, including “conventional” superconductivity, are currently explained by BCS (Bardeen, Cooper, Schrieffer) theory.
123
120
Synthese (2011) 182:117–129
‘Mott insulation’ has since then been generalized to include any system of particles occupying a potential lattice in which motion is inhibited due to repulsive interactions. In a Mott insulator, particles remain localized to lattice sites while all phase-coherent, ‘wavelike’ characteristics are suppressed. The theoretical treatments of superfluidity and Mott insulation eventually gave rise to a generalized model, called the Bose–Hubbard Model (BHM), which tied the two phenomena together. This model predicts that if a coherent atom cloud is subjected to a potential lattice of varying depth, the cloud will undergo a transition between the superfluid and Mott insulator states around a critical depth. To speak somewhat loosely, the system will switch between ‘wavelike’ and ‘particle-like’ behaviours depending on the strength of repulsive interactions between particles. This transition is driven by quantum-mechanical fluctuations that arise from the trade-off between particle-number-uncertainty and phase-uncertainty in the many-body system.
2.2 The quest for evidential standards Like many efforts to detect unobservable phenomena that are predicted by theory, the experimental realization of the Superfluid-to-Mott-insulator phase transition required a ‘smoking gun’, i.e. an observable effect that is unique enough to eliminate doubts concerning alternative causes of the data other than the desired phenomenon. One of the most promising predictions of the Bose–Hubbard Model in this respect concerns the excitation spectra of Mott insulators and superfluids. A Mott insulator is expected to absorb energy in quanta of U , the energy of repulsive interactions between atoms in the same lattice site. This is because any amount of energy smaller than U would not be sufficient for an atom to move into a neighbouring site that is already occupied by another atom. Consequently, the excitation spectrum of a Mott insulator is expected to have peaks in intervals of U . By contrast, in the superfluid state atoms are free to tunnel between sites and may absorb energy in fractions of U . The excitation spectrum of a superfluid is therefore expected to be gapless. The first experimental detection of the superfluid-to-Mott-insulator phase transition in an atomic gas came late in 2001 (Greiner et al. 2002). This experiment, preformed by a laboratory in Munich, was deemed persuasive largely because the measured excitation spectra were compatible to a high degree with the predictions of the Bose–Hubbard Model. Nevertheless, two years later a more in-depth study of the phase transition by an experimental group in Zurich produced unexpected results that differed significantly from the earlier ones (Stöferle et al. 2004). The group in Zurich excited the atoms by modulating the amplitude of the lattice, that is, by adding a recurrent ‘kick’ to the intensity of the lasers that create the lattice. Different frequencies of this ‘kick’ correspond to different energies. The experimenters measured how much energy the atoms absorbed in various modulation frequencies across varying lattice depths. Though not obviously incompatible with the predictions of the Bose–Hubbard model, the results of the Zurich experiment were not readily explained by the model. The spectra that supposedly belonged to Mott insulators displayed peaks, but the peaks were not evenly spaced and appeared at lower energies than multiples of U . In addition, the spectra corresponding to superfluids were not completely smooth. Instead,
123
Synthese (2011) 182:117–129
121
Fig. 1 Excitation spectra of an ultracold gas occupying a one-dimensional potential lattice at various potential depths as measured by the Zurich group. Lower potential depths (low U/J) correspond to the superfluid side of the transition, while higher potential depths correspond to the Mott-insulator side. (Reprinted with permission from Stöferle et al. 2004, p. 3. Copyright (2004) by the American Physical Society) (colour online)
they exhibited peaks at low energies joined with an unexpected downward slope at high energies (Fig. 1). The evidential significance of the Zurich results could not be estimated simply by comparing the results to the predictions of the Bose–Hubbard model. Taken on its own, this model provides general predictions concerning the location of gaps and peaks in the spectrum. These predictions may be compatible with many different data patterns, depending on how the data is interpreted. Nothing as specific as the pattern of excitations measured in the Zurich experiment can be derived from the model without filling in specific experimental parameters. Such parameters include the number of lattice sites, the number of atoms, the shape of the confining potential, the method of excitation, the evolution of the system after leaving the lattice and the statistics involved in determining the amount of energy absorbed by the gas. Before these experimental parameters are filled into the Bose–Hubbard model, the predictions of the model are general enough to be compatible with the results of both experiments—the one in Zurich and the earlier one reported by the group in Munich. Yet the results of these experiments were significantly different, e.g. they produced starkly disparate measurements of the high-energy end of the superfluid spectrum. Unfortunately, deriving model predictions for the specific parameters of each experiment is an analytically intractable task. This situation leads to a kind of ‘gray zone’ with respect to standards of empirical adequacy, a problem that has become increasingly common in physics as detectors turn progressively more complex. As the next section will illustrate, computer simulations prove to be efficacious means of filling this vacuum in evidential standards.
123
122
Synthese (2011) 182:117–129
2.3 Simulating the detection process The challenge of interpreting the results of the Zurich experiment was taken up by a theory group at Oxford (Clark and Jaksch 2006). The Oxford group developed a computer simulation of an atomic gas that approximated the experimental conditions in the Zurich laboratory. The simulation was built around the assumption that the time evolution of the system is governed by the Bose–Hubbard Hamiltonian. Computing the evolution of the system based on the Bose–Hubbard Hamiltonian takes exponentially longer periods when more atoms are involved. To overcome this difficulty, the theorists employed an approximation called the Time Evolving Block Decimation algorithm (TEBD). This algorithm simplifies the original Bose–Hubbard model by only taking into account interactions between atoms is neighbouring lattice sites.3 In accordance with the Zurich experiment, the group at Oxford simulated a onedimensional lattice at varying depths that undergoes a periodic ‘kick’. The theorists preformed several runs, each time filling in different values for the number of sites, the number of atoms and the shape of the confining potential. While the simulation represented the intervention method of the actual experiment, the theorists also explored initial and background conditions that were quite different from the configuration implemented by the experimenters. As shown in Fig. 2, these different ‘virtual configurations’ resulted in qualitatively distinct excitation spectra. Figure 2a shows the pattern of excitations that resulted from simulating the system with experimental conditions closest to those of the actual experiment. The figure shows two strong peaks, the first one becoming wider as the lattice becomes shallower. A smaller peak can be seen which gains strength on the superfluid side and shifts to higher energies on the Mott insulator side. The theorists identified these patterns of shifting and widening as “signatures of the superfluid to Mott-insulator transition”, which is also the title they gave their paper. The theorists found their simulation results in configuration (a) to be “very reminiscent” of those of the Zurich experiment (ibid, 15). The pattern of a double widening peak and a smaller high-energy peak comes very close to what the experimental group achieved (cf. Fig. 1). The theorists noted that the results of the simulation still cannot be “directly compared” to the experiment, due to factors that were not represented in the simulation. Such factors have to do with the procedure that the experimenters employed to measure the energy absorbed by the gas after it was excited. Nevertheless, the similarity in the results was sufficient for the group at Oxford to conclude that:
3 The TEBD algorithm provides a good approximation for the time evolution of a many-body system
as long as the degree of quantum entanglement in the system remains low (Vidal 2003). The algorithm takes into consideration only a small (low-dimensional) subset of the Hilbert space that is required to fully represent the system. This simplification is achieved by decomposing the product state of the many-body system and rewriting it with new coefficients (called Schmidt coefficients). In each iteration, the algorithm selects only the terms with the largest coefficients and drops the rest. The degree of approximation achieved in this way can be adjusted by varying the number of terms selected.
123
Synthese (2011) 182:117–129
123
Fig. 2 Simulated excitation spectra for a Bose gas across varying lattice depths. These results were achieved by applying the Bose–Hubbard Hamiltonian to a system approximating the experimental conditions of the 2004 Zurich experiment. Lower potential depths (low U/J) correspond to the superfluid side of the transition, while higher potential depths correspond to the Mott-insulator side. a is the result of running the simulation with initial and background conditions closest to the configuration that was actually implemented in the experiment. b and c were produced by assuming a different shape of the confining potential than the one implemented experimentally. d was produced by assuming a lower filling (atom-to-site) ratio than the one implemented experimentally. (Reprinted with permission from Clark and Jaksch 2006) (colour online)
our results point to the fact that the BHM [Bose–Hubbard model] is sufficient to explain all the features discovered in the [Zurich] experiment and that the experiment was a clean realization of this model as expected. (Clark and Jaksch 2006, p. 17) According to the authors, the simulation approximated the results of the experiment well enough to establish that the Bose–Hubbard model explains the empirical results. Yet this is not where their conclusions end. The success of the simulation led the theorists to another remarkable conclusion, namely that the experiment was “a clean realization” of the model. In other words, based on their simulation results, the theorists made a retroactive detection claim about what happened in the Zurich laboratory. The results of the simulation were taken by the theorists as evidence for the presence of the superfluid-to-Mott-insulator phase transition in the 2004 experiment. 3 Computer simulations as tests for experimental reliability Are the theorists at Oxford warranted in making a retroactive detection claim based on their computer simulation? For the sake of the current discussion it can be assumed
123
124
Synthese (2011) 182:117–129
that the simulation was internally valid, i.e. that the algorithm implemented by the theorists provided good approximate solutions to the Bose–Hubbard Hamiltonian. Given this assumption, the simulation clearly provided good reasons to believe that the Bose–Hubbard model is empirically adequate. That is, the simulation showed that if the Zurich experiment was a reliable one, then the Bose–Hubbard model can explain its results. This conclusion alone does not yet justify a detection claim. Prima facie, it seems plausible that the Zurich experiment could have been poorly controlled, i.e. that the experimental data was caused by some confounding factor. A detection claim is only warranted once it is shown that the detection procedure was reliable, namely that it was unlikely for the experimental data to turn out as it did unless the desired phenomenon were present. On a closer look, however, this is precisely what the Oxford simulation showed. That is, the simulation tested the reliability of the experimental setup in the Zurich laboratory and found it to be reliable. Consequently, the theorists were warranted in making a detection claim based on their computer simulation. Recall that the simulation approximated many of the parameters that are unique to the specific experimental design that was implemented in Zurich. The simulation represented the size of the cloud, the shape and size of the lattice, the method of intervention used to excite the gas, and even background conditions such as the shape of the confining potential. More importantly, the simulation studied the dependence of the resulting excitation spectra on variations on some of these parameters. It is worth examining how this kind of simulated representation of the experimental design contributes to an evaluation of empirical evidence. Here it is useful to refer to Woodward’s (2000) account of evidential reliability. Woodward views reliability as a counterfactual relation between possible phenomena claims (P) and possible empirical data outcomes (D). According to Woodward, we cannot tell whether D1 is evidence for P1 just by considering the relationship between D1 and P1 and whether D1 occurs. Whether D1 is evidence for P1 also depends on whether alternative data to D1 would be produced if P2 were to hold instead, even if in fact D1 are all the data that are ever produced. (Woodward 2000, p. 174) Establishing evidential relations between data and phenomena requires showing that different data would have been produced in the presence of alternative phenomena. Indeed, the method of simulating experimental signatures establishes exactly this type of counterfactual relation. By probing the effects of varying initial and background conditions on its results, computer simulations explore the counterfactual sensitivity of a specific experimental design to phenomenal differences. The Oxford simulation is a stark example of evaluating experimental work in this way. The simulation yielded a host of data patterns that represent the data that would have been produced by the experiment under different configuration of atom number, lattice size and confining potential. The simulation then showed that the best qualitative match between simulated signatures and experimental data patterns is obtained under similar phenomenal conditions to those that the experimenters had claimed to realize. The fact that the results of the experiment qualitatively match the simulated signature can be taken as evidence for the reliability of the experimental procedure in
123
Synthese (2011) 182:117–129
125
just the same sense of ‘reliability’ that is employed by Woodward. The counterfactually unique match between signature and data makes it unlikely that the empirical results were the outcome of the ‘wrong’ phenomenon, i.e. a confounding factor. Computer-simulated signatures therefore rightly retain some of the original sense of the term ‘signature’—a unique, reliable indicator for the presence of a person who is not immediately present. The foregoing discussion is not meant to deny that in principle things can still go wrong. For example, it may turn out that the experimental design was not implemented correctly and that as a result, the assumptions that informed the simulation did not actually hold. The interesting question, however, is not whether or not inferences from simulations to empirical facts are certain (they are not). Instead, the crucial point is that simulations provide scientists with equally good, and sometimes better, reasons to believe in the reliability of their experimental apparatuses as do ‘traditional’ methods. The physical calibration of detectors, for example, depends on showing that relevant similarities obtain between a surrogate signal and some sought-after phenomenon (Franklin 1997). In some cases, establishing such similarity relations is an arduous task that involves overcoming considerable justified doubts. It is difficult to imagine a good argument for why establishing the adequacy of surrogate signals is categorically easier than establishing the external validity of computer simulations. Indeed, in the present case it is very likely that the experimental design was implemented just as the theorists supposed. The relevant techniques of cooling and trapping atoms have all become common stock in ultracold physics long before the Zurich experiment. In this case, then, there was no reason to seriously doubt that the experimenters attained the initial and background conditions they reported. Rather, the issue in question was whether the intervention and data analysis methods used in the experiment were reliable enough, and consequently whether the experimental data constituted evidence for the desired phase transition. The Oxford computer simulation can justifiably be seen as settling both of these questions with a high level of confidence. The last few points make clear why the theorists at Oxford are warranted in making a detection claim about the phase transition, even though the latter occurred two years earlier in a laboratory in Zurich. Claims to the detection of unobservable, theoretically-predicted phenomena are essentially twofold: they consist in claiming that (i) the detection procedure is reliable and that (ii) the results of the procedure fit relevant theoretical predictions. As shown above, the Oxford simulation supports both of these claims in relation to the Zurich experiment. Consequently, the theorists were justified in making a retroactive detection claim based on the results of their simulation. While the simulation-based inference consists of two mutually supportive strands of reasoning—one establishing the reliability of the experiment and the other establishing the empirical adequacy of the theoretical model—the inference is nevertheless not circular. This is because the theorists relied on independent evidence that supports the validity of the Bose–Hubbard model regardless of experiments in cold gases. This independent evidence comes from experimenting with superfluids and Mott insulators as well as from well-founded theoretical considerations that underwrite the
123
126
Synthese (2011) 182:117–129
Bose–Hubbard model.4 Taking this independent evidence as a starting point, the simulation can be viewed as involving two conceptually distinct steps: 1. Testing the reliability, i.e. counterfactual sensitivity, of the experimental design, given independent evidence that the theoretical model is valid. 2. Testing the qualitative match between the prediction of the theoretical model and the actual data from the experiment, given (from step 1) that the experiment was reliable. The form of this procedure is reminiscent of what Chang (2004, p. 224) calls ‘epistemic iteration’, or a self-enriching justification structure. While the reliability of the experiment is already established by passing test 1, it nevertheless gains additional support from passing the second test.5 Similarly, although the validity of the theoretical model is presupposed in test 1, it gains additional support by passing test 2. Understood in this way, the simulation-based inference avoids circularity while remaining ampliative. 4 Inferring data patterns from models of phenomena Bogen and Woodward develop their account of evidential reasoning as an objection to what they call ‘logical’ accounts of confirmation (Bogen and Woodward 2003; Woodward 2000). These include hypothetico-deductive accounts, Hempel’s positiveinstance theory, Carnap’s inductive logic, and a subset of Bayesian approaches. What is common to all of these views is that “the inference from evidence to hypothesis is simply the instantiation of the right sort of logical relationship between the evidence and the hypothesis” (Woodward 2000, p. 173). By contrast, Bogen and Woodward claim that experimental results are only rarely directly comparable to theoretical predictions. In most cases, scientists must work their way ‘upwards’ from empirical data to claims about phenomena. To be able to do so, experimenters must establish the reliability of their data production procedures by fiddling with their instruments and testing them for various idiosyncratic sources of error. Bogen and Woodward thus seem to be considering only two types of views on how the reliability of data production is ascertained. Either we look at what scientists do in practice, and realize that they learn about the reliability of their methods by empirically probing their instruments; or we revert to the highly unrealistic ‘logical’ view and speculate about what scientists could do in principle if they had “unlimited computational power” (Ibid, p. 176). As the practice of generating and using computer-simulated signatures suggests, there are more than just two options. Advancements in computational science have 4 The Bose–Hubbard model is based on an earlier model, called the Hubbard model, that has proven successful in describing electron interactions in a potential lattice long before ultracold experiments were available (Hubbard 1963). 5 Note that test 2 could still fail even if test 1 passes. That is, the experimental procedure may be shown to be
counterfactually sensitive to the desired phenomenon while the actual experiment nevertheless yields a data pattern that does not match the results of the simulation. In such cases additional work is required to determine whether the discordance is resolved by correcting the experimental implementation, the simulation software or the theoretical model.
123
Synthese (2011) 182:117–129
127
opened up new ways of establishing the reliability of data production methods. In many cases, simulating the detection process is not only feasible, but also cheaper and faster than physically probing the apparatus. The simulations discussed here differ from the methods described by Bogen and Woodward in at least two fundamental ways. First, they establish the reliability of detection processes by investigating representations of experimental apparatuses rather than the apparatuses themselves. This implies that the notion of ‘empirical investigation’ is currently construed too narrowly by Bogen and Woodward. Second, these simulations serve to form inferences that clearly run ‘downwards’ from a theoretical model of the phenomena to possible patterns of data, in the opposite direction to the inferences described by Bogen and Woodward. On the other hand, these simulations do not mark a return to the kind of ‘logical’ derivations of data from theory that Bogen and Woodward reject. First, unlike theoretical predictions, computer-simulated signatures provide information about the counterfactual sensitivity, and hence the causal efficacy, of particular experimental setups. Even if signatures are seen as no more than results of mathematical derivations from theory plus initial conditions, one still needs to explain why derivations with counterfactual initial conditions are relevant to confirmation. Given the narrow way in which Bogen and Woodward construe so-called ‘logical’ views of confirmation, it is unlikely that these views can explain the evidential significance of the counterfactual information conveyed by simulated signatures. Second, and more importantly, the computer simulations discussed here do not fit in the category of ‘logical’ views because they do not offer systematic explanations of experimental data. Bogen and Woodward characterize systematic explanations as those that (i) provide a detailed pattern of dependency of explanandum on explanans and that (ii) unify and connect “a range of different kinds of explananda” (Bogen and Woodward 1988, p. 325). Computer-simulated signatures fit neither of these criteria. First, they provide very little insight into detailed dependencies. While the theorists at Oxford did attempt to provide guesses as to why different initial and background conditions resulted in certain data patterns rather than others, the complexity of the algorithm makes detailed dependencies between input and output impossible for humans to track. Second, the computer simulations discussed here are tailored to a particular intervention method and experimental design. There is simply no way to generalize the results of the Oxford simulation beyond the scope of the Zurich experiment. This last point does not undermine the generality of the method of evaluating empirical evidence by means of computer simulations. Computer-simulated signatures can be generated for a vast host of putative phenomena, as is illustrated by their growing use across many sub-disciplines in the physical sciences. In high-energy physics, for example, signatures are used as a matter of course to test the reliability of new detector designs even before the actual detector is built.6 While this method is widely applicable to many areas of science, its results are in each case still irrevocably bound to the context of a specific detection scheme. Given the above analysis, I suggest that the categories into which Bogen and Woodward currently classify theories of evidence are too narrow. In particular, their views 6 An account of the role played by computer simulations in the design and construction of detectors in high energy physics can be found in Merz (2006).
123
128
Synthese (2011) 182:117–129
on the justification of detection claims should be updated to include ‘downward’ inferences based on computational techniques. Nevertheless, this conclusion does not diminish the importance of Bogen and Woodward’s emphasis on data production and reliability as central to the evaluation of empirical evidence. Similarly, the notion of ‘empirical investigation’ of apparatuses does not completely lose its relevance. Instead, this notion may be generalized so that simulation would be seen as ‘virtual fiddling’ with a representation of the laboratory apparatus. Ultimately, the generalized notion of empirical investigation would have to be grounded in a comprehensive epistemological analysis of computer simulations. Such analysis would specify the conditions under which simulations are able to augment or replace traditional empirical methods such as calibration. For recent work tending in this direction see Parker (2009) and Winsberg (2009). 5 Conclusion Computer simulations provide physicists with new ways of evaluating the evidential import of empirical data. By combining theoretical considerations with the details of a concrete experiment, simulations become powerful tools for producing new standards of evidence. Far from being theoretical predictions in the traditional sense, computersimulated signatures function as tests for the reliability of the detection process. As I have argued in this paper, this use of computer simulations does not square neatly with Bogen and Woodward’s views on scientific evidence. Nor do such simulation-related practices fit the views that Bogen and Woodward purport to criticize. What is required, then, is a revision of existing theories of scientific evidence that would make sense of the increasing contribution of computational methods to evidential reasoning. In particular, the revised view should recognize the evidential significance of interventions made not only on the apparatus itself, but also on representations of the apparatus. Acknowledgements The case study discussed in this paper was developed in the course of a study on ultracold physics in which I assisted Ian Hacking. I am extremely grateful to him, as well as to Margaret Morrison, Joseph Berkovitz and Christopher Ellenor, for their helpful comments on earlier versions of this paper. This work also benefitted from several members of the audience at the 2008 meeting of the Canadian Society for the History and Philosophy of Science and from lively discussions with participants of the ‘Data, Phenomena, Theories’ conference in Heidelberg.
References Bogen, J., & Woodward, J. (1988). Saving the phenomena. The Philosophical Review, 97(3), 303–352. Bogen, J., & Woodward, J. (2003). Evading the IRS. Poznan Studies in the Philosophy of the Sciences and the Humanities, 20, 223–256. Chang, H. (2004). Inventing temperature: Measurement and scientific progress. Oxford: Oxford University Press. Clark, S. R., & Jaksch, D. (2006). Signatures of the superfluid to Mott-insulator transition in the excitation spectrum of ultracold atoms. New Journal of Physics, 8(8), 160. Franklin, A. (1997). Calibration. Perspectives on Science, 5(1), 31–80. Galison, P. (1997). Image and logic. Chicago: University of Chicago Press. Greiner, M., et al. (2002). Quantum phase transition from a superfluid to a Mott insulator in a gas of ultracold atoms. Nature, 45, 39–44.
123
Synthese (2011) 182:117–129
129
Hartmann, S. (1996). The world as a process: Simulations in the natural and social sciences. In R. Hegselmann et al. (Eds.), Modelling and simulation in the social sciences from the philosophy of science point of view (pp. 77–100). Theory and decision library. Dordrecht: Kluwer. Hubbard, J. (1963). Electron correlations in narrow energy bands. Proceedings of the Royal Society of London A, 276(1365), 238–257. Hughes, R. I. G. (1999). The Ising model, computer simulation and universal physics. In M. S. Morgan & M. Morrison (Eds.), Models as mediators: Perspectives on natural and social science (pp. 97– 145). Cambridge: Cambridge University Press. Humphreys, P. (2004). Extending ourselves. Oxford: Oxford University Press. Keller, E. F. (2003). Models, simulation, and computer experiments. In H. Radder (Ed.), The philosophy of scientific experimentation (pp. 198–215). Pittsburgh: University of Pittsburgh Press. Merz, M. (2006). Locating the dry lab on the lab map. In Lenhard, J., Kuppers, G., & Shinn, T. (Eds.), Simulation: Pragmatic construction of reality (Vol. 25, pp. 155–172). Sociology of science yearbook. Dordrecht: Springer. Morgan, M. S. (2002). Model experiments and models in experiments. In L. Magnani & N. J. Nersessian (Eds.), Model based reasoning: Science, technology, values (pp. 41–58). New York: Kluwer. Parker, W. (2009). Does matter really matter: Computer simulations, experiments, and materiality. Synthese, 169(3), 483–496. Stöferle, T., Moritz, H., Schori, C., Köhl, M., & Esslinger, T. (2004). Transition from a strongly interacting 1D superfluid to a Mott insulator. Physical Review Letters, 92(13), 130403(4). Vidal, G. (2003). Efficient classical simulation of slightly entangled quantum computations. Physical Review Letters, 91(4), 147902(4). Winsberg, E. (2003). Simulated experiments: Methodology for a virtual world. Philosophy of Science, 70(1), 105–125. Winsberg, E. (2009). A tale of two methods. Synthese, 169(3), 575–592. Woodward, J. (2000). Data, phenomena and reliability. Philosophy of Science, 67, S163–S179.
123
This page intentionally left blank z
Synthese (2011) 182:131–148 DOI 10.1007/s11229-009-9621-x
Data and phenomena in conceptual modelling Benedikt Löwe · Thomas Müller
Received: 20 May 2009 / Accepted: 3 June 2009 / Published online: 4 July 2009 © The Author(s) 2009. This article is published with open access at Springerlink.com
Abstract The distinction between data and phenomena introduced by Bogen and Woodward (Philosophical Review 97(3):303–352, 1988) was meant to help accounting for scientific practice, especially in relation with scientific theory testing. Their article and the subsequent discussion is primarily viewed as internal to philosophy of science. We shall argue that the data/phenomena distinction can be used much more broadly in modelling processes in philosophy. Keywords
Modelling · Methodology · Metaphilosophy
We would like to dedicate this paper to the memory of our colleague at Bonn, Daniela Bailer-Jones. B. Löwe Institute for Logic, Language and Computation, Universiteit van Amsterdam, Postbus 94242, 1090 GE Amsterdam, The Netherlands e-mail:
[email protected] B. Löwe Department Mathematik, Universität Hamburg, Bundesstrasse 55, 20146 Hamburg, Germany B. Löwe Mathematisches Institut, Rheinische Friedrich-Wilhelms-Universität Bonn, Endenicher Allee 60, 53115 Bonn, Germany T. Müller (B) Departement Wijsbegeerte, Universiteit Utrecht, Heidelberglaan 6-8, 3584 CS Utrecht, The Netherlands e-mail:
[email protected] T. Müller Institut für Philosophie, Rheinische Friedrich-Wilhelms-Universität Bonn, Lennéstraße 39, 53113 Bonn, Germany
123
132
Synthese (2011) 182:131–148
1 Introduction The distinction between data and phenomena introduced by Bogen and Woodward (1988) was meant to help accounting for scientific practice, especially scientific theory testing. Their article and the subsequent discussion is primarily viewed as internal to philosophy of science. In this paper, we apply their distinction to the general technique of conceptual modelling, a widespread methodology that is also employed in philosophy. Distinguishing between data and phenomena will allow us to shed some light on a number of philosophical and metaphilosophical issues: it provides for a stance from which one can assess the status of empirical methods in philosophy, and it helps to distinguish between good and bad uses of the technique of conceptual modelling in specific arguments employed in analytical philosophy. In Sect. 2, we give a general overview of the technique of conceptual modelling with some examples. In Sect. 3, we introduce our main metaphilosophical issue, viz., the fundamental difficulty that practitioners of philosophy face when doing “philosophy of X ”, e.g., philosophy of mathematics, philosophy of language, or the philosophy of time: they have to walk a thin line between two methodological dangers, the Scylla of mere armchair philosophy and the Charybdis of essentially becoming sociologists of X . In the context of conceptual modelling, we can give a precise description of these two perils. After that, in Sect. 4 we provide a recapitulation of the distinction between data and phenomena and show how it captures some fine structure of the method of conceptual modelling. In order to illustrate the general usefulness of making the data/phenomena distinction in philosophy, we give a case study from epistemology and connect to the 1950s discussion about ordinary language philosophy. We then formulate specific lessons for the current debate about the use of experiments and intuitions in philosophy. Finally, in Sect. 5, we provide our answer to the questions raised in Sect. 3. By using the detailed analysis of potential problems generated by confusing data and phenomena, we propose a means of charting the waters between Scylla and Charybdis, and analyse why the modeller sometimes runs into Scylla and sometimes is swallowed by Charybdis.
2 Conceptual modelling In science and engineering, mathematical modelling has long been seen as one of the most fundamental methodologies and its scope is perceived as growing: Nowadays, mathematical modelling has a key role also in fields such as the environment and industry, while its potential contribution in many other areas is becoming more and more evident. One of the reasons for this growing success is definitely due to the impetuous progress of scientific computation; this discipline allows the translation of a mathematical model [ …] into algorithms that can be treated and solved by ever more powerful computers. (Quarteroni 2009, p. 10) Mathematical modelling thus presupposes quantitative and computational methods. However, a slight generalization of the same methodology is ubiquitous also in
123
Synthese (2011) 182:131–148
133
non-quantitative research areas. In analogy to mathematical modelling, we call this more general technique conceptual modelling.1 Conceptual modelling is an iterative process through which a stable equilibrium is reached between a concept or a collection of concepts as explanandum and a (somewhat) formal representation of it.2 Each iteration towards the equilibrium involves three steps: 1.
2.
3.
Formal representation. Guided by either some initial understanding of X or the earlier steps in the iteration, one develops a formal representation of the explanandum. Phenomenology. With a view towards step 3, one collects evidence in the range of the explanandum that is ideally able either to corroborate or to question the current formal representation.—We call this step “phenomenology” because it is generally broader than mere data collection; this will become important below, and will be discussed in Sects. 4 and 5. Assessment. In the light of the results from step 2, one assesses the adequacy of the representation. If this assessment is positive, the modelling cycle is left—no further iteration is necessary since an equilibrium has been reached. Otherwise, the representation has to be changed, and a new iteration is started at step 1.
This method obviously covers mathematical modelling as employed in the sciences and in engineering, where the formal representation typically comes with a numerical mathematical model that allows for quantitative predictions. In this case, the phenomenology step consists of quantitative experiments and/or observations. Another instance of modelling is given by the iterative schemes of parameter fitting that are often employed in data analysis, where something like the above procedure is implemented in the form of concrete algorithms. In that case, the phenomenology step involves, e.g., the computation of a “goodness of fit” measure (based on the discrepancy between the given data points and the current parameterized model) that is then assessed in order to decide whether convergence has been reached, or whether a new iteration with updated parameters is necessary. It can also be illuminating to describe historical episodes in the sciences in terms of the modelling paradigm, when assessment in the light of new data leads to theory change. As an illustration, one may consider Kepler’s two successive theories of planetary orbits, only the second of which (incorporating what is now known as “Kepler’s laws”) survived assessment in the light of Brahe’s astronomical observations (Kuhn 1957). History also provides us with striking examples of the use of conceptual modelling in non-quantitative contexts—consider, e.g., Ventris’s deciphering of Linear B starting with the faulty theoretical assumption that 1 The use of the word “modelling” calls to mind the debate about theories and models in philosophy of science. This overlap in terminology may be somewhat unfortunate, but first, the entrenchment of the term “mathematical modelling” does not allow another choice of phrase, and secondly, calling the method “modelling” is also adequate from the point of view of that debate, since stressing that one is just after a model does not suggest deep metaphysical involvement. Cf. our remarks about logical analysis (which does suggest such involvement) below, note 3. 2 When speaking of “formal representation” here, we wish to leave it open whether this representation just employs some predicate-logical symbolization with a view towards regimented natural language, uses a formal language such as the language of modal logic, or specifies a mathematical model of some sort.
123
134
Synthese (2011) 182:131–148
the language was not related to Greek (Chadwick 1990). And in philosophy, modelling is a useful generalization of the paradigm of “analysis”, as we shall lay out below. The modelling paradigm is also broad enough to cover two opposite stances with respect to the aim of theory construction: should a theory just account for given observations, or is some deeper insight aimed at? In this debate about “saving the phenomena”, one camp joins Osiander’s preface to Copernicus’s De Revolutionibus in viewing the aim of theory construction to be the establishment of a model that simply reproduces what is observable, thus yielding, in Duhem’s words, [ …] a system of mathematical propositions, deduced from a small number of principles, which aim to represent as simply and completely, and exactly as possible, a set of experimental laws. (Duhem 1906, p. 19) At the other end of the spectrum, one may demand models that are not just empirically, but somehow also structurally appropriate in the sense of capturing the essence of what they are models of. No matter how this demand can be spelled out explicitly— within our modelling paradigm it clearly refers to a standard of assessment (step 3) that differs from the one in place in the “saving the phenomena” camp. We already stated that the method of conceptual modelling is not confined to the sciences, nor to quantitative methods. In the case of philosophy one should also note that conceptual modelling is more general and more dynamical than the prominent method of “conceptual analysis”. For an overview of this method, we refer the reader to Beaney (2008).3 The wider scope of conceptual modelling vis-à-vis conceptual analysis is important, e.g., when it comes to the debate about the use of intuitions in philosophy, which we shall address in Sect. 4.3. Thus, in his defense of the use of intuitions in philosophy against some positions of experimental philosophy, Ernest Sosa writes: It is often claimed that analytic philosophy appeals to armchair intuitions in the service of “conceptual analysis”. But this is deplorably misleading. The use of intuitions in philosophy should not be tied exclusively to conceptual analysis. Consider some main subjects of prominent debate: utilitarian versus deontological theories in ethics, for example, or Rawls’s theory of justice in social and political philosophy [ …]. These are not controversies about the conceptual analysis of some concept. [ …] Yet they have been properly conducted in terms of hypothetical examples, and intuitions about these examples. (Sosa 2007, p. 100) All the philosophical debates that Sosa mentions as falling outside the scope of conceptual analysis do fall under the scope of the more general technique of conceptual
3 The well-known distinction between “theory” and “model” in philosophy of science (cf. note 1) is approximately reflected in the distinction between “conceptual analysis” and “conceptual modelling”. At least on a certain reading, analysis is meant to render an initially imprecise and vague problem in its ‘true’ logical form via the reduction of natural language to formal logic. On the other hand, the technique of conceptual modelling makes no claims about the relationship between the model and the described concept other than adequacy in the context at hand, thus staying closer to the rôle of models in science.
123
Synthese (2011) 182:131–148
135
modelling. The use of intuition can therefore be addressed more adequately within that framework.4 Let us add a disclaimer at this point: Even though conceptual modelling is broader than conceptual analysis, we do of course realize that conceptual modelling is not the only method of philosophy. There are philosophical argumentation techniques that will not fit into our general scheme. Our claim is just that conceptual modelling is widespread enough to guarantee that understanding the metaphilosophical implications and details of this particular philosophical technique will greatly advance our understanding of philosophical practice in general—even if some aspects of philosophical methodology are not covered. 3 “What is a philosophy of X?” One of the hardest methodological challenges in philosophy is to give an account of what one is doing in establishing a “philosophy of” something that is already there.5 Philosophical theory building often starts out with a normative agenda, which threatens to lose touch with what the theory is to be about. On the other hand, a purely descriptive approach can be criticised for failing to be distinctively philosophical. Is there a safe passage for “philosophy of X ”, generally, between the Scylla of normative subject-blindness and the Charybdis of merely descriptive non-philosophy? One can fruitfully read the historical development of philosophy of science in the twentieth century as a searching party in these difficult waters. Broadly speaking, philosophy of science as a separate subfield of philosophy started out from the movement of logical empiricism in the 1920s. In their denial of metaphysics, these philosophers only acknowledged a formal reconstruction task for philosophy, as stated forcefully by Carnap: Philosophie betreiben bedeutet nichts Anderes als: die Begriffe und Sätze der Wissenschaft durch logische Analyse klären. (Carnap 1930, p. 26)6 Carnap would certainly reject as metaphysical the suggestion that the aim of logical analysis—and thus, on his view, of philosophy—should be to capture the essence of science. But logical analysis demands more than “saving the phenomena” in terms of a description of actual scientific practice: it is meant to give a rational reconstruction, not just a description, of such practice.7 Thus, philosophy of science started out dangerously close to Scylla, as a purely normative enterprise in which logical methods were employed to build up a reconstructed version of science without any direct 4 Kuipers (2007) also stresses the fact that in analytical philosophy and especially in philosophy of science, a method broader than (static) conceptual analysis is often employed. He discusses this fact in the context of a broadening of the understanding of the method of explication originally proposed by the logical empiricists. Our framework of conceptual modelling is broader than the notion of explication, but shares its dynamical nature. 5 A subcase of this challenge is the so-called “paradox of analysis”; cf. again Beaney (2008). 6 “To pursue philosophy means nothing but: clarifying the concepts and sentences of science by logical
analysis.” 7 Cf. our remarks in notes 1 and 3 above.
123
136
Synthese (2011) 182:131–148
link to science as actually practiced.8 The historical turn that philosophy of science took in the 1960s pushed the field towards the other extreme. While close attention to actual scientific practice was certainly a necessary corrective moment, critics saw philosophy of science dissolve into history or sociology. In the meantime, however, philosophy of science seems to have settled for a healthy compromise between the two extremes: historical case studies and sociological findings are generally taken seriously, but philosophy of science has not collapsed into history or sociology. These disciplines are rather seen as informing the field by providing material for philosophical reflection. While it would be an overstatement to claim that there was a universally accepted methodological agreement as to the importance of such empirical input, there is clearly a consensus among philosophers of science that good philosophical work needs to steer a course between a merely normative and a merely empirical approach. Overall it seems that we have here a success story of establishing a “philosophy of X ” in one specific case. Can this be generalized? In the case of a philosophical approach using the method of conceptual modelling, one can already identify Scylla and Charybdis in the three step process described above: by the way in which the phenomenological step (step 2) is performed, a philosopher decides how close he or she wants to maneuver towards either of the dangers. An extreme case of armchair philosophy would base the phenomenology solely on introspective truth, using just the philosopher’s own intuition as basis for the following assessment step—if such a step is performed at all. Especially if the philosopher is not an expert of X , this is steering dangerously close to Scylla, as one may be losing touch with the subject of investigation at issue. On the other hand, an empirically based approach will involve a data collection step that will provide a description of human behaviour with respect to X . However, how do we guarantee that this gives us more than just a description of human behaviour? How do we salvage the philosophical content of our theory? The current debate about the use of empirical methods in philosophy, under the heading of experimental philosophy, can usefully be described in our framework: it shows a normative (“armchair”) and a descriptive (“experimental”) camp pulling in different directions, corresponding to different ideas as to what a sound phenomenology step in conceptual modelling should look like. We shall address this debate in more detail in Sect. 4.3 below, making use of the data/phenomena distinction. 4 Data vs. phenomena in science—and in philosophy Bogen and Woodward (1988) famously distinguish data—local, situated facts—from phenomena—the empirical material against which theories are tested: 8 The matter is of course more diverse than this brief sketch suggests. For one thing, there were other historical developments that added to what became philosophy of science as we know it—one only has to think of the French school initiated by Poincaré. And on the other hand, as Uebel (2001) and Nemeth (2007) point out, the “left wing” of the Vienna circle itself, one of the birthplaces of logical empiricism, didn’t just incorporate Carnap’s movement of logical reconstruction, but also Neurath, who was much more open to sociological investigations—closer to the “saving the phenomena” attitude indeed; but then, closer to Charybdis, too.
123
Synthese (2011) 182:131–148
137
Data, which play the role of evidence for the existence of phenomena, for the most part can be straightforwardly observed. However, data typically cannot be predicted or systematically explained by theory. By contrast, well-developed scientific theories do predict and explain facts about phenomena. Phenomena are detected through the use of data, but in most cases are not observable in any interesting sense of that term. (Bogen and Woodward 1988, p. 305f.) Bogen and Woodward show through a number of case studies that the step from data to phenomena is indeed crucial for scientific practice, and that specific local skills— knowledge of possible confounding factors or about the quirks of individual scientific instruments—are needed to make that step. In their simplest case study, they take up an example that Ernest Nagel, a defender of the normative approach to philosophy of science of the 1950s, discusses in his influential book The Structure of Science (Nagel 1961, p. 79): measuring the melting point of lead. Nagel suggests that the sentence “lead melts at 327◦ C” is a statement about an observation. Bogen and Woodward show, however, that an individual observation of a sample of lead’s melting first needs to be put into its local context, a requirement reminiscent of the traditional hermeneutical rule of “sensus totius orationis” or “semantic holism”.9 Without going into the details, making one such observation in a laboratory involves a number of technical and subjective factors that are crucial for the interpretation of the resulting single data point. Many such observations pooled together may allow one to infer to the existence of a stable phenomenon, which would then be expressed by the given sentence, “lead melts at 327◦ C”. It is this phenomenon that, e.g., a quantum-mechanical model of the crystal structure of lead would have to explain—and not a single thermometer reading, which may be influenced by, e.g., draught in the laboratory because of someone’s entering and opening the door. A skilled experimenter knows about such and other confounding factors (in one case Bogen and Woodward cite, involving the experimenter’s boss’s heavy steps on the stairwell), and takes them into account prudently when inferring to the existence of phenomena. Ex post it may well be possible to capture such effects systematically via appropriate mini-theories describing a certain class of observational effects, and anticipating them appropriately is a prerequisite of good experimental design. The structure of the actual lab work however always remains open for all kinds of unforseen interventions, and accordingly, there is no formal recipe for the step from data to phenomena. We claim that the data/phenomena distinction makes sense not just within philosophy of science, but in all cases in which the technique of conceptual modelling is employed. The distinction allows one to discern some fine structure within the phenomenology step, which is not just a step of data collection. The actual work (the empirical work in the lab, the historical work in the archive, the sociological work with questionnaires or interviews—but also armchair introspection) indeed just leads to data: idiosyncratic local facts. However, from these, stable and reproducible phenomena have to be distilled in order to be put up against the theory at issue in the 9 Cf. (Scholz 2001, p. 71) or (Bühler and Cataldi Madonna 1996); e.g., “ein Wort hat nur eine Bedeutung keinen Sinn, ein Saz (sic) an und für sich hat einen Sinn aber noch keinen Verstand, sondern den hat nur eine völlig geschlossene Rede” (Schleiermacher 1838, p. 41).
123
138
Synthese (2011) 182:131–148
modelling cycle.10 A phenomenology step that leaves out this move amounts to mere data collection—it is incomplete and can lead to a wrong assessment. Before we move on to employ the data/phenomena distinction in addressing the general question about “philosophy of X ” discussed in the previous section, we shall now illustrate its usefulness via two short case studies. We shall first show how the distinction can be used in epistemology by addressing the question of the factivity of knowledge (Sect. 4.1). Secondly, we shall comment on a specific discussion about ordinary language philosophy in the late 1950s (Sect. 4.2). More generally, we will then point to some lessons for the proper place of intuition and experiment in conceptual modelling (Sect. 4.3). 4.1 Empirical questions in epistemology Many central questions of epistemology center around the concept of knowledge: what knowledge is, which knowledge we have or can acquire in which circumstances, and how knowledge relates to truth, belief, and justification. It seems obvious that there is a link between these questions and empirical facts about knowledge attributions, which may be specific utterances of “X knows that p”. After all, philosophers are interested in the concept of knowledge because people think and talk about knowledge—if we had no use for knowledge attributions, the field of epistemology probably wouldn’t be there. Also quite plausibly, the socially acknowledged attribution of knowledge of p to X would appear to be a good guide as to whether some person X actually knows some p. Surely the link will not be strict—people make mistakes, and there can be pragmatic factors bringing about non-literal uses of such attributions (after all we know that “Can you pass the salt?” isn’t a question about physical ability either). But if socially acknowledged attribution of knowledge was systematically talking about a phenomenon (sic!) different from knowledge, why wouldn’t philosophers investigate that concept instead? Shouldn’t there at least be some link between the utterances and the philosophical theory? The debate about such a link is rather fierce. Some epistemologists favour a completely normative and a priori approach to knowledge that would not acknowledge the relevance of actual usage—cf., e.g., Hazlett (2009) for an overview. In this camp, the nature of knowledge is held to be fixed by philosophical theory derived via an armchair method based on singular intuitions and introspection. Actual knowledge attributions are then assessed pragmatically, i.e., for cases in which philosophical epistemology and actual use differ, a pragmatic rule is invoked to explain the discrepancy. Such a philosophical analysis is in danger of becoming revisionist philosophy of knowledge by claiming that knowledge assertions in natural language are not about the concept called “knowledge” in philosophy, but about some other concept that happens to be called the same in natural language. According to this extreme view, native speakers using the verb “to know” are referring to this other concept, and their usage has no bearing whatsoever on epistemology. At the other extreme, a purely empirical analysis 10 In fact, phenomena usually play a double rôle in modelling: from the theory, one derives predicted phenomena which can then be compared with the phenomena derived from the data, and any discrepancy here will be important for the third modelling step of assessment.
123
Synthese (2011) 182:131–148
139
of the natural language usage of “to know” clearly loses its philosophical component and becomes a mere description of speaker behaviour. Let us illustrate this opposition by the question of the factivity of knowledge, i.e., the question of whether one can know only that which is true. Most a priori philosophical theories of knowledge hold on to this principle, while some contextualist theories deny it. It is a plain empirical fact that people sometimes use knowledge attributions, and apparently felicitously, in a way that denies the factivity of knowledge.11 What are we to make of this? From the point of view of conceptual modelling, it becomes clear that we are here confronted with a debate about the phenomenology step of modelling and about its assessment. Proponents of an a priori theory of factive knowledge may downplay the importance of the empirical material by pointing out that people are sometimes confused about the concepts they employ (which is certainly true), by giving an alternative, pragmatic reading of the knowledge attributions under consideration, or by simply insisting that such examples are philosophically irrelevant. Proponents of a more descriptive approach, on the other hand, may remain unimpressed and point to the undeniable empirical data, accusing the apriorist of an isolating strategy and of reliance on a single data point gathered via introspection. In this debate, the data/phenomena distinction can be employed to do more justice to both sides, and perhaps to advance the state of the discussion. (We refrain from taking sides here.) The facts that the descriptivists point to are first and foremost data—observable, but local and idiosyncratic facts that do not have the status of theoretically relevant phenomena yet. A further step is necessary here—and even internal consistency of the data after averaging out random errors isn’t enough: there can be systematic, not just random errors in data collection.12 Let us thus assume that knowledge of p was attributed to X in a certain situation (this already demands a certain interpretation of the words uttered, since irony, play-acting etc. have to be excluded), and that nobody complained even though p was false. Let us assume that a number of such data have been accumulated. When can we infer from there to an epistemologically relevant phenomenon of non-factivity of knowledge? We pointed out above that the move from data to phenomena is one that requires skill and acquaintance with the local idiosyncrasies of the data. For the case at hand this means that one will have to pay close attention to the local circumstances in which the data (the knowledge attributions) were recorded. The main confounding factor is the possibility of a pragmatic reading of the knowledge attribution. If in a given case a recorded knowledge attribution has a pragmatic explanation—and there are such cases—, then it will not corroborate the phenomenon of non-factive knowledge. One should however observe that the onus of proof in such cases is on the normative 11 Hazlett (2009) gives the example “Everyone knew that stress caused ulcers, before two Australian doctors in the early 80s proved that ulcers are actually caused by bacterial infection”, saying that this sentence “does not strike ordinary people as deviant, improper, unacceptable, necessarily false, etc.”. Further examples along these lines are easy to find. For some empirical evidence for the non-factive use of “to know” in mathematical practice cf. Müller-Hill (2009) and Löwe et al. (2009). 12 E.g., in measuring the melting point of lead, one might be using a wrongly calibrated thermometer, or consistently misjudge the visual signs of melting. To repeat, the step from data to phenomena isn’t automatic.
123
140
Synthese (2011) 182:131–148
theoretician. As Austin (1956) knew, ordinary use may not have the last word, but it does have the first. Generalizing from the specific example, we see that an a priori approach can be correct in criticizing purported empirical refutations of a philosophical theory by single cases and in dismissing them as reports of idiosyncratic usage. This holds good as long as the empiricists have not identified phenomena but are merely relying on data. As soon as a stable and philosophically relevant phenomenon is identified that challenges the a priori theory, however, the empiricist has a valid point against the theory. 4.2 On the phenomenological basis of ordinary language philosophy: Mates vs. Cavell in 1958 In this subsection, our topic is the question of the nature of the factual foundation of the ordinary language philosophy of the 1950s. We shall discuss the debate between Mates (1958) and Cavell (1958).13 In the 1950s, much of analytic philosophy had taken a linguistic turn towards ordinary language, apparently following Wittgenstein’s maxim, “Denk nicht, sondern schau!”. The idea was that many philosophical problems should be resolvable through analysis of language as ordinarily practiced—after all, Wittgenstein had also taught that philosophy was language’s going on holiday.14 Reverting to ordinary use should thus expose many philosophical problems as spurious. If this is to be the method of philosophy, one obvious question is where the facts about ordinary use come from. Benson Mates (1958) asks this question and argues that an empirical approach is necessary: statements about ordinary use are empirical statements and as such should be addressed, e.g., by the methods of empirical linguistics.15 Stanley Cavell (1958), Mates’s co-symposionist at the 1957 APA Pacific meeting from which both papers ensued, however claims that it is a mistake to view such statements as straightforwardly empirical. According to him, a native speaker has a form of access to facts about ordinary use that is different from and philosophically much more pertinent than nose-counting. One of Mates’s main points—to which, as far as we can see, Cavell has no direct response—is that the philosophical access to the pertinent facts seems to lead to incompatible results. It may be worth quoting at length; the context of the discussion is an assessment of Ryle’s claim that (most of) the problem of free will comes from disregarding the ordinary use of the term “voluntary”, which according to him only really applies to “actions which ought not to be done” (Ryle 1949, p. 69). Mates comments: [T]he intuitive findings of different people, even of different experts, are often inconsistent. Thus, for example, while Prof. Ryle tells us that “voluntary” and “involuntary” in their ordinary use are applied only to actions which ought not 13 We should like to thank Jim Bogen for drawing our attention to this material. 14 “Philosophische Probleme entstehen, wenn die Sprache feiert” (Wittgenstein 1953, para. 38).—The
“don’t think, but look” maxim is from para. 66. 15 Mates in fact envisages at least two methodological ways of accessing such facts, cf. (Mates 1958, p. 165). We shall not go into any detail here.
123
Synthese (2011) 182:131–148
141
to be done, his colleague Prof. Austin states in another connection: “… for example, take ‘voluntary’ and ‘involuntary’: we may join the army or make a gift voluntarily, we may hiccough or make a small gesture involuntarily …” If agreement about usage cannot be reached within so restricted a sample as the class of Oxford Professors of Philosophy, what are the prospects when the sample is enlarged? (Mates 1958, p. 165) Well, how should this question about ordinary use be resolved—and what is the proper methodological place of statements about ordinary use in philosophical arguments anyhow? Again, our main point here is not to try and resolve these questions, but to point to the usefulness of the data/phenomena distinction in modelling. Statements about ordinary use are first and foremost data. This holds no matter whether such statements are established by empirical investigation (be it nose-counting or interactive interviews) or by armchair introspection exposing intuitions. What is needed in the iterative scheme of conceptual modelling, on the other hand, is stable phenomena. The question of how and when given data may be used to infer to the existence of phenomena is, as always, a delicate one. As in the epistemology case from Sect. 4.1, empirical data about usage may be open to pragmatic maneuvers that explain away observed linguistic behaviour as not pertinent to a given philosophical question. And it is true that a competent speaker of a language, as a member of the community whose practice it is to speak that language, has a special access to that practice that allows him or her to say “we do it like this” that may not need empirical support—Cavell (1958) gives a detailed exposition of this view. One should however acknowledge the methodological pluralism that is now standard in linguistics: according to Karlsson’s (2008), “grammatical intuition [ …] is accessible by introspection, elicitation, experimental testing, and indirectly by observation of language data”. In this way, conflicting native speakers’ intuitions can be resolved by collecting more data from various sources—a standard move in inference to the existence of phenomena in the sciences (Bogen and Woodward 1988). As in the lab, so in philosophy, data only make sense if their local context is properly taken into account, and the touchstone for theories consists of phenomena, not of data. 4.3 On the proper place of experiment and intuition in conceptual modelling The discussion between Mates and Cavell outlined in Sect. 4.2 is a precursor of a current and very intense metaphilosophical debate about the rôle of intuition and experiment in philosophy. In the last section, we linked the discussion from the 1950s to the data/phenomena distinction, and similarly, we can also locate it as a crucial but neglected factor in the current debate, at least for philosophical approaches in the paradigm of conceptual modelling—which covers a great part of the locus of the current metaphilosophical debate; viz., analytical philosophy. In the mentioned debate, we find two opposed camps whose extremes correspond roughly to Scylla and Charybdis from Sect. 3. Situating the discussion explicitly within the conceptual modelling framework and insisting on the data/phenomena distinction
123
142
Synthese (2011) 182:131–148
will help us to identify what is going on in this debate and to develop criteria for assessing whether a use of intuition or experiment is problematic or not. Experimental philosophers have attacked the use of intuition and introspection, claiming that these are too much dependent on external factors to be philosophically relevant. To take just one example, Weinberg et al. (2001) describe a cluster of empirical hypotheses about epistemic intuitions (e.g., “Epistemic intuitions vary from culture to culture”) and then argue that “a sizeable group of epistemological projects [ …] would be seriously undermined if one or more [ …] of [these] hypotheses about epistemic intuitions turns out to be true” (Weinberg et al. 2001, p. 429). They then provide empirical data that suggests that these hypotheses are in fact true.16 In general, some experimental philosophers believe that intuitions are only “suited to the task of providing an account of the considered epistemic judgements of (mostly) well-off Westerners with Ph.D.’s in Philosophy” (Bishop and Trout 2005, p. 107). Critics of experimental philosophy read this as a complete denial of the philosophical relevance of intuition. Against this they stress the fact that sometimes the experimentally observed disagreement is merely verbal and not substantive (Sosa 2007, p. 102), that “dialogue and reflection” rather than singular and possibly idiosyncratic introspection have “epistemological authority” (Kauppinen 2007, p. 113)17 and that “intuitions involve the very same cognitive capabilities that we use elsewhere” (Williamson 2004, p. 152). In slightly differing ways, both Sosa and Williamson stress the analogy between the philosophical use of intuitions and other methods of scientific endeavour: [I]ntuition is supposed to function [ …] in philosophy [ …] by analogy with the way observation is supposed to function in empirical science. (Sosa 2007, p. 106) Metaphilosophical talk about intuitions [ …] conceals the continuity between philosophical thinking and the rest of our thinking. (Williamson 2004, p. 152) We believe that the discussion about the rôle of intuition and experiment in philosophy can benefit from the two methodological points that we have argued for in this paper. First, viewing philosophical arguments as cases of conceptual modelling helps to understand the rôle that intuition or experiment are supposed to play: they constitute part of the phenomenology step. Sosa and Williamson in their respective ways are right in stressing the analogy with the sciences here. Second, making the data/phenomena distinction helps to see that there is nothing wrong with intuition, nor with experiments,
16 As just one example, they tested a Gettier-style story (similar to the ones discussed in Sect. 5.1 below) about Jill driving an American car on Western and East Asian test subjects. While the Western subjects agreed with the mainstream philosophical opinion that no true knowledge exists in a Gettier-style situation, the majority of East Asian subjects disagreed (Weinberg et al. 2001, Sect. 3.3.2). For further results concerning dependence of intuitions on the order of presentation of stories, cf. also Swain et al. (2008). 17 Kauppinen (2007) also criticizes the experimentalists for using non-participatory questionnaire methods. Note that there are other, qualitative and participatory methods of the social sciences that may be better and more reliable ways of inferring philosophically relevant phenomena. Of course, experimental philosophers are to blame if they do not employ the full richness of methods available in the empirical social sciences.
123
Synthese (2011) 182:131–148
143
as such.18 Both are methods for arriving at data: local, idiosyncratic facts. Introspection at first tells us only something about the intuitions of one particular human being, the philosopher; similarly, experiments give us data about particular individual human beings. It is the move to stable phenomena that makes all the difference. Both intuition and experiment are problematic if the assessment step of modelling uses just data and not stable phenomena, or if the distinction is not made at all.19 It is interesting to note that the discussion about intuition in philosophy has its direct cognate in the meta-discussion in linguistics where the rôle of native speaker intuitions about grammaticality has been questioned.20 While looking at this analogy, it should strike the philosopher as interesting that native-speaker intuition in linguistics is a different concept from the philosopher’s intuition in many examples of analytic philosophy: we already quoted Karlsson’s (2008) point about the multiple accessibility of grammatical intuition. Using native-speaker intuitions in linguistics is methodologically sound because they can be corroborated as phenomena rather than data by elicitation and testing. The problem with philosophical intuition is that very often this seems to be not the case—and papers such as Weinberg et al. (2001) (even if one may be sceptical about their methodology) show that there is a problem. If one is not able to identify a phenomenon behind the intuitions one refers to—and such a phenomenon has to be stable across cultural differences to be philosophically relevant—, then something is amiss. 5 Assessing instances of conceptual modelling In Sect. 3, we discussed the Scylla of divorcing philosophy of X from X and the Charybdis of transforming philosophy of X into an empirical analysis of human dealings with X . From our discussion, it is clear that philosophy of X , when conducted in the style of conceptual modelling, will have to maneuver between these two perils in all three modelling steps. This provides us with a kind of checklist for assessing instances of conceptual modelling in philosophy. 1.
Formal representation. In this step one has to strive for a representation that is properly philosophical—not just any representation of aspects of X will do. What counts as properly philosophical, or as philosophically interesting, also has to do with the historical development of the subject of philosophy. As is well known, what now is physics used to be philosophy a few hundred years ago. Nowadays, when one envisages, e.g., a philosophy of matter, many questions will just be outside of the scope of a philosophical representation and threaten to move one into the whirlpools of Charybdis.
18 In this assessment we thus join the balanced view of Symons (2008). 19 For a clear example of running the two together, cf. Kauppinen (2007). Grundmann (2003) asks for
broadening the range of data to be used in philosophy (“die Datenbasis [ …] erweitern”, p. 50). Our point is that it isn’t the amount of data but data that allow for the isolation of stable phenomena, which makes the crucial difference. 20 Cf. Schütze (1996), Sect. 2.4, entitled “Introspection, Intuition and Judgment”, which starts with a quo-
tation by Raven McDavid: “Being a native speaker doesn’t confer papal infallibility on one’s intuitive judgements”.
123
144
2.
3.
Synthese (2011) 182:131–148
Phenomenology. Especially in this step, the two mentioned dangers both pose a genuine threat. Collecting phenomena without philosophical relevance will push the modeller away from the field of philosophy, while failing to collect phenomena in the field of X will cause the modeller to falter at the rock of Scylla. The latter can happen in two different ways: either by collecting phenomena that fall outside of the scope of X , of, even worse, by failing to infer stable phenomena from the data about X that one has collected. Assessment. The dangers in this step are related to those of the previous step. Mistaking data for phenomena, or failing to make the distinction, is widespread in philosophy, leading to a premature end of the modelling cycle. In many instances, the assessment step is even lacking altogether, which means that the cyclical nature of modelling is not taken seriously.
As an application of our methodological results, we shall now look at cases of good and bad conceptual modelling by discussing the phenomenology step of three wellknown philosophical arguments. This will allow us, finally, to comment on a certain feeling of dissatisfaction with current analytic philosophy. 5.1 Stopped clocks and bogus barns The question of whether knowledge is (nothing but) justified true belief, has been addressed by a number of philosophers. One purported example of justified true belief that is not knowledge, dating from before the famous paper of Gettier (1963), is by Russell: There is a man who looks at a clock which is not going, though he thinks it is, and who happens to look at it at the moment when it is right; this man acquires a true belief as to the time of day, but cannot be said to have knowledge. (Russell 1948, p. 170f.) This example plays its rôle in the phenomenology step of modelling the concept of knowledge. If it points to a true phenomenon of justified true belief without knowledge, then knowledge cannot be (nothing but) justified true belief. How are we to assess this case? In the case of a stopped clock, we have stable natural language intuitions. Clocks stop all the time, and we deal with such situations in everyday life. Telephone calls and e-mail discussions across several time zones give us experience in handling temporal shift and subtle linguistic means of dealing with the fact that the subjective time for different speakers may be different. This experience in turn allows us to confidently claim intuitions in the case of a stopped clock. In this case, thus, an appeal to natural language intuition (“cannot be said to have knowledge”) is enough to infer to the existence of the relevant phenomenon. In the larger context of modelling knowledge, this means that the theory at issue is refuted and that a new modelling cycle has to begin. In epistemology, many other examples of justified true belief without knowledge have been discussed, but not all of them are equally well supported. Consider Goldman’s (or Ginet’s) bogus barn example:
123
Synthese (2011) 182:131–148
145
Henry is driving in the countryside with his son. [ …] Suppose we are told that, unknown to Henry, the district he has just entered is full of papier-mâché facsimiles of barns. These facsimiles look from the road exactly like barns, but are really just facades, without back walls or interiors [ …] Henry has not encountered any facsimiles; the object he sees is a genuine barn. But if the object on that site were a facsimile, Henry would mistake it for a barn. Given this new information, we would be strongly inclined to withdraw the claim that Henry knows the object is a barn. (Goldman 1976, p. 772f.) Again, our natural language intuitions (“we would be strongly inclined …”) are invoked before he background of a hypothetical situation. Goldman’s data point stands: he would not say that Henry has knowledge in that case. Are we allowed to make the step to the existence of the relevant phenomenon here, too? This case seems to be more doubtful—too much depends on fine details of the story. In the given story, Henry has just entered fake barn country, he sees a real barn; why shouldn’t he know it is one? Contrast this with a case in which he has already been misled (unknowingly); our intuitions may shift. In a well-conducted instance of conceptual modelling, this situation should prompt us to look for more data to corroborate the phenomenon in question. Typically, an experiment might be called for at this stage—or the switch to a different example, like Russell’s, which leads one to a relevant phenomenon more directly.21 5.2 Mad pain and Martian pain On the way to making a point about the mind-body problem, David Lewis opens his paper on “Mad pain and Martian pain” as follows: There might be a strange man who sometimes feels pain, just as we do, but whose pain differs greatly from ours in its causes and effects. Our pain is typically caused by cuts, burns, pressure, and the like; his is caused by moderate exercise on an empty stomach. Our pain is generally distracting; his turns his mind to mathematics, facilitating concentration on that but distracting him from anything else. [ …] [H]e feels pain but his pain does not at all occupy the typical causal role of pain. He would doubtless seem to us to be some sort of madman [ …] I said there might be such a man. I don’t know how to prove that something is possible, but my opinion that this is a possible case seems pretty firm. If I want a credible theory of mind, I need a theory that does not deny the possibility of mad pain. (Lewis 1983, p. 122) Lewis thus introduces the possibility of mad pain and goes on to use it as a touchstone for a theory of mind, he speaks of “the lesson of mad pain” being an argument against
21 For Goldman, the Gettier nature of the case is only one of the relevant aspects, which is why Russell’s
example wouldn’t be enough for the purpose of his paper.
123
146
Synthese (2011) 182:131–148
functionalism (p. 123). In so doing he treats that possibility as a phenomenon. We may grant that already the possibility, not just the actual existence, of a case of mad pain has intriguing consequences for the philosophy of mind, as Lewis claims. Within our modelling framework, we can however also assess how successful Lewis has been in establishing the existence of the phenomenon in question. He states that he doesn’t know “how to prove that something is possible”—read: how to establish the possibility of something as a phenomenon. Now clearly there are established methods for this task; medieval philosophers already knew that ab esse ad posse valet consequentia. Lewis however does not point to an actual case, or even just to a case that is relevantly similar. Instead he argues in favour of his supposition from the fact of his “pretty firm” conviction of the possibility in question; he later repeats that he is concerned here with his “naive opinions about this case” (p. 129). Lewis thus brings forward a piece of data—and a relevant one too. However, by this he has not succeeded in establishing the phenomenon that he needs for his argument against functionalism. Since Lewis in the mentioned article gives no further support for the possibility that he presupposes, his argument should be rejected as methodologically unsound—which, of course, is not the same as rejecting its conclusion. 5.3 Dissatisfaction with current analytic philosophy Having assessed a few ‘first-level’ philosophical arguments from the methodological perspective provided in this paper, we also mean to provide a (partial) explanation of why some philosophers may think that current analytic philosophy is taking a wrong turn. We can discern and explain two reasons. First, a substantial amount of research results is negative, consisting in the refutation of theories of other people. This is to some extent built into the method of conceptual modelling, which is falsificationprone: any model can be challenged by more data, and challenging the model of a different researcher is easier than coming up with your own model. Secondly, however, the negative image of such research has another, more problematic source: the refutations of theories of other philosophers often involve counterexamples piled on top of each other in thought-experiments dealing with fictitious universes populated with bizarre creatures and objects, such as Martians deploying M-rays (Mele 2003, Chap. 2), zombies (Chalmers 1996), twin earthlings (Putnam 1975), or watchful angels with a dislike for particular glassware (Johnston 1992). Many people feel that they are losing their grip on what such examples can prove, and would generally question the merit of science fiction in philosophy.22 Our analysis allows us to go beyond this vague feeling of dissatisfaction with such thought-experiments, and explain why certain cases are unproblematic and others correspond to a faulty phenomenology step. The data—i.e., that some philosopher is convinced by a certain story and another isn’t—can be acknowledged, no war about personal intuitions need be fought. The important question is, rather, whether from such stories we may infer not just to data, but to phenomena.
22 Remember that Wittgenstein went for Westerns.
123
Synthese (2011) 182:131–148
147
Acknowledgements The authors acknowledge the support of the Deutsche Forschungsgemeinschaft for the Wissenschaftliches Netzwerk PhiMSAMP (MU1816/5-1). Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References Austin, J. L. (1956). A plea for excuses. In Proceedings of the Aristotelean Society, 57, 175–204. Beaney, M. (2008). Analysis. In: E. N. Zalta (Ed.), The Stanford Encyclopedia of philosophy. Winter 2008 Edition. stanford, CA: CSLI. http://plato.stanford.edu/archives/win2008/entries/analysis/ Bishop, M. & Trout, J. (2005). Epistemology and the psychology of human judgment. Oxford: Oxford University Press. Bogen, J. & Woodward, J. (1988). Saving the phenomena. Philosophical Review 97(3), 303–352. Bühler, A. & Cataldi Madonna, L. (1996). Einleitung. In A. Bühler & L. Cataldi Madonna (Eds.), Georg Friedrich Meier, Versuch einer allgemeinen Auslegungskunst (1757), Vol. 482 of Philosophische Bibliothek (pp. VII–IC). Hamburg: Felix Meiner Verlag. Carnap, R. (1930). Die alte und die neue Logik. Erkenntnis, 1, 12–26. Cavell, S. (1958). Must we mean what we say? Inquiry, 1, 172–212. Chadwick, J. (1990). The decipherment of linear B (2nd ed.). Cambridge: Cambridge University Press. Chalmers, D. (1996). The conscious mind: In search of a fundamental theory. Oxford: Oxford University Press. Duhem, P. (1906). La théorie physique: son objet, sa structure. Paris: Rivière (cited from P. Wiener’s translation, The Aim and Structure of Physical Theory, Princeton: Princeton University Press, 1991). Gettier, E. (1963). Is justified true belief knowledge? Analysis, 23, 121–123. Goldman, A. I. (1976). Discrimination and perceptual knowledge. Journal of Philosophy, 73(20), 771–791. Grundmann, T. (2003). Der Wahrheit auf der Spur. Eine Verteidigung des erkenntnistheoretischen Externalismus. Paderborn: Mentis. Hazlett, A. (2009) The myth of factive verbs. Philosophy and Phenomenological Research (to appear). Johnston, M. (1992). How to speak of the colors. Philosophical Studies, 68, 221–263. Karlsson, F. (2008). Early generative linguistics and empirical methodology. In A. Lüdeling, & M. Kytö (Eds.), Corpus linguistics. An international handbook (Vol. 1, pp. 14–38). Berlin: De Gruyter. Kauppinen, A. (2007). The rise and fall of experimental philosophy. Philosophical Explorations, 10(2), 95–118. Kuhn, T. S. (1957). The Copernican revolution. Cambridge MA: Hardard University Press. Kuipers, T. A. F. (2007). Explication in philosophy of science. In T. A. F. Kuipers (Ed.), Handbook of the philosophy of science. General philosophy of science—focal issues (pp. vii–xxiii). Amsterdam: Elsevier. Lewis, D. K. (1983). Philosophical papers, Vol. 1. Oxford: Oxford University Press. Löwe, B., Müller, T., & Müller-Hill, E. (2009). Mathematical knowledge: A case study in empirical philosophy of mathematics. In: B. Van Kerkhove, J. De Vuyst, & J. P. Van Bendegem (Eds.), Philosophical perspectives on mathematical practice. London: College Publications. Mates, B. (1958). On the verification of statements about ordinary language. Inquiry, 1, 161–171. Mele, A. R. (2003). Motivation and agency. Oxford: Oxford University Press. Müller-Hill, E. (2009) Formalizability and knowledge ascriptions in mathematical practice. Philosophia Scientiae 13(2) (to appear). Nagel, E. (1961). The structure of science. New York: Hartcourt, Brace and World. Nemeth, E. (2007). Logical empiricism and the history and sociology of science. In A. Richardson & T. E. Uebel (Eds.), Cambridge companion to logical empiricism (pp. 278–302). Cambridge: Cambridge University Press. Putnam, H. (1975). The meaning of ‘meaning’. In Mind, language, and reality. Philosophical papers (Vol. 2, Chapter 12, pp. 215–271). Cambridge: Cambridge University Press.
123
148
Synthese (2011) 182:131–148
Quarteroni, A. (2009). Mathematical models in science and engineering. Notices of the American Mathematical Society, 56(1), 10–19. Russell, B. (1948). Human knowledge: Its scope and limits. London: Allen and Unwin. Ryle, G. (1949). The concept of mind. New York: Barnes and Noble. Schleiermacher, F. (1838). Hermeneutik und Kritik mit besonderer Beziehung auf das Neue Testament, Vol. I.7 of Friedrich Schleiermacher’s saemmtliche Werke. Berlin: G. Reimer. Aus Schleiermachers handschriftlichem Nachlasse und nachgeschriebenen Vorlesungen herausgegeben von Dr. Friedrich Lücke. Scholz, O. R. (2001). Verstehen und Rationalität, Vol. 76 of Philosophische Abhandlungen. Frankfurt a.M.: Vittorio Klostermann. Schütze, C. T. (1996). The empirical base of linguistics. Grammaticality judgments and linguistic methodology. Chicago: University of Chicago Press. Sosa, E. (2007). Experimental philosophy and philosophical intuition. Philosophical Studies, 132, 99–107. Swain, S., Alexander, J., & Weinberg, J. M. (2008). The instability of philosophical intuitions: Running hot and cold on truetemp. Philosophy and Phenomenological Research, 76(1), 138–155. Symons, J. (2008). Intuition and philosophical methodology. Axiomathes, 18, 67–89. Uebel, T. (2001). Carnap and Neurath in exile: Can their disputes be resolved? International Studies in the Philosophy of Science, 15, 211–220. Weinberg, J. M., Nicholas, S., & Stich, S. (2001). Normativity and epistemic intuitions. Philosophical Topics, 29(1&2), 429–460. Williamson, T. (2004). Philosophical ‘intuitions’ and scepticism about judgment. Dialectica, 58(1), 109–153. Wittgenstein, L. (1953). Philosophische Untersuchungen / Philosophical Investigations. Oxford: Blackwell (edited posthumously and published together with an English translation by E. Anscombe).
123
Synthese (2011) 182:149–163 DOI 10.1007/s11229-009-9617-6
What are the phenomena of physics? Brigitte Falkenburg
Received: 20 May 2009 / Accepted: 3 June 2009 / Published online: 14 July 2009 © Springer Science+Business Media B.V. 2009
Abstract Depending on different positions in the debate on scientific realism, there are various accounts of the phenomena of physics. For scientific realists like Bogen and Woodward, phenomena are matters of fact in nature, i.e., the effects explained and predicted by physical theories. For empiricists like van Fraassen, the phenomena of physics are the appearances observed or perceived by sensory experience. Constructivists, however, regard the phenomena of physics as artificial structures generated by experimental and mathematical methods. My paper investigates the historical background of these different meanings of “phenomenon” in the traditions of physics and philosophy. In particular, I discuss Newton’s account of the phenomena and Bohr’s view of quantum phenomena, their relation to the philosophical discussion, and to data and evidence in current particle physics and quantum optics. Keywords Analytic-synthetic method · Bohr · Newton · Phenomena · Physics · Scientific realism 1 Introduction Let me begin with a commemorative remark about Daniela Bailer-Jones. I had invited her to give a talk on models at the first fall meeting of the Philosophy Working Group of the German Physical Society, at Heidelberg, in October 2004. She had accepted the invitation long before. But it turned out that she had to go to the hospital once more for an operation and would come out just a few days before the meeting. So she told me that she would not be able to come. I asked her to send me a piece of her work on
B. Falkenburg (B) Fakultaet 14, Institut fuer Philosophie und Politikwissenschaft, Technische Universitaet Dortmund, 44227 Dortmund, Germany e-mail:
[email protected] 123
150
Synthese (2011) 182:149–163
models and let me present it at the meeting. The day of the meeting she might come for the discussion, or not, depending on her actual health state. She was very fond of this idea and sent me the chapter on Models, Theories and Phenomena1 from her book. Finally, she did not only show up, pale and tough, but gave us an impressive talk. The chapter on Models, Theories and Phenomena remained all of her work on phenomena. One of its virtues is that it relates separate discussions of recent philosophy of science which belong together. It conceives of the relation between models and phenomena in the following way. Like the Cartwright school, Bailer-Jones considered models to be mediators between theories and the phenomena. Like Bogen and Woodward, she considered the phenomena of science to be in the world. Like Hacking, she emphasized that scientific phenomena are stable, reproducible, and regular.2 In particular, she made the following claims about models and phenomena: Models are about phenomena in the world. Models are abstract, whereas phenomena are concrete and empirical. Nevertheless, the phenomena of a science are typical. They are prototypes of a certain kind.3 I take models to be about phenomena in the world […]. A phenomenon has empirical properties, whereas a model that is a structure does not. The phenomena that are explored by modelling are concrete in the sense that they are, or have to do with, real things, real things with many properties – things such as stars, genes, electrons, chemical substances, and so on. One tries to model, however, not any odd specimen of a phenomenon, but a typical one. […] The prototype has all the properties of the real phenomenon; it is merely that the properties are selected such that they do not deviate from a ‘typical’ case of the phenomenon. […] the prototype of a phenomenon still counts as concrete, because it […] could exist in just this manner. Hence, according to Bailer-Jones phenomena are concrete, empirical matters of fact in the world which exhibit certain typical features. This view of scientific phenomena is not uncontroversial. The philosophers of science do not agree about the question of what are the phenomena of science. Their answers depend on their respective positions in the debate on scientific realism. For scientific realists, phenomena are the matters of fact in nature which are explained and predicted by physical theories. According to this view defended by Bogen and Woodward, the phenomena are what the physicists call effects: the Einstein–de Haas effect, the Bohm–Aharanov effect, the quantum Hall effect, etc. But for empiricists like van Fraassen, the phenomena of physics are the appearances, that is, what can be observed or perceived by mere sensory experience. From an empiricist point of view, there are no unobservable phenomena of physics. For constructivists, in turn, the phenomena of physics are not in the world. From a 1 Bailer-Jones (2009, Chap. 6). 2 Cartwright (1983, 1999), Morgan and Morrison (1999), Bogen and Woodward (1988), Hacking (1983). 3 All quotations from Bailer-Jones (2009, Chap. 6).
123
Synthese (2011) 182:149–163
151
constructivist point of view, the phenomena of physics are artificial, they are nothing but the structures generated by experimental and mathematical methods. As I will show in the following, Bailer-Jones’ account of the phenomena agrees with the tradition of physics and philosophy. In particular, it comes close to Newton’s views about the phenomena of physics, to Kant’s concept of a phenomenon and to Bohr’s views about quantum phenomena. Only in recent physics, their ways to talk about phenomena is replaced by more specific ways of talking about data or evidence.
2 The debate on scientific realism For physics, the tradition of saving the phenomena was most important. In ancient astronomy, the attempt to save the phenomena was related to the description of the apparent motions of the celestial bodies. The phenomena were nothing but the observed positions of the moon and the planets at the night sky. The Copernican revolution, however, raised the question for the true motions of the celestial bodies and their causes. Galileo used the telescope in order to investigate the celestial motions and the experimental method in order to investigate the laws of mechanical motions. Kepler stated the laws of the elliptic motions of the planets around the sun. Newton unified Kepler’s and Galileo’s laws in his theory of universal gravitation. With Galileo’s experimental method, he investigated the optical phenomena of light propagation, diffraction, and dispersion, in particular, the decomposition of white light into colors by means of a prism. With the beginnings of modern physics, the debate about scientific realism arose. Ancient astronomy aimed at the description of the apparent motions of the celestial bodies. Ptolemy’s system of cycles, epicycles, and decentres made a most accurate description possible. It was open for corrections and made it possible to give very good approximations to the complicated observed planetary motions. With regard to the phenomena, Copernicus’ heliocentric system with circular planetary motions was much worse. But it claimed to explain the apparent planetary motions in terms of the true motions of the celestial bodies. The Aristotelian opponents of the Copernican system interpreted it from an instrumentalist point of view. They considered it to be a mere mathematical tool, as one hypothesis amongst others. The debate about the Copernican system was a debate about truth, about the claims of scientific realism. It was closely related to the question of what are the phenomena of physics. Galileo took his observations of the phases of Venus and the moons of Jupiter as phenomena that proved the truth of the Copernican system. His Aristotelian opponents refused to look through the telescope. They did not accept these observations as genuine phenomena. Hence, the opposition between empiricism and scientific realism is as old as modern physics. It deals with the question of whether the laws of mathematical physics are true and whether unobservable entities such as forces, fields, atoms, and subatomic particles exist. It also deals with the question of what are the phenomena of physics and which tools are employed to observe them. The successes of modern physics did not eliminate instrumentalism and empiricism. In view of subatomic physics, relativity, and quantum theory, the debate on scientific realism went on. The defenders of scientific realism follow the founders of modern
123
152
Synthese (2011) 182:149–163
physics. They admit that scientific phenomena are given by means of technological devices such as the telescope, the microscope, the electron microscope, or the particle detectors and computer devices of particle physics and quantum optics. They claim that these phenomena are physical effects (such as the photo effect, Zeemann effect, Bohm–Aharanov effect, Quantum Hall effect, etc.). The term “effect” indicates that these phenomena have causes, in particular, forces, fields, potentials, or interactions of subatomic particles. According to scientific realism, the phenomena of physics are stable facts of nature predicted and explained by physical theories. This is Bogen and Woodward’s view.4 Contrarily, modern defenders of empiricism and instrumentalism stand in the tradition of Aristotle and Galileo’s opponents. Empiricists claim that the appearances only are given by sensory experience. For Mach, Carnap, Suppes, or van Fraassen, the phenomena of physics consist in sense data, observable measurement outcomes, models of data, or empirical structures. More radically, constructivists consider the phenomena of physics to be artefacts, as opposed to the facts of nature, and the physical effects mentioned above to be technological and mathematical constructs. There were several constructivist schools, from the Neokantian Marburg school to current social constructivism.5 According to the latter, scientific phenomena stem from the fabric of the laboratory and scientific theories are socio-cultural products. In this way, the phenomena of physics and the unobservable agents behind physical effects are debated until today. 3 Newton: analysis and synthesis of the phenomena In physics, there is also no unambiguous concept of the phenomena. What the phenomena are depends on the theoretical and experimental context. This ambiguity dates back to Newton’s use of the term. In the Opticks, Newton identified the phenomena with the observable effects of his experiments. He investigated the phenomena of light propagation and refraction, the colours obtained from white light with a prism, the re-composition of white light from the spectra, etc.6 In contradistinction to these directly observable phenomena, the phenomena of the Principia are the planetary motions described by Kepler’s laws. Hence, they are phenomenological laws, or mathematical structures which empiricists may accept as an empirical structure or data model.7 Both kinds of Newton’s phenomena have common features. They are regular, predictable, law-like, and hence, typical. They are closely related to Newton’s methodology of analysis and synthesis. This methodology was familiar to scientists from Galileo to Bohr and to philosophers from Descartes to Mach.8 Since it is no longer 4 Bogen and Woodward (1988). 5 See Cohen (1883, 1986), Natorp (1910), Pickering (1984), Latour and Woolgar (1979), Knorr-Cetina
(1999). 6 Newton (1730). 7 Newton (1729, 401 ff., or 1999, 797 ff.). 8 The method dates back to Euklid’s Elements. It was opposed to Aristotle’s methods of deduction and induction; see Engfer (1982). The founders of modern physics applied it to natural phenomena, as Galileo‘s resolutive-compositive method (where “resolutive-compositive” are just the Latin terms for
123
Synthese (2011) 182:149–163
153
known in current philosophy of science, it should be sketched shortly. In his preface to the second edition of Newton’s Principia, Cotes explains it as the method of those whose natural philosophy is based on experiment. […] they proceed by a twofold method, analytic and synthetic. From certain selected phenomena they deduce by analysis the forces of nature and the simpler laws of those forces, from which they then give the constitution of the rest of the phenomena by synthesis. The method is identical with the famous “induction” of “propositions gathered from phenomena” pinned down in Newton’s methodological rules at the beginning of Book Three of the Principia, in Rule 4.9 It should not be confused with induction in a modern, empiricist sense. According to Rule 1 and Rule 2, it employs causal analysis. Rule 1 is a principle of causal parsimony, of admitting “no more causes” than are “sufficient to explain the phenomena”. Rule 2 requires assigning the same causes to “natural effects of the same kind”.10 Hence, the aim of causal analysis is to explain phenomena, where the latter are natural effects of a certain kind. In the Opticks, Newton describes the same method of analysis and synthesis as follows: As in mathematics, so in natural philosophy, the investigation of difficult things by the method of analysis ought ever to proceed the method of composition. This analysis consists in making experiments and observations, and in drawing general conclusions from them by induction […] By this way of analysis we may proceed from compounds to ingredients, and from motions to the forces producing them; and in general, from effects to their causes, and from particular causes to more general ones […] And the synthesis consists in assuming the causes discovered, and established principles, and by them explaining the phenomena proceeding from them, and proving the explanations.11 Here, too, the phenomena are subject to causal analysis. The quotation shows that for Newton the causal analysis of the phenomena has two aspects, namely the search of the ingredients or parts of a given compound and the search for the forces behind the motions. In the Principia, Newton explains the motions of the celestial bodies and other mechanical phenomena in terms of gravitation as a universal force. In the Footnote 8 continued “analytic-synthetic”); see, e.g. Losee (1993). Descartes gives a general version of it in the second and third rule of his Discours de la méthode. He and his rationalist followers (Leibniz, Spinoza, Wolff) carried the method over to philosophy. They attempted to model philosophical knowledge after mathematics and/or physics, in order to give undoubted foundations to metaphysics. Kant criticized the confusion to which the method led in seventeenth and eighteenth century metaphysics. In nineteenth century, the method of analysis and synthesis disappeared from philosophy, giving place to the modern empiricist view of induction and deduction. Nevertheless, residues of it are still present in Mill’s account of causal analysis [see Mill (1843), Book III: Of Induction, Chaps. V–VIII]. Mach (1905, 256 ff.), indeed only discusses the origins of the method in ancient mathematics. 9 See the rules at the beginning of Book Three of the Principia, Newton (1729, pp. 398–400; 1999, pp. 794–796). Quotation: Newton (1999, p. 796). Cf. also Falkenburg (2000, Chap. 1), Falkenburg and Ihmig (2004); Ihmig (2004a,b). 10 Ibid., pp. 794–795. 11 Newton (1730, pp. 404–405).
123
154
Synthese (2011) 182:149–163
Fig. 1 Spectral decomposition and re-composition of light. Opticks, Book One, Part II, Exp. 13 (Newton 1730, p. 147)
Opticks, Newton explains the constitution of light and other optical phenomena in terms of colours and hypothetical light atoms. In particular, he shows how white light is decomposed into the colours by a prism and recomposed by superposing the spectra from two parallel prisms (see Fig. 1). In both fields of physics, the phenomena are the starting point of causal analysis. The only difference between the Principia and the Opticks is that Newton is able to give a full-fledged mathematical theory of gravitation explaining the phenomena of mechanics, whereas he is not able to derive any force of the atomic constituents of light or matter, to say nothing of their mathematical description. Regarding the atoms, Newton’s analysis of the optical phenomena does not establish theoretical principles that explain the optical phenomena. Here, the second step of the analytic-synthetic method, the “synthesis”, is missing. Therefore, Newton presented his atomistic hypothesis only in Query 31 at the end of the Opticks. This hypothesis is based on a further methodological principle of physics, namely the assumption of the self-conformability or unity of nature: And thus nature will be very conformable to her self and very simple […].12 This principle is closely related to the belief that the phenomena are connected by laws. In the Principia, the law of gravitation unifies the terrestrial and celestial phenomena of mechanics, namely Galileo’s and Kepler’s motions. In Cajori’s appendix to Motte’s English translation of the Principia, this is demonstrated by the following thought experiment. Imagine a stone cast from a very high mountain The greater velocity the stone gets by the impact, the closer its parabolic trajectory approximates the elliptic orbit of a satellite (see Fig. 2). Later physicists followed Newton, by identifying the phenomena with natural effects of a certain kind, which are given at any stage of research. This account of the phenomena is closely related to the analytic-synthetic method of Newton and 12 Ibid., p. 397. Indeed, his hypothesis that matter consists of atoms is also supported by Rule 3 of the Principia, the rule of generalizing the extensive quantities and the impenetrability of mechanical bodies to all parts of matter including the atoms.
123
Synthese (2011) 182:149–163
155
Fig. 2 Connection between Galileo’s and Kepler’s motions. Principia, Appendix (Newton 1729, p. 551)
his followers. The phenomena are given in the sense that they are the starting point of causal analysis. Vice versa, they are the explananda of physical theories in the sense that they are the end point of the deduction from the principles found by causal analysis, i.e., the opposite synthetic step of the method. In this way, the phenomena of physics came to be any kinds of observable appearances, phenomenological laws, experimental results, measurement outcomes, and unexpected as well predicted physical effects. The phenomena of physics may be theory-laden in many regards, as far as they are given in this sense, waiting for causal analysis and theoretical explanation.13 From a Newtonian point of view, the empiricist claim that the phenomena of science are sense data is based on a deep misunderstanding of scientific methodology. The epistemic goals and the methods of physics are committed to scientific realism. They aim at finding phenomena that may be far from being obvious, at their causal analysis, and at their mathematical explanation. 4 Kant: phenomena and the laws of nature Kant’s philosophical account of the phenomena was based on Newton’s views, but it also took up the traditional philosophical distinction of phenomena and noumena. According to Descartes and his rationalist followers, our sensory perceptions may be deceptive and only intellectual knowledge is reliable. The rationalist credo was that the phenomena perceived by the senses do not give as access to reality proper. Their distrust in the phenomena traced back to Plato. Leibniz’ distinction is most typical. For Leibniz, the phenomena consist in the empirical matters of fact and their spatiotemporal order. The phenomena that we perceive are illusory though well-founded by the monads. His monads are conceived as unobservable, immaterial noumena or things-in-themselves which are the only real substances in the world. The empiricists, on the other hand, stuck to Aristotle’s tradition. They defended the doctrine that the phenomena are all that we know. In contradistinction to Descartes or Leibniz, 13 In my Particle Metaphysics (Falkenburg 2007, Chaps. 2–3), I show how the phenomena, or empirical basis, of particle physics became more and more theory-laden and complex in the course of scientific progress, from the discovery of the electron to the era of the big particle accelerators and highly sophisticated particle detectors.
123
156
Synthese (2011) 182:149–163
Locke criticized the notion of a substance or thing-in-itself as an unclear and confused concept. Later, Hume claimed that such metaphysical concepts are just based on psychological habit. Kant wanted to reconcile the tenable aspects of seventeenth/eighteenth century empiricism and rationalism. For him, both positions were in need of revision. Hume’s empiricism denied the necessity of the principle of causality and the laws of nature, including in particular Newton’s law of gravitation. But the opposing claims of rationalism about metaphysical substances as the only existing reality led to never-ending metaphysical struggles. In view of Newton’s physics, both positions generated problems. In order to resolve them, Kant developed his critical philosophy.14 From rationalism, he kept the ambitious project to model philosophy after the exact sciences. From empiricism, he kept the principle that we only can have objective knowledge of the phenomena. According to the Critique of Pure Reason, the phenomena are concrete, empirical matters of fact in the world, whereas the noumena are things-in-themselves of which no experience is possible, and hence no objective knowledge. Kant’s phenomena are neither identical with the empiricists’ sense data nor with the rationalists’ deceptive perceptions. They are structured sense data. Two kinds of structures are a priori imposed on them: the pure forms of intuition (space and time) and the pure of the understanding (the categories, e.g. substance and causality). On the one hand, the phenomena are given in pure intuition. As such, they are singular. That is, they are given as concrete empirical things and events that are spatio-temporally individuated. On the other hand, the phenomena underlie general laws of nature such as the principle of causality. As such, they behave law-like. In particular, all empirical things are mechanical bodies that obey Newton’s law of gravitation. The relation between the phenomena in general and the phenomena of Newton’s physics was investigated by Kant, too. According to the Metaphysical Foundations of Natural Science, the phenomena of physics (and of any science proper) have a mathematical structure. Due to this mathematical structure, Kant’s scientific phenomena are typical in the same sense as Newton’s phenomena, which were explained in the last section. In nineteenth century philosophy, Kant’s views about the empirical phenomena and their Newtonian structure did not become influential. Post-Kantian German idealism did followed Lambert’s rather than Kant’s views about the phenomena. Lambert was physicist and philosopher. As a physicist, he investigated phenomena of light and developed methods of photometry. As a philosopher, he referred to Plato’s view that the phenomena only seem to be real. His New Organon contains a part titled Phenomenology, or Doctrine of Appearance.15 According to this doctrine, the appearances are what seem to be. For light phenomena, both views of the phenomena perfectly agree. This is not only true for the colours, the paradigmatic non-primary qualities. It is also true for the phenomena of luminosity that Lambert investigated. Lambert’s Phenomenology became influential in the phenomenological tradition of philosophy from Hegel to Husserl and Heidegger.
14 See Falkenburg (2000). 15 See Lambert (1764).
123
Synthese (2011) 182:149–163
157
Nevertheless, Kant’s view of the phenomena became most influential in nineteenth and twentieth century physics. In particular, they underlie Bohr’s views about quantum phenomena.16 5 Bohr: complementary quantum phenomena In order to develop the atomic model of 1913, Bohr had substantially weakened the explanatory claims of Newtonian physics. Bohr’s quantization postulates did not only violate classical radiation theory but also Newton’s principle of the unity of nature. In the transition from the classical to the quantum domain, nature is neither “conformable to her self” nor simple.17 At the end of his Nobel lecture of 1922, Bohr gave his own account of scientific explanation: By a theoretical explanation of natural phenomena we understand in general a classification of the observations of a certain domain with the help of analogies pertaining to other domains of observation, where one has presumably to do with simpler phenomena. The most one can demand of a theory is that this classification can be pushed so far that it can be contribute to the development of the field of observation by the prediction of new phenomena.18 For Bohr, too, the phenomena of physics are explananda of theories. But for him, an explanation is nothing but the classification of observations in terms of analogies. In this point, he follows Mach’s empiricist epistemology rather than Newton. Theories do not aim at the description of the “true causes” of the phenomena19 but only at the prediction of phenomena. This is an instrumentalist view. Physical reality is in the phenomena, but not in a literal understanding of the analogies that help to classify them in order to find out their structure. Nevertheless, Bohr was familiar with the traditional method of analysis and synthesis. When this method disappeared from the philosophy of science in the course of the triumph of empiricism,20 the expression “analysis and synthesis” was still omnipresent in the philosophical contexts that influenced Bohr. His philosophical teacher Høffding used “analysis and synthesis” as a standing expression in his books, and so did Bohr in his later philosophical writings without ever defining them.21 We may also assume that Bohr was familiar Newton’s Principia and Opticks. (Was there any physicist of 16 In nineteenth century science, Kant mainly was present due to Neokantian influences. Helmholtz investigated the foundations of Kant’s epistemology in physical geometry as well as in physiology. Bohr’s views about Kant were influenced by his philosophical teacher Høffding; see Faye (1991) and Pringe (2007). 17 See Newton (1730, p. 397). 18 Bohr (1922), last page (BCW, Vol. 4, p. 482). Bohr’s account of theoretical explanation traces back to
Høffding’s account of analogies and explanation, where the latter was clearly influenced by Mach’s views. 19 See Newton’s Rule 1, Newton (1729, p. 398; 1999, p.794). 20 See note 8. 21 See in particular Bohr (1958) and the article “Analyse et synthèse” in Chevalley (1991, pp. 373–378). Chevalley emphasizes that the term “analysis and synthesis” had also a canonical sense in the Neo-Humboldtian philosophy of language, in Göttingen, the German stronghold of quantum mechanics, in the late twenties; ibid., p. 377.
123
158
Synthese (2011) 182:149–163
Bohr’s generation who did not look into Newton’s Principia and Opticks?) Due to his philosophical education, he could not miss to understand correctly the method of analysis and synthesis in the Queries of the Opticks and the four methodological rules in the Principia. In Bohr’s philosophical writings, the Newtonian sense of the expression “analysis and synthesis” indeed repeatedly occurs.22 After the birth of quantum mechanics, he emphasized that Planck’s quantum of action indicates the limitations of experimental analysis.23 Later, he claimed more generally that biology shows the limitations of mechanical analysis.24 Bohr’s report of his discussions with Einstein, first published in 1949, mentions the problem of “obtaining the proper harmony between analysis and synthesis of physical phenomena” and the impossibility of “attempting a causal analysis of radiative phenomena”.25 On the basis of such text passages, I take it for granted that in the context of quantum mechanics Bohr used the concepts of analysis and synthesis roughly in Newton’s sense. For Bohr, the limitations of experimental and causal analysis in the quantum domain indicate that the concept of a subatomic physical object has to be rejected. There are no quantum objects, there are only quantum phenomena. The quantum mechanical wave function? does not have concrete physical meaning, it only is abstract and symbolic. The only objects of subatomic physics are complementary quantum phenomena. In accordance with Bohr’s account of theoretical explanation, they are analogues of the corresponding phenomena of classical physics, namely particle tracks or wave interference. He introduced these ideas in his famous Como lecture of 1927. There, he emphasized that the “essence” of quantum theory may be expressed in the so-called quantum postulate, which attributes to any atomic process an essential discontinuity, or rather individuality, completely foreign to the classical theories and symbolised by Planck’s quantum of action.26 Here, “individuality” has to be understood in the sense of being indivisible.27 Planck’s quantum of action indicates limitations of experimental analysis, giving rise to the discontinuity “or rather individuality” of the quantum jumps. In addition, Planck’s quantum of action plays a crucial role in Heisenberg’s uncertainty relations. The latter, in turn, indicate the limitations of causal analysis in the quantum domain: The very nature of the quantum theory thus forces us to regard the space-time coordination and the claim of causality, the union of which characterises the classical theories, as complementary but exclusive features of the description […].28 22 See the examples given in Chevalley (1991, pp. 374–375). 23 See below. 24 See Chevalley (1991, p. 374), pointing to Bohr (1961, p. 159); Light and Life (Bohr 1933). 25 Bohr (1949, pp. 10–11). 26 Bohr (1928, p. 88). 27 This was already noted by Meyer-Abich (1965). 28 Bohr (1928, p. 90).
123
Synthese (2011) 182:149–163
159
Fig. 3 Particle tracks observed in nuclear emulsions (Powell et al. 1959, p. 31)
Fig. 4 Compton scattering (below) and event reconstruction (above) (Compton 1927)
According to the Como lecture, the classical theories of particle or wave are complementary descriptions that only together give a full account of the quantum domain. Bohr emphasizes that we are not dealing with contradictory but with complementary pictures of the phenomena, which only together offer a natural generalisation of the classical mode of description.29 The most typical particle-like phenomena of quantum physics are: the particle tracks observed in the bubble chamber (Fig. 3); and the Compton effect in which a single photon gives an observable kick or momentum transfer to an electron, by losing energy and lowering its frequency (Fig. 4). The most typical wave-like phenomena of quantum physics are: the interference fringes observed when (a) electrons or (b) gamma rays are sent through a crystal (Fig. 5). 29 Ibid., p. 91.—For Bohr’s complementarity view of quantum mechanics, see also Falkenburg (2007,
pp. 272–277).
123
160
Synthese (2011) 182:149–163
Fig. 5 Electron diffraction (a) and gamma ray diffraction (b) (Raether 1957, p. 443)
These quantum phenomena are complementary in the following sense. Electrons may either cause particle tracks or interference fringes, depending on the experimental conditions. Similarly, photons may either cause a momentum kick or interference fringes, again depending on the experimental conditions. But there are non experimental conditions under which the tracks (or the momentum kick, respectively) can be observed at the same time. Why are there no quantum objects, for Bohr? According to the Como lecture, classical physical objects are defined in terms of spatio-temporal and causal properties, where these properties can be observed simultaneously. But for quantum phenomena, the spatio-temporal and causal properties cannot be observed at once. The spatio-temporal properties of a physical object are its space–time coordinates. The causal properties, for Bohr, are expressed in terms of momentum–energy conservation. For the alleged quantum ‘objects’ the conditions of observation are at odds with the possibilities of defining a full-fledged physical object in terms of space–time coordinates and momentum–energy conservation. Bohr’s example is the Compton effect, in which the photon shows particle-like nature by giving a momentum kick to an electron, whereas the interference fringes of light sent through a double slit are evidence of a wave-like space–time coordination of light propagation. The particle-like momentum kick is described in terms of classical mechanics, the wave-like light propagation in terms of classical electrodynamics. Hence, there are no quantum objects for Bohr but only complementary quantum phenomena. Only the latter are concrete phenomena. Only the complementary quantum phenomena have either a spatio-temporal or a causal representation. For Bohr, they are intuitive in Kant’s sense, in contradistinction to the abstract and symbolic formalism of quantum mechanics. They are observations obtained under specified circumstances including the experimental arrangement. As such, they are individual (i.e., indivisible), complementary (i.e., mutually exclusive), and in correspondence to the classical models of wave or particle. This account of the quantum phenomena turned out to be closely related to Newton’s methodology of analysis and synthesis, given that the “individuality” or indivisibility of quantum phenomena means that for Bohr,
123
Synthese (2011) 182:149–163
161
the Planck’s quantum of action indicates the limitations of experimental and causal analysis. 6 Conclusions According to the approaches discussed in the preceding sections, the phenomena of physics have the following features. They are (i) spatio-temporally individuated objects and events in the world, i.e., concrete; (ii) given by observation or measurement, i.e., empirical; and (iii) explained in terms of laws and causal stories, i.e., typical, regular, or law-like. The phenomena of physics in the broadest sense cover a wide range of different levels of observation measurement, and experimentation. Only about this last point, scientific realism and its opponents empiricism or constructivism disagree. Nevertheless, the latter positions rely on basically correct insights. The phenomena of physics are empirical structures constructed by experimental and mathematical methods. But strict empiricism and radical constructivism exaggerate as regards the claim that the phenomena should be directly observable. They miss the complicated architectonics of modern physics, according to which the causal explanations of one stage of research may bring out the phenomena of the next stage of research, which again are subject to causal analysis, the results of which will bring out the next deeper level of phenomena, or physical effects, or explananda of physical theories. At all levels of causal analysis, however, the phenomena of physics have the following common features. They are concrete matters of fact in nature. They are empirical insofar they are given by means of observation, measurement, or experiment. And they are typical insofar as they are identified by classification and in terms of mathematical structures. Indeed, the approaches of Bailer-Jones, Bogen and Woodward, Newton, Kant, and Bohr, notwithstanding their differences basically agree about these features of the phenomena. However, finally the following point has to be noted. The concept of a phenomenon is a pre-theoretical, informal notion of physics. In several areas of current physics, it has been eliminated in favor of more precise concepts. In current particle physics, it is replaced by talk about data, events, and evidence. Only in current quantum optics, Bohr’s use of the notion of a quantum phenomenon survived. After all, the concept of a phenomenon has a philosophical origin and it remains to be a philosophical term. References Bailer-Jones, D. M. (2009). Scientific models in philosophy of science. University of Pittsburgh Press (to appear). Bogen, J., & Woodward, J. (1988). Saving the phenomena. The Philosophical Review, 97, 303–352. Bohr, N. (1922). On atomerne bygning. Nobel lecture, December 11, 1922. German translation by Pauli, W. (1923). Über den Bau der Atome. Naturwissenschaften, 11, 606–624. English translation by Hoyt, F. C. (1923). The structure of the atom. Nature, 112, 29–44. Danish and English text in BCW (Bohr’s Collected Works), Vol. 4, pp. 425–482. Bohr, N. (1927). The quantum postulate and the recent development of atomic theory. Como lecture. Modified version: Bohr, 1928. Both versions in BCW, Vol. 6, pp. 109–158.
123
162
Synthese (2011) 182:149–163
Bohr, N. (1928). The quantum postulate and the recent development of atomic theory. Nature, 121, 580–590 (modified version). Reprinted in Wheeler, J. A., & Zurek, W. H. (Eds.). (1983). Quantum theory and measurement (pp. 87–126). Princeton: Princeton University Press. (Quoted after Wheeler and Zurek 1983). Bohr, N. (1933). Light and life. Nature, 131, 421–423, 457–459. Reprinted in Bohr, N. (1958). Atomic physics and human knowledge. New York: Wiley. Also in BCW, Vol. 10. Bohr, N. (1949). Discussion with Einstein on epistemological problems of atomic physics. In P. A. Schilpp (Ed.), Albert Einstein: Philosopher—scientist (pp. 115–150). Illinois, Evanston: Library of Living Philosophers. Reprinted in Wheeler, J. A., & Zurek, W. H. (Eds.). (1983). Quantum theory and measurement (pp. 9–49). Princeton: Princeton University Press. (Quoted after Wheeler and Zurek 1983). Bohr, N. (1958). Atomic physics and human knowledge. New York: Wiley. Bohr, N. (1961). Physique atomique et connaissance humaine (C. Chevalley, Ed.). Paris: Gallimard. Cartwright, N. (1983). How the laws of physics lie. Oxford: Clarendon Press. Cartwright, N. (1999). The dappled world: A study of the boundaries of science. Cambridge: Cambridge University Press. Chevalley, C. (1991). Glossaire, in Bohr 1961, pp. 345–567. Cohen, H. (1883). Das Prinzip der Infinitesimalmethode und seine Geschichte Ein Kapitel zur Grundlegung der Erkenntniskritik. Berlin: Duemmler. Cohen, H. (1896). Einleitung mit kritischem Nachtrag zur 9.\ Auflage der Geschichte des Materialismus von Friedrich Albert Lange. Reprinted in Werke, Vol. 5.2. Hildesheim: Olms (1977 ff.). Compton, A. H. (1927). Compton scattering. Picture taken from: Grimsehls Lehrbuch der Physik. Neu bearbeitet von R.Tomaschek. Zweiter Band, Zweiter Teil: Materie und Äther (8th ed., pp. 138, 211). Leipzig und Berlin: Teubner. Engfer, H. J. (1982). Philosophie als Analysis. Studien zur Entwicklung philosophischer Analysis-Konzeptionen unter dem Einfluß mathematischer Methodenideale im 17. und frühen 18. Jahrhundert. Stuttgart-Bad Cannstatt: Fromann-Holzboog. Falkenburg, B. (2000). Kants Kosmologie. Die wissenschaftliche Revolution der Kosmologie im 18. Jahrhundert. Frankfurt/M.: Klostermann. Falkenburg, B. (2007). Particle metaphysics: A critical account of subatomic reality. Berlin: Springer. Falkenburg, B., & Ihmig, K.-N. (2004). Report DFG Fa 261/5-1, Hypotheses non fingo: Newtons Methodenlehre. Dortmund (unpublished). Faye, J. (1991). Niels Bohr: His heritage and legacy. Dordrecht: Kluwer. Hacking, I. (1983). Representing and intervening. Cambridge: Cambridge University Press. Ihmig, K.-N. (2004a). Die Bedeutung der Methoden der Analyse und Synthese für Newtons Programm der Mathematisierung der Natur. In U. Meixner & A. Newen (Eds.), Logical analysis and history of philosophy 7, focus: History of the philosophy of nature (pp. 91–119). Paderborn: Mentis. Ihmig, K.-N. (2004b). Newtons’s program of mathematizing nature. In M. Hoffmann, J. Lenhard & F. Seeger (Eds.), Activity and sign—grounding mathematics education. Festschrift für Michael Otte, Dordrecht: Kluwer. Knorr-Cetina, K. D. (1999). Epistemic cultures: How the sciences make knowledge. Cambridge, Mass.: Harvard University Press. Lambert, J. H. (1764). Neues Organon oder Gedanken über die Erforschung des Wahren und dessen Unterscheidung von Irrtum und Schein. Leipzig: Johann Wendler; Nachdruck: Berlin, AkademieVerlag (1990). Latour, B., & Woolgar, St. (1979). Laboratory life. The construction of scientific facts. Beverly Hills, London: Sage Publications, (2nd ed.): Princeton University Press. Losee, J. (1993). A historical introduction to the philosophy of science (3rd ed.). Oxford: Oxford University Press. Mach, E. (1905). Erkenntnis und Irrtum. Skizzen zur Psychologie der Forschung. Reprint of 5th ed: Leipzig (1926); Darmstadt: Wissenschaftliche Buchgesellschaft (1991). Meyer-Abich, K. M. (1965). Korrespondenz, Individualität und Komplementarität. Wiesbaden: Steiner. Mill, J. St. (1843). A system of logic. Abridged version based on the 8th ed. (New York 1881). Reprint: John Stuart Mill’s philosophy of scientific method (E. Nagel, Ed.). New York: Hafner Publishing Co. (1950). Morgan, M. & Morrison, M. (Eds.). (1999). Models as mediators: Perspectives on natural and social sciences. Cambridge: Cambridge University Press.
123
Synthese (2011) 182:149–163
163
Natorp, P. (1910). Die logischen Grundlagen der exakten Wissenschaften (2nd ed.). Leipzig und Berlin: Teubner. Newton, I. (1729). Principia. Sir Isaac Newton’s mathematical principles of natural philosophy and his system of the world. Vol. I: The motion of bodies. Vol. II: The system of the world. Mott’s translation revised by Cajori. Berkeley: Univiversity of California Press (1934, 1962). Newton, I. (1730). Opticks or treatise of the reflections, refractions, inflections & colours of light. Based on the fourth edition: London (1730); New York: Dover (1952, 1979). Newton, I. (1999). The principia. Mathematical principles of natural philosophy (I. Bernhard Cohen & Anne Whitman, Trans.). Berkeley: University of California Press. Pickering, A. (1984). Constructing quarks. Edinburgh: Edinburgh University Press. Powell, C. F., Fowler, P. H., & Perkins, D. H. (1959). The study of elementary particles by the photographic method: An account of the principal techniques and discoveries illustrated by an atlas of photomicrographs. London: Pergamon Press. Pringe, H. (2007). Critique of the quantum power of judgment. Berlin: Walter de Gruyter. Raether, H. (1957). Elektronenfrequenzen. Handbuch der Physik, Bd. 32, 443.
123
This page intentionally left blank z
Synthese (2011) 182:165–179 DOI 10.1007/s11229-009-9618-5
Data and phenomena: a restatement and defense James F. Woodward
Received: 20 May 2009 / Accepted: 3 June 2009 / Published online: 7 July 2009 © Springer Science+Business Media B.V. 2009
Abstract This paper provides a restatement and defense of the data/ phenomena distinction introduced by Jim Bogen and me several decades ago (e.g., Bogen and Woodward, The Philosophical Review, 303–352, 1988). Additional motivation for the distinction is introduced, ideas surrounding the distinction are clarified, and an attempt is made to respond to several criticisms. Keywords
Data · Phenomena · Inductive inference
1 Introduction Two decades ago, Jim Bogen and I published a paper (Bogen and Woodward 1988) in which we introduced a distinction between data and phenomena and claimed that this had important implications for how we should understand the structure of scientific theories and the role of observation in science. This initial paper was followed by a series of papers on related themes (Bogen and Woodward 1992, 2005; Woodward 1989, 2000). In the intervening years, our ideas have been embraced by some (e.g., Kaiser 1991) and extensively criticized by others (Glymour 2000; McAllister 1997; Schindler 2007). In this essay, I want to place some of our ideas within a more general context and provide some additional background and motivation for them. Along the way I will attempt to clarify and, where appropriate, correct some claims in our original paper. I will also attempt to respond to some of the criticisms that have been leveled at our account.
J. F. Woodward (B) Division of the Humanities and Social Sciences, California Institute of Technology, MC 101-40, Pasadena, CA 91125, USA e-mail:
[email protected] 123
166
Synthese (2011) 182:165–179
Bogen and Woodward 1988, advocated a three-level picture of scientific theory or, more accurately, of those theories that were in the business of providing systematic explanations. Explanatory theories such as classical mechanics, general relativity, and the electroweak theory that unifies electromagnetic and weak nuclear forces were understood as providing explanations of what we called phenomena—features of the world that in principle could recur under different contexts or conditions. For example, the magnitude of the deflection of starlight by the sun and the gravitational red shift are phenomena that are explained by General Relativity (GR) and the existence of neutral currents is explained by the electroweak theory. The melting of lead at 327.5◦ C is a phenomenon which is explained by the character of the electron bonds and the presence of so-called “delocalized electrons” in samples of this element. As explained in our original paper, in speaking of theories as “explaining” phenomena, we had in mind the capacity of theories to provide relatively detailed and systematic explanations, often taking the form of derivations from basic laws and principles. We distinguished such explanations from so-called singular causal explanations, in which it is merely specified that some causal factor plays a role in the production of an outcome, but without specifying in a detailed, quantitative way how the outcome depends on this factor or which other factors influence the outcome. Thus, GR does not just tell us that the gravitational field of the sun “causes” starlight deflection, but allows for a principled derivation of the magnitude of this deflection. Data are public records produced by measurement and experiment that serve as evidence for the existence or features of phenomena. The data from Eddington’s expedition of 1919 that provided evidence for a value for the deflection of starlight by the sun that was predicted by GR took the form of photographic plates reflecting stellar positions and some of the data serving as evidence for the existence of neutral currents took the form of bubble chamber photographs. Data serving as evidence for the value of the melting point of lead might take the form of a record of temperature readings taken from a thermometer of some particular design. Investigators attempt to design experiments and arrange measurement apparatus in such a way that data produced by these reflect features of the phenomena they are trying to detect. However, typically (even in a well-designed experiment or measurement) data will reflect the influence of many other causal factors as well, including factors that have nothing to do with the phenomenon of interest and instead are more idiosyncratic to the measurement apparatus and experimental design employed and its local environment. Eddington’s photographic plates changed in dimension because of changes in the temperature of the surrounding air—an influence for which he needed to correct in reporting his estimate of starlight deflection—and this influenced his data. Eddington’s data also reflected the operation of the photographic and optical technology he employed, his decisions about the positioning of his telescopes and cameras and so on. In the case of the melting point of lead, repeated measurements, even with a thermometer of the same design and the same procedure for taking measurements, will result in a scatter of data (perhaps reflecting variations in the temperature of the surrounding air, idiosyncracies of the visual perception of the observers taking the measurement and so on). Schematically, if we think of the melting point as a fixed quantity M, then even if the measurement procedure is
123
Synthese (2011) 182:165–179
167
working properly the data di observed on different occasions of measurement will be some function f of M and of additional causal factors u i which will take different values on different occasions of measurement: di = f (M, u i ). The researcher will be able to observe the measurement results di but will usually not be able to measure or observe the values of all of the additional causal factors u i . Her goal will nonetheless be to extract from the data di , an estimate of the value of M. This may be accomplished if the researcher knows or has good reason to believe that various additional assumptions are satisfied: for example, as discussed briefly below, if the researcher knows that f is additive (e.g., di = M + u i ) and the distribution of the u i is normal, then the mean value of the thermometer readings di will, in a sense that can be specified (see below) yield a reliable (unbiased, minimum variance) estimate of M. In our 1988 paper, Bogen and I claimed that while theories like GR aimed to provide systematic explanations of phenomena they “typically” or usually did not provide (and did not aim at providing) similar explanations of data. (I reiterate that “systematic explanation” meant something like the provision of a principled, nonad hoc derivation from the basic laws and principles. For some reservations about these claims about typicality, see Sect. 3 below.) Fundamental theories like GR fail to provide such explanations of data in part because, as noted above, data commonly reflect the causal influence of many different factors besides those that fall within the purview of any single general theory. For example, GR is a theory of gravitational phenomena, not a theory that purports to explain or provide derivations concerning the behavior of cameras and optical telescopes or Eddington’s decisions about experimental design. More importantly, Bogen and I claimed there is often no obvious scientific rationale or motivation for attempting to provide detailed systematic explanations of data. Instead, the role of data is to serve as evidence for the existence of phenomena and to play this role, researchers do not need to exhibit systematic explanations/derivations of data, either from fundamental theories or from such theories supplemented by hypotheses about the operation of measurement apparatus etc. In denying that fundamental theories typically provided systematic explanations of data, we made it clear that we did not mean to deny that in well-functioning measurements phenomena will figure as one causal factor among many in the production of data. When a thermometer is used to measure the melting point of lead, the researcher certainly hopes that the data obtained will reflect the causal influence of the temperature of the sample as it melts, along with various other factors. In this sense, the melting point M will figure in a singular causal explanation of the data. However, neither the theories that explain why lead has the melting point that it does nor information about the value of M will provide what we have been calling detailed, systematic explanations (or non-trivial derivations of) of the values of the observed data di . To provide such explanations one would need (among other things) to identify the values of the other factors u i that contribute to the di and this information is typically unknown. Note that, by contrast, to use the di for the quite different purpose of estimating the value of M, one does not need such information about the individual values of the u i —instead, as explained in more detail below, it is enough if one knows very general facts about the distribution of the u i . It is thus perfectly possible to be in the position
123
168
Synthese (2011) 182:165–179
of being able to use the data to estimate the melting point without being able to provide detailed explanations of the data or derivations of the data from independently supported premises. Some commentators (e.g., Votsis 2009) have responded to these contentions by arguing that even if claims about data are not derivable from explanatory theories by themselves (or even if such derivations are not commonly provided in science), it nonetheless must always be possible “in principle” to derive claims about data from explanatory theories when conjoined with “suitable auxiliaries” including hypotheses about phenomena themselves, the operation of instruments, the distribution of various sources of error that are present and so on. My response to this, again developed in more detail below, is several fold. First, the contention that claims about data are derivable “in principle” from (some) true hypotheses is in itself completely trivial and unilluminating if one is interested in understanding data to phenomena reasoning as it actually figures in science. Suppose that phenomena P is in fact present on some occasion i in circumstances C and datum di occurs on this occasion. Then one can derive that datum di occurs from the premises (1) If P is present on occasion i in circumstances C, di occurs, (2)) P is present on occasion i in circumstances C. Relatedly, within a hypothetico-deductive (HD) framework of confirmation, one might attempt to take the observation of di to provide, in conjunction with (1), HD confirmation of (2). Needless to say, the in principle possibility of such a derivation has no tendency at all to show that scientists actually make use of such derivations or that they play some functionally useful role in scientific reasoning. In particular, because of the well-known defects of HD confirmation, the derivability of di from (1) and (2) or from other claims that include the occurrence of P, does not by itself warrant any inductive conclusions about P. Our claims about the relationship between data and phenomena were intended as claims about science and scientific theorizing as actually practiced (and what is required for data to provide evidence) and not about what is possible in principle, in a sense that is disconnected from what scientists do or should do. What is most important from our perspective is that to establish that data are evidence for phenomena it is neither necessary nor sufficient as a normative matter to provide derivations/explanations of data. Indeed, focusing on the possibility of such derivations gets things backwards. What matters is not that we be able to infer “downward” to features of the data from theory and other assumptions; rather, what matters is whether we are able to infer “upward” from the data (and other assumptions) to the phenomenon of interest. Data are scientifically useful and interesting insofar as they provide information about features of phenomena. Thus Eddington infers from features of his photographs (and various background assumptions) to a value for the deflection of starlight rather than trying to infer or derive characteristics of the photographs from other assumptions. Similarly, the researcher measuring the melting point of lead infers from her data to an estimate of the melting point, rather than vice-versa. The remainder of this paper is organized as follows. Section 2 describes some background motivations for the data/phenomena distinction. Section 3 qualifies some of the claims in our original papers. Sections 4– 6 explore some issues about the role of background assumptions and “theory” in data to phenomena reasoning.
123
Synthese (2011) 182:165–179
169
2 Background and motivation My original thinking about the data/phenomena distinction was prompted in part by some issues concerning statistical explanation, and in fact these provide a natural illustration of some of the basic ideas of our 1988 paper. Consider a measurement of some observable (e.g., position) performed on a quantum mechanical system that is not in an eigenstate for the operator corresponding to that observable. With sufficient knowledge of the system, QM allows one to derive the probability with which various outcomes will occur, but not to derive (that is, deduce with certainty) which particular outcome will occur. Although they differed in important respects, the standard models of statistical explanation developed in the 1960s and 1970s by Hempel, Salmon and others assumed that in this sort of case, if appropriate other conditions were satisfied, quantum mechanics explained why individual outcomes occurred rather than just the probabilities with which they occurred, even when those probabilities were strictly between zero and one. My contrary (and at the time) heterodox view was that in such cases theories like QM explained only the probabilities with which outcomes occurred but not the occurrence of the individual outcomes themselves. However, it also seemed to me completely uncontroversial (and fully consistent with my heterodox view about statistical explanation) that observations of individual measurements outcomes on quantum mechanical systems when aggregated into claims about the relative frequencies with which such outcomes occurred could serve as evidence for or against the predictions of QM about probabilities. (Standard methodologies of statistical inference show how to use such frequency information to assess claims about probabilities.) Suppose that one thinks of the outcomes oi of individual measurements repeatedly made on some observable under the same conditions as data, claims about the probability with which those outcomes occur as phenomena, and QM as the theory which explains such phenomena. Then one has the essential ingredients of the data–phenomena–explanatory-theory picture defended by Bogen and Woodward. The data oi are evidence for phenomena (probability claims) but only claims about phenomena and not claims about individual data outcomes are derivable from QM. Indeed the latter are not derivable even from QM in conjunction with theories about the operation of measuring instruments and so on since such outcomes are irreducibly stochastic. If one rejects the idea that statistical theories like QM explain individual outcomes when these have intermediate probabilities, then QM explains phenomena (in the form of claims about probabilities) but not the data that are evidence for those phenomena. (One may think of the data as indirectly evidence for QM since they are evidence for probability claims that are evidence for QM.) Even if one rejects these claims about explanation, the claims about the derivability of the oi remain. Our 1988 paper claimed that a similar set of interrelations among data, phenomena and explanatory theory occurs in many areas of science. A second motivation for our 1988 paper can be found in the body of ideas that came to be called the New Experimentalism and the associated slogan, due to Hacking (1983), that “Experiment has a life of its own”. The central ideas of the new
123
170
Synthese (2011) 182:165–179
experimentalism are probably sufficiently familiar not to require detailed recounting But very briefly, I understand them to include (at least) the following: Experimental techniques, ideas about experimental design, design of instruments, procedures for analyzing experimental data and so on often seem to develop relatively independently of “high” theory (meaning, roughly, explanatory theory in our sense). Although experiments are sometimes explicitly conducted with the purpose of testing previously formulated theories, it is also common for experimentation to uncover new features of the natural world (phenomena) which were not predicted by any existing theory. Sometimes experiments are (legitimately) conducted in an exploratory fashion, in which researchers look for interesting effects or investigate new features of known phenomena, in the absence of any previously worked out theory that explains those phenomena. One thought behind our 1988 paper was that similar points could be made regarding areas of scientific investigation that did not involve experimentation (understood as active manipulation of nature) but instead involved the generation of data by more passive forms of observation. Here too one found a variety of assumptions and techniques for data analysis that also seemed to have “a life of their own” in the sense that their development and the considerations that bore on their reliability often seemed relatively independent of high theory. These included many different sorts of statistical techniques for analyzing observational data and handling error found in it (with different techniques being appropriate for different sorts of data—e.g., for time series rather than cross-sectional data), various sort of “data mining” procedures, which may or may not be statistical in character, subject matter specific ideas about possible confounding factors that might affect observational studies and how to deal with these, ideas about how to measure or operationalize quantities of interest (not necessarily provided by theory), subject specific simplifying assumptions in data analysis (such as the spatial smoothing assumptions in fMRI analysis described below) and so on. By distinguishing between data and phenomena, raising the question of how scientists reasoned from the former to the latter and taking seriously the possibility such reasoning often had its own distinctive characteristics (which might be different from the reasoning used to establish connections between phenomena and explanatory theory), we hoped to provide a framework for the exploration of the various sorts of considerations and techniques described above. My view is that forms of inference and discovery that involve reasoning with data and that are at least somewhat independent of prior detailed theoretical understanding have become even more prominent in science than they were when we wrote Bogen and Woodward (1988). In areas ranging from genomics to neurobiology to climatology, we have huge amounts of data and often limited theoretical understanding—the current level of development in these areas is, as the cliché, has it, “data rich and theory poor”. The development of techniques for extracting information from data under such circumstances is thus a matter of considerable importance. Understanding the distinctive features of data to phenomena reasoning also should be high on the agenda of philosophers of science.
123
Synthese (2011) 182:165–179
171
3 Corrections and restatements Although much of “Saving “ seems to me to stand up fairly well, we also advanced claims that now strike me as exaggerated and insufficiently qualified and others that (with the benefit of hindsight) I wish we had expressed differently. For starters, we were far too willing to formulate our central contentions as claims about what “typically” or even always happens in “science”—thus, we said phenomena are “typically” or usually not observable, that theories don’t (ever?) explain data and so on. These formulations now strike me as overgeneralizations and (more importantly) unnecessary. When one considers the enormous variety of activities and disciplines that fall under the heading of “science” and the great variation these exhibit over time, it seems implausible that there will be many interesting exceptionless truths about how science always works. Formulating claims in terms of what is “typical” or “usual” rather than what always happens makes them vaguer and harder to assess empirically, but arguably does not much enhance their plausibility or usefulness. Moreover, such sweeping formulations now seem to me largely unnecessary for the points we were trying to make. As emphasized above, our goal was to provide a framework that made sense of various sorts of data-based reasoning, one that grants that such reasoning can be relatively independent of certain kinds of theory. In support of this framework, it is enough to show that reasoning from data to phenomena can (and not infrequently does) successfully proceed without reliance on theoretical explanation of data. It was an unnecessary diversion to claim that this was always or even usually the case. A similar point holds in connection with the claim in our original paper that phenomena, in contrast to data, are typically not observable (in senses of observable closely linked to human perception). As Paul Teller has argued1 there are plausible examples of observable phenomena, at least if we do not restrict this notion in question-begging ways: for example, various phenomena associated with the operation of the visual system such as color constancy in changing light and pop-out effects in visual attention. Similarly, for some optical phenomena such as the so-called Poison spot which appears at the center of the shadow cast by a circular object illuminated by a point source. What we should have said (I now think) is that phenomena need not be observable and that in many cases standard discussions of the role of observation in science shed little light on how phenomena are detected or on the considerations that make data to phenomena reasoning reliable—asking whether phenomena are observable is often not the right question to ask if one wishes to understand how such reasoning works. This is because the reliability of such reasoning often has little to do with how human perception works. Another, related point that should have been clearer in our original paper is that (contrary to what some commentators have supposed) our goal was not to replace “top-down” theory-dominated views of science with an equally monolithic view in which scientific reasoning is always understood as “bottom-up” or purely data-driven. Instead our goal was to advance a more pluralistic understanding of science, in which, depending on goals and contexts, sometimes data-driven reasoning and sometimes
1 In a talk at PSA 2008, Pittsburgh, November, 2008.
123
172
Synthese (2011) 182:165–179
theory played a more leading role. It goes without saying that different areas of science face very different sorts of problems and are in different epistemic situations and these problems may require very different strategies for their solution. There is no general answer to the questions like how data-driven or theory-driven scientific research should be—it all depends on what the problem is, what sorts of information is available and so on. 4 The role of background assumptions and goals in data to phenomena reasoning I turn next to some general remarks about the structure of data to phenomena reasoning and about the role played by “background” assumptions as well as goals or interests in such reasoning. Some commentators have interpreted Bogen and me as claiming that researchers reason to conclusions about phenomena from data alone without the aid of any additional assumptions that go beyond the data—as claiming that such conclusions somehow emerge just from data and nothing else. Other commentators have taken us to be claiming that the researcher’s goals or interests play no role in this process. This is absolutely not our (or at least my) view. Data to phenomena reasoning, like inductive reasoning generally, is ampliative in the sense that the conclusion reached (a claim about phenomena) goes beyond or has additional content besides the evidence on which it is based (data). I believe it is characteristic of such reasoning that it always requires additional substantive empirical assumptions that go beyond the evidence. Thus the general form of an inductive inference is not (4.1) Evidence-→ conclusion. but rather always involves (at least) something like this: (4.2) Evidence + substantive empirical assumptions → conclusion. This is true of data to phenomena reasoning as well. In fact, even schema (4.2) is incomplete. In many cases, inductive inference (including a great deal of data to phenomena reasoning) also relies on more or less explicit assumptions about epistemic goals or ends, including attitudes toward risk and costs associated with various mistakes. I will return to this point below but for now want to further explore the role of empirical assumptions in (4.2). These may take a huge variety of different forms, which vary greatly from case to case. There is thus no one single grand assumption (e.g., nature is uniform) which serves as foundation for all inductive reasoning, but rather many different possible assumptions, the appropriateness of which will depend on the case at hand. Sometimes these assumptions will be highly specific. For example, when data consisting of the amount of carbon-14 in a fossil is used to arrive at a date for the fossil (as part of a reconstruction of, e.g., a migration pattern, or a phylogenetic tree, the phenomenon of interest), assumptions about the half-life of carbon-14, and the way in which soil conditions and atmospheric exposure may affect the presence of carbon in bones are among the background assumptions employed. In other cases,
123
Synthese (2011) 182:165–179
173
the relevant assumptions may be more generic, in the sense that they may be motivated by very general features of the situations to which they apply. For example, in statistical inference, assumptions about the probability distribution governing the error—whether it is normal, whether the error is correlated with other factors and so on—will often be crucial since the reliability of various estimating procedures will depend on such assumptions. Thus when measuring the melting point of lead, general empirical assumptions about the factors influencing the measurement results may be used to motivate the assumption of a particular error distribution. For example, the assumption that there are many factors influencing the measurement results, that these operate independently and that they have a combined effect that is additive may be used via the central limit theorem to motivate the assumption that the error distribution is normal. As a final example, consider the common use of spatial smoothing procedures in the analysis of fMRI data. Each individual voxel measurement is noisy; it is common to attempt to improve the signal to noise ratio by averaging each voxel with its neighbors, weighted by some function that falls with distance. This procedure depends (among other considerations) on the empirical assumption that the activity of each voxel is more closely correlated with nearby spatial neighbors. In each of these cases the “additional” assumptions employed “go beyond the data” in the sense that they are not warranted just by the data generated in the particular experimental or observational context at hand, but rather have some independent source. These assumptions are conjoined with data (or assumed in some way in the data analysis) to reach conclusions about phenomena. However, that such assumptions “go beyond the data” does not mean they are arbitrary, empirically unfounded, untestable, or matters for stipulation or convention. Instead, assumptions like those described above are ordinary empirical assumptions, which are either true or false of the situations to which they are applied. When necessary, such assumptions often can be tested by ordinary empirical procedures. Such assumptions are sometimes “theoretical” in the sense that they concern factors that are not observed. (For example, the error distribution is not usually regarded as something that is “observed”.) However, in many cases such assumptions will not be “theoretical” in the sense that they are part of a theory whose role is to provide substantive, detailed explanations. For example, one does not usually think of assumptions about the distribution of the error term as playing such an explanatory role—instead the error term is responsible for the “unexplained” variance in measurement. Arguably a similar point holds for the use of spatial smoothing assumptions in fMRI analysis—their role is not one of explanation. I have stressed these points about the need for additional assumptions in data to phenomena reasoning because Bogen and I are sometimes interpreted as radical inductivists. Schindler (2007, p. 165) writes: Bogen and Woodward [believe] that unobservable phenomena are—without the mediation of the theory—inferred from observable data. Elsewhere Schindler describes us as arguing for a “‘bottom-up” construction of phenomena from data without the involvement of theory’ (2007, p. 160). If without the “mediation” or “involvement” of theory means “without any additional substantive empirical assumptions” I repeat that this is just not our view. Our view is that such assumptions are always required in data to phenomena reasoning
123
174
Synthese (2011) 182:165–179
(I will also add, in anticipation of discussion below, that there are also cases in which data to phenomena reasoning is “mediated” by the very theory that explains those phenomena. What is excluded by the view defended in our 1988 paper is that this mediating role takes the form of theory providing explanations of the data.2 ) I said above that in addition to substantive empirical assumptions, inductive inference (including data to phenomena reasoning) often relies on (or is guided by) evaluative considerations having to do with the investigator’s choice of goals, interests, and attitudes toward risk. Putting aside implausibly strong forms of realism about epistemic values, it seems natural to suppose that such choices can be reasonable or unreasonable, but that (unlike the empirical assumptions that figure in inductive inference) they are not straightforwardly true or false—they are not, as it were, forced on us by nature. (Nonetheless the reasonableness of such evaluative choices may depend in part on assumptions that are straightforwardly true or false: to take an example discussed below, whether it is reasonable to employ a certain estimator may depend on whether certain descriptive assumptions about the distribution of the error term hold.) The goals and values figuring in data to phenomena reasoning can take many different forms, some of which are obvious and unremarkable, others of which may have a rich conceptual/mathematical structure. (4.3)
An investigator who wishes to determine the melting point of lead will have this as her goal or cognitive interest and this will structure her investigation in certain ways—she will proceed differently if instead she wishes to measure the boiling point of water. The investigator’s project will be misguided if certain factual assumptions on which it rests are false (for example, if there is no such thing as “the” (unique) melting point of (all pure samples of lead). Nonetheless, the choice of this particular goal is not dictated by nature (or by the data) and the investigator might reasonably have chosen other goals instead.
(4.4)
An investigator wishes to estimate the point value of some quantity from statistical data. A standard practice within frequent statistics is to choose an estimator satisfying certain evaluative criteria. For example, it is usually thought desirable that estimators be unbiased, where a random variable x ∗ is unbiased estimator for parameter X , if E (x*) = X. Moreover, among unbiased estimators, it is standard to choose an (or the) estimator that is “best” in the sense of having minimum variance. If we require also that the estimator be a linear function of the data we arrive at the familiar notion of a best, linear unbiased estimator (BLUE). Again, the adoption of these criteria for what makes an estimator good is not a choice that is forced on us by nature. In some circumstances it might make sense to choose an estimator that is slightly biased but has smaller variance over an unbiased estimator of higher variance.
2 We also do not require that the theories which figure in data to phenomena reasoning be independent of
the theory which explains or predicts those phenomena—see Sect. 6.
123
Synthese (2011) 182:165–179
175
5 Skepticism and foundationalism in data to phenomena reasoning If additional assumptions are always required to get from data to phenomena (and such inferences may also be guided by goals and interests), this creates an opening for the skeptically minded. This is the dialectic exploited by James McAllister in a series of papers criticizing our views (e.g., McAllister 1997). McAllister observes that claims about phenomena do not follow just from the data alone and that something more is required for such inferences. According to McAllister, this “something more” has the character of a stipulation. In particular, according to McAllister, such reasoning involves the decomposition of data into a signal or pattern (corresponding to a phenomenon) and a “noise level”, the latter being “stipulated”, given the investigator’s cognitive interests and attitudes toward risk An indefinitely large number of different stipulations and decompositions are legitimate. It follows that we must conceive of the world as “radically polymorphic”—any body of data contains or corresponds to indefinitely many phenomena. For reasons that will emerge below, I do not find McAllister’s notion of choice of a noise level very clear. If we interpret this as reflecting the investigator’s interests, or attitudes toward mistake, then (as I have already said) I fully agree with the general idea that these play an important role in inductive inference, including data to phenomena reasoning. However, as have seen, substantive assumptions figure too. I believe McAllister is simply wrong in supposing that substantive assumptions like those described above (the assumption that carbon-14 has a certain half-life or the assumption that the error normally distributed in a certain measurement) are just matters of stipulation (of noise level or of anything else). Instead such assumptions are empirical claims that, given a particular context, are either true or false. And in the case of many possible phenomena claims (e.g., the claim that the melting point of lead is 627 K), the empirical assumptions required to license their reliable inference from data are false. True empirical assumptions will not license inconsistent phenomena claims from the same data, although different true assumptions may license different but consistent phenomena claims from the same data. We need to distinguish the role of goals and interests (which may have a stipulative element) from the role of empirical assumptions in data to phenomena reasoning. McAllister’s notion of a “noise level” encourages a conflation of these two kinds of considerations.3 More generally, we can say that McAllister’s focus on the need for additional assumptions/stipulations in data to phenomena reasoning draws attention to general features of all inductive inference. These features are inescapable; it is not a flaw in our account that it recognizes a role for them. Our account was never intended to answer the sort of skeptic who demands to be shown how conclusions about phenomena follow from data alone, without further substantive assumptions. In (Bogen and Woodward 1988), our interest was in describing scientific contexts in which (we assumed) not just data but a justifiable basis for other sorts of substantive assumptions that go beyond the data were available. We saw our task as identifying those assumptions, elucidating 3 Noise level might refer to something like choice of significance level which arguably reflects a stipulation or convention but it might also reflect assumptions about the distribution of the error term which are not a matter of stipulation.
123
176
Synthese (2011) 182:165–179
how they figured in data to phenomena inference, and (in some cases) showing how they were justified, but not one of providing a presuppositionless foundation for them. Once it is recognized that goals and interests play a role in guiding data to phenomena reasoning (and inductive inference more generally), an important issue is where they (and whatever “relativity” or “subjectivity” they introduce) should be “located”. One possibility, which I take to be the alternative favored by McAllister, is to locate these “out in the world”, so to speak, or at least the way we think about the world—that is, in the phenomena themselves, or in the way in which we conceptualize phenomena. In other words, one takes the fact that goals and interests affect which data to phenomena inferences investigators make to show the phenomena themselves (or the truth-value of claims about them) are in some way relative to the interests, goals or perspective of the investigator. This line of thought encourages the idea that reality itself should be regarded as radically polymorphic, containing different phenomena corresponding to each of the goals and stipulations the experimenter is willing to adopt. An alternative view, which I find more plausible, is take the relativity to goals and interests that figures in data to phenomena reasoning to attach to investigators and their choices rather than to the phenomena themselves. On this alternative view, someone who decides to measure the melting point of lead rather than the boiling point of water is influenced by her interests, but this does not mean that the value of the melting point of lead is an “interest-relative” matter or that given one set of cognitive goals or stipulations about an appropriate error level, the melting point might legitimately be regarded as, say, 327◦ C and given another set of interests and goals it might be legitimately regarded as 627◦ C. Similarly, in an investigation of the bias of a coin in which the data are repeated coin flips, a researcher who employs a significance test with a significance level of 0.05 has a different attitude toward the costs of a certain kind of mistake (she adopts a different noise or error level) than a researcher who employs a significance level of 0.001. This might lead the second researcher to reject the hypothesis that the coin is fair in circumstances in which the first researcher does not. But this does not mean (at least on the usual interpretation of significance testing) that whether the coin is biased itself depends on the researcher’s attitude toward error or that there is no such thing as the true bias of the coin, independently of the researcher’s interests. It just means that the researchers have made different choices about (or have different attitudes toward) the possibility of making a certain kind of mistake about the true (interest -independent) bias of the coin.
6 The “involvement” of explanatory theory in data to phenomena reasoning I suggested above that explanatory theories often aim at providing explanations of phenomena and that it is difficult to provide systematic explanations of data from general theory, even in conjunction with theories of instruments, non-trivial auxiliaries etc. Moreover and more importantly, for data to provide reliable evidence for phenomena it is not necessary to provide such explanations. I want now to expand on these claims. I begin by emphasizing what they do not say. Suppose that we have a theory T that explains some claim about phenomenon P and P is established by reasoning from
123
Synthese (2011) 182:165–179
177
data D. Some commentators have taken Bogen and Woodward, 1988, to be contending that theory T cannot in any way be “involved” in the reasoning from D to P. We do not make this claim and I think it is obviously false. Instead what we claim is T that commonly does not play (and need not play) one particular kind of role in reasoning from D to P—that of providing systematic explanations of D. This is not to say that T plays no role at all in reasoning from D to P. This is not mere hair-splitting; a major problem with discussions of the role of “theory” in data to phenomena reasoning (or, for that matter, in connection with the so-called theory-ladeness of observation) is that philosophers fail to make discriminations among the very different ways that theories can be “involved” in this process. They also fail to distinguish among very different things that go under the heading of “theory”. Different “involvements” by different sorts of theories can have quite different epistemological implications for data to phenomena reasoning. We need to keep these distinct. To begin with a very simple case, suppose theory T explains some phenomenon P that a researcher wishes to measure/detect from data D. Suppose also T provides a motivation for measuring P: the researcher measures P because she wants to test T or because T says P is an important quantity that plays a fundamental role in the nature—the researcher would not have attempted to measure P if he did not regard T as a serious possibility. Perhaps also T provides a vocabulary for characterizing the results of measuring P. In this case, there will be an obvious sense in which T is “involved” in reasoning from D to P—the researcher would not have engaged in this reasoning at all or would not have used concepts drawn from T to describe its results of this reasoning if she did not accept T or at least take it seriously. However this sort of involvement of T in data to phenomena reasoning does not necessarily mean that T is being used to explain D or that D cannot be evidence for P unless T is conceived as playing this explanatory role. For example, it seems plausible that Eddington would not have performed the measurements on his solar eclipse expedition and would not have described the results of these measurements in the way that he did (that is, as measurements of the deflection of light by the sun’s gravitational field) if he did not accept various “theoretical” claims—most obviously, that gravity influences the behavior of light. In an obvious sense these theoretical assumptions help to structure Eddington’s measurement activities—they play both a motivating role (in suggesting a goal or target to aim at) and a vocabulary for describing the results of the measurement. This does not mean, however, that GR or the other explanatory theories about gravity and its coupling with light under test played the role of explaining Eddington’s data. Still less does it follow that Eddington’s data analysis required that he assume that one of these theories played this explanatory role or even that one of them was correct. I said above that different “involvements” by different sorts of theories in data to phenomena reasoning can have very different epistemological implications. One particular philosophical worry is that the involvement of theory in such reasoning is of such a character as to introduce a “vicious” circularity of some kind, with the involvement of T in the data analysis somehow guaranteeing that the results of such analysis will be to support T regardless of what the data are, or at least making it impossible to detect that T is false when it is. It is thus important to note that it is entirely possible for T to be involved in some way in data to phenomena reasoning without introducing this
123
178
Synthese (2011) 182:165–179
sort of vicious circularity. For example, in Millikan’s oil drop experiment, the mere fact that theoretical assumptions (e.g., that the charge of the electron is quantized and that all electrons have the same charge) play a role in motivating his measurements or a vocabulary for describing his results does not by itself show that his experimental design and data analysis were of such a character as to guarantee that he would obtain results supporting his theoretical assumptions. His experiment was such that he might well have obtained results showing that the charge of the electron was not quantized or that there was no single stable value for this quantity. A similar point holds for Eddington’s measurements In general, determining whether T is involved in D to P reasoning (when P is some phenomenon which is used to test T ) in such a way as to introduce damaging circularity is a far from straightforward—it is not obvious this can be captured just in terms of logical relationships involving T, P and D.4 To illustrate this point, consider a schematic example: Theory T is taken seriously but is not known to be true, T predicts (and if true would explain) phenomenon P, and data D provides a prima-facie case that P occurs. (T might be the electroweak theory, P the existence of neutral currents, and D photographs produced by the Gargamelle bubble chamber.) Suppose, however, we are at the limits of available experimental technology: data D are very noisy, we believe there may be unknown sources of error in the experiment, and background assumptions used in analyzing the data are based on simulations in which we are not completely confident. Based just on reasoning from the data and the assumptions adopted in the data analysis (and ignoring the fact that T predicts P), we are far from fully confident that we have detected P. In such a case, the very fact that T predicts P might boost our confidence that the error has been adequately controlled for, the simulations are not radically mistaken and that the experiment has successfully detected P. In other words, the data-based reasoning to P and the fact that T predicts P mutually reinforce our confidence that P is real—the two sets of considerations are in a positive feedback relation with each other. That P apparently obtains might in turn to be important evidence in favor of T . This is a case in which T figures in the overall process leading to acceptance of P in an important way and P in turn supports T . Note, however, that even in this case (i) T does not (or at least need not) play the role of explaining individual items of data and (ii) it is far from obvious that the process described is automatically viciously circular or that it fails to provide a legitimate basis for increased confidence in P (or T , for that matter). Of course there would be no basis for increased confidence in P if (to mention an extreme possibility) T influences the researchers’ reasoning in such a way that they would accept P completely independently of whatever data was produced in the experiment—i.e., if their commitment to T is such that it would lead them to interpret virtually any data
4 Another illustration: in order to test the hypothesis that metals expand linearly with temperature a thermometer is employed which is calibrated on the assumption that the metal it contains (e.g., mercury) increases linearly with temperature. This may seem blatantly circular, but a little thought will show that there are possible results that might falsify the hypothesis—for example, if mercury and the metal tested expand in accord with different functions of temperature, one of which is non-linear, this discrepancy could be detected. See Chang (2004). Two questions: (i) Is this procedure objectionably circular? (ii) If it is not, how should we characterize the kinds of circularity that are objectionable?
123
Synthese (2011) 182:165–179
179
as evidence for P (dismissing discrepant data as due to error etc.). However, the story I have told does not require that this be the case. This last example is a case in which top-down theory-driven reasoning plays an important role in phenomenon-detection and experimental reasoning—it helps to convince researchers that error has been adequately controlled and that the phenomenon is real. Our claims about the data/phenomena distinction should not be interpreted as requiring that the kind of top down reasoning described above never occurs. We do claim, however, that not all data to phenomena reasoning is theory-driven in the way just described. The history of science is full of examples in which phenomena are detected or noticed in observations or exploratory experiments without the investigator being in possession of any prior theory that explains or predicts that phenomenon or, indeed, provides any reason to expect it.5 Acknowledgments Many thanks to the participants at the Heidelberg conference on “Data, Phenomena, Theories” for helpful comments.
References Bogen, J., & Woodward, J. (1988). Saving the phenomena. The Philosophical Review, 97(3), 303–352. Bogen, J., & Woodward, J. (1992). Observations, theories and the evolution of the human spirit. Philosophy of Science, 59, 590–611. Bogen, J., & Woodward, J. (2005). Evading the IRS. In M. Jones & N. Carwright (Eds.), Correcting the model: Idealization and abstraction in science. Poznan studies (pp. 233–267). Holland: Rodopi Publishers. Chang, H. (2004). Inventing temperature: Measurement and scientific progress. Oxford: Oxford University Press. Glymour, B. (2000). Data and phenomena: A distinction reconsidered. Erkenntnis, 52, 29–37. Hacking, I. (1983). Representing and intervening. Cambridge: Cambridge University Press. Kaiser, M. (1991). From rocks to graphs—The shaping of phenomena. Synthese, 89, 111–133. McAllister, J. (1997). Phenomena and patterns in data sets. Erkenntnis, 47, 217–228. Schindler, S. (2007). Rehabilitating theory: Refusal of the ‘bottom-up’ construction of scientific phenomena. Studies in the History and Philosophy of Science, 38, 160–184. Votsis, I. (2009). Making contact with observations. PhilSci Archive. Proceedings of the first conference of the European Philosophy of Science Association. Accessed January 20, 2009. Woodward, J. (1989). Data and phenomena. Synthese, 79, 393–472. Woodward, J. (2000). Data, phenomena, and reliability. Philosophy of Science, PSA 1998, 67, S163–S179.
5 A number of examples are provided in Hacking (1983)—see especially pp. 152–154.
123